Machine Learning for the ATLAS Experiment

(1)

Machine Learning for the ATLAS Experiment

Adam Bailey

Artemisa Workshop

29/05/2019

(2)

● ATLAS TileCal group

○ Tile calorimeter and ATLAS analysis

● Currently involved in several analysis searches:

○ Standard Model H → ττ

○ Beyond the Standard Model H/A → ττ

○ Lepton ﬂavour violating H → τ e/μ

○ Beyond the Standard Model HH → bbγγ

● For TileCal, use of neural networks in calorimeter signal reconstruction

● Group members involved in machine learning studies:

○ Adam Bailey, Fernando Carrio Argos, Luca Fiorini, Sergi Rodriguez

Introduction

(3)

DNN Signal Reconstruction

● HL-LHC pileup degrades the pulse quality, LHC algo performance deteriorates

● 128 FPGAs to process the full ATLAS Tile Calorimeter

● Each FPGA process 96 diﬀerent signals with 40 MHz rate

● Deep NN for real-time signal reconstruction at the HL-LHC

● NN runs on FPGA, but training happens oﬄine (GPU or CPU)

A variation of the sigmoidal

function is used

(4)

DNN Signal Reconstruction

● Deep NN performs much better than current algorithm for HL-LHC pile-up conditions

● FPGAs allow to process online relatively complex algorithms at 40 MHz

● To do: More ML options to be studied

DNN Algo Regression

= 0.98

ATLAS Run 2 Algo Regression

= 0.90

(5)

Machine Learning for New Physics Searches

● Lepton ﬂavor violation exists in nature (neutrino oscillations), but LFV in the charged sector (cLFV) is extremely suppressed in the SM:

○ BR(μ → eγ) < 10^-48

● Many models predict cLFV decays of Higgs and/or Z and/or other massive resonances.

● Low energy results (e.g. m → eγ, t → eee, μ‐e conversion, etc.) provide indirect constraints, but there are often assumptions.

● ATLAS & CMS Experiments:

○ Complementary eﬀorts made by both collaborations

● Analysis performance plays a leading role:

○ Steep lepton reconstruction turn-on curves

○ High lepton p_T resolution

○ Misidentiﬁed lepton rate ⇒ the lower the better

→ Machine Learning can make the diﬀerence

(6)

Machine Learning for New Physics Searches

● Application of Machine Learning in LHC searches for new physics.

● Search for Lepton Flavour Violating decays of the Higgs boson

● So far using BDTs and high-level variables

● Plan to optimise DNN adding low-level (kinematic) variables

● Need access to small set of ATLAS data for training, O(100GB)

S. Rodriguez, L. Fiorini

ATLAS-CONF-2019-013 ATLAS-CONF-2019-013

(7)

Higgs Boson Properties Measurement

● Measurement of the Higgs boson properties is becoming more and more precise

● New techniques are becoming necessary to explore specific phase spaces of the Higgs boson production (differential and Simplified Template cross-section)

(8)

ML for Higgs Boson Properties

● VBF production very interesting for precise properties measurement, but is polluted by ggF production

● One of the main purpose of Run2 Higgs properties papers is the STXS measurement

○ Hence the separation of the VBF and ggF contributions becomes more important.

● Goal: Improve purity achieved by cut-based selection using ML techniques

● Use NN/BDT to separate VBF Higgs events from background and ggF production

S. Rodriguez, L. Fiorini

ML improves separation by a factor

~2 for the same VBF signal eﬃciency

(9)

HH → bb γγ

● If New Physics @ O(TeV), good chances that it is connected to the Higgs sector.

○ Higgs boson pair production important in the structure of EW-spontaneous symmetry breaking mechanism

○ bbγγ the simplest production process that is sensitive to the self-coupling

○ Can probe λ higher dimensional interactions and existence of heavier states coupled to the Higgs JHEP 11 (2018) 040

● Participation in bbγγ searches with 13 TeV

○ Search for heavy resonances decaying X→HH

○ Search for non resonant enhancement of double

● Higgs production above SM

● Prepare future analyses for HLLHC with sensitivity to SM production of HH, related to H_λ

X → HH → bbγγ

(10)

Machine Learning for HH → bb γγ

● Looking into using MVA for both Non-Resonant and Resonant searches

● Started to use MVA to create a VBF category for the bbyy ﬁnal state

● MVA provides better separation of signal and background than cut based selection

● The MVA uses reduce size derived objects (100 GB size)

● For more complex applications, the training and testing takes several hours so that the use of Artemisa would make a great improvement.

I. Sayago, A. Ruiz, L. Fiorini

(11)

BSM H/A → ττ

● Search for additional heavy BSM Higgs, 2HDM-type II

● In MSSM, 2 Higgs doublets, 5 Higgs

● For large tanβ, couplings to down-type fermions are enhanced

● Two channels - τ_lepτ_had (~46%) and τ_hadτ_had(~42%)

● Split into b-tag and b-veto category

A. Bailey, L. Fiorini

● Search for 2 ~back-to-back τ, opposite charge, MET

● Eﬀectively signal is ~ back-to-back τ, usually some MET

● Previous and current analysis does not use machine learning

○ Plenty of scope for improvement

○ Plan on early Run 2 paper soon, later updated paper

○ Later paper will include MVA and other improvements

(12)

BSM H/A → ττ : MVA

● First MVA studies:

○ XGBoost BDT

○ 3 signal mass bins

● 2 examples from τ_hadτ_had b-veto

● Already see improvement over standard analysis method

● Higher signal masses easier to separate

● Found similar results with “simple”

neural network (Keras + Tensorﬂow) Cut-based

performance

(13)

BSM H/A → ττ : Future ML Studies

● Can also try more complex neural networks

● Splitting into signal mass bins is reasonable, but:

○ Splits the MC statistics

○ Leads to discontinuities in limit curves

● Better ways to deal with diﬀerent signal masses?

○ Mass-parameterised neural network: output is

function of event-level quantities and true signal mass.

Performance not degraded at masses it was not trained on

● Avoid training on mass-related variables?

○ Use adversarial neural network, penalise learning signal mass?

● Channel classiﬁcation:

○ Possible to use ML to separate ggF/bbH production modes?

○ Currently use requirement of 1 or more jets passing a b-tagging score cut to deﬁne b-tag channel

○ Instead use the b-tag BDT score as input to classiﬁcation BDT?

Ref.

(14)

Machine Learning Tools

● Summary of those used in H/A → ττ so far

● For initial tests, have been using general tools, not HEP speciﬁc

○ XGBoost for BDT

○ Keras+Tensorﬂow for neural network

● These require some manipulation of data:

○ Need numpy array as input

○ Then have to apply results back to ROOT ntuples

● Possible to use TMVA too

○ Useful to produce plots of input variables, correlations etc.

● Other tools: numpy, pandas, sklearn

● Note: Have found quite strict tensorﬂow + cuda version compatibility requirements

Machine Learning for the ATLAS Experiment