Machine Learning for the ATLAS Experiment
Adam Bailey
Artemisa Workshop
29/05/2019
● ATLAS TileCal group
○ Tile calorimeter and ATLAS analysis
● Currently involved in several analysis searches:
○ Standard Model H → ττ
○ Beyond the Standard Model H/A → ττ
○ Lepton flavour violating H → τ e/μ
○ Beyond the Standard Model HH → bbγγ
● For TileCal, use of neural networks in calorimeter signal reconstruction
● Group members involved in machine learning studies:
○ Adam Bailey, Fernando Carrio Argos, Luca Fiorini, Sergi Rodriguez
Introduction
DNN Signal Reconstruction
● HL-LHC pileup degrades the pulse quality, LHC algo performance deteriorates
● 128 FPGAs to process the full ATLAS Tile Calorimeter
● Each FPGA process 96 different signals with 40 MHz rate
● Deep NN for real-time signal reconstruction at the HL-LHC
● NN runs on FPGA, but training happens offline (GPU or CPU)
A variation of the sigmoidal
function is used
DNN Signal Reconstruction
● Deep NN performs much better than current algorithm for HL-LHC pile-up conditions
● FPGAs allow to process online relatively complex algorithms at 40 MHz
● To do: More ML options to be studied
DNN Algo Regression
= 0.98
ATLAS Run 2 Algo Regression
= 0.90
Machine Learning for New Physics Searches
● Lepton flavor violation exists in nature (neutrino oscillations), but LFV in the charged sector (cLFV) is extremely suppressed in the SM:
○ BR(μ → eγ) < 10-48
● Many models predict cLFV decays of Higgs and/or Z and/or other massive resonances.
● Low energy results (e.g. m → eγ, t → eee, μ‐e conversion, etc.) provide indirect constraints, but there are often assumptions.
● ATLAS & CMS Experiments:
○ Complementary efforts made by both collaborations
● Analysis performance plays a leading role:
○ Steep lepton reconstruction turn-on curves
○ High lepton pT resolution
○ Misidentified lepton rate ⇒ the lower the better
→ Machine Learning can make the difference
Machine Learning for New Physics Searches
● Application of Machine Learning in LHC searches for new physics.
● Search for Lepton Flavour Violating decays of the Higgs boson
● So far using BDTs and high-level variables
● Plan to optimise DNN adding low-level (kinematic) variables
● Need access to small set of ATLAS data for training, O(100GB)
S. Rodriguez, L. Fiorini
ATLAS-CONF-2019-013 ATLAS-CONF-2019-013
Higgs Boson Properties Measurement
● Measurement of the Higgs boson properties is becoming more and more precise
● New techniques are becoming necessary to explore specific phase spaces of the Higgs boson production (differential and Simplified Template cross-section)
ML for Higgs Boson Properties
● VBF production very interesting for precise properties measurement, but is polluted by ggF production
● One of the main purpose of Run2 Higgs properties papers is the STXS measurement
○ Hence the separation of the VBF and ggF contributions becomes more important.
● Goal: Improve purity achieved by cut-based selection using ML techniques
● Use NN/BDT to separate VBF Higgs events from background and ggF production
S. Rodriguez, L. Fiorini
ML improves separation by a factor
~2 for the same VBF signal efficiency
HH → bb γγ
● If New Physics @ O(TeV), good chances that it is connected to the Higgs sector.
○ Higgs boson pair production important in the structure of EW-spontaneous symmetry breaking mechanism
○ bbγγ the simplest production process that is sensitive to the self-coupling
○ Can probe λ higher dimensional interactions and existence of heavier states coupled to the Higgs JHEP 11 (2018) 040
● Participation in bbγγ searches with 13 TeV
○ Search for heavy resonances decaying X→HH
○ Search for non resonant enhancement of double
● Higgs production above SM
● Prepare future analyses for HLLHC with sensitivity to SM production of HH, related to Hλ
X → HH → bbγγ
Machine Learning for HH → bb γγ
● Looking into using MVA for both Non-Resonant and Resonant searches
● Started to use MVA to create a VBF category for the bbyy final state
● MVA provides better separation of signal and background than cut based selection
● The MVA uses reduce size derived objects (100 GB size)
● For more complex applications, the training and testing takes several hours so that the use of Artemisa would make a great improvement.
I. Sayago, A. Ruiz, L. Fiorini
BSM H/A → ττ
● Search for additional heavy BSM Higgs, 2HDM-type II
● In MSSM, 2 Higgs doublets, 5 Higgs
● For large tanβ, couplings to down-type fermions are enhanced
● Two channels - τlepτhad (~46%) and τhadτhad (~42%)
● Split into b-tag and b-veto category
A. Bailey, L. Fiorini
● Search for 2 ~back-to-back τ, opposite charge, MET
● Effectively signal is ~ back-to-back τ, usually some MET
● Previous and current analysis does not use machine learning
○ Plenty of scope for improvement
○ Plan on early Run 2 paper soon, later updated paper
○ Later paper will include MVA and other improvements
BSM H/A → ττ : MVA
● First MVA studies:
○ XGBoost BDT
○ 3 signal mass bins
● 2 examples from τhadτhad b-veto
● Already see improvement over standard analysis method
● Higher signal masses easier to separate
● Found similar results with “simple”
neural network (Keras + Tensorflow) Cut-based
performance
BSM H/A → ττ : Future ML Studies
● Can also try more complex neural networks
● Splitting into signal mass bins is reasonable, but:
○ Splits the MC statistics
○ Leads to discontinuities in limit curves
● Better ways to deal with different signal masses?
○ Mass-parameterised neural network: output is
function of event-level quantities and true signal mass.
Performance not degraded at masses it was not trained on
● Avoid training on mass-related variables?
○ Use adversarial neural network, penalise learning signal mass?
● Channel classification:
○ Possible to use ML to separate ggF/bbH production modes?
○ Currently use requirement of 1 or more jets passing a b-tagging score cut to define b-tag channel
○ Instead use the b-tag BDT score as input to classification BDT?
Ref.
Machine Learning Tools
● Summary of those used in H/A → ττ so far
● For initial tests, have been using general tools, not HEP specific
○ XGBoost for BDT
○ Keras+Tensorflow for neural network
● These require some manipulation of data:
○ Need numpy array as input
○ Then have to apply results back to ROOT ntuples
● Possible to use TMVA too
○ Useful to produce plots of input variables, correlations etc.
● Other tools: numpy, pandas, sklearn
● Note: Have found quite strict tensorflow + cuda version compatibility requirements
○ Newer versions have useful tools for parallel running.