ATLAS Data Acquisition: from Run I to Run II
William Panduro Vazquez
Royal Holloway, University of London
On behalf of the ATLAS collaboration
Overview
• DAQ System for Run 1
• Run 2 Challenges & Solutions
• Readout System upgrade
• DataFlow network overhaul
• High Level Trigger (HLT) changes
• New software concepts, techniques, libraries
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 2
ATLAS TDAQ System (Run 1)
3
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014
ATLAS TDAQ System (Run 1)
4
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014
The focus of this talk
Run 2 Challenges
• Run conditions from 2015
– Pileup
(multiple interactions per bunch crossing)– Peak of 40 in 2012
(luminosity 7x1033)• Design: avg pileup of 21 for luminosity 1034
– For Run 2 estimate max average 55
• Assuming 25 ns crossings
– With 50 ns could reach 80
• Peak value more significant for DAQ system
– Potentially larger event size (up to 2 MB) – Longer HLT processing time
• Increased buffering capacity required
• ATLAS DAQ changes for 2015
– L1 trigger rate 100 kHz (70 kHz in 2012)
– 20% increase in number of readout channels
• Higher luminosity and energy requiring more channels to read out existing hardware
• New detector components and trigger readout
– Insertable B-layer (IBL) – new pixel detector layer (talks by W. Lukas & B. Mandelli – Detector RD & Performance) – L1 topological trigger (talk by F. Pastore – Detector RD & Performance)
– In 2016: Fast tracker (FTK) (poster by T. Iizawa)
– Average rate written to disc ~1 kHz
• Run 1 reality becomes a requirement
• Peak rates potentially much higher
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 5
2012 Avg: 20.7
Run 2 Solutions
• Global requirement: higher capacity, more flexible DAQ system
– Increase buffering capacity and readout rate
• Overhaul of readout system (ROS)
– Increase network capacity
• Simpler, more scalable design
– Upgraded High Level Trigger
• Increase number of available cores
• Merge/optimise Level 2 and Event Filter Layers
• Major undertaking over tight 2 year timescale!
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 6
Readout System in Run 1
• Based on off-the shelf PCs (~150 in run 1)
• Receives data via ~1600 fibre readout links (ROLs) after L1 accept
– Stored in buffer on custom PCI card (ROBIN) – 3 ROLs per ROBIN
– 4-5 ROBINs per PC (12 – 15 ROLs) – 2 x 1 GbE output network connections
• Serve data to HLT and hold buffered data for entire HLT latency period
• Read out desired data to event building system on L2 accept, delete rejected events
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 7
• Not quite capable of satisfying max required rates for Run 2
• Run 1 experience suggests requirements will evolve beyond expectation
• Increase in number of ROLs (20%) will use majority of remaining spares
– Rack space a problem with current link density
• Beginning to run into hardware obsolescence and end-of-life problems
• Limited memory capacity per channel could limit HLT performance
• Upgrade Goals:
– Manage 100 kHz input rate from L1 while reading out 50% of this data
– Significantly increase link density and reduce system footprint Max Output Rate
Predicted
Readout Scenario
Pixel Detector
(kHz)
Hadron Calo Endcap
(kHz) 2012 Data Taking
(75 kHz L1 Rate)
30 33
2015 Rate Predictions (100 kHz L1 Rate)
53 48
Current ROS Max Lab Performance (100 kHz L1 Rate)
50 45
Example rates for ROS PCs from most demanding systems
8
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014
Card ROBIN RobinNP
Number of Readout Links 3 12
Total RAM on-board (MB) 192 8192
Memory per link (MB) 64 682
Max Input Bandwidth (MB/s) 480 1920 Output Bandwidth (MB/s) 264 1600
(3200 possible)
Readout System Upgrade
12 x 6 Gb/s optical link
1156-pin Xilinx Virtex 6 FPGA 2x DDR3 SODIMM
Gen2 x8 PCI Express
• Faster , smaller ROS PCs with higher link density
– Buffers housed on new PCIe card: RobinNP – New ROS PCs will be half the height of the old – 2 RobinNPs per ROS PC (24 ROLs)
– Link density 3x greater
• Significant space, power and cooling saving
• Much greater scope for later expansion
– Significant output network upgrade (4 x 10 GbE)
ROS Features Run 1 Run 2
Number of PCs 150 100
Max Number of PCs per Rack 10 15 Average Readout Links per PC 12 24 Max PC Input Bandwidth (MB/s) 1920 3840 Max PC Output Bandwidth (MB/s) 200 4000
• Data management done by CPU of host PC
– Original ROBIN tasks managed by on-board Power PC processor
– Host processing easier to maintain, can upgrade CPU for performance boost
• Hardware based on board developed by ALICE (Common Readout Receiver Card or ‘C-RORC’)
– Custom ATLAS firmware implementing enhanced
‘ROBIN’ functionality
Dataflow Network in Run 1
Data Collection Network (DC1, DC2)
Back-end
Network (BE1, BE2)
DC1 DC2
BE1 BE2
L2 Trigger and Event Filter Nodes (~8k cores)
Event Builder
Nodes (48) running 96 processes in total
Readout PCs ~150
Data Logger (~5) Event Filter
Nodes
(~8k cores)
Data from front-end
Offline processing and storage
10 GbE 1 GbE rack fan in/out
Level 2
Supervisors (~6)
04/07/2014
W. Panduro Vazquez (RHUL), ICHEP 2014 9
Dataflow Network in Run 2
Data Collection Network (DC1, DC2)
DC1 DC2
Combined High Level Trigger (L2, EF & EB) Nodes (~20k Cores)
Next Gen Readout PCs ~100
Data Logger (~10) Data from
front-end
Offline processing and storage
10 GbE 1 GbE rack fan in/out
• Direct 10 GbE connection to readout PCs
• Merger of L2, EF and EB functionality into HLT
• No need for back-end network layer
• Backbone implemented and
• ROS connectivity Q4 2014
HLT
Supervisor (1)
04/07/2014
W. Panduro Vazquez (RHUL), ICHEP 2014 10
High Level Trigger Upgrade
• Remove distinction between Level-2 (L2) nodes and Event Filter (EF) nodes
– Details in talk by F. Pastore (Detector RD & Performance)
• Previous complicated dataflow and control replaced by Data Collection Manager (DCM)
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 11
Run 1 Run 2
Level 1 Trigger
RoI Builder
HLT Supervisor
HLT Processor
DCM
Readout System
Data Logger
Regions of Interest
Regions of Interest Discard data
Serve Data
End-of-event processing
Data Request
Functionality Combined
• Common nodes performing complete decision process
– Optimised and more flexible event building
• Eliminates previous limit of 7 kHz
• Major saving in bandwidth and processing time
– No need to pack/unpack and transfer data between nodes
– Event building no longer needs to duplicate requests for data that the HLT already holds through prior L2 request
• More advanced processing allowing simpler, more efficient dataflow
– Better HLT node load balancing
New Software Technologies
• To achieve upgrade goals many new techniques implemented
– Advantage also taken of opportunity to upgrade from obsolete architectures
– Widespread adoption of C++11 capabilities
• Dataflow request-gathering protocol
– Now based on asynchronous messaging based on boost::asio
• Easier to maintain (fewer custom implementations/packages)
• Simpler, more generic threads
• Threading overhaul (throughout ROS and HLT)
– Adoption of C++11 threading libraries accompanied by well supported externals: Intel threaded building blocks (tbb)
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 12
ATLAS TDAQ System (Run 2)
13
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014
Summary
• Major overhaul of DAQ system under way
– Needed to handle new run conditions (pileup) and performance requirements (higher rates) – New detector and trigger components being integrated
• ROS upgrade
– New smaller, faster PCs, 3 x denser packing of links – New custom buffer hardware (RobinNP)
– Installation scheduled for Q4 2014
• Dataflow network
– Comprehensive capacity upgrade
– Simpler, more efficient structure eliminating need for backend layer – Upgrade nearing completion
• Testing and validation ongoing
• ROS connectivity to be implemented throughout summer 2014
• HLT upgrade
– Merging of level 2 trigger and event filter
– Major overhaul of trigger algorithms (see talk by F. Pastore – Detector RD & Performance) – Ongoing integration and testing
• New technology (hardware, firmware and software) throughout
– Integrating lessons of Run 1
• On schedule to be ready by end of 2014
– Regular programme of milestone weeks throughout the year to drive integration
• 2014 is proving a busy but rewarding year!
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 14
Extra Slides
Effect of Pileup on DAQ
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 16
L2 Trigger Event Filter
• Processing time vs average pileup <μ> for different parts of the HLT
• Different performance for different core types
• Average pileup also results in an
increased overall event size
Run 1 ROBIN Card
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 17
• Memory buffers implemented on custom PCI hardware (ROBIN)
– 3 readout links per card – 64 MB memory per link
– Virtex-II FPGA + PPC chip for event indexing and transfer management – 64 bit PCI (264 MB/s) Bus for data transfer to ROS PC
– Up to 5 ROBINs per PC (15 channels)
• 4 in most cases (12 channels) Virtex II FPGA
3 x 64 MB Memory buffers Power PC
Processor 64 bit 66 Mhz PCI Bus
264 MB/s 3 x 2 Gb/s
optical links
New ROS Performance
• Current baseline measurements (more performance improvement possible)
• New ROS PC: Intel E5-1650v2 (6 cores + HT) @ 3.5 Ghz
• Exceed Run 2 performance requirements
• Readout of 100% of data possible for fewer links
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 18
RobinNP Indexer Thread
(1 per ROBgroup)
Request Processor Thread
(1 per ROBgroup)DMA Monitoring Thread
(1 per ROBgroup)Async IO Threads
Free Descriptor Queue (Semaphore)
Request Queue (Semaphore)
Memory Pages (FIFO + Interrupts)
RobinNP Hardware
Queue DMA
Receive DMA (FIFO + Interrupts) Descriptor Flow
Page Flow
DMA & DMA Control
HLT Traffic and Data Collection
HLT Traffic
Data Collector Thread
Processed
Descriptor Queue (Semaphore)
RobinNPReadoutModule
TriggerIn
ROS Software Architecture
04/07/2014 W. Panduro Vazquez (RHUL), ICHEP 2014 19