• No se han encontrado resultados

Plans of the WLCG for Run3 and HL-LHC era

N/A
N/A
Protected

Academic year: 2024

Share "Plans of the WLCG for Run3 and HL-LHC era"

Copied!
23
0
0

Texto completo

(1)

Plans of the WLCG for Run3 and HL-LHC era

Jose F. Salt Cairols

Instituto de Física Corpuscular

XI CPAN DAYS

21-23 October 2019

(2)

Overview

1.-The WLCG Global Collaboration 2.-Run 3 and HL/LHC Plan

3.- The Spanish LHC Computing GRID community (LCG-ES)

4.- Usage of additional compute resources 5.- Heterogeneity and Federation

6.- Software Optimization

7.- Spanish Strategy in Computing 8.- Summary and Outlook

23/10/2019 XI CPAN Days 2

(3)

1.- The WLCG Global Collaboration The Worldwide LHC

Computing GRID.

Distributed High- throughput

computing infrastructure to store, process and analyze data produced by the LHC experiments.

In numbers:

-   167 sites, 42 countries, 63 MoU’s -   ~ 800 Kcores

-  ~ 500 PB disk storage -  ~ 750 PB tape storage

-  Optical private nertwork (LHCOPN) and overlay over NREN s (LHCONE) with 10/100 Gbps links

CERN Computing Center

The equipment purchased by the centers (T0&T1 &T2) give service to the whole collaboration (as a detector)

WLCG is a worldwide and non-stop infrastructure

Contributes to the scientific and technological progress of the center which

participates in WLCG:

scientific infrastructure, expert perssonel, etc

(4)

2.-Run 3 and HL/LHC Plan

BEST GUESS Run 3:

- 2021 is a vey low data test run , resources-> same as 2018 for pp

- full Heavy Ions run is likely -> will need some level of additional resources -  2022 is a full year with a resources level of 1’5 times 2018

-  2023-24 Moderate (20%) growth rates

(5)

From I. Bird’s talk at 7th Scientific Computing Forum, 4/10/210 SCF, 4th Oct 2019, CERN

Resource Evolution

(6)

-4-5 times gap between ‘flat budget– 20% annual increase’ and resource requirements for HL-LHC

- Intense R&D to reduce data and resource requirements

(7)

-  Cost evolution is not well established -  Assumed price reduction

-  10% CPU, 15% disk, 20% tape

(8)

3.- The Spanish LHC Computing GRID Community (LCG-ES)

Clouds:

●  CERN, CA, DE, ES, FR, IT, ND, NL, RU, TW, UK, US The PIC Cloud (ES)

●  Tier1: PIC Barcelona

●  Provides 5% of Tier1 data processing of CERN's LHC detectors ATLAS, CMS and LHCb

●  Tier2s

: ○  CMS Spanish Tier2

○  CIEMAT Madrid

○  IFCA Santander

○  ATLAS Spanish Tier2 IFIC Valencia IFAE Barcelona UAM Madrid)

○  LHCb Spanish Tier2

○  USC Santiago de Compostela

○  UB (Universitat de Barcelona=

○  LIP Lisbon, Portugal

○  UTFSM Santiago, Chile

○  UNLP La Paz, Argentina (inactive)

-  Integrated in the WCLG project (World Wide LHC Computing GRID) and following the ATLAS/CMS/LHCb computing models

-  We represent the 4% of the total Tier-2s resources and the 5% of the Tier-1s ones

Total accounting of Resources:

CPU (HS06) =182K Disk (PB) = 14.5 Tape (PB) = 19.6

LCG-ES

(9)

More than 22 million finished jobs

On average, 5000 slots occupied by running jobs daily

More than 196 million events proccessed

More than 46 million files produced

Spanish Cloud performance in Run II

(10)

4.- Usage of additional compute resources

•  Supercomputers for LHC

–  Growing funding in supercomputing (HPC) infrastructures

•  Roadmap towards Exaflop machines

•  Countries/Funding agencies pushing HEP community to use these resources

–  Euro HPC Beur funding 2 aprox 200 PFlps machines by 2021, 2 EXaFlops by 2024

–  Data intensive computing with HPC facilities is not easy.

•  Limited/ no network connectivity in complete nodes

•  Limted storage for cahcing I/O event data files

–  The ‘Call for resource allocAtion” in not suitable

•  We need a guaranteed share of resources

•  agreement with BSC

–  LHC applications are NOT really suited for HPC

•  No large parallelization ( no use of fast node interconnects

•  No eseential use of acceleratos (GPU, FPGA)

–  Substantial integration work to make HPC work for HTC

(11)

•  Use of BSC (Barcelona Supercomputing Center) resources:

–  Recommendation of using the computing resources of BSC coming from Funding Agency

–  ATLAS: : effort devoted to addapt the queues at BSC to run simulation production jobs . In 2018, start to call for computing time (IFIC, IFAE) and several requests have been granted

•  Computing hours have been requested in the Spanish

Supercomputing Network (RES) and Europe (PRACE), being granted for the IFAE 2.8 M hours and IFIC 1.2M hours in the Mare Nostrum (BSC) and 2M hours in Lusitania (Cenit)

•  installed the ATLAS software and the necessary tools for the

execution of simulation work of the ATLAS detector in these HPCs, so in this way we have used resources outside the Spanish Tiers centers.

We have simulated more than 60 million event

-  IFIC/IFAE-PIC led ATLAS simulation when profiting of opportunistic HPC resources

-  More than 60 millions of events simulated

-  More than 90% of jobs ended

successfully

–  CMS:

•  CIEMAT/PIC: Regarding the use of BSC resources by CMS, we still cannot use them due to the lack of network connectivity from the nodes, which is necessary in CMS to integrate them into the WMS.

There is a project with the HTCondor team to address that limitation.

•  IFCA Adaptation of ALTAMIRA (node of RES in Cantabria) within the GRID Infrastructure (input de Ibán)

–  The grid infrastructure of the T2 has been redesigned so that when the T2 is saturated, check the availability of free HPC resources and forward them there. At the moment pilot

examples are operating using altamira in

"parasitic" mode, but it can be easily changed.

-  LHCb:

at the spanish level the LHCb groups have not started with these activities yet

23/10/2019 XI CPAN Days 11

(12)

-In December 2018: meeting at BSC to explore the possibility of having a

dedicated share for LHC computing needs

Take the example of another special ‘project

‘agreement with BSC

–  February-April: to prepare an LHC Computing-BSC agreement draft

–  Discussion of technical and policy questions –  July 2019: Sergi Girona (BSC) will prepare

the definitive document agreement to be approved at the November BSC ‘Junta de Gobierno’

(BSC Executive Board)

- February-March 2020 could be opened for users (hopefully)

Meeting at BSC in December 2018

View of Mare Nostrum

(13)

Cloud Computing Resources:

23/10/2019 XI CPAN Days 13

Experiments have run large scale tests using Cloud compute nodes

Google Cloud, Amazon AWS, Microsoft Azure

-> (aprox) 50K cores concurrently for few days

=>Commercial cloud is

•  not profitable for either (a) storage or (b) computing,

•  But it can be useful to test new architectures without investing

  Currentely essentially no commercial cloud use for LHC computing

  Potential future opportunties:

European Open Science Cloud (EOSC)

A EU model for use of cloud computing in the private and public sector

(14)

European Science cluster of Astronomy & Particle Physics ESFRI Research Infrastructure

(15)

5.- Heterogeneity and resources federation

(16)
(17)

Federation is the key

•  Federation in data storage:

–  The idea is localize bulk data in a cloud service (data lake): minimize replication, assure availability –  Serve data to remote ( or local) compute grid, cloud, HPC, ???

–  Simple caching is all that is needed at compute site (or none, if fast network) –  Federated data at national, regional, global scales

(18)

•  Federation of

computing resources

–  Main issue: reducing the hardware cost

–  reducing the operational cost

–  Co-location of data and processors is not

guaranteed- sites can be

‘diskless’

–  Heterogenous computing

PIC is contributing actively in the first group with studies in Data Access and Popularity for a CMS at PIC and CIEMA measuring the effect on the applications to real data in a remote way

(19)

6.- Software Optimization

•  Solution could come from the software

–  50 millions of lines of code mainly C++

–  “a project / experiment cannot afford to have bad software” (Graeme’s talk in Granada)

•  Initiatives:

–  HEP Software Foundation

–  IRIS-HEP: Institute for Research & Innovation in Software for HEP, 25M$, 5 years

–  Proposal a EU Scientific Software Institute –  In Spain: COMCHA forum

•  New hardware architectures

–  High level parallelism , new instructions sets,…

–  Support in software frameworks for heterogenous hardware

•  New/faster algorithms

–  Machine Learning/Deep Learning

–  Rewrite physics algorithms for new hardware

23/10/2019 XI CPAN Days 19

Improvement in CPU consumption by using faster phyisics algortithms in FASTSIM/FASTRECO

(20)

7.- Spanish Strategy in Computing

•  Common theme in many contributions to the EPPS Granada is the desire to collaborate with and benefit from LHC R&D work

•  Synergies and ‘not to reinvent the wheel’

•  Situation in different projects:

DUNE and CTA will

leverage the WLCG for its Computing Infrastructure

Nuclear Physics Coll:

ESCAPE address FAIR data management

The LHC Computing Model has been adapated to the needs and the size of

AGATA collaboration

Computing @

Future Accelerators

Meeting May 2019:

Addressing the outstanding questions CLIC and Future Circular Cilliders

(21)

and implies governance evolution

Our strategy in Spain could be to establish a Computing Committee in order to coordinate the study of the computing/storage needs of the different projects/

initiatives. Our organization would be fully embedded in the governance model

described above.

(22)

8 .- Summary and Conclusions

•  The Spanish LHC GRID Computing projects have been essential for the scientific achievements in LHC projects

•  New needs and objectives for Run 3 and HL-LHC will imply deep changes in our organization and technical challenges for the HEP Computing

Community

–  HPC resources/Cloud Computing/HLT –  Reseource Federation: Data Lakes

•  Export –partially or globally- the WLCG organization and perspective to other Astroparticle, Nuclear and High Energy scientific projects ( sinergy)

•  Take profit of the experience of the LHC computing GRID groups at the spanish centers since they (the centers) are also involved in other non- LHC experiments

23/10/2019 XI CPAN Days 22

(23)

THANKS

QUESTIONS?

Referencias

Documento similar

Squares in green, yellow and orange indicate the station type (suburban, urban background and traffic respectively) according to the air quality monitoring net- work (A — Madrid

The goal of the study is to analyze the support of political parties during electoral periods from Twitter comments, including 250 000 tweets regarding the Spanish general elections

In response to this situation, the Anglo-Spanish Employment Agency (ASEA) emerged as a replacement of John Holt glo- Re. _ _ _ cruit1ngAgency, with the main objective of

Analysis and comparison of the results obtained after the application of a gamified methodology and a traditional one in physical education in “bachillerato” Spanish education for 16

The research objective offered in this chapter is to study an analysis of how Spanish research centers of Consejo Superior de Investigaciones Científicas and public universities