Presented by:
Santiago González de la Hoz ([email protected]) IFIC – Valencia (Spain)
Challenges in the adoptation of the EGI paradigm for a e-Science/Tier2 centre
(ES-ATLAS-T2)
Santiago González de la Hoz, Álvaro Fernández Casaní, Gabriel Amorós, Carlos Escobar, Mohamed Kaci, Alejandro Lamas, Elena Oliver, José Salt, Javier Sánchez, Miguel Villaplana
EGI-InSPIRE
1
Outline
Introduction to the LHC and ATLAS Computing Model
LHC and ATLAS achievements last year
ATLAS hierarchical Computing Model
ATLAS Spanish Tier-2 and Tier 3
Description
Computing Resources
Storage Resources
User Analysis Jobs
RELATION with EGI and the Ibergrid NGI
Transition from EGEE to EGI
EGI organization
Heavy User Communities: HEP case
Migration from EGEE to EGI at IFIC
Description of tools
CONCLUSIONS
LHC and ATLAS achievements last years
LHC started again 20Nov 2009 First collisions 23Nov
(900GeV) Xmas stop
One million events reached!
(before 7TeV in March 2010)
LHC and ATLAS achievements last years
ATLAS operation has been very successful during both pp and PbPb collisions @ LHC during 2010.
Recorded almost all delivered luminosity
Sub-systems operational almost 100% of the time DATA RECORDED
LHC and ATLAS achievements last years
DATA RECORDD
Several full data re-processings @ Tier1s have been done with improved software, alignment and calibrations
Several MC productions have been done as well, and processed with the same conditions applied for data.
Step I data taken
Jul-Sep L~10 pb-1
Oct 29
Nov 8
Start reprocessing
Step II
# reprocessing jobs
Nov
Express Stream and Calo Stream reprocessing
Oct
Oct 8
DATA PROCESSING 2010
LHC and ATLAS achievements last years
Huge increase of analysis running jobs: Millions of jobs are run every week at hundreds of sites
Over 1000 different users during the past 6 moths LHC Start-up
WOLRDWIDE DATA ANALYSIS 2010
ATLAS Multi-Tier Hierarchical
Model
Hierarchical Computing Model
MONARC was more than ten years ago:
The landscape has changed
We have to update the maps
Cloud boundaries have serious impact on:
Production and Data Processing
Data placement and data access
CPU utilization
Disk space usage and data availability
Network bandwidth
The scenario to have one copy of derived data per cloud is not sustainable.
The scenario to execute the entire production task per cloud is not sustainable.
Data transfer capability today able to manage much higher bandwidth than expected/planned.
Network is extremely reliable
Traffic could flow more between countries as well as within
Tier2s could be used more efficiently. Tier1 and Tier2 may become more equivalent for the network. Hierarchy of Tier1,2 no longer so important
2010 Lessons
The first steps toward a new model
Hierarchical Computing Model
Grid and Clouds
Local File Catalogs consolidation:
There are more than 15 LFC ATLAS, one per cloud + 6 in US.
If LFC is down the whole cloud is down
It will be one catalog at CERN and and hot backup in another geographical location (BNL)
Dynamic data placement: PD2P
Caching data which are planned to be used
Decrease number of primary replicas
T2Ds. Directly Connected Tier2
Tier2 with the direct connection to ALL Tier1s, Tier2Ds and CERN
Tier2Ds Selection Criteria: Robustness and Network bandwidth and performance
Looking to the future
ATLAS Spanish Distributed Tier2
IFIC
IFAE
UAM
Enable Physics Analysis by Spanish ATLAS Users
Tier-1s send AOD data to Tier-2s
Continuous production of ATLAS MC events
Tier-2s produce simulated data and send them to Tier-1s
To contribute to ATLAS + WLCG Computing Common Tasks
Sustainable growth of infrastructure according to the scheduled ATLAS ramp-up and stable operation
T1/T2 Relationship
FTS (File Transfer System) channels are installed for these data for
production use
All other data transfers go through normal network routes
Tier 1 services:
VO Box, FTS channel server, Local file catalogue (part of Distributed Data Management)
Tier-2 and Tier -3 IFIC resources
CE CPU Cores Mem/Core HEPSpec06 ce01 Intel(R)
Pentium(R) D CPU 3.20GHz
40*2=80 2GB 40 x 2 x
5.77 = 461.6 ce04 Intel(R) Xeon(R)
CPU E5472 3.00GHz
48x8=384 2GB 48 x 8 x 9.20 = 3532.8 Intel(R) Xeon(R)
CPU L5520 @ 2.27GHz
32*8=256 2GB 32 x 8 x 18.40 = 4710.4
720 8704,8
IFIC is an e-Science centre with two infrastructure: Tier2, GRID-CSIC
Pledge 2010 ES-ATLAS-T2: 6000 HEPSpec06
Extra resources thanks to GRID-CSIC infrastructure:
Grid resources to be used by all the scientific communities in Spain which belong to CSIC
Storage resources
SUN X4500/40/70: 5x(500GB) + 14x(1TB) TOT: 710 TB ( ATLAS Pledge 2010 523 TB )
Tier-2 Resources
Disk servers agregated using linux (RHEL5) + RAID5 (software) + Lustre
The 48 disks are distributed into 6 OSTs (5*8 +1*6 + 2 OS) Lustre v1.8 (in hardware with iSCSI + HA)
One metadata server (MDS) Lustre server with redundancy RAID1.
Tier-3 Resources
Around 100 TB → 60 TB under DDM control + 40 TB under IFIC control
Space token dedicated to Tier-3 → ATLASLOCALGROUPDISK
To manage local users’ data.
It has an area on a SE but points to non-pledged space
• Switch Cisco 6509
• 10 Gbit to backbone
• 1 Gbit to worker nodes and disk servers
User Analysis jobs
Jobs run where the data are located (2010 model). Data grouped in datasets
User can ask for a replica in other site (Datri, DDM)
Athena package is installed by grid jobs by swadmins, and used for montecarlo production, and analysis.
Receiving and storing the produced data, thanks to the high availability of its sites and the reliable services provided by the team managers
Providing the required distributed analysis tools to allow users to use the data and produce experimental results.
Distributed analysis through Ganga and Panda
Some tools are going to be supported by EGI
Tier2 Dataset Job
Transition from EGEE to EGI
EGEE ended in April’2010 and EGI-InSPIRE continues from its legacy
Organized in NGIs (Pl-Grid, Ibergrid,…)
Transition is being done, and for what affects to ATLAS the
important issues are:
Support for middleware and tools.
Infrastructure support according to required levels
Not disturbing of current operations
and end-users
Heavy User Communities: HEP case
This activity provides continued support for activities currently supported by EGEE while they transition to a sustainable support model within their own community or within the production infrastructure
Main EGI-InSPIRE tasks that affects ATLAS:
EGI-InSPIRE SA3:
SA3.3 Services for HEP (204PMs CERN, 60PMs INFN)
User Community Support on Services for HEP
The services used by High Energy Physics experiments at the LHC can be classified in
(as defined in deliverable MS603)1.
Experiment services – developed, maintained and operated by the collaborations themselves
2.
Middleware services – generic services at Grid middleware layer, typically operated by WLCG
3.
Infrastructure services – fabric-oriented services operated by the sites
4.
Database services
Experiment specific services
Experiment services provide functionality very specific to one experiment and the corresponding computing model
Use generic m/w where possible
ALICE: AliEN
ATLAS: PanDA, DDM
CMS: CRAB, Analysis Server, Production Agent, PhEDEx, DBS
LHCb: DIRAC
(as defined in deliverable MS603)
Middleware Services used by HEP
Data Management: LFC, FTS
Workload management: Ganga, Condor-G, gLite WMS, glideinWMS
Persistency services: CORAL, POOL, COOL, FroNTier
Monitoring services: HammerCloud, Experiment Dashboard, SAM, Nagios
Security Services: VOMS, VOMRS, MyProxy
Computing Services: LCG CE, CREAM CE, OSG CE, ARC CE
Storage Services: CASTOR, dCache, DPM, xrootd, StoRM, BeSTMan
(as defined in deliverable MS603)
Migration from EGEE to EGI at IFIC
The transition from EGEE to EGI.InSPIRE is being done and for what affects to ATLAS the important issues are:
All our services in Glite 3.2 and SLC5 (srmv2, gridftp, squid, top-bdII, site- bdII, WMS, proxy, MON,..)
Support for middleware and tools, for instance supporting NGI VOs with the Grid-CSIC infrastructure:
Infrastructure support according to required levels. Some users communities are using NGI VOs already created or we have created new specialized VOs. Each VO should have a dedicated person, for instance to install the required software for that community.
Not disturbing of current operations and end-users. At IFIC local VO (“ific”) is used for the new users in order to be training in the grid technologies and tools
The end of support for the lcg-CE service with a completed migration to the CREAM service (same situation for UI).
The proposal is that all sites supporting LHC experiments run CREAM and are no longer required running LCG-CE for LHC.
To support the new VOs
Storage Element: New space in our storage element (Lustre +Storm) is
being deployed for the new VOs without stopping the running services
Evolving ATLAS cloud model in 2011
Summer11 ALL LFCs aggregated in a single LFC at CERN (agreement between ATLAS and WLCG)
Cross Cloud Production
Current situation: Some big T2 sites already associated to many Tier1s
Adapt monitoring
Data Collection into T2s: Extend current channel validation:
Application: Atlas Distributed Analysis using the Grid supported by Panda and GANGA
How to combine all these: Job scheduler/manager: GANGA
Heterogeneous grid environment based on 3 grid infrastructures: OSG, EGEE,
Nordugrid
Ganga
https://twiki.cern.ch/twiki/bin/view/Atlas/DistributedAnalysisUsingGanga
A user-friendly job definition and management tool
Allows simple switching between testing on a local batch system and large-scale data processing on distributed resources (Grid)
Developed in the context of ATLAS and LHCb
Python framework
Support for development work from UK (PPARCG/
GridPP), Germany (D-Grid) and EU (EGEE/ARDA)
Ganga is based on a simple, but flexible, job abstraction
A job is constructed from a set of building blocks, not all required for every job
Ganga offers three ways of user interaction:
Shell command line
Interactive IPython shell
Graphical User Interface
See Kenyon’s talk on 13th April in user environment session
Panda/Ganga usage at IFIC
Hammercloud: These tests are executed in a regular basis in sites to spot potential problems at ATLAS sites
(see Daniel’s talk on 12th April in User support services session). The performance is shown to be dependent on the used file system. Lustre (at IFIC) works better without using file stager while dCache (at IFAE and UAM) has better behavior when the file stager is activated (ref)
Panda Statistics:
STEP09
Data taking
DDM
(see Fernando Barreiro’s talk on 12th April in Data management session)
Volumes managed today:
~more than 43PB
~more than 1.7 million datasets
~more than 130 million files distributed across >100 sites
Aggregated data transfer record on 2010-05-09:
10GB/s (Plot from the ATLAS DDM Dashboard)
In production since 2004 and considered one of the largest data management environments
• Manage the experiment’s data:
– Data placement
– Bookkeeping & accounting
– Data access to the other systems and end-users
Monitoring helps improving the reliability of the sites:
Data transfers
Job Monitoring
Site Commissioning
Conclusions
EGEE to EGI transition tasks performed:
Common middleware services and operations are now supported by EGI
A gradual migration from LCG-CE to CREAM-CE has been done in order to support the new VOs without overlapping with the running
services. The same is carried out for the User Interfaces.
New space in our storage element (Lustre +Storm) is being deployed for the new VOs without stopping the running services.
IFIC Users submit its analysis jobs where data is, replicating most used datasets to local storage.
Various tools are used, some of them supported by EGI, being HEP a Heavy User Community (like biomed):
Ganga, Panda, DDM, Dashboards,…
Impact on the IFIC e-Science infrastructure for ATLAS:
LHC started again on Nov’09 and successfully reached 7 TeV.
IFIC is part of Spanish Tier-2, and defined its Tier-3 to fulfill ATLAS
requirements. Computing and Storage resources are in place according to
Back up SLIDES
LHC and ATLAS achievements last years
ATLAS operation has been very successful during both pp and PbPb collisions @ LHC during 2010.
Recorded almost all delivered luminosity
Sub-systems operational almost 100% of the time DATA RECORDED
A solution: Grid technologies The offline computing:
- Output event rate: 200 Hz ~ 109 events/year - Average event size (raw data): 1.6 MB/event
Processing:
- order of 40k of today’s fastest PCs Storage:
- Raw data recording rate 320 MB/sec - Accumulating at 5-8 PB/year
Worldwide LHC Computing Grid (WLCG) ATLAS Data Challenge (DC)
ATLAS Production System (ProdSys)
Analysis Data Format
Derived Physics Dataset (DPD) after many discussions last year in the context of the Analysis Forum will consist (for most analysis) of skimmed/slimmed/thinned AODs plus relevant blocks of computed quantities (such as invariant masses).
Produced at Tier-1s and Tier-2s
Stored in the same format as ESD and AOD at Tier-3s