Grid User Interface:Ganga
Farida Fassi
Master de Physique Informatique
Rabat, Maroc
24-17 th , May, 2011
Outline
Ganga Overview
Ganga Architecture
How to use Ganga
More on Ganga usage
Brief review on Panda
Ganga Overview
• The naive idea of submitting jobs to Grid assume the following steps:
▫ Prepare the “Job Description Language” file for job configuration
▫ Find suitable (e.g. Athena) software application
▫ Locate the datasets on different storage elements
▫ Job splitting, monitoring and book-keeping
• Ganga combines the components to provide a front-
end client for interacting with Grid infrastructures
Ganga architecture
• Ganga allows simple switching between testing on a local batch system and large-scale data processing on Grid distributed resources
▫ Jobs look the same whether they run locally or on the Grid
▫ Configure once, run anywhere
Ganga Overview
Architecture
Job Object is where the Ganga journey starts:
A job in Ganga is constructed from a set of
building blocks, not all required for every job
Architecture
Customized application, plug-in based design , eases job creation
Incremental analysis development switching between different technologies:
First test on local machine
Intermediate sample analyzed on batch
Full sample run using GRID backends
Few words on analysis, data Model…
User has to define his/her analysis project specifying:
The physique process that he/her aim to study
Data format that continue the required information for the analysis.
Lean some tools
p b W
+W
-t
q
q
l
- b
t
p
b-jet
jet
jet
b-jet
Example application: ATHENA
Athena is the ATLAS framework used to control the execution workflow
Support Athena applications: Simulation, Reconstruction, and Analysis
Some Analysis Work-flows
Classic analysis using AOD (ROOT file or Database format)
Athena user code sequentially processes large Monte Carlo
or Data stream sample on the Grid
Produces ROOT tuple output which is further processed
locally or on the Grid
Small MC Sample Production:
Use Production System Transformation (Geant) to produce a
small MC sample for special/official usage
ROOT:
Generic ROOT application e.g. Toy MC
How to use Ganga
11
• Ganga processes, in the order they are specified, any
configuration files pointed to by the environment variable
▫ GANGA_CONFIG_PATH
and then processes “.gangarc” configure file
• This makes possible the use of group configuration files But allows settings to be overridden by user config
Configurations
• Ganga creates a directory gangadir in your home directory and uses this for storing job-related files and information
▫ created at the first launch
• [DefaultJobRepository] local_root = /alternative/gangadir
•
[
Ganga Workspace
Example: ATLAS Analysis Job
• ATLAS Applications: Athena and AthenaMC
• Data input:
▫ DQ2Dataset: all DQ2 dataset handling in client, LFC/SE
interaction on worker node, used by all backends
▫ ATLASDataset: LFC file access
▫ ATLASLocalDataset: local file system, Local/Batch backend
• Data output:
▫ DQ2OutputDataset: stores files on Grid SE, registration in DQ2
▫ AtlasOutputDataset: multipurpose for Grid and Local output
•[configuration]
•TextShell = IPython
•... ...
•[LCG]
• VirtualOrganisation=atlas
•... ...
•[athena]
• LCGOutputLocation = srm://lsrm.ific.uv.es/lustre/ific.uv.es/grid/atlas/dq2/users/
LocalOutputLocation = srm://lsrm.ific.uv.es/lustre/ific.uv.es/grid/atlas/dq2/users/
ATLAS_SOFTWARE = /opt/exp_software/atlas/prod/releases/rel_12-0_2
• …. ….
• Sy nta x
Hardcoded configurations
setenv GANGA_CONFIG_PATH GangaAtlas/Atlas.ini
set path = (/afs/ific.uv.es/project/atlas/software/ganga/install/4.4.2/bin/ $path
)
~/.gangarc ganga -g
user config > site config > release config
Sequence
Python ConfigParser standard
How to set configurations
release config
site config user config
Configurations
“ Hello World” example”: CLIP
• From a Ganga CLIP session, a job that writes
“Hello World” can be created, and submitted to LCG, as follows
app = Executeable() app.exe = “/bin/echo”
app.env = {}
app.args = [“Hello World”]
# Property values set above are in fact the defaults # for Executable application
j = Job(application = app, backend = LCG()) j.submit()
# Check on job progress jobs
# When job has completed, check the output
j.peek(“stdout”)
Athena example: CLIP
This assumes you are in the ATLAS VO, your cmt area set up and have checked out, built your package into a work area :
j = Job()
j.name='Test-AthenaJob-IFIC' j.application = Athena()
j.application.exclude_from_user_area=["*.o","*.root.*","*.exe"]
j.application.prepare(athena_compile=False)
j.application.option_file='$HOME/AthenaTerstArea/12.0.6/PhysicsAnalysis/AnalysisCommon/UserAnalysis/UserAnalysis- 00-09-10/run/AnalysisSkeleton_topOptions.py'
j.application.atlas_release='12.0.6'
j.inputdata.type='DQ2_LOCAL' j.application.max_events='10‘
j.inputdata=DQ2Dataset()
j.inputdata.dataset="trig1_misal1_mc12.005186.PythiaZmumu_pt100_fixed.recon.AOD.v12000601_tid005906"
j.splitter = AthenaSplitterJob(numsubjobs=2) j.merger = AthenaOutputMerger()
j.outputdata=DQ2OutputDataset()
j.outputdata.outputdata=['AnalysisSkeleton.aan.root']
j.backend=LCG()
j.backend.CE='ce01.ific.uv.es:2119/jobmanager-pbs-short' j.submit()
Aplication
InputData
Splitter & Merger OutputData
Submission
•
list_plugins( “type”) # List plugins of specified type:
•
# “applications”, “backends”, etc
•
j1 = Job(backend =LSF()) # Create a new job for LSF
•
a1 = Executable() # Create Executable application
•
j1.application = a1 # Set value for job’s application
•
j1.backend = LCG() # Change job’s backend to LCG
•
export(j1, “myJob.py”) # Write job to specified file
•
load( “myJob.py” ) # Load job(s) from specified file
•
j2 = j1.copy() # Create j2 as a copy of job j1
•
jobs # List jobs
•
jobs[i].subjobs # List subjobs for split job i
Ganga CLIP commands (1)
Useful commands
Ganga CLIP commands (2)
When a job j has been defined, the following methods can be used
j.submit() # Submit the job
j.kill() # Kill the job (if running)
j.remove() # Kill the job and delete associated files j.peek() # List files in job’s output directory
Once a job has been submitted,
it can no longer be modified,
it cannot be resubmitted, but
the job can be copied and the copy
can be modified/submitted
20
Ganga architecture Ganga architecture
CLIP GUI
Scripts
J = Job(backend=LSF()) j.submit()
Ganga.Core Ganga.Core
Athena
Gaudi
Job
repository
File Workspace IIN/OUT Sandbox CondorG
gLite
LSF Monitoring
Plugin modules
21
Support for managed production and user analysis
Coherent, homogeneous processing system layered over diverse resources
Pilot submission through CondorG, local batch or gLite WMS
PanD A PanD
A
Use of pilot jobs for acquisition of resources. Workload jobs assigned to successfully activated pilots based on Panda-managed brokerage criteria
integrated data management and
monitoring system
Monitoring tools
Jobs
SAM
Collect, store and expose to users
information coming from different
sources
23
You have a choice:
1 ).Select to see all jobs submitted in the selected time window,By default you get last 24 hours time Window
2).Select all jobs which had been terminated in last 24 hours or are pending or running at the current moment.
Then select ‘all jobs regardless submission time’ option
Monitoring tools
Useful links
Ganga https
://twiki.cern.ch/twiki/bin/viewauth/Atlas/FullG angaAtlasTutorial
Panda https
://twiki.cern.ch/twiki/bin/viewauth/Atlas/Panda
Dashboard: http://arda-dashboard.cern.ch/cms
SAM: