Cluster, Grids, and Cloud
Cluster, Grids, and Cloud Computing Computing
for high performance computing and business for high performance computing and business
Yudith Cardinale, PhD Universidad Simón Bolívar
Caracas, Venezuela Octobre 2009
High Performance Computing High Performance Computing
◆ Use of the currently most powerful Use of the currently most powerful computers (supercomputers) to solve computers (supercomputers) to solve
advanced computation problems: need a advanced computation problems: need a
lot of CPU time or/and huge quantity of lot of CPU time or/and huge quantity of
data.
data.
◆ Commonly related with parallel Commonly related with parallel programming to use many CPUs programming to use many CPUs
simultaneously.
simultaneously.
◆ Mostly associated with computing used for Mostly associated with computing used for scientific research.
scientific research.
Petrochemical: fluids dynamics, simulation, modeling, seismic tomography
Digital Biology
Internet &
e-commerce
Examples of HPC applications Examples of HPC applications
Time prediction Number Theory
CRAY 2 (1985-89)
Supercomputing Architectures Supercomputing Architectures
◆ Traditional Traditional
supercomputers were supercomputers were
based on few processors based on few processors
and vectorial facilities, and vectorial facilities,
with specialized with specialized
technology (30 M$!!) technology (30 M$!!)
◆ More recently, More recently,
supercomputers consist supercomputers consist
of hundreds of of hundreds of
processors (Massively processors (Massively
Parallel Processors-MPP) Parallel Processors-MPP)
… and still high costly!!
… and still high costly!!
Supercomputing Architectures: Clusters Supercomputing Architectures: Clusters
◆ GGeneric computers eneric computers
connected to each other connected to each other
through fast local area through fast local area
networks viewed as a networks viewed as a
single computer.
single computer.
◆ Usually deployed to Usually deployed to
improve performance and improve performance and
availability over single availability over single
computers, while typically computers, while typically
being much more cost- being much more cost-
effective than single effective than single
computers of comparable computers of comparable
speed or availability speed or availability
Roadrunner, BlueGene, IBM
+1400 Tflops, 129K CPUs
Supercomputing Architectures:
Supercomputing Architectures:
Clusters Clusters
Nebulae, Dawning Blade, Nvidia GPU, +1200 Tflops, 120K CPUs,
Cluster
Jaguar, Cray Inc, Opteron Six Core, +1700 Tflops, 224K CPUs, MPP
New GPU Parallel Architecture,
Nvidia
◆ Benefits of Clusters:Benefits of Clusters:
● Better relation cost/benefits (reduced cost)Better relation cost/benefits (reduced cost)
● ScalabilityScalability
● Availability Availability
● Re-usabilityRe-usability
● Processing PowerProcessing Power
● Many, many available open source toolsMany, many available open source tools
◆ Clusters and MPP domain the TOP500 list Clusters and MPP domain the TOP500 list
Supercomputing Architectures: Clusters
Supercomputing Architectures: Clusters
Site Equipment Pays Network SO Jaguar-Cray XT5
USA 224162 MPP Proprietary Linux AMD Opt 6C
Nebulae-Dawning
China 120640 1.27Pflops Cluster Linux TC36000 Blade, Intel GPU
DOE/NNSA/LLNL USA 122400 Cluster Linux
NSF U. Tennessee AMD Opt 6C USA 98928 MPP Proprietary Linux JUGENE- Blue Gene/P German 294912 MPP Proprietary CNK
NASA USA 81920 772Tflops MPP SGI
#Proc. Pmax Archit.
Oak Ridge Nat Lab 1.76 Pflops
NSCS Shenzhen Infiniband
Roadrunner-BladeCenter
1.04 Pflops Infiniband PowerXCell – IBM
Kraken-Cray XT5
832 Tflops
Forschungszentrum 826 Tflops
Pleiades-SGI Altix ICE
Infiniband Intel Xeon QC
Top 6 (June 2010)
Top 6 (June 2010)
Cluster categorizations Cluster categorizations
◆ Load-balancing clusters: share Load-balancing clusters: share
computational workload, distribute the computational workload, distribute the
workload efficiently.
workload efficiently.
◆ High availability clusters: provide High availability clusters: provide
redundancy of data and services (backup redundancy of data and services (backup
systems) systems)
◆ High performance (Compute) clusters: High performance (Compute) clusters:
provide parallel execution provide parallel execution
Examples of how Clusters are used in HPC Examples of how Clusters are used in HPC
and business and business
◆ Load-balancing clusters: Load-balancing clusters:
◆ Transactional based systems (banking)Transactional based systems (banking)
◆ e-store or online application for customer e-store or online application for customer service (Amazon, ...)
service (Amazon, ...)
◆ High availability clusters: High availability clusters:
◆ Google platformGoogle platform
◆ Services on-line (selling, banking,...)Services on-line (selling, banking,...)
◆ High performance (Compute) clusters: High performance (Compute) clusters:
◆ Science and research (human genome, Science and research (human genome, weather prediction, NASA, etc.)
weather prediction, NASA, etc.)
I want more!!
I want more!!
◆ Limitations: the consumption of electrical Limitations: the consumption of electrical power is limited (power conditioning) … the power is limited (power conditioning) … the
physical space (a room, a building?) … physical space (a room, a building?) …
cooling equipment cooling equipment
◆ Solution: aggregate distributed resources, Solution: aggregate distributed resources, geographically distant, through fast
geographically distant, through fast networks
networks
◆ Share clusters offer economies of scale Share clusters offer economies of scale and more effective use of resources
and more effective use of resources
Ubiquitous Computing
Ubiquitous Computing
Technology Evolution: Supercomputers, Clusters, Ubiquitous Computing (volunteer (p2p), grid, and cloud)
Ubiquitous Computing
Ubiquitous Computing
(Computer Supported Collaborative Work)
Ubiquitous Computing: two perspectives Ubiquitous Computing: two perspectives
◆ Technology used to enhance collaborationTechnology used to enhance collaboration
◆ Collaboration used to enhance technologyCollaboration used to enhance technology
Volunteer Computing (P2P computing) Volunteer Computing (P2P computing)
◆ Scenario: The publicScenario: The public
◆ 6+ billion of citizen6+ billion of citizen
◆ Many have Internet accessMany have Internet access
◆ Most of then are (or could be made) Most of then are (or could be made) interested in science
interested in science
◆ Big capacity of computing and storage Big capacity of computing and storage aggregated ...in the World
aggregated ...in the World
◆ High percentage of idle compute time High percentage of idle compute time
◆ How can science get benefits of that?How can science get benefits of that?
◆ People provide computing resources People provide computing resources (volunteer computing)
(volunteer computing)
Volunteer Computing (P2P computing) for science Volunteer Computing (P2P computing) for science
SETI@home (Search for Extraterrestrial Intelligence)
Folding@home (study of protein folding) y genome@home
(human genome project)
Volunteer Computing (P2P computing) for science Volunteer Computing (P2P computing) for science
◆ Other projects:Other projects:
◆ Climaprediction.net
◆ Einstein@home (search for spinning neutron stars (pulsars))
◆ LHC@home (particle accelerators and detectors)
◆ Rosetta@home (study of protein structures)
◆ Performance:Performance:
◆ Currently: 500K people, 1M of PCs, 6.5 Currently: 500K people, 1M of PCs, 6.5 PetaFlops.
PetaFlops.
◆ Potential: 1 billion of PCs (today), 2 Potential: 1 billion of PCs (today), 2
billions of PCs (2015), GPU, ExaFlops, billions of PCs (2015), GPU, ExaFlops,
Exabytes of storage.
Exabytes of storage.
Volunteer Computing: Business-to-Business (B2B) Volunteer Computing: Business-to-Business (B2B)
◆ Business intelligence: collaboration among Business intelligence: collaboration among members of commerce chains (sharing
members of commerce chains (sharing customer data and business strategies) customer data and business strategies)
◆ E-Business collaboration: small business E-Business collaboration: small business can integrate with big ones
can integrate with big ones
◆ Edge Services: enterprises integrate services Edge Services: enterprises integrate services and capabilities across widely geographically and capabilities across widely geographically
boundaries and bring data closer to the boundaries and bring data closer to the
consumption point consumption point
◆ Distributed computing and resources Distributed computing and resources
utilization: for enterprises with large-scale utilization: for enterprises with large-scale
networks, they can take advantage of idle networks, they can take advantage of idle
Volunteer Computing: Business-to-Business (B2B) Volunteer Computing: Business-to-Business (B2B)
… … but P2P platform is not enough … but P2P platform is not enough … Security and Standards are needed Security and Standards are needed
Implications of Volunteer Computing Implications of Volunteer Computing
◆ Anonymous resourcesAnonymous resources
◆ Result validation, no trusted results Result validation, no trusted results
◆ Replication of compute Replication of compute
◆ Heterogeneity Heterogeneity
◆ Applications are not “tied” to any Applications are not “tied” to any architecture
architecture
◆ Sporadic availability of resourcesSporadic availability of resources
◆ Sporadic availability of unidirectional Sporadic availability of unidirectional network
network
◆ ScalabilityScalability
◆ Easy to use (clients)Easy to use (clients)
- Unification of geographical distributed resources
Grid Computing Definition
Grid Computing Definition
GRID = An infinity world of resources
Grid Computing Definition Grid Computing Definition
◆ Heterogeneous resources tied by Heterogeneous resources tied by
middleware and accessible transparently to middleware and accessible transparently to
a fast network a fast network
◆ Grid infrastructure provides a common set Grid infrastructure provides a common set of services and capabilities deployed across of services and capabilities deployed across
resources resources
◆ Methods and approaches for accessing grid Methods and approaches for accessing grid services are responsibility of final users
services are responsibility of final users
Grid Computing Definition Grid Computing Definition
◆ Allows collaboration among academic, Allows collaboration among academic, research, and scientific institutions
research, and scientific institutions
◆ Each institution offers its infrastructure to Each institution offers its infrastructure to gain access to other infrastructures:
gain access to other infrastructures:
◆ increase local power computeincrease local power compute
◆ better utilization of resources (execute better utilization of resources (execute
bigger applications without investment in bigger applications without investment in
infrastructure) infrastructure)
◆ Each institution is responsible of its own Each institution is responsible of its own infrastructure (costs, administration,
infrastructure (costs, administration, maintenance)
maintenance)
Grid Computing Definition Grid Computing Definition
◆
We can say that Grid Computing We can say that Grid Computing defines a
defines a Social Computation Model: Social Computation Model:
◆
All institutions collaborate with All institutions collaborate with common research objectives
common research objectives
◆
All institutions get benefits, in short All institutions get benefits, in short term or long term.
term or long term.
Security
Resources assigning
& Scheduling
Data locality
System administration
Resources discovery
Uniform Access
Computational Economy
Construction of applications
Challenges for building grid computing
Challenges for building grid computing
Middlewares Examples (science) Middlewares Examples (science)
Globus (http://www.globus.org)
gLite (http://glite.web.cern.ch/glite) http://legion.virginia.edu
http://www.unicore.org
http://suma.ldc.usb.ve
Grids Platforms Examples (science) Grids Platforms Examples (science)
EGEE/EELA/GILDA
(Europe and Latin America)Teragrid (USA)
Implications of Grid Computing Implications of Grid Computing
◆ Identified resourcesIdentified resources
◆ Formal participationFormal participation
◆ Trusted and secure environments Trusted and secure environments
◆ Heterogeneity Heterogeneity
◆ Integrate all types of software, Integrate all types of software, hardware, operating systems, hardware, operating systems,
languages, and architectures languages, and architectures
◆ High availability of resourcesHigh availability of resources
◆ Provide Fault ToleranceProvide Fault Tolerance
◆ ScalabilityScalability
Cloud Computing Definition Cloud Computing Definition
Unification of geographical distributed resources for SELLING
Cloud Computing Definition Cloud Computing Definition
◆ Cloud Computing involves delivering Cloud Computing involves delivering hosting services over the Internet
hosting services over the Internet
◆ Provides easy, scalable access to Provides easy, scalable access to
computing resources and IT services computing resources and IT services
◆ Users need not have knowledge of, Users need not have knowledge of,
expertise in, or control over the technology expertise in, or control over the technology
infrastructure in the "cloud"
infrastructure in the "cloud"
◆ Services are fully managed by providersServices are fully managed by providers
◆ Takes advantage of P2P and Grid Takes advantage of P2P and Grid computing technologies
computing technologies
Cloud Computing Definition Cloud Computing Definition
◆ These services are broadly divided into These services are broadly divided into three categories:
three categories:
◆ Infrastructure-as-a-Service (IaaS)Infrastructure-as-a-Service (IaaS)
◆ Platform-as-a-Service (PaaS)Platform-as-a-Service (PaaS)
◆ Software-as-a-Service (SaaS)Software-as-a-Service (SaaS)
◆ There are public clouds and private cloudsThere are public clouds and private clouds
◆ Public clouds sells services to anyonePublic clouds sells services to anyone
◆ Private clouds supplies services to a Private clouds supplies services to a limited number of people
limited number of people
Cloud Computing Definition Cloud Computing Definition
◆
We can say that Cloud Computing We can say that Cloud Computing defines a
defines a Capitalist Computation Capitalist Computation Model:
Model:
◆
Every one gets individual benefits Every one gets individual benefits with minimum investment
with minimum investment
◆
Providers get benefits ($$$) from Providers get benefits ($$$) from every one
every one
Challenges of Cloud Computing Challenges of Cloud Computing
Virtualization
QOS
Accounting
Web Services
Cloud Computing Examples
Cloud Computing Examples
Business: Grid or Cloud Computing?
Business: Grid or Cloud Computing?
◆ In the enterprises area there is not the idea of In the enterprises area there is not the idea of
“sharing”: how much is my profit if I offer my
“sharing”: how much is my profit if I offer my resources? (hard to measure in grid computing) resources? (hard to measure in grid computing)
◆ Common objectives among companies? Usually Common objectives among companies? Usually they compete (commercial competition) (grid
they compete (commercial competition) (grid computing is for common goals)
computing is for common goals)
◆ In business context, users prefer have control In business context, users prefer have control over their resources: no viable in grid but tacit over their resources: no viable in grid but tacit
in cloud in cloud
◆ In cloud: providers save with massive In cloud: providers save with massive
administration of resources, user save in total administration of resources, user save in total
cost of ownership cost of ownership