GRID COMPUTING – AN
INTRODUCTION
Dr G SUDHA SADSAIVAM
Outline
Introduction to Grid Computing
Methods of Grid computing
Grid Middleware
Grid Architecture
Grid Computing
Grid computing is a form of distributed computing whereby
a "super and virtual computer" is composed of a cluster of
networked, loosely coupled computers, acting in concert to
perform very large tasks.
Grid computing (Foster and Kesselman, 1999) is a growing
technology that facilitates the executions of large-scale
resource intensive applications on geographically distributed
computing resources.
Facilitates flexible, secure, coordinated large scale resource
sharing among dynamic collections of individuals, institutions,
and resource
Enable communities (“virtual organizations”) to share
geographically distributed resources as they pursue common
goals
Ian Foster and Carl Kesselman
Criteria for a Grid:
Coordinates resources that are not subject to
centralized control.
Uses standard, open, general-purpose protocols and
interfaces.
Delivers nontrivial qualities of service.
Benefits
Exploit Underutilized resources
Resource load Balancing
Virtualize resources across an enterprise
Data Grids, Compute Grids
Enable collaboration for virtual organizations
Grid Applications
Data and computationally intensive applications:
This technology has been applied to computationally-intensive
scientific, mathematical, and academic problems like drug
discovery, economic forecasting, seismic analysis, back
office data processing in support of e-commerce
A chemist may utilize hundreds of processors to screen
thousands of compounds per hour.
Teams of engineers worldwide pool resources to analyze
terabytes of structural data.
Meteorologists seek to visualize and analyze petabytes of
climate data with enormous computational demands.
Resource sharing
Computers, storage, sensors, networks, …
Sharing always conditional: issues of trust, policy,
negotiation, payment, …
Coordinated problem solving
distributed data analysis, computation, collaboration, …
Grid Topologies
• Intragrid
– Local grid within an organisation
– Trust based on personal contracts
• Extragrid
– Resources of a consortium of organisations
connected through a (Virtual) Private Network
– Trust based on Business to Business contracts
• Intergrid
– Global sharing of resources through the internet
– Trust based on certification
Computational Grid
“A computational grid is a hardware and software
infrastructure that provides dependable, consistent,
pervasive, and inexpensive access to high-end
computational capabilities.”
”The Grid: Blueprint for a New Computing
Infrastructure”, Kesselman & Foster
Example : Science Grid (US Department of Energy)
Data Grid
A data grid is a grid computing system that deals with data
— the controlled sharing and management of large
amounts of distributed data.
Data Grid is the storage component of a grid environment.
Scientific and engineering applications require access to
large amounts of data, and often this data is widely
distributed. A data grid provides seamless access to the local
or remote data required to complete compute intensive
calculations.
Example :
Biomedical informatics Research Network (BIRN),
the Southern California earthquake Center (SCEC).
Methods of Grid Computing
Distributed Supercomputing
High-Throughput Computing
On-Demand Computing
Data-Intensive Computing
Collaborative Computing
Logistical Networking
Distributed Supercomputing
Combining multiple high-capacity resources on
a computational grid into a single, virtual
distributed supercomputer.
Tackle problems that cannot be solved on a
single system.
High-Throughput Computing
Uses the grid to schedule large numbers of
loosely coupled or independent tasks, with the
goal of putting unused processor cycles to
work.
On-Demand Computing
Uses grid capabilities to meet short-term
requirements for resources that are not
locally accessible.
Models real-time computing demands.
Collaborative Computing
Concerned primarily with enabling and
enhancing human-to-human interactions.
Applications are often structured in terms of a
virtual shared space.
Data-Intensive Computing
The focus is on synthesizing new information
from data that is maintained in geographically
distributed repositories, digital libraries, and
databases.
Particularly useful for distributed data mining.
Logistical Networking
Logistical networks focus on exposing storage
resources inside networks by optimizing the global
scheduling of data transport, and data storage.
Contrasts with traditional networking, which does
not explicitly model storage resources in the
network.
high-level services for Grid applications
Called "logistical" because of the analogy it bears
with the systems of warehouses, depots, and
distribution channels.
A typical view of Grid
environment
Grid Information Service Grid Information Service
system collects the details of Details of Grid resources
the available Grid resources
and passes the information 1
to the resource broker.
2
4
Computational jobs
3
Grid application
Processed jobs
Computation result
User
A User sends computation
Resource Broker
A Resource Broker distribute the
or data intensive application
to Global Grids in order to
jobs in an application to the Grid Grid Resources
resources based on user’s QoS Grid Resources (Cluster, PC,
speed up the execution of requirements and details of available Supercomputer, database,
the application. Grid resources for further executions. instruments, etc.) in the Global
Grid execute the user jobs.