KEMBAR78
Notes - DN - Grid and Cloud Computing | PDF | Cloud Computing | Grid Computing
0% found this document useful (0 votes)
63 views6 pages

Notes - DN - Grid and Cloud Computing

The document discusses the differences and similarities between grid computing, utility computing, and cloud computing. It provides definitions and background on each type of computing. Some key points made are: - Grid, utility, and cloud computing all aim to harness shared resources to meet demands cost-effectively and efficiently. - Cloud computing builds on grid computing concepts and leverages technologies like virtualization and Web 2.0 applications. - While the architectures and business models differ, all three environments share the goal of optimally meeting demands using shared resources.

Uploaded by

Charles Darwin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views6 pages

Notes - DN - Grid and Cloud Computing

The document discusses the differences and similarities between grid computing, utility computing, and cloud computing. It provides definitions and background on each type of computing. Some key points made are: - Grid, utility, and cloud computing all aim to harness shared resources to meet demands cost-effectively and efficiently. - Cloud computing builds on grid computing concepts and leverages technologies like virtualization and Web 2.0 applications. - While the architectures and business models differ, all three environments share the goal of optimally meeting demands using shared resources.

Uploaded by

Charles Darwin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Grid Computing & Cloud Computing

The fields of Grid, Utility and Cloud Computing have a set of common objectives in harnessing shared resources to
optimally meet a great variety of demands cost-effectively and in a timely manner.

Improvements at the processor level, and in recent years the availability of low cost multi-core circuits, additional
progress in high speed, low latency interconnects, has allowed building large-scale local clusters for distributed
computing, and the extension to wide-area collaborating clusters in the Grid. Now, the recent availability of
hardware support for platform virtualization on commodity machines provides a key enabler for Cloud based
computing

One is the improving maturity and capability of software to manage virtual machines and the other is the migration
from a monolithic approach in constructing software solutions to a service approach in which complex processes are
composed of loosely coupled components

Foster, Kesselman, and Tuecke (2001): “Grid concept is coordinated resource sharing and problem solving in
dynamic, multi-institutional virtual organizations...The sharing is, necessarily, highly controlled, with resource
providers and consumers defining clearly and carefully just what is shared, who is allowed to share, and the
conditions under which sharing occurs.”

Utility Computing is “a business model in which computing resources are packaged as metered services”
(Foster, Zhao, Raicu, & Lu, 2008) to meet on demand resource requirements.

Cloud Computing as providing application software delivered as services over the Internet, and the software and
hardware infrastructure in the data centers that provide those services using business models that are similar to
Utility Computing.

Cloud Computing leverages emerging technologies such as the Web 2.0 for application services, and virtualization
and dynamic provisioning support for platform services

Grid Computing, Utility Computing and Cloud Computing differ in aspects as their architectures, the types of
coordinated institutions, the types of resources shared, the cost/business models, and the technologies used to
achieve their objectives. However, all these computing environments have the common objectives in harnessing
shared resources to optimally meet a variety of demands cost-effectively and at timely manner.

Basics of Grid Computing


Grid Computing harnesses distributed resources from various institutions (resource providers), to meet the demands
of clients consuming them. Resources from different providers are likely to be diverse and heterogeneous in their
functions (computing, storage, software, etc.), hardware architectures (Intel x86, IBM PowerPC, etc.), and usage
policies set by owning institutions.

Developed under the umbrella of Grid Computing, information services, name services, and resource brokering
services are important technologies responsible for the aggregation of resource information and availability,
selection of resources to meet the clients’ specific requirements and the quality of services criteria while adhering to
the resource usage policies.

Cloud Services are “consumer and business products, services and solutions that are delivered and consumed in real-
time over the Internet” while Cloud Computing is “an emerging IT development, deployment and delivery model,
enabling real-timedelivery of products, services and solutions over the Internet (i.e., enabling Cloud services)”.

Interaction Models of Grid and Cloud Computing


One of the most scalable interaction models of Grid domains is peer-to-peer
Grids use heterogeneous resources from more than one resource provider belonging to the same Virtual
Organization (VO) to execute their applications. It is important for participating resource providers and consumers
to have common information models, interaction protocols, application execution states, etc.
The organization of Open Grid Forum (OGF) has the goal of establishing relevant and necessary standards for Grid
computing.

Some proposed standards include Job Submission Description Language (JSDL), Basic Execution Service (BES)
and others OGF officially launched a workgroup, named the Open Cloud Computing Interface Working Group
(OCCI-WG) to develop the necessary common APIs for the lifecycle management of Cloud infrastructure services.

Distributed Computing in the Grid and Cloud


Administrating and managing an interoperable collection of distributed compute resource clusters on which to
execute client jobs, typically scientific/ HPC applications.

The procedures and protocols required to support clients from complex services built on distributed components that
handle job submission, security, machine provisioning, and data staging.

The Grid also represents as a coherent entity a collection of compute resources that may be under different
administrative domains, such as universities, but inter-operate transparently to form virtual organizations.

The second aspect of distributed computing in the Grid is that job themselves are distributed, typically running on
tightly coupled nodes within a cluster and leveraging middleware services

Workloads in Clouds usually consist of more loosely coupled distributed jobs such as map/reduce, and HPC jobs
written to minimize internode communication and leverage concurrency provided by large multi-core nodes.

Layered Models and Usage patterns in Grid and Cloud


Infrastructure

This is the layer in which Clouds share most characteristics with the original purpose of Grid middleware.
Eucalyptus , OpenNebula, or Amazon EC2. In these systems users can provision execution
environments in the form of virtual machines through interfaces such as APIs or command line tools.

We use Globus as the reference Grid technology.

The user needs to be authorized to use the system. In Grid systems this is managed through the Community
Authorization System (CAS) or by contacting a Certificate Authority that is trusted by the target institution, which
issues a valid certificate.

Clouds usually offer web forms to allow the registration of new users, and have additional web applications to
maintain databases of customers and generate credentials, such as the case of Eucalyptus or Amazon.
Different mechanisms are employed to carry users’ requests, but Web Services are the most common of them. Users
either write a custom program that consumes theWS offered by providers, or use available tools. Examples include
the Amazon API tools for Amazon EC2

Globus offers a set of console-based scripts that facilitate communication with the Grid.

As part of the request for resource usage, users need to specify the action or task to be executed on the destination
resources. Several formats are available for this purpose. Globus supports a Resource Specification Language (RSL)
and a Job Submission Description Language (JSDL) that can define what process is to be run on the target machine,
as well as additional constraints that can be used by a matchmaking component to restrict the class of resources to be
considered, based on machine architecture, processor speed, amount of memory, etc.

Match-making and scheduling phase involved. The GRAM component from Globus is specially flexible in this
regard, and multiple adapters allow different treatments for jobs

More advanced and widely used adapters transfer job execution responsibility to a local resource manager such as
Condor, LoadLeveler or Sun Grid Engine. These systems are able of multiplexing jobs that are sent to a site into
multiple resources.

Cloud systems have simpler job management strategies, since the type of jobs are homogeneous and don’t need to
be adapted to a variety of resources such as in the case of the Grid. For example, Eucalyptus uses a Round Robin
scheduling technique to alternate among machines.

In the case of Cloud computing, the most important data that has to be transferred is the definition of an execution
environment, usually in terms of Virtual Machine images. There is no standard method for transferring data in Cloud
systems, but it is worth noting Amazon’s object storage solution, the Simple Storage Service (S3)

Grid and Cloud systems need to offer users a method to monitor their jobs, as well as their resource usage. acquiring
information about the Grid’s resources is provided by the Monitoring and Discovery Service (MDS),

High-level monitoring tools have been developed on top of existing Cloud management systems such as Amazon
CloudWatch.

Platform
The interface provided by a PaaS solution allows developers to build additional services without being exposed to
the underlying physical or virtual resources.

Examples of Cloud solutions that present these features are Google App Engine, Salesforce’s force.com or Microsoft
Azure

We define Platform level solutions as those containing the following two aspects:
• Abstraction from Physical Resources
• Programming API to Support New Services

PaaS solutions are deeply tied to Cloud vendors, and therefore they are designed in hand with the rest of the
infrastructure, and the second is that provisioning resources with the required libraries is much easier than in Grid
computing, allowing new nodes to be spawned with the required environment

Grids, having the required software installed in the execution resources usually involve having a human operator do
it, making the process more costly.

Applications

There is no clear distinctions between applications developed on Grids and those that use Clouds to perform
execution and storage.

Grid applications fall in the realm of scientific software, while software running in Clouds has leaned towards
commercial workloads.

Possible causes for the different levels of adoption of these technologies for the development of applications:
• Lack of business opportunities in Grids.
• Complexity of Grid tools.
• Affinity with target software.

Service Orientation and Web Services


The architectural principle adopted by the Grid is Service Orientation (SO) with software components connected
by Web Services (WS).

Data Management

In Grid computing, data-intensive applications such as the scientific software in domains like high energy physics,
bio-informatics, astronomy or earth sciences involve large amounts of data, sometimes in the scale of PetaBytes
(PB) and beyond.
Data Grids have emerged in scientific and commercial settings to specifically optimize data management. For
example, one of the services provided by Data Grids is replica management

In order to retrieve data efficiently and also avoid hot spots in a distributed environment, Data Grids often keep
replicas, which are either complete or partial copies of the original datasets. Replica management services are
responsible for creating, registering, and managing replicas

GridFTP is an extension of FTP and it supports parallel and striped data transfer and partial file transfer. FTP and
GridFTP are the most widely-used transport protocols when transferring bulk data for Grid applications.

In the current state of Cloud computing, storage is usually close to computation and therefore data management is
simpler than in Grids

Clouds need to provide scalable and efficient techniques for transferring data. For example, we may need to move
virtual machine images, which are used to instantiate execution environments in Clouds, from users to a repository
and from the repository to hosting machines
Monitoring

Although some Cloud monitoring tools have already been developed, they provide high level information and, in
most cases, the monitoring functionality is embedded in the VMmanagement system following specific mechanisms
and models

The current challenge for Cloud monitoring tools is providing information from the Clouds and application/service
requests with sufficient level of detail in nearly real time in order to take effective decisions rather than providing a
simple and graphical representation of the Cloud status.

Grid monitoring is a complex task, since the nature of the Grid means heterogeneous systems and resources.
However, monitoring is essential in the Grid to allow resource usage to be accounted for and to let users know
whether and how their jobs are running. This is also an important aspect for other tasks such as scheduling.

A Grid performance monitoring tool also needs to handle many different types of resources and should be able to
adapt when communication links or other resources go down. Monitoring systems should be distributed to suit these
requirements

Ganglia is a scalable distributed monitoring system for high-performance computing environments such as clusters
and Grids

The Globus Monitoring and Discovery System (MDS) is another widely used monitoring tool that provides
information about the available resources on the Grid and their status.

Amazon CloudWatch is a web service that provides monitoring for Amazon Web Services Cloud resources such
as Amazon EC2. It collects raw data from AmazonWeb Services and then processes the information into readable
metrics that are recorded for a period of two weeks. It provides the users with visibility into resource utilization,
operational performance, and overall demand patterns - including metrics such as CPU utilization, disk reads and
writes, and network traffic.

Windows Azure Diagnostic Monitor collects data in local storage for every diagnostic type that is enabled and can
transfer the data it gathers to an Azure Storage account for permanent storage. It can be scheduled to push the
collected data to storage at regular intervals or it can be requested an on-demand transfer whenever this information
is required.

Scheduling, Metascheduling, and Resource Provisioning


In Grid computing, scheduling techniques have evolved to incorporate other factors, such as the heterogeneity of
resources or geographical distribution. The software component responsible for scheduling tasks in Grids is usually
called meta-scheduler or Grid resource broker. The main actions that are performed by a Grid resource broker are:
resource discovery and monitoring, resource selection, job execution, handling and monitoring. However, it may be
also responsible for other additional tasks such as security mechanisms, accounting, quality of service (QoS)
ensuring, advance reservations, negotiation with other scheduling entities, policy enforcement, migration, etc.

While in Grid computing the most important scheduling tasks are optimizing applications response time and
resource utilization, in Cloud computing other factors become crucial such as economic considerations and efficient
resource provisioning in terms of QoS guarantees, utilization and energy.

Interoperability in Grids and Clouds

One goal of Grid computing is to provide uniform and consistent access to resources distributed in different data
centers and institutions

One tool that takes this approach for Grid interoperation is meta-brokering

Meta-brokering supports the Grid interoperability from the viewpoint of the resource management and scheduling.
eg. GridWay

The Open Grid Forum Open Cloud Computing Interface (OCCI) working group of OGF is working on defining an
API for the development of interoperable tools for common tasks including deployment, autonomic scaling and
monitoring. Project OpenNebula and RESERVOIR projects have provided OCCI-compiant implemenations.

Security and User Management

Users in the Grid are granted privileges by site administrators based on their credentials, which are provided by a
trusted Certificate Authority. The Grid Security Infrastructure (GSI) (Welch et al., 2003) is the component of the
Globus middleware responsible for orchestrating security across different sites

GSI uses X.509 Public Key Infrastructure (PKI) and SSL/TLS protocols for transport encryption.

In the case of Cloud computing, the lack of standardization among vendors results in multiple security models: for
example, both Amazon EC2 and Eucalyptus employ pairs of X.509 certificates and private keys for authentication.
Google App Engine, an example of PaaS solution, requires users to first log-in via Google Accounts. The variety of
methods makes it difficult to create new opportunities for interoperation, and the fragmentation of security models
hinders the reuse of newly developed features.

Models such as the GSI infrastructure, where different providers trust various Certifying Authorities without
compromising the rest of institutions, would allow the scaling of Clouds outside of single institution boundaries.

You might also like