KEMBAR78
Module 4 Cloud Programming and Software Environments | PDF | Cloud Computing | No Sql
0% found this document useful (0 votes)
36 views20 pages

Module 4 Cloud Programming and Software Environments

Module 4 covers cloud programming and software environments, detailing features of cloud and grid platforms, programming support for major services like Google App Engine, Amazon AWS, and Microsoft Azure, and the management of Service Level Agreements (SLAs). It discusses capabilities, traditional features, data transport, security, and various programming paradigms in cloud computing. The module emphasizes the importance of parallel and distributed programming for enhancing application performance and resource utilization.

Uploaded by

Sohit Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views20 pages

Module 4 Cloud Programming and Software Environments

Module 4 covers cloud programming and software environments, detailing features of cloud and grid platforms, programming support for major services like Google App Engine, Amazon AWS, and Microsoft Azure, and the management of Service Level Agreements (SLAs). It discusses capabilities, traditional features, data transport, security, and various programming paradigms in cloud computing. The module emphasizes the importance of parallel and distributed programming for enhancing application performance and resource utilization.

Uploaded by

Sohit Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Module 4: Cloud Programming and Software

Environments
Kai Hwang,Geoffrey Fox, JackDongarra,Todd Green, Distributed and Cloud
Computing: Clusters, Grids, Clouds and The Future Internet, Morgan
Kaufmann Publishers,2011 (Text Book 6)
and Cloud Computing, Rajkumar Buyya (Text Book 1)

CLOUD COMPUTING,DEPT OF CSE 1


Module 4
CLOUD PROGRAMMING AND SOFTWARE ENVIRONMENTS:

Features of Cloud and Grid Platforms,

 Parallel and Distributed Programming Paradigms,

 Programming Support of Google App Engine,

Programming on Amazon AWS and Microsoft Azure

Text Book 6: Chapter 6: 61 to 6.4

SLA MANAGEMENT:

Inspiration,

 Traditional Approaches to SLA Management,

 Types of SLA, Life Cycle of SLA

 SLA Management in Cloud,

Automated Policy-based Manage CLOUD COMPUTING,DEPT OF CSE 2


6.1 FEATURES OF CLOUD AND GRID PLATFORMS
Important features in real cloud and grid platforms are summarized here.

In four tables, the capabilities, traditional features, data features, and features for programmers and
runtime systems to use are covered.

The entries in these tables are source references for anyone who wants to program the cloud
efficiently

CLOUD COMPUTING,DEPT OF CSE 3


6.1.1 Cloud Capabilities and Platform Feature
Commercial clouds need broad capabilities, as summarized in Table 6.1.

These capabilities offer cost-effective utility computing with the elasticity to scale up and down in power.
However, as well as this key distinguishing feature, commercial clouds offer a growing number of additional
capabilities commonly termed “Platform as a Service” (PaaS).

For Azure, current platform features include Azure Table, queues, blobs, Database SQL, and web and Worker
roles. Amazon is often viewed as offering “just” Infrastructure as a Service (IaaS), but it continues to add platform
features including SimpleDB similar to Azure Table), queues, notification, monitoring, content delivery network,
relational database, and MapReduce (Hadoop).

Google does not currently offer a broad-based cloud service, but the Google App Engine (GAE) offers a powerful
web application development environment.

CLOUD COMPUTING,DEPT OF CSE 4


6.1.1 Cloud Capabilities and Platform Feature

CLOUD COMPUTING,DEPT OF CSE 5


6.1.1 Cloud Capabilities and Platform Feature
Table 6.2 lists some low-level infrastructure features.

Table 6.3 lists the traditional programming environments for parallel and distributed systems that need to be
supported in Cloud environments. They can be supplied as part of system (Cloud Platform) or user environment.

Table 6.4 presents features emphasized by clouds and by some grids. Note that some of the features in Table 6.4
have only recently been offered in a major way. In particular, these features are not offered on academic cloud
infrastructures such as Eucalyptus, Nimbus, OpenNebula, or Sector/Sphere (although Sector is a data parallel file
system or DPFS classified in Table 6.4)

CLOUD COMPUTING,DEPT OF CSE 6


6.1.1 Cloud Capabilities and Platform Feature

CLOUD COMPUTING,DEPT OF CSE 7


6.1.1 Cloud Capabilities and Platform Feature

CLOUD COMPUTING,DEPT OF CSE 8


6.1.1 Cloud Capabilities and Platform Feature

CLOUD COMPUTING,DEPT OF CSE 9


6.1.2 Traditional Features Common to Grids and Clouds

In this section, we concentrate on features related to workflow, data transport, security, and availability

concerns that are common to today’s computing grids and clouds

6.1.2.1 Workflow

Workflow has spawned many projects in the United States and Europe. Pegasus, Taverna, and Kepler are popular,
but no choice has gained wide acceptance.

There are commercial systems such as Pipeline Pilot, AVS (dated), and the LIMS environments. A recent entry is
Trident [2] from Microsoft Research which is built on top of Windows Workflow Foundation. If Trident runs on
Azure or just any old Windows machine, it will run workflow proxy services on external (Linux) environments.

CLOUD COMPUTING,DEPT OF CSE 10


6.1.2.2 Data Transport

The cost (in time and money) of data transport in (and to a lesser extent, out of) commercial clouds is often discussed as a difficulty in
using clouds.

If commercial clouds become an important component of the national cyberinfrastructure we can expect that high-bandwidth links
will be made available between clouds and TeraGrid. The special structure of cloud data with blocks (in Azure blobs) and tables could
allow high-performance parallel algorithms, but initially, simple HTTP mechanisms are used to transport data [3–5] on academic
systems/TeraGrid and commercial clouds.

6.1.2.3 Security, Privacy, and Availability The following techniques are related to security, privacy, and availability requirements for
developing a healthy and dependable cloud programming environment.

• Use virtual clustering to achieve dynamic resource provisioning with minimum overhead cost.

• Use stable and persistent data storage with fast queries for information retrieval.

• Use special APIs for authenticating users and sending e-mail using commercial accounts.

• Cloud resources are accessed with security protocols such as HTTPS and SSL.

CLOUD COMPUTING,DEPT OF CSE 11


6.1.3 Data Features and Databases

interesting programming features related to the program library, blobs, drives, DPFS, tables, and various types of
databases including SQL, NOSQL, and nonrelational databases and special queuing services.

6.1.3.1 Program Library

Many efforts have been made to design a VM image library to manage images used in academic and commercial
clouds.

6.1.3.2 Blobs and Drives

The basic storage concept in clouds is blobs for Azure and S3 for Amazon.

These can be organized (approximately, as in directories) by containers in Azure.

In addition to a service interface for blobs and S3, one can attach “directly” to compute instances as Azure drives and
the Elastic Block Store for Amazon. This concept is similar to shared file systems such as Lustre used in TeraGrid.

CLOUD COMPUTING,DEPT OF CSE 12


6.1.3.3 DPFS

This covers the support of file systems such as Google File System (MapReduce), HDFS (Hadoop), and Cosmos
(Dryad) with compute-data affinity optimized for data processing.

It could be possible to link DPFS to basic blob and drive-based architecture, but it’s simpler to use DPFS as an
application-centric storage model with compute-data affinity and blobs and drives as the repository-centric view.

CLOUD COMPUTING,DEPT OF CSE 13


6.1.3.4 SQL and Relational Databases

Both Amazon and Azure clouds offer relational databases and it is straightforward for academic systems to offer a
similar capability unless there are issues of huge scale where, in fact, approaches based on tables and/or
MapReduce might be more appropriate [8].

6.1.3.5 Table and NOSQL Nonrelational Databases

A substantial number of important developments have occurred regarding simplified database structures—
termed “NOSQL” [9,10]—typically emphasizing distribution and scalability.

These are present in the three major clouds: BigTable [11] in Google, SimpleDB [12] in Amazon, and Azure Table
[13] for Azure.

CLOUD COMPUTING,DEPT OF CSE 14


6.1.3.6 Queuing Services

Both Amazon and Azure offer similar scalable, robust queuing services that are used to communicate between
the components of an application.

The messages are short (less than 8 KB) and have a Representational State Transfer (REST) service interface with
“deliver at least once” semantics.

They are controlled by timeouts for posting the length of time allowed for a client to process. One can build a
similar approach (on the typically smaller and less challenging academic environments), basing it on publish-
subscribe systems such as ActiveMQ [20] or NaradaBrokering [21,22]

CLOUD COMPUTING,DEPT OF CSE 15


6.1.4 Programming and Runtime Support

Programming and runtime support are desired to facilitate parallel programming and provide runtime support of
important functions in today’s grids and clouds.

6.1.4.1 Worker and Web Roles

The roles introduced by Azure provide nontrivial functionality, while preserving the better affinity support that is
possible in a nonvirtualized environment.

Worker roles are basic schedulable processes and are automatically launched. Note that explicit scheduling is
unnecessary in clouds for individual worker roles and for the “gang scheduling” supported transparently in
MapReduce.

Web roles provide an interesting approach to portals

CLOUD COMPUTING,DEPT OF CSE 16


6.1.4.2 MapReduce:
There has been substantial interest in “data parallel” languages largely aimed at loosely coupled computations
which execute over different data samples.
The language and runtime generate and provide efficient execution of “many task” problems that are well
known as successful grid applications.
However, MapReduce, summarized in Table 6.5, has several advantages over traditional implementations for
many task problems, as it supports dynamic execution, strong fault tolerance, and an easy-to-use high-level
interface.

CLOUD COMPUTING,DEPT OF CSE 17


CLOUD COMPUTING,DEPT OF CSE 18
6.1.4.4 SaaS

Services are used in a similar fashion in commercial clouds and most modern distributed systems. We expect
users to package their programs wherever possible, so no special support is needed to enable SaaS.

We desire a SaaS environment that provides many useful tools to develop cloud applications over large data sets.
In addition to the technical features, such as MapReduce, BigTable, EC2, S3, Hadoop, AWS, GAE, and
WebSphere2, we need protection features that may help us to achieve scalability, security, privacy, and
availability.

CLOUD COMPUTING,DEPT OF CSE 19


6.2 PARALLEL AND DISTRIBUTED PROGRAMMING PARADIGMS

Under this a parallel and distributed program as a parallel program running on a set of computing engines or a distributed computing
system is explained

The term carries the notion of two fundamental terms in computer science: distributed computing system and parallel computing.

A distributed computing system is a set of computational engines connected by a network to achieve a common goal of running a job
or an application. A computer cluster or network of workstations is an example of a distributed computing system.

Parallel computing is the simultaneous use of more than one computational engine (not necessarily connected via a network) to run a
job or an application.

For instance parallel computing may use either a distributed or a nondistributed computing system such as a multiprocessor platform.
Running a parallel program on a distributed computing system (parallel and distributed programming) has several advantages for both
users and distributed computing systems. From the users’ perspective, it decreases application response time; from the distributed
computing systems standpoint, it increases throughput and resource utilization

CLOUD COMPUTING,DEPT OF CSE 20

You might also like