KEMBAR78
Cloud Computing | PDF | Apache Hadoop | Cloud Computing
100% found this document useful (2 votes)
3K views42 pages

Cloud Computing

This technical project report summarizes a project on cloud computing submitted by Akansha Tyagi to Punjab Technical University. The report was guided by Mr. Narendra Kirola of eRoads Technology and completed by Akansha Tyagi as part of her studies at Swami Paramanand College of Engineering and Technology. The report includes an introduction to cloud computing, its characteristics and enabling technologies, examples of cloud computing services, and real-world uses of cloud computing.

Uploaded by

Akansha Tyagi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
3K views42 pages

Cloud Computing

This technical project report summarizes a project on cloud computing submitted by Akansha Tyagi to Punjab Technical University. The report was guided by Mr. Narendra Kirola of eRoads Technology and completed by Akansha Tyagi as part of her studies at Swami Paramanand College of Engineering and Technology. The report includes an introduction to cloud computing, its characteristics and enabling technologies, examples of cloud computing services, and real-world uses of cloud computing.

Uploaded by

Akansha Tyagi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 42

PUNJAB TECHNICAL UNIVERSITY

Jalandhar, Punjab 590018

A Technical Project Report


On

CLOUD COMPUTING
Submitted by

AKANSHA TYAGI
90790308965
Guided by
Mr. Narendra kirola
eRoads Technology

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

SWAMI PARAMANAND COLLEGE OF ENGINEERING AND


TECHNOLOGY
Jaulan-kalan,Lalru,
distt.Mohali,Punjab2012-2013
SWAMI PARAMANAND COLLEGE OF ENGINEERING AND
TECHNOLOGY
(Affiliated to Punjab Technical University, Jalandhar)

A
Project Report

Department of Computer Science

Swami Paramanand College Of Engineering and Technology


Jaulan-kalan,Lalru,
distt.Mohali,Punjab2012-2013

CERTIFICATE
Certified that seminar work entitled CLOUD COMPUTING is a bonafide work carried out in the
eighth semester by Akansha Tyagi(SPCET-90790308965) in partial fulfilment for the award of
Bachelor of Technology in Computer Science from Swami Paramand College of
Engineering and Technology during the academic year 2012-2013. Who carried out the seminar work
under the guidance and no part of this work has been submitted earlier for the award of any degree.

ACKNOWLEDGEMENT
I take this opportunity to express my deepest gratitude to my team leader and project guide
Mr.NarendraKirola (Sr. Network Engineer) for his able guidance and support in this
phase of transition from an academic to a professional life.. His support and valuable inputs
helped me immensely in completing this project.
I would also like to show my deep sense of gratitude to my team members Mr. Nitish
Singh, Ms. Payal Sharma, Ms. Richa Mishra and Mr. Deepak Kumar at Eroads
Technology, Noida who helped me in ways of encouragement, suggestions and technical
inputs, thus contributing either directly or indirectly to the various stages of the project.
I am also grateful toNavinKumar(noc,Eroads Technology) for providing me this great
opportunity of industrial training at Eroads Technology.
I extend my heartiest thanks to Er. Dev Kant Sharma(HOD, Computer Science And
Engineering, SPCET) for providing me the necessary help to undergo this industrial/project
training at Eroads Technology, Noida.
And last, but not the least, I would like to thank the staff at Eroads Technology for being so
cordial and cooperative throughout the period of my training.

AKANSHA TYAGI
COMPUTER SCIENCE

S. No.

Topic

Page No.

1.

Introduction

2.

Cloud Computing
2.1Characteristics of cloud computing

3.

Need for cloud computing

14

4.

History

15

5.

6.

7.

Enabling Technologies
5.1 Cloud computing application architecture
5.2 Server Architecture
5.3 Map Reduce
5.4 Google File System
5.5 Hadoop
Cloud Computing Services
6.1 Amazon Web Services
6.2 Google App Engine
Cloud Computing in the Real World
7.1 Time Machine
7.2 IBM Google University Academic Initiative
7.3 SmugMug
7.4 Nasdaq

16

21

24

8.

Conclusion

40

9.

References

41

ABSTRACT
Cloud computing is the delivery of computing as a service rather than a product,
whereby shared resources, software, and information are provided to computers and
other devices as a metered service over a network (typically the Internet).
Cloud computing provides computation, software, data access, and storage resources
without requiring cloud users to know the location and other details of the computing
infrastructure.
End users access cloud based applications through a web browser or a light weight
desktop or mobile app while the business software and data are stored on servers at a
remote location. Cloud application providers strive to give the same or better service
and performance as if the software programs were installed locally on end-user
computers.
At the foundation of cloud computing is the broader concept of infrastructure
convergence (or Converged Infrastructure) and shared services. This type of data center
environment allows enterprises to get their applications up and running faster, with
easier manageability and less maintenance, and enables IT to more rapidly adjust IT
resources (such as servers, storage, and networking) to meet fluctuating and
unpredictable business demand.

CLOUD COMPUTING
5

This overview gives the basic concept, defines the terms used in the industry, and outlines the general
architecture and applications of Cloud computing. It gives a summary of Cloud Computing and
provides a good foundation for understanding.
Keywords: Grid, Cloud, Computing
1. INTRODUCTION
Cloud Computing, to put it simply, means Internet Computing. The Internet is commonly
visualized as clouds; hence the term cloud computing for computation done through the Internet.
With Cloud Computing users can access database resources via the Internet from Anywhere, for as
long as they need, without worrying about any maintenance or management of actual resources.
Besides, databases in cloudier very dynamic and scalable. Cloud computing is unlike grid computing,
utility computing, or autonomic computing. In fact, it is a very independent platform in terms of
computing. The best example of cloud computing is Google Apps where any application can be
accessed using a browser and it can be deployed on thousands of computer through the Internet.
1.1WHAT IS CLOUD COMPUTING?
Cloud computing provides the facility to access shared resources and common infrastructure, offering
services on demand over the network to perform operations that meet changing business needs. The
location of physical resources and devices being accessed are typically not known to the end user. It
also provides facilities for users to develop, deploy and manage their applications on the cloud,
which entails virtualization of resources that maintains and manages itself.
Some generic examples include:
Amazons Elastic Computing Cloud (EC2) offering computational services that enable people to use
CPU cycles without buying more computers
Storage services such as those provided by Amazons Simple Storage Service (S3)
Companies like Nirvana allowing

1.2. SOFTWARE AS A SERVICE(SAAS)


SaaS is a model of software deployment where an application is hosted as a service provided to
customers across the Internet. Saas is generally used to refer to business software rather than consumer
software, which falls under Web 2.0. By removing the need to install and run an application on a users
own computer it is seen as a way for businesses to get the same benefits as commercial software with
smaller cost outlay.

1.3. CLOUD STORAGE


6

Over time many big Internet based companies (Amazon, Google) have come to realise that only a
small amount of their data storage capacity is being used. This has led to the rentingout of space and
the storage of information on remote servers or "clouds.
Data Cloud:-Along with services the cloud will host data. There has been some discussion of this
being potentially useful notion possibly aligned with the Semantic Web, though it could result in data
becoming undifferentiated .
1.4. CLOUD COMPUTINGARCHITECTURE
Cloud computing architecture, just like any other system, is categorized into two main sections: Front
End and Back End.
Front End can be end user or client or any application (i.e. web browser etc.) which is using cloud
services. Backend is the network of servers with any computer program and data storage system. It is
usually assumed that cloud contains infinite storage
capacity for any software available in market.
Cloud has different applications that are hosted on their own dedicated server farms.
Cloud has centralized server administration system. Centralized server administers the
system, balances client supply, adjusts demands, monitors traffic and avoids congestion. This server
follows protocols, commonly known as middleware. Middleware controls the communication of cloud
network among them.
Cloud Architecture runs on a very important assumption, which is mostly true. The
assumption is that the demand for resources is not always consistent from client to cloud.
Because of this reason the servers of cloud are unable to run at their full capacity. To avoid this
scenario, server virtualization technique is applied. In sever virtualization, all physical
servers are virtualized and they run multiple servers with either same or different application.

1.5. CHARACTERISTICS OF CLOUDCOMPUTING


Cloud computing, typically entails:
High scalability
Cloud environments enable servicing of business requirements for larger audiences, through high
scalability
Agility
The cloud works in the distributed mode environment. It shares resources among users
and tasks, while improving efficiency and agility (responsiveness)
High availability and reliability
Availability of servers is high and more reliable as the chances of infrastructure
failure are minimal
Multi-sharing
With the cloud working in a distributed and shared mode, multiple users and applications

2.Cloud Computing:7

A definition for cloud computing can be given as an emerging computer


paradigm where data and services reside in massively scalable data centres in the
cloud and can be accessed from any connected devices over the internet. Cloud computing is a way of
providing various services on virtual machines allocated on top of a large physical machine pool which
resides in the cloud. Cloud computing comes into focus only when we think about what IT has always
wanted a way to increase capacity or add different capabilities to the current setting on the fly
without investing in new infrastructure, training new personnel or licensing new software. Here on the
fly and without investing or training becomes the keywords in the current situation. But cloud
computing offers a better solution. We have lots of compute power and storage capabilities residing in
the distributed environment of the cloud. What cloud computing does is to harness the capabilities of
these resources and make available these resources as a single entity which can be changed to meet the
current needs of the user. The basis of cloud computing is to create a set of virtual servers on the
available vast resource pool and
give it to the clients. Any web enabled device can be used to access the resource through the virtual
servers. Based on the computing needs of the client, the infrastructure allotted to the client can be
scaled up or down.
From a business point of view, cloud computing is a method to address the scalability and availability
concerns for large scale applications which involves lesser overhead. Since the resource allocated to
the client can be varied based on the needs of the client and can be done without any fuss, the overhead
is very low.
One of the key concepts of cloud computing is that processing of 1000 times
the data need not be 1000 times harder. As and when the amount of data increases, the
cloud computing services can be used to manage the load effectively and make the
processing tasks easier. In the era of enterprise servers and personal computers, Cloud computing is
basically an Internet-based network made up of large numbers of server mostly based on open standards, modular and inexpensive. Clouds containvast amounts of
information and provide a variety of
As a metaphor for the Internet, "the cloud" is a familiar clich, but when combined with "computing",
the meaning gets bigger and fuzzier. Some analysts and vendors define cloud computing narrowly as
an updated version of utility computing: basically virtual servers available over the Internet. Others go
very broad, arguing anything you consume outside the firewall is "in the cloud", including
conventional outsourcing.
Cloud computing comes into focus only when you think about what we always need: a way to
increase capacity or add capabilities on the fly without investing in new infrastructure, training new
personnel, or licensing new software. Cloud computing encompasses any subscription-based or payper-use service that, in real time over the Internet, extends ICT's existing capabilities.
Cloud computing is at an early stage, with a motley crew of providers large and small delivering
a slew of cloud-based services, from full-blown applications to storage services to spam filtering. Yes,
utility-style infrastructure providers are part of the mix, but so are SaaS (software as a service)
providers such as Salesforce.com. Today, for the most part, IT must plug into cloud-based services
individually, but cloud computing aggregators and integrators are already emerging. The Internet is
often represented as a cloud and the term cloud computing arises from that analogy.
Accenture defines cloud computing as the dynamic provisioning of IT capabilities (hardware,
software,or services) from third parties over a network. McKinsey says that clouds are ard ware-based
8

services offering compute, network and storage capacity where: hardware management is highly
abstracted from the buyer; buyers incur infrastructure costs as variable OPEX [operating
expenditures]; and infrastructure capacity is highly elastic (up or down).
The cloud model differs from traditional outsourcing in that customers do not hand over their own IT
resources to be managed. Instead they plug into the cloud, treating it as they would an internal data
center or computer providing the samefunctions.
Large companies can afford to build and expand their own data centers but small- to medium-size
enterprises often choose to house their IT infrastructure in someone elses facility. A collocation center
is a type of data center where multiple customers locate network, server and storage assets, and
interconnect to a variety of telecommunications and other network service providers with a minimum
of cost and complexity. A selection of companies in the collocation and cloud arena is presented in
Table 1.
Amazon has a head start but well known companies such as Microsoft, Google, and Apple have joined
the fray.
Although not all the companies selected for Table 1 would agree on the definitions given in this article,
it is generally supposed that there are three basic types of cloud computing: Infrastructure as a Service
(IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS). In IaaS, cpu, grids or clusters,
virtualized servers, memory, networks, storage and systems software are delivered as a service.
Perhaps the best known example is Amazons Elastic Compute Cloud (EC2) and Simple Storage
Service (S3), but traditional IT vendors such as IBM, and telecoms providers such as AT&T and
Verizon are also offering solutions. Services are typically charged by usage and can be scaled
dynamically, i.e. capacity can be
increased or decreased more or less on demand.
PaaS provides virtualized servers on which users can run applications, or develop new ones, without
having to worry about maintaining the operating systems, server hardware, load balancing or
computing capacity. Well known examples include Microsofts Azure and Salesforces Force.com.
Microsoft Azure provides database and platform services starting at $0.12 per hour for compute
infrastructure; $0.15 per gigabyte for storage; and $0.10 per 10,000 transactions. For SQL Azure, a
cloud database, Microsoft is charging $9.99 for a Web Edition, which comprises up to a 1 gigabyte
relational database; and $99.99 for a Business Edition, which holds up to a 10 gigabyte relational
database. For .NET Services, a set of Web based developer tools for building cloud-based applications,
Microsoft is charging $0.15 per 100,000 message operations.
SaaS is software that is developed and hosted by the SaaS vendor and which the end user accesses over
the Internet. Unlike traditional applications that users install on their computers or servers, SaaS
software is owned by the vendor and runs on computers in the vendors data center (or a collocation
facility). Broadly speaking, all customers of a SaaS vendor use the same software: these are one-sizefits all solutions. Well known examples are Salesforce.com, Googles Gmail and Apps, instant
messaging from AOL, Yahoo and Google, and Voice-over Internet Protocol (VoIP) from Vonage and
Skype.
9

Pros and Cons of Cloud Computing The great advantage of cloud computing is elasticity: the ability
to add capacity or applications almost at a moments notice. Companies buy exactly the amount of
storage, computing power, security and other IT functions that they need from specialists in data-center
computing. They get sophisticated data center services on demand, in only the amount they need and
can pay for, at service levels set with the
vendor, with capabilities that can be added or subtracted at will.
The metered cost, pay-as-you-go approach appeals to small- and medium-sized enterprises; little or no
capital investment and maintenance cost is needed. IT is remotely managed and maintained, typically
for a monthly fee, and the company can let go of plumbing concerns. Since the vendor has many
customers, it can lower the per-unit cost to each customer. Larger companies may find it easier to
manage collaborations in the cloud, rather than having to make holes in their firewalls for contract
research organizations. SaaS deployments usually take less time than in-house ones, upgrades are
easier, and users are always using the most recent version of the application. There may be fewer bugs
because having only one version of the software reduces complexity.
This may all sound very appealing but there are downsides. In the cloud you may not have the kind of
control over your data or the performance of your applications that you need, or the ability to audit or
change the processes and policies under which users must work. Different parts of an application
might
be in many places in the cloud. Complying with federal regulations such a Sarbanes Oxley, or FDA
audit, is extremely difficult. Monitoring and maintenance tools are immature. It is hard to get metrics
out of the cloud and general management of the work is not simple.
There are systems management tools for the cloud environment but they may not integrate with
existing system management tools, so you are likely to need two systems. Nevertheless, cloud
computing may provide enough benefits to compensate for the inconvenience of two tools.
Cloud customers may risk losing data by having them locked into proprietary formats and may lose
control of data because tools to see who is using them or who can view them are inadequate. Data loss
is a real risk. In October 2009 1 million US users of the T-Mobile Sidekick mobile phone and emailing
device lost data as a result of server failure at Danger, a company recently acquired by Microsoft. Bear
in mind, though, that it is easy to underestimate risks associated with the current environment while
overestimating the risk of a new one. Cloud computing is not risky for every system. Potential users
need to evaluate security measures such as firewalls, and encryption techniques and make sure that
they will
have access to data and the software or source code if the service provider goes out of business.
It may not be easy to tailor service-level agreements (SLAs) to the specific needs of a business.
Compensation for downtime may be inadequate and SLAs are unlikely to cover concomitant damages,
but not all applications have stringent uptime requirements. It is sensible to balance the cost of
guaranteeing internal uptime against the advantages of opting for the cloud. It could be that your own
IT organization is not as sophisticated as it might seem.
Calculating cost savings is also not straightforward. Having little or no capital investment may actually
have tax disadvantages. SaaS deployments are cheaper initially than in-house installations and future
costs are predictable; after 3-5 years of monthly fees, however, SaaS may prove more expensive
overall.
Large instances of EC2 are fairly expensive, but it is important to do the mathematics correctly and
make a fair estimate of the cost of an on-premises (i.e., in-house) operation.
10

Standards are immature and things change very rapidly in the cloud. All IaaS and SaaS providers use
different technologies and different standards. The storage infrastructure behind Amazon is different
from that of the typical data center (e.g., big Unix file systems). The Azure storage engine does not use
a standard relational database; Googles App Engine does not support an SQL database. So you cannot
just move applications to the cloud and expect them to run. At least as much work is involved in
moving an application to the cloud as is involved in moving it from an existing server to a new one.
There is also the issue of employee skills: staff may need retraining and they may resent a change to
the cloud and
fear job losses. Last but not least, there are latency and performance issues. The Internet connection
may add to latency or limit bandwidth. (Latency, in general, is the period of time that one component
in a system is wasting time waiting for another component. In networking, it is the amount of time it
takes a packet to travel from source to destination.) In future, programming models exploiting
multithreading may hide latency.
Nevertheless, the service provider, not the scientist, controls the hardware, so unanticipated
sharing and reallocation of machines may affect run times. Interoperability is limited. In general, SaaS
solutions work best for non-strategic, non-mission-critical processes that are simple and standard and
not highly integrated with other business systems. Customized applications may demand an in-house
solution, but SaaS makes sense for applications that have become commoditized, such as reservation
systems in the travel industry.
Virtualization of computers or operating systems hides the physical characteristics of a computing
platform from users; instead it shows another abstract computing platform. A hypervisor is a piece of
virtualization software that allows multiple operating systems to run on a host computer concurrently.
Virtualization providers include VMware, Microsoft, and Citrix Systems (see Table 1). Virtualization is
an enabler of cloud computing.
Recently some vendors have described solutions that emulate cloud computing on private networks
referring to these as private or internal clouds (where public or external cloud describes cloud
computing in the traditional mainstream sense). Private cloud products claim to deliver some of the
clouds and connecting customer data centers to those of external cloud providers. It has been reported
that Eli Lilly wants to benefit from both internal and external clouds and that Amylin is looking at
private cloud VMware as a complement to EC2. Other experts, however, are skeptical: one has even
gone as far as to describe private clouds as absolute rubbish.
Platform Computing has recently launched a cloud management system, Platform ISF, enabling
customers to manage workload across both virtual and physical environments and support multiple
pervisors and operating systems from a single interface. VMware, the market leader in virtualization
6technology, is moving into cloud technologies in a big way, with vSphere 4. The company is building
a huge partner network of service providers and is also releasing a vCloud API. VMware wants
customers to build a series of virtual data centers, each tailored to meet different requirements, and
then have the ability to move workloads in the virtual data centers to the infrastructure provided by
cloud vendors.
Cisco, EMC and VMware have formed a new venture called Acadia. Its strategy for private cloud
computing is based on Ciscos servers and networking, VMwares server virtualization and EMCs
storage. (Note, by the way, that EMC owns nearly 85% of VMware.) Other vendors, such as Google,
11

disagree with VMwares emphasis on private clouds; in return VMware says Googles online
applications are not ready for the enterprise. Applicability
Not everyone agrees, but McKinsey has concluded as follows. Clouds already make sense for many
small and medium-size businesses, but technical, operational and financial hurdles will need to be
overcome before clouds will be used extensively by large public and private enterprises. Rather than
create unrealizable expectations for internal clouds, CIOs should focus now on the immediate
benefits of virtualizing server storage, network operations, and other critical building blocks.
They recommend that users should develop an overall strategy based on solid
business cases not cloud for the sake of cloud; use modular design in all new software to minimize
costs when it comes time to migrate to the cloud; and set up a Cloud CIO Council to advise industry.
Applications in the Pharmaceutical Industry In the pharmaceutical sector, where large amounts of
sensitive data are currently kept behin protective firewalls, security is a real concern, as is policing
individual researchers access to the cloud.
Nevertheless, cheminformatics vendors are starting to look at cloud options, especially in terms of
Software as a Service (SaaS) and hosted informatics. In bioinformatics and number-crunching, the
cloud has distinct advantages. EC2 billing is typically hours times number of cpus, so, as an overgeneralization, the cost for 1 cpu for 1000 hours is the same as the cost of 1000 cpus for 1 hour. This
makes cloud computing appealing for speedy answers to complex calculations. Over the past two
years, new DNA sequencing technology has emerged allowing a much more comprehensive view of
biological systems at the genetic level. This so-called next-generation sequencing has increased by
orders of magnitude the already daunting deluge of laboratory data, resulting in an immense IT
challenge. Could the cloud
provide a solution?
An unnamed pharmaceutical company found that processing BLAST databases and query jobs was
time consuming on its internal grid and approached Cycle Computing
about running BLAST and other applications in the cloud. After the customer had approved Cycles
security model, Cycle built a processing pipeline for BLAST that provides more than 7000 public
databases from the National Center for Biotechnology Information (NCBI), Ensembl, and the
Information Sciences Institute of the University of Southern California (ISI) that are updated weekly.
The CycleCloud BLAST service is now publicly available to all users.
.
2.1. Characteristics of Cloud Computing
1. Self Healing
Any application or any service running in a cloud computing environment has the property
of
self healing. In case of failure of the application, there is always a hot backup of the application ready
to take over without disruption. There are multiple copies of the same
application - each copy updating itself regularly so that at times of failure there is at least one copy of
the application which can take over without even the slightest change in its unning state.
2. Multi-tenancy
With cloud computing, any application supports multi-tenancy - that is multiple tenants a same instant
of time. The system allows several customers to share the infrastructure allotted to them without any of
them being aware of the sharing. This is done by virtualizing the
12

servers on the available machine pool and then allotting the servers to multiple users. This is done in
such a way that the privacy of the users or the security of their data is not compromised.
3. Linearly Scalable
Cloud computing services are linearly scalable. The system is able to break down the workloads into
pieces and service it across the infrastructure. An exact idea of linear scalability can be obtained from
the fact that if one server is able to process say 1000 transactions per
4. Service-oriented
Cloud computing systems are all service oriented - i.e. the systems are such that they are created
out of other discrete services. Many such Division of Computer Science and Engineering, School Of
Engineering, CUSATtogether to form this service. This allows re-use of the different services that are
available and that are being created. Using the
services that were just created, other such services can be created. SLA Driven
Usually businesses have agreements on the amount of services. Scalability and availability issues
cause clients to break these agreements. But cloud computing services are SLA driven such that when
the system experiences peaks of load, it will automatically adjust itself so as to comply with the
service-level agreements.
The services will create additional instances of the applications on more servers so that the load can be
easily managed.
Virtualized
The applications in cloud computing are fully decoupled from the underlying hardware. The cloud
computing environment is a fully virtualized environment.
Flexible
Another feature of the cloud computing services is that they are flexible. They can be used to serve a
large variety of workload types varying from small loads of a small consumer application to very heavy loads of a commercial
application.

3.Need for cloud computing


13

What is a Cloud computing?


Cloud computing is Internet- ("CLOUD-") based development and use of computer
technology ("COMPUTING")
Cloud computing is a general term for anything that involves delivering hosted services over the
Internet.
It is used to describe both a platform and type of application.
Cloud computing also describes applications that are extended to be accessible through the
Inter net.
These cloud applications use large data centers and powerful servers that host Web
applications and Web services.
Anyone with a suitable Internet connection and a standard browser can access a cloud
application.

4.History
14

The Cloud is a metaphor for the Internet, derived from its common depiction in network diagrams or
more generally components which are managed by others) as a cloud outline.
The underlying concept dates back to 1960 when John McCarthy opined that "computation may
someday be organized as a public utility" (indeed it shares characteristics with service bureaus which
date back to the 1960s) and the term The Cloud was already in commercial use around the turn of the
21st century. Cloud computing solutions had started to appear on the market, though most of the focus
at this time was on Software as a service.
The Cloud is a term with a long history in telephony, which has in the past decade, been adopted as a
metaphor for internet based services, with a common depiction in network diagrams as a cloud outline.
The underlying concept dates back to 1960 when John McCarthy opined that "computation may
someday be organized as a public utility"; indeed it shares characteristics with service bureaus which
date back to the 1960s. The term cloud had already come into commercial use in the early 1990s to
refer to large ATM networks.
By the turn of the 21st century, the term "cloud computing" had started to appear, although most of the
focus at this time was on Software as a service (SaaS).
In 1999, Salesforce.com was established by Marc Benioff, Parker Harris, and his fellows. They applied
many technologies of consumer web sites like Google and Yahoo! to business applications. They also
provided the concept of "On demand" and "SaaS" with their real business and successful customers.
The key for SaaS is being customizable by customer alone or with a small amount of help. Flexibility
and speed for application development have been drastically welcomed and accepted by
business users.
IBM extended these concepts in 2001, as detailed in the Autonomic Computing Manifesto -- which
described advanced automation techniques such as self-monitoring, self-healing, self-configuring, and
self-optimizing in the management of complex IT systems with heterogeneous storage, servers,
applications, networks, security mechanisms, and other system elements that can be virtualized across
an enterprise.
Amazon.com played a key role in the development of cloud computing bymodernizing their data
centers after the dot-com bubble and, having found that the new cloud architecture resulted in
significant internal efficiency improvements, providing access to their systems by way of Amazon Web
Services in 2005 on a utility computing basis.
2007 saw increased activity, including Goggle, IBM and a number of universities embarking on
a large scale cloud computing research project, around the time the term started gaining popularity
in the mainstream press. It was a hot topic by mid-2008 and numerous cloud computing events.

5 Enabling Technologies
5.1 Cloud computing application architecture
15

Cloud Computing Architecture

Cloud architecture, the systems architecture of the software systems involved in the delivery of cloud
computing, comprises hardware and software designed by a cloud architect who typically works for a
cloud integrator. It typically involves multiple cloud components communicating with each other over
application programming interfaces, usually web services.
This closely resembles the UNIX philosophy of having multiple programs doing one thing well and
working together over universal interfaces. Complexity is controlled and the resulting systems are
more manageable than their monolithic counterparts.
Cloud architecture extends to the client, where web browsers and/or software applications access
cloud applications.
Cloud storage architecture is loosely coupled, where metadata operations are centralized enabling
the data nodes to scale into the hundreds, each independently delivering data to applications or user.
duled.

5.2. Server Architecture


Cloud computing makes use of a large physical resource pool in the cloud. As

16

computing makes use of a large physical resource pool in the cloud. As


said above, cloud computing services and applications make use of virtual server
instances built upon this resource pool. There are two applications which help in
managing the server instances, the resources and also the management of the
resources by these virtual server instances. One of these is the Xen hypervisor which
provides an abstraction layer between the hardware and the virtual OS so that the
distribution of the resources and the processing is well managed. Another application that is widely
used is the Anomalism server management system which is used for
management of the infrastructure platform.
When Xen is used for virtualization of the servers over the infrastructure, a
thin software layer known as the Xen hypervisor is inserted between the server's
hardware and the operating system. This provides an abstraction layer that allows
each physical server to run one or more "virtual servers," effectively decoupling the
operating system and its applications from the underlying physical server. The Xen
hypervisor is a unique open source technology, developed collaboratively by the Xen
community and engineers at over 20 of the most innovative data center solution
vendors, including AMD, Cisco, Dell, HP, IBM, Intel, Mellanox, Network Appliance,
Novell, Red Hat, SGI, Sun, Unisys, VERITAS, Voltaire, and Citrix. Xen is licensed
under the GNU General Public License (GPL2) and is available at no charge in both
source and object format. The Xen hypervisor is also exceptionally lean-- less than
50,000 lines of code. That translates to extremely low overhead and near-native
performance for guests. Xen re-uses existing device drivers (both closed and open
source) from Linux, making device management easy. Moreover Xen is robust to
device driver failure and protects both guests and the hypervisor from faulty or
malicious drivers
The Enomalism virtualized server management system is a complete virtual
server infrastructure platform. Anomalism helps in an effective management of the
resources. Enomalism can be used to tap into the cloud just as you would into a
remote server. It brings together all the features such as deployment planning, load
balancing, resource monitoring, etc. Enomalism is an open source application. It has a
17

very simple and easy to use web based user interface. It has a module architecture
which allows for the creation of additional system add-ons and plugins. It supports
one click deployment of distributed or replicated applications on a global basis. It supports the
management of various virtual environments including KVM/Qemu,
Amazon EC2 and Xen, OpenVZ, Linux Containers, VirtualBox. It has fine grained
user permissions and access privileges.
5.3. Map Reduce
Map Reduce is a software framework developed at Google in 2003 to support
parallel computations over large (multiple petabyte) data sets on clusters of
commodity computers. This framework is largely taken from map and reduce
functions commonly used in functional programming, although the actual semantics
of the framework are not the same. It is a programming model and an associated
implementation for processing and generating large data sets. Many of the real world
tasks are expressible in this model. MapReduce implementations have been written in
C++, Java and other languages.
Programs written in this functional style are automatically parallelized and
executed on the cloud. The run-time system takes care of the details of partitioning
the input data, scheduling the programs execution across a set of machines, handling
machine failures, and managing the required inter-machine communication. This
allows programmers without any experience with parallel and distributed systems to
easily utilize the resources of a largely distributed system.
The computation takes a set of input key/value pairs, and produces a set of
output key/value pairs. The user of the MapReduce library expresses the computation
as two functions: Map and Reduce.
Map, written by the user, takes an input pair and produces a set of
intermediate key/value pairs. The MapReduce library groups together all intermediate
values associated with the same intermediate key I and passes them to the Reduce

function.
The Reduce function, also written by the user, accepts an intermediate key Ind a set of values for that
key. It merges together these values to form a possibly smaller set of values. Typically just zero or one
output value is produced per Reduce invocation. The intermediate values are supplied to the user's
reduce function via anterator. This allows us to handle lists of values that are too large to fit in
memory.
Map Reduce achieves reliability by parcelling out a number of operations onhe set of data to each node
in the network; each node is expected to report back periodically with completed work and status
18

updates. If a node falls silent for longer han that interval, the master node records the node as dead, and
sends out the node's signed work to other nodes. Individual operations use atomic operations for
naming file outputs as a double check to ensure that there are not parallel conflicting threads running;
when files are renamed, it is possible to also copy them to another name in addition to the name of the
task (allowing for side-effects).

5.4. Google File System


Google File System (GFS) is a scalable distributed file system developed by Google for data intensive
applications. It is designed to provide efficient, reliable access to data using large clusters of
commodity hardware. It provides fault tolerance while running on inexpensive commodity hardware,
and it delivers high aggregate performance to a large number of clients.
Files are divided into chunks of 64 megabytes, which are only extremely rarely overwritten, or
shrunk; files are usually appended to or read. It is also designed and optimized to run on computing
clusters, the nodes of which consist of cheap, "commodity" computers, which means precautions must
be taken against the high failure rate of individual nodes and the subsequent data loss. Other design
decisions select for high data throughputs, even when it comes at the cost of latency.
The nodes are divided into two types: one Master node and a large number of Chunkservers.
Chunkservers store the data files, with each individual file broken up into fixed size chunks (hence the
name) of about 64 megabytes, similar to clusters or sectors in regular file systems. Each chunk is
assigned a unique 64-bit label, and logical mappings of files to constituent chunks are maintained.
Each chunk is replicated several times throughout the network, with the minimum being three, but
even more for files that have high demand or need more redundancy.
The Master server doesn't usually store the actual chunks, but rather all the metadata associated with
the chunks, such as the tables mapping the 64-bit labels to chunk locations and the files they make up,
the locations of the copies of the chunks,
what processes are reading or writing to a particular chunk, or taking a "snapshot" of the chunk
pursuant to replicating it (usually at the instigation of the Master server, when, due to node failures, the
number of copies of a chunk has fallen beneath the set number). All this metadata is kept current by the
Master server periodically receiving updates from each chunk server ("Heart-beat messages").
Permissions for modifications are handled by a system of time-limited, expiring "leases", where the
Master server grants permission to a process for a finite period of time during which no other process
will be granted permission by the Master server to modify the chunk. The modified chunkserver, which
is always the primary chunk holder, then propagates the changes to the chunkservers with the backup
copies. The changes are not saved until all chunkservers acknowledge, thus guaranteeing the
completion and atomicity of the operation. Programs access the chunks by first querying the Master
server for the locations of the desired chunks; if the chunks are not being operated on (if there are no
outstanding leases), the Master replies with the locations, and the program then contacts and receives
the data from the chunkserver directly. As opposed to many filesystems, it's not implemented in the
kernel of an Operating System but accessed through a library to avoid overhead.
5.5. Hadoop
Hadoop is a framework for running applications on large cluster built of
commodity hardware. The Hadoop framework transparently provides applications
19

both reliability and data motion. Hadoop implements the computation paradigm
named MapReduce which was explained above. The application is divided into many
small fragments of work, each of which may be executed or re-executed on any node
in the cluster. In addition, it provides a distributed file system that stores data on the
compute nodes, providing very high aggregate bandwidth across the cluster. Both
MapReduce and the distributed file system are designed so that the node failures are
automatically handled by the framework. Hadoop has been implemented making use
of Java. In Hadoop, the combination of the entire JAR files and classed needed to run
a MapReduce program is called a job. All of these components are themselves
collected into a JAR which is usually referred to as the job file. To execute a job, it is
submitted to a jobTracker and then executed.
Tasks in each phase are executed in a fault-tolerant manner. If node(s) fail in
the middle of a computation the tasks assigned to them are re-distributed among the
remaining nodes. Since we are using MapReduce, having many map and reduce tasks
enables good load balancing and allows failed tasks to be re-run with smaller runtime
overhead.
The Hadoop MapReduce framework has master/slave architecture. It has a
single master server or a jobTracker and several slave servers or taskTrackers, one per
node in the cluster. The jobTracker is the point of interaction between the users and
the framework. Users submit jobs to the jobTracker, which puts them in a queue of
pending jobs and executes them on a first-come first-serve basis. The jobTracker
manages the assignment of MapReduce jobs to the taskTrackers. The taskTrackers
execute tasks upon instruction from the jobTracker and also handle data motion
between the map and reduce phases of the MapReduce job.
Hadoop is a framework which has received a wide industry adoption. Hadoop

is used along with other cloud computing technologies like the Amazon services so as
to make better use of the resources. There are many instances where Hadoop has been
used. Amazon makes use of Hadoop for processing millions of sessions which it uses
for analytics. This is made use of in a cluster which has about 1 to 100 nodes.
Facebook uses Hadoop to store copies of internal logs and dimension data sources and
Division of Computer Science and Engineering, School Of Engineering, CUSAT
use it as a source for reporting/analytics and machine learning. The New York Times
made use of Hadoop for large scale image conversions. Yahoo uses Hadoop to
support research for advertisement systems and web searching tools.

6. Cloud Computing Services


Even though cloud computing is a pretty new technology, there are many
companies offering cloud computing services. Different companies like Amazon,
Google, Yahoo, IBM and Microsoft are all players in the cloud computing services
industry. But Amazon is the pioneer in the cloud computing industry with services
like EC2 (Elastic Compute Cloud) and S3 (Simple Storage Service) dominating the
20

industry. Amazon has an expertise in this industry and has a small advantage over the
others because of this. Microsoft has good knowledge of the fundamentals of cloud
science and is building massive data centers. IBM, the king of business computing
and traditional supercomputers, teams up with Google to get a foothold in the clouds.
Google is far and away the leader in cloud computing with the company itself built
from the ground up on hardware.
6.1. Amazon Web Services
The Amazon Web Services is the set of cloud computing services offered by
Amazon. It involves four different services. They are Elastic Compute Cloud (EC2),
Simple Storage Service (S3), Simple Queue Service (SQS) and Simple Database
Service (SDB).
1. Elastic Compute Cloud (EC2)
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute
capacity in the cloud. It is designed to make webscale computing
Easier for developers.
It provides on-demand processing power.
Amazon EC2's simple web service interface allows you to obtain and configure capacity with minimal
friction. It provides you with complete control of your computing resources and lets you run on
Amazon's proven computing environment. Amazon EC2 reduces the time required to obtain and boot
new server instances to minutes, allowing you to quickly scale capacity,
both up and down, as your computing requirements change. Amazon EC2 changes the economics of
computing by allowing you to pay only for capacity that you actually use. Amazon EC2 provides
developers the tools to build failure resilient applications and isolate themselves from common failure
scenarios.
Amazon EC2 presents a true virtual computing environment, allowing
you to use web service interfaces to requisition machines for use, load them
with your custom application environment, manage your network's access
permissions, and run your image using as many or few systems as you desire.
To set up an Amazon EC2 node we have to create an EC2 node
configuration which consists of all our applications, libraries, data and
associated configuration settings. This configuration is then saved as an AMI
(Amazon Machine Image). There are also several stock instances of Amazon
AMIs available which can be customized and used. We can then start,
terminate and monitor as many instances of the AMI as needed.
Amazon EC2 enables you to increase or decrease capacity within
minutes. You can commission one, hundreds or even thousands of server
instances simultaneously. Thus the applications can automatically scale itself
up and down depending on its needs. You have root access to each one, and
you can interact with them as you would any machine. You have the choice of
several instance types, allowing you to select a configuration of memory,
CPU, and instance storage that is optimal for your application. Amazon EC2
offers a highly reliable environment where replacement instances can be
21

rapidly and reliably commissioned. Amazon EC2 provides web service


interfaces to configure firewall settings that control network access to and
between groups of instances. You will be charged at the end of each month for
your EC2 resources actually consumed. So charging will be based on the
actual usage of the resources.
2. Simple Storage Service (S3)
S3 or Simple Storage Service offers cloud computing storage service.
It offers services for storage of data in the cloud. It provides a high-availability
large-store database. It provides a simple SQL-like language. It has been
designed for interactive online use. S3 is storage for the Internet. It is designed to make web-scale
computing easier for developers. S3 provides a simple web
services interface that can be used to store and retrieve any amount of data, at
any time, from anywhere on the web. It gives any developer access to the
same highly scalable, reliable, fast, inexpensive data storage infrastructure that
Amazon uses to run its own global network of web sites.
Amazon S3 allows write, read and delete of objects containing from 1
byte to 5 gigabytes of data each. The number of objects that you can store is
unlimited. Each object is stored in a bucket and retrieved via a unique
developer-assigned key. A bucket can be located anywhere in Europe or the
Americas but can be accessed from anywhere. Authentication mechanisms are
provided to ensure that the data is kept secure from unauthorized access.
Objects can be made private or public, and rights can be granted to specific
users for particular objects. Also the S3 service also works with a pay only for
what you use method of payment.
3. Simple Queue Service (SQS)
Amazon Simple Queue Service (SQS) offers a reliable, highly
scalable, hosted queue for storing messages as they travel between computers.
By using SQS, developers can simply move data between distributed
components of their applications that perform different tasks, without losing
messages or requiring each component to be always available. Messages can be retained in a queue
for up to 4 days. It is simple, reliable,
secure and scalable.
4. Simple Database Service (SDB)
Amazon SimpleDB is a web service for running queries on structured data
in real time. This service works in close conjunction with the Amazon S3 and
EC2, collectively providing the ability to store, process and query data sets in
the cloud. These services are designed to make web-scale computing easier
and more cost-effective to developers. Traditionally, this type of functionality
is accomplished with a clustered relational database, which requires a sizable
upfront investment and often requires a DBA to maintain and administer them.
22

Amazon SDB provides all these without the operational complexity. It


requires no schema, automatically indexes your data and provides a simple
API for storage and access. Developers gain access to the different
functionalities from within the Amazons proven computing environment and
are able to scale instantly and need to pay only for what they use.
6.2. Google App Engine
Google App Engine lets you run your web applications on Google's
infrastructure. App Engine applications are easy to build, easy to maintain, and easy
to scale as your traffic and data storage needs grow. You can serve your app using a
free domain name on the
appspot.com domain, or use Google Apps to serve it from
your own domain. You can share your application with the world, or limit access to
members of your organization. App Engine costs nothing to get started. Sign up for a
free account, and you can develop and publish your application at no charge and with
no obligation. A free account can use up to 500MB of persistent storage and enough
CPU and bandwidth for about 5 million page views a month. Google App Engine makes it easy to
build an application that runs reliably,
even under heavy load and with large amounts of data. The environment includes the
following features: dynamic web serving, with full support for common web technologies
persistent storage with queries, sorting and transactions
automatic scaling and load balancing
APIs for authenticating users and sending email using Google Accounts
a fully featured local development environment that simulates Google App
Engine on your computer
Google App Engine applications are implemented using the Python
programming language. The runtime environment includes the full Python language
and most of the Python standard library. Applications run in a secure environment that
provides limited access to the underlying operating system. These limitations allow
App Engine to distribute web requests for the application across multiple servers, and
start and stop servers to meet traffic demands.
App Engine includes a service API for integrating with Google Accounts.
Your application can allow a user to sign in with a Google account, and access the
email address and displayable name associated with the account. Using Google
Accounts lets the user start using your application faster, because the user may not
need to create a new account. It also saves you the effort of implementing a user
account system just for your application
App Engine provides a variety of services that enable you to perform common
operations when managing your application. The following APIs are provided to
access these services: Applications can access resources on the Internet, such as web
services or other data, using App Engine's URL fetch service. Applications can send
email messages using App Engine's mail service. The mail service uses Google
infrastructure to send email messages. The Image service lets your application
23

manipulate images. With this API, you can resize, crop, rotate and flip images in
JPEG and PNG formats.
In theory, Google claims App Engine can scale nicely. But Google currently
places a limit of 5 million hits per month on each application. This limit nullifies App
Engine's scalability, because any small, dedicated server can have this performance.
Google will eventually allow webmasters to go beyond this limit (if they pay).

7. Cloud Computing in the Real World


7.1. Time Machine
Times machine is a New York Times project in which one can read any issue
from Volume 1, Number 1 of The New York Daily Times, on September 18, 1851
through to The New York Times of December 30, 1922. They made it such that one
can choose a date in history and flip electronically through the pages, displayed with
their original look and feel. Heres what they did. They scanned all their public
domain articles from 1851 to 1992 into TIFF files. They converted it into PDF files
and put them online. Using 100 Linux computers, the job took about 24 hours. Then a
coding error was discovered that required the job be rerun. Thats when their software
team decided that the job of maintaining this much data was too much to do in-house.
So they made use of cloud computing services to do the work.
All the content was put in the cloud, in Amazon. They made use of 100
instances of Amazon EC2 and completed the whole work in less than 24 hours. They
uploaded all the TIFF files into the cloud and made a program in Hadoop which does
the whole job. Using Amazon.com's EC2 computing platform, the Times ran a PDF
conversion app that converted that 4TB of TIFF data into 1.5TB of PDF files. The
PDF files were such that they were fully searchable. The image manipulation and the
search ability of the software were done using cloud computing services.

7.2. IBM Google University Academic Initiative


Google and IBM came up with an initiative to advance large-scale distributed
computing by providing hardware, software, and services to universities. Their idea
was to prepare students "to harness the potential of modern computing systems," the
companies will provide universities with hardware, software, and services to advance training in largescale distributed computing. The two companies aim to reduce the
cost of distributed computing research, thereby enabling academic institutions and
their students to more easily contribute to this emerging computing paradigm. Eric
Schmidt, CEO of Google, said in a statement. "In order to most effectively serve the
long-term interests of our users, it is imperative that students are adequately equipped to harness the
potential of modern computing systems and for researchers to be able to
innovate ways to address emerging problems."
The first university to join the initiative is the University of Washington.
24

Carnegie-Mellon University, MIT, Stanford University, the University of California at


Berkeley, and the University of Maryland are also participating in the program.
As part of the initiative, Google and IBM are providing a cluster of several
hundred computers -- Google's custom servers and IBM BladeCenter and System x
servers. Over time, the companies expect the cluster to surpass 1,600 processors. The
Linux-based servers will run open source software including Xen's virtualization
system and Hadoop, an open source implementation of Google's distributed file
system that's managed by the Apache Software Foundation.
Students working with the cluster will have access to a Creative Commonslicensed
University
of Washington.
7.3. SmugMug
SmugMug is an online photo hosting application which is fully based on cloud
computing services. They dont own any hard drives. All their storage is based in the
Amazon S3 instances.
7.4. Nasdaq
NASDAQ which had lots of stock and fund data wanted to make extra
revenue selling historic data for those stocks and funds. But for this offering, called
Market Replay, the company didn't want to worry about optimizing its databases and
servers to handle the new load. So it turned to Amazon's S3 service to host the data,
and created a lightweight reader app that let users pull in the required data. The
traditional approach wouldn't have gotten off the ground economically. NASDAQ
took its market data and created flat files for every entity, each holding enough data
for a 10-minute replay of the stock's or fund's price changes, on a second-by-second
basis. It adds 100,000 files per day to the several million it started with.
Infrastructure (Cisco Systems)
Computer software (3tera, Hadoop, IBM, RightScale)
Operating systems (Solaris, AIX, Linux including Red Hat)
Platform virtualization (Citrix, Microsoft, VMware, Sun xVM, IBM)

25

Types of services:
These services are broadly divided into three categories:
Infrastructure-as-a-Service (IaaS)
Platform-as-a-Service (PaaS)
Software-as-a-Service (SaaS).

Infrastructure-as-a-Service (IaaS):
Infrastructure-as-a-Service(IaaS) like Amazon Web Services provides virtual servers with unique
IP addresses and blocks of storage on demand. Customers benefit from an API from which they
can control their servers. Because customers can pay for exactly the amount of service they use,
like for electricity or water, this service is also called utility computing.
Platform-as-a-Service (PaaS):
Platform-as-a-Service(PaaS) is a set of software and development tools hosted on the provider's
servers. Developers can create applications using the provider's APIs. Google Apps is one of the
most famous Platform-as-a-Service providers. Developers should take notice that there aren't any
interoperability standards (yet), so some providers may not allow you to take your application and
26

put it on another platform.


Software-as-a-Service (SaaS):
Software-as-a-Service (SaaS) is the broadest market. In this case the provider allows the
customer only to use its applications. The software interacts with the user through a user
interface. These applications can be anything from web based email, to applications like
Twitter
Types by visibility:

Public cloud:
Public cloud or external cloud describes cloud computing in the traditional mainstream sense,
whereby resources are dynamically provisioned on a fine-grained, self-service basis over the
Internet, via web applications/web services, from an off-site third-party provider who
shares resources and bills on a fine-grained utility computing basis.
Hybrid cloud:
A hybrid cloud environment consisting of multiple internal and/or external providers] "will be
typical for most enterprises". A hybrid cloud can describe configuration combining a local device,
such as a Plug computer with cloud services. It can also describe configurations combining virtual
and physical, colocated assetsfor example, a mostly virtualized environment that requires
physical servers, routers, or other hardware such as a network appliance acting as a firewall or
spam filter.
Private cloud:
Private cloud and internal cloud are neologisms that some vendors have recently used to describe
offerings that emulate cloud computing on private networks. These (typically virtualisation
automation) products claim to "deliver some benefits of cloud computing without the pitfalls",
capitalising on data security, corporate governance, and reliability concerns. They have been
criticized on the basis that users "still have to buy, build, and manage them" and as such do not
benefit from lower up-front capital costs and less hands-on management[, essentially "[lacking]
the economic model that makes cloud computing such an intriguing concept".
While an analyst predicted in 2008 that private cloud networks would be the future of corporate
IT, there is some uncertainty whether they are a reality even within the same firm. Analysts also
27

claim that within five years a "huge percentage" of small and medium enterprises will get most
of their computing resources from external cloud computing providers as they "will not have
economies of scale to make it worth staying in the IT business" or be able to afford private clouds.
Analysts have reported on Platform's view that private clouds are a stepping stone to external
clouds, particularly for the financial services, and that future datacenters will look like internal
clouds.
The term has also been used in the logical rather than physical sense, for example in reference to
platform as a service offerings, though such offerings including Microsoft's
Azure Services Platform are not available for on-premises deployment.
How does cloud computing work?
Supercomputers today are used mainly by the military, government intelligence agencies,
universities and research labs, and large companies to tackle enormously complex
calculations for such tasks as simulating nuclear explosions, predicting climate change, designing
airplanes, and analyzing which proteins in the body are likely to bind with potential new drugs.
Cloud computing aims to apply that kind of powermeasured in the tens of trillions of
computations per secondto problems like analyzing risk in financial portfolios, delivering
personalized medical information, even powering immersive computer games, in a way that users
can tap through the Web. It does that by networking large groups of servers that often use
low-cost consumer PC technology, with specialized connections to spread data-processing
chores across them. By contrast, the newest and most powerful desktop PCs process only about
3 billion computations a second. Let's say you're an executive at a large corporation. Your
particular responsibilities include making sure that all of your employees have the right
hardware and software they need to do their jobs. Buying computers for everyone isn't
enough -- you also have to purchase software or software licenses to give employees the tools
they require. Whenever you have a new hire, you have to buy more software or make sure
your current software license allows another user. It's so stressful that you find it difficult to go.

A typical cloud computing system


Soon, there may be an alternative for executives like you. Instead of installing a suite of
28

software for each computer, you'd only have to load one application. That application would
allow workers to log into a Web-based service which hosts all the programs the user would
need for his or her job. Remote machines owned by another company would run everything
from e-mail to word processing to complex data analysis programs. It's called cloud computing,
and it could change the entire computer industry.
In a cloud computing system, there's a significant workload shift. Local computers no longer have
to do all the heavy lifting when it comes to running applications. The network of computers that
make up the cloud handles them instead. Hardware and software demands on the user's side
decrease. The only thing the user's computer needs to be able to run is the cloud computing
system's interface software, which can be as simple as a Web browser, and the cloud's network
takes care of the rest.
There's a good chance you've already used some form of cloud computing. If you have an
e-mail account with a Web-based e-mail service like Hotmail, Yahoo! Mail or Gmail, then you've
had some experience with cloud computing. Instead of running an e-mail program on your
computer, you log in to a Web e-mail account remotely. The software and storage for your
account doesn't exist on your computer -- it's on the service's computer cloud.
SEVEN TECHNICAL SECURITY BENEFITS OF THE CLOUD:

SPECI FI C CHARACTERI STI CS / CAPABI LI TI ES OF CLOUDS


Since clouds do not refer to a specific technology, but to a general provisioning paradigm with
enhanced capabilities, it is mandatory to elaborate on these aspects. There is currently a strong
tendency to regard clouds as just a new name for an old idea, which is mostly due to a confusion
between the cloud concepts and the strongly related P/I/SaaS paradigms (see also II.A.2, but also
due to the fact that similar aspects have already been addressed without the dedicated term
cloud associated with it (see also II).
This section specifies the concrete capabilities associated with clouds that are considered essential
(required in any cloud environment) and relevant (ideally supported, but may be restricted to
specific use cases). We can thereby distinguish non-functional, economic and technological
capabilities addressed, respectively to be addressed by cloud systems.
Non-functional aspects represent qualities or properties of a system, rather than specific
technological requirements. Implicitly, they can be realized in multiple fashions and interpreted in
different ways which typically leads to strong compatibility and interoperability issues between
29

individual providers as they pursue their own approaches to realize their respective requirements,
which strongly differ between providers. Non-functional aspects are one of the key reasons why
clouds differ so strongly in their interpretation (see also II.B).
Economic considerations are one of the key reasons to introduce cloud systems in a business
environment in the first instance. The particular interest typically lies in the reduction of cost and
effort through outsourcing and / or automation of essential resource management. As has been
noted in the first section, relevant aspects thereby to consider relate to the cut-off between loss of
control and reduction of effort. With respect to hosting private clouds, the gain through cost
reduction has to be carefully balanced with the increased effort to build and run such a system.
Obviously, technological challenges implicitly arise from the non-functional and economical aspects,
when trying to realize them. As opposed to these aspects, technological challenges typically imply a
specific realization even though there may be no standard approach as yet and deviations may
hence arise. In addition to these implicit challenges, one can identify additional technological
aspects to be addressed by cloud system, partially as a pre-condition to realize some of the high
level features, but partially also as they directly relate to specific characteristics of cloud systems.
1. NON- FUNCTI ONAL ASPECTS
The most important non-functional aspects are:
Elasticity is an essential core feature of cloud systems and circumscribes the capability of the
underlying infrastructure to adapt to changing, potentially non-functional requirements, for
example amount and size of data supported by an application, number of concurrent users etc. One
can distinguish between horizontal and vertical scalability, whereby horizontal scalability refers to
the amount of instances to satisfy e.g. changing amount of requests, and vertical scalability refers to
the size of the instances themselves and thus implicit to the amount of resources required to
maintain the size. Cloud scalability involves both (rapid) up- and down-scaling.
Elasticity goes one step further, tough, and does also allow the dynamic integration and extraction
of physical resources to the infrastructure. Whilst from the application perspective, this is identical
to scaling, from the middleware management perspective this poses additional requirements, in
particular regarding reliability. In general, it is assumed that changes in the resource infrastructure
are announced first to the middleware manager, but with large scale systems it is vital that such
changes can be maintained automatically.
Reliability is essential for all cloud systems in order to support todays data centre-type
applications in a cloud, reliability is considered one of the main features to exploit cloud capabilities.
Reliability denotes the capability to ensure constant operation of the system without disruption, i.e.
no loss of data, no code reset during execution etc. Reliability is typically achieved through
redundant resource utilisation. Interestingly, many of the reliability aspects move from a hardware
to a software-based solution. (Redundancy in the file systems vs. RAID controllers, stateless front
end servers vs. UPS, etc.).
Notably, there is a strong relationship between availability (see below) and reliability however,
reliability focuses in particular on prevention of loss (of data or execution progress).
Quality of Service support is a relevant capability that is essential in many use cases where
specific requirements have to be met by the outsourced services and / or resources. In business
cases, basic QoS metrics like response time, throughput etc. must be guaranteed at least, so as to
ensure that the quality guarantees of the cloud user are met. Reliability is a particular QoS aspect
which forms a specific quality requirement.
30

Agility and adaptability are essential features of cloud systems that strongly relate to the elastic
capabilities. It includes on-time reaction to changes in the amount of requests and size of resources,
but also adaptation to changes in the environmental conditions that e.g. require different types of
resources, different quality or different routes, etc. Implicitly, agility and adaptability require
resources (or at least their management) to be autonomic and have to enable them to provide self-*
capabilities.
Availability of services and data is an essential capability of cloud systems and was actually one
of the core aspects to give rise to clouds in the first instance. It lies in the ability to introduce
redundancy for services and data so failures can be masked transparently. Fault tolerance also
requires the ability to introduce new redundancy (e.g. previously failed or fresh nodes) in an online
manner non-intrusively (without a significant performance penalty).
With increasing concurrent access, availability is particularly achieved through replication of data /
services and distributing them across different resources to achieve load-balancing. This can be
regarded as the original essence of scalability in cloud systems.
2. ECONOMIC ASPECTS
In order to allow for economic considerations, cloud systems should help in realising the following
aspects:
Cost reduction is one of the first concerns to build up a cloud system that can adapt to changing
consumer behaviour and reduce cost for infrastructure maintenance and acquisition. Scalability and
Pay per Use are essential aspects of this issue. Notably, setting up a cloud system typically entails
additional costs be it by adapting the business logic to the cloud host specific interfaces or by
enhancing the local infrastructure to be cloud-ready. See also return of investment below.
Pay per use. The capability to build up cost according to the actual consumption of resources is a
relevant feature of cloud systems. Pay per use strongly relates to quality of service support, where
specific requirements to be met by the system and hence to be paid for can be specified. One of the
key economic drivers for the current level of interest in cloud computing is the structural change in
this domain. By moving from the usual capital upfront investment model to an operational expense,
cloud computing promises to enable especially SMEs and entrepreneurs to accelerate the
development and adoption of innovative solutions.
Improved time to market is essential in particular for small to medium enterprises that want to
sell their services quickly and easily with little delays caused by acquiring and setting up the
infrastructure,
in particular in a scope compatible and competitive with larger industries. Larger
enterprises need to be able to publish new capabilities with little overhead to remain competitive.
Clouds can support this by providing infrastructures, potentially dedicated to specific use cases that
take over essential capabilities to support easy provisioning and thus reduce time to market.
Return of investment (ROI) is essential for all investors and cannot always be guaranteed in fact
some cloud systems currently fail this aspect. Employing a cloud system must ensure that the cost
and effort vested into it is outweighed by its benefits to be commercially viable this may entail
direct (e.g. more customers) and indirect (e.g. benefits from advertisements) ROI. Outsourcing
resources versus increasing the local infrastructure and employing (private) cloud technologies need
therefore to be outweighed and critical cut-off points identified.
31

Turning CAPEX into OPEX is an implicit, and much argued characteristic of cloud systems, as
the actual cost benefit (cf. ROI) is not always clear (see e.g.[9]). Capital expenditure (CAPEX) is
required to build up a local infrastructure, but with outsourcing computational resources to cloud
systems
according to operational need.
Going Green is relevant not only to reduce additional costs of energy consumption, but also to
reduce the carbon footprint. Whilst carbon emission by individual machines can be quite well
estimated, this information is actually taken little into consideration when scaling systems up.
Clouds principally allow reducing the consumption of unused resources (down-scaling). In addition,
up-scaling should be carefully balanced not only with cost, but also carbon emission issues. Note
that beyond software stack aspects, plenty of Green IT issues are subject to development on the
hardware level.
3. TECHNOLOGI CAL ASPECTS
The main technological challenges that can be identified and that are commonly associated with
cloud systems are:
Virtualisation is an essential technological characteristic of clouds which hides the technological
complexity from the user and enables enhanced flexibility (through aggregation, routing and
translation). More concretely, virtualisation supports the following features:
Ease of use: through hiding the complexity of the infrastructure (including management,
configuration etc.) virtualisation can make it easier for the user to develop new applications, as well
as reduces the overhead for controlling the system.
Infrastructure independency: in principle, virtualisation allows for higher interoperability by making
the code platform independent.
Flexibility and Adaptability: by exposing a virtual execution environment, the underlying
infrastructure can change more flexible according to different conditions and requirements
(assigning more resources, etc.).
Location independence: services can be accessed independent of the physical location of the user
and the resource.
Multi-tenancy is a highly essential issue in cloud systems, where the location of code and / or
data is principally unknown and the same resource may be assigned to multiple users (potentially at
the same time). This affects infrastructure resources as well as data / applications / services that are
hosted on shared resources but need to be made available in multiple isolated instances. Classically,
all information is maintained in separate databases or tables, yet in more complicated cases
information may be concurrently altered, even though maintained for isolated tenants. Multitenancy
implies a lot of potential issues, ranging from data protection to legislator issues (see
section III).
Security, Privacy and Compliance is obviously essential in all systems dealing with potentially
sensitive data and code.
Data Management is an essential aspect in particular for storage clouds, where data is flexibly
distributed across multiple resources. Implicitly, data consistency needs to be maintained over a
wide distribution of replicated data sources. At the same time, the system always needs to be aware
32

of the data location (when replicating across data centres) taking latencies and particularly workload
into consideration. As size of data may change at any time, data management addresses both
horizontal and vertical aspects of scalability. Another crucial aspect of data management is the
provided consistency guarantees (eventual vs. strong consistency, transactional isolation vs. no
isolation, atomic operations over individual data items vs. multiple data times etc.).
APIs and / or Programming Enhancements are essential to exploit the cloud features: common
programming models require that the developer takes care of the scalability and autonomic capabilities
him- / herself, whilst a cloud environment provides the features in a fashion that allows the user to
leave such management to the system.
Metering of any kind of resource and service consumption is essential in order to offer elastic
pricing, charging and billing. It is therefore a pre-condition for the elasticity of clouds.
Tools are generally necessary to support development, adaptation and usage of cloud services.
C. RELATED AREAS
It has been noted, that the cloud concept is strongly related to many other initiatives in the area of
the Future Internet, such as Software as a Service and Service Oriented Architecture. New
concepts and terminologies often bear the risk that they seemingly supersede preceding work and
thus require a fresh start, where plenty of the existing results are lost and essential work is
repeated unnecessarily. In order to reduce this risk, this section provides a quick summary of the
main related areas and their potential impact on further cloud developments.
1. INTERNET OF SERVI CES
Service based application provisioning is part of the Future Internet as such and therefore a similar
statement applies to cloud and Internet of Services as to cloud and Future Internet. Whilst the cloud
concept foresees essential support for service provisioning (making them scalable, providing a simple
API for development etc.), its main focus does not primarily rest on service provisioning. As detailed
in section II.A.1 cloud systems are particularly concerned with providing an infrastructure on which
any type of service can be executed with enhanced features.
Clouds can therefore be regarded as an enabler for enhanced features of large scale service
provisioning. Much research was vested into providing base capabilities for service provisioning
accordingly, capabilities that overlap with cloud system features can be easily exploited for cloud
infrastructures.
2. INTERNET OF THI NGS
It is up to debate whether the Internet of Things is related to cloud systems at all: whilst the internet of
things will certainly have to deal with issues related to elasticity, reliability and data management etc.,
there is an implicit assumption that resources in cloud computing are of a type that can host and / or
process data in particular storage and processors that can form a computational unit (a virtual
processing platform).
However, specialised clouds may e.g. integrate dedicated sensors to provide enhanced capabilities
and the issues related to reliability of data streams etc. are principally independent of the type of
data source. Though sensors as yet do not pose essential scalability issues, metering of resources
will already require some degree of sensor information integration into the cloud.
33

Clouds may furthermore offer vital support to the internet of things, in order to deal with a flexible
amount of data originating from the diversity of sensors and things. Similarly, cloud concepts for
scalability and elasticity may be of interest for the internet of things in order to better cope with
dynamically scaling data streams.
Overall, the Internet of Things may profit from cloud systems, but there is no direct relationship
between the two areas. There are however contact points that should not be disregarded. Data
management and interfaces between sensors and cloud systems therefore show commonalities.
3. THE GRI D
There is an on-going confusion about the relationship between Grids and Clouds [17], sometimes
seeing Grids as on top of Clouds, vice versa or even identical. More surprising, even elaborate
comparisons (such as [18][19][20]) still have different views on what the Grid is in the first
instance, thus making the comparison cumbersome. Indeed most ambiguities can be quickly
resolved if the underlying concept of Grids is examined first: just like Clouds, Grid is primarily a
concept rather than a technology thus leading to many potential misunderstandings between
individual communities.
With respect to research being carried out in the Grid over the last years, it is therefore
recommendable to distinguish (at least) between (1) Resource Grids, including in parti cular Grid
Computing, and (2) eBusiness Grids which centres mainly on distributed Virtual Organizations and
is closer related to Service Oriented Architectures (see below). Note that there may be combination
between the two, e.g. when capabilities of the eBusiness Grids are applied for commercial resource
provisioning, but this has little impact on the assessment below.
Resource Grids try to make resource - such as computational devices and storage - locally available
in a fashion that is transparent to the user. The main focus thereby lies on availability rather than
scalability, in particular rather than dynamic scalability. In this context we may have to distinguish
between HPC Grids, such as EGEE, which select and provide access to (single) HPC resources, as
opposed to distributed computing Grids (cf. Service Oriented Architecture below) which also
includes P2P like scalability - in other words, the more resources are available, the more code
instances are deployed and executed. Replication capabilities may be applied to ensure reliability,
though this is not an intrinsic capability of in particular computational Grids. Even though such Grid
middleware(s) offers manageability interfaces, it typically acts on a layer on top of the actual
resources and thus does rarely virtualise the hardware, but the computing resource as a whole (i.e.
not on the IaaS level).
Overall, Resource Grids do address similar issues to Cloud Systems, yet typically on a different layer
with a different focus - as such, e.g. Grids do generally not cater for horizontal and vertical elasticity.
What is more important though is the strong conceptual overlap between the issues addressed by
Grid and Clouds which allows re-usage of concepts and architectures, but also of parts of technology
(see also SOA below).
Specific shared concepts:
Virtualisation of computation resources, respectively of hardware
Scalability of amount of resources versus of hardware, code and data
Reliability through replication and check-pointing
Interoperability
Security and Authentication
34

eBusiness Grids share the essential goals with Service Oriented Architecture, though the specific
focus rests on integration of existing services so as to build up new functionalities, and to enhance
these services with business specific capabilities. The eBusiness (or here Virtual Organization)
approach derives in particular from the distributed computing aspect of Grids, where parts of the
overall logic are located in different sites. The typical Grid middleware thereby focus mostly on
achieving reliability in the overall execution through on-the-fly replacement and (re)integration.
But eBusiness Grids also explore the specific requirements for commercial employment of service
consumption and provisioning - even though this is generally considered an aspect more related to
Service Oriented Architectures than to Grids.

Again, eBusiness Grids and Cloud Systems share common concepts and thus basic technological
approaches. In particular with the underlying SOA based structure, capabilities may be exposed and
integrated as stand-alone services, thus supporting the re-use aspect.
Specific shared concepts:
Pay-per-use / Payment models
Quality of Service
Metering
Availability through self-management
It is worth noting that the comparison here is with deployed Grids. The original Grids concept had a
vision of elasticity, virtualization and accessibility [48] [49] not unlike that claimed for the Clouds
vision.
4. SERVI CE ORIENTED ARCHI TECTURES
There is a strong relationship between the Grid and Service Oriented Architectures, often leading
to confusions where the two terms either are used indistinguishably, or the one as building on top
of the other. This arises mostly from the fact that both concepts tend to cover a comparatively wide
scope of issues, i.e. the term being used a bit ambiguously.
Service Oriented Architecture however typically focuses predominantly on ways of developing,
publishing and integrating application logic and / or resources as services. Aspects related to
enhancing the provisioning model, e.g. through secure communication channels, QoS guaranteed
maintenance of services etc. come in this definition secondary. Again it must be stressed though
that the aspects of eBusiness Grids and SOA are used almost interchangeably - in particular since the
advent of Web Service technologies such as the .NET Framework and Globus Toolkit 4, where GT4 is
typically regarded as Grid related and .NET as a Web Service / SOA framework (even though they
share the same main capabilities).
Though providing cloud hosted applications as a service is an implicit aspect of Cloud SaaS
provisioning, the cloud concept is principally technology agnostic, but it is generally recommended to
build on service-oriented principles. However, in particular with the resource virtualization aspect of
cloud systems, most technological aspects will have to be addressed at a lower level than the service
layer.
Service Oriented Architectures are therefore of primary interest for a) the type of applications
1. CENTRALIZED DATA:
35

Reduced Data Leakage: this is the benefit I hear most from Cloud providers - and in my
view they are right. How many laptops do we need to lose before we get this? How many
backup tapes? The data landmines of today could be greatly reduced by the Cloud
as thin client technology becomes prevalent. Small, temporary caches on handheld devices
or Netbook computers pose less risk than transporting data buckets in the form of laptops.
Ask the CISO of any large company if all laptops have company mandated controls
consistently applied; e.g. full disk encryption. Youll see the answer by looking at the
whites of their eyes. Despite best efforts around asset management and endpoint security
we continue to see embarrassing and disturbing misses. And what about SMBs? How many
use encryption for sensitive data, or even have a data classification policy in place?
Monitoring benefits: central storage is easier to control and monitor. The flipside is
the nightmare scenario of comprehensive data theft. However, I would rather spend my time
as a security professional figuring out smart ways to protect and monitor access to data
stored in one place (with the benefit of situational advantage) than trying to figure out
all the places where the company data resides across a myriad of thick clients! You can get
the benefits of Thin Clients today but Cloud Storage provides a way to centralize the data
faster and potentially cheaper. The logistical challenge today is getting Terabytes of data to
the Cloud in the first place.
2.
INCIDENT RESPONSE / FORENSICS:
Forensic readiness: with Infrastructure as a Service (IaaS) providers, I can build a
dedicated forensic server in the same Cloud as my company and place it offline, ready for
use when needed. I would only need pay for storage until an incident happens and I need to
bring it online. I dont need to call someone to bring it online or install some kind of
remote boot software - I just click a button in the Cloud Providers web interface. If I have
multiple incident responders, I can give them a copy of the VM so we can distribute the
forensic workload based on the job at hand or as new sources of evidence arise and need
analysis. To fully realise this benefit, commercial forensic software vendors would
need to move away from archaic, physical dongle based licensing schemes to a network
licensing model.
Decrease evidence acquisition time: if a server in the Cloud gets compromised (i.e.
broken into), I can now clone that server at the click of a mouse and make the cloned
disks instantly available to my Cloud Forensics server. I didnt need to find storage
or have it ready, waiting and unused - its just there.
Eliminate or reduce service downtime: Note that in the above scenario I didnt have to go tell the COO
that the system needs to be taken offline for hours whilst I dig around in the RAID Array hoping that
my physical acqusition toolkit is compatible (and that the version of RAID firmware isnt supported by
my forensic software). Abstracting the hardware removes a barrier to even doing forensics in some
situations.
Decrease evidence transfer time: In the same Cloud, bit fot bit copies are super fast - made faster by
that replicated, distributed file system my Cloud provider engineered for me. From a network traffic
perspective, it may even be free to make the copy in the same Cloud. Without the Cloud, I would have
to a lot of time consuming and expensive provisioning of physical devices. I only pay for the storage
as long as I need the evidence.
36

Eliminate forensic image verification time: Some Cloud Storage implementations expose a
cryptographic checksum or hash. For example, Amazon S3 generates an MD5 hash automagically
when you store an object. In theory you no longer need to generate time-consuming MD5 checksums
using external tools - its already there.
Decrease time to access protected documents: Immense CPU power opens some doors. Did the suspect
password protect a document that is relevant to the investigation? You can now test a wider range of
candidate passwords in less time to speed investigations
.
3. PASSWORD ASSURANCE TESTING (AKA CRACKING):
Decrease password cracking time: if your organization regularly tests password strength by running
password crackers you can use Cloud Compute to decrease crack time and you only pay for what you
use. Ironically, your cracking costs go up as people choose better passwords ;-).
Keep cracking activities to dedicated machines: if today you use a distributed password cracker to
spread the load across non-production machines, you can now put those agents in dedicated Compute
instances - and thus stop mixing sensitive credentials with other workloads
4. LOGGING:
Unlimited, pay per drink storage: logging is often an afterthought, consequently insufficient disk
space is allocated and logging is either non-existant or minimal. Cloud Storage changes all this - no
more guessing how much storage you need for standard logs.
Improve log indexing and search: with your logs in the Cloud you can leverage Cloud Compute to
index those logs in real-time and get the benefit of instant search results. What is different here? The
Compute instances can be plumbed in and scale as needed based on the logging load - meaning a true
real-time view.
Getting compliant with Extended logging: most modern operating systems offer extended logging in
the form of a C2 audit trail. This is rarely enabled for fear of performance degradation and log size.
Now you can opt-in easily - if you are willing to pay for the enhanced logging, you can do so.
Granular logging makes compliance and investigations easier.
5. IMPROVE THE STATE OF SECURITY SOFTWARE (PERFORMANCE):
Drive vendors to create more efficient security software: Billable CPU cycles get noticed. More
attention will be paid to inefficient processes; e.g. poorly tuned security agents. Process accounting
will make a comeback as customers target expensive processes. Security vendors that understand
how to squeeze the most performance from their software will win.
6. SECURE BUILDS:
Pre-hardened, change control builds: this is primarily a benefit of virtualization based Cloud
Computing. Now you get a chance to start secure (by your own definition) - you create your Gold
Image VM and clone away. There are ways to do this today with bare-metal OS installs but frequently
these require additional 3rd party tools, are time consuming to clone or add yet another agent to each
endpoint.
Reduce exposure through patching offline: Gold images can be kept up securely kept up to date.
Offline VMs can be conveniently patched off the network.
37

Easier to test impact of security changes: this is a big one. Spin up a copy of your production
environment, implement a security change and test the impact at low cost, with minimal startup time.
This is a big deal and removes a major barrier to doing security in production environments.
7. SECURITY TESTING:
Reduce cost of testing security: a SaaS provider only passes on a portion of their security testing costs.
By sharing the same application as a service, you dont foot the expensive security code review and/or
penetration test. Even with Platform as a Service (PaaS) where your developers get to write code, there
are potential cost economies of scale (particularly around use of code scanning tools that sweep source
code for security weaknesses).
Adoption fears and strategic innovation opportunities
Adoption-fears
Security: Many IT executives make decisions based on the perceived security risk instead of the real
security risk. IT has traditionally feared the loss of control for SaaS deployments based on an
assumption that if you cannot control something it must be unsecured. I recall the anxiety about the
web services deployment where people got really worked up on the security of web services because
the users could invoke an internal business process from outside of a firewall.
The IT will have to get used to the idea of software being delivered outside from a firewall that gets
meshed up with on-premise software before it reaches the end user. The intranet, extranet, DMZ, and
the internet boundaries have started to blur and this indeed imposes some serious security challenges
such as relying on a cloud vendor for the physical and logical security of the data, authenticating users
across firewalls by relying on vendor's authentication schemes etc., but assuming challenges as fears is
not
a
smart
strategy.
Latency: Just because something runs on a cloud it does not mean it has latency. My opinion is quite
the opposite. The cloud computing if done properly has opportunities to reduce latency based on its
architectural advantages such as massively parallel processing capabilities and distributed computing.
The web-based applications in early days went through the same perception issues and now people
don't worry about latency while shopping at Amazon.com or editing a document on Google docs
served to them over a cloud. The cloud is going to get better and better and the IT has no strategic
advantages to own and maintain the data centers. In fact the data centers are easy to shut down but the
applications are not and the CIOs should take any and all opportunities that they get to move the data
centers
away
if
they
can.

SLA: Recent Amazon EC2 meltdown and RIM's network outage created a debate around the
availability of a highly centralized infrastructure and their SLAs. The real problem is not a bad SLA
but lack of one. The IT needs a phone number that they can call in an unexpected event and have an up
front estimate about the downtime to manage the expectations. May be I am simplifying it too much
but this is the crux of the situation. The fear is not so much about 24x7 availability since an on-premise
system hardly promises that but what bothers IT the most is inability to quantify the impact on
business in an event of non-availability of a system and set and manage expectations upstream and
downstream. The non-existent SLA is a real issue and I believe there is a great service innovation
opportunity for ISVs and partners to help CIOs with the adoption of the cloud computing by providing
a rock solid SLA and transparency into the defect resolution process.
38

Strategic innovation opportunities


Seamless infrastructure virtualization:
If you have ever attempted to connect to Second Life behind the firewall you would know that it
requires punching few holes into the firewall to let certain unique transports pass through and that's not
a viable option in many cases. This is an intra-infrastructure communication challenge. I am glad to
see IBM's attempt to create a virtual cloud inside firewall to deploy some of the regions of the Second
Life with seamless navigation in and out of the firewall. This is a great example of a single sign on that
extends beyond the network and hardware virtualization to form infrastructure virtualization with
seamless
security.
Hybrid systems: The IBM example also illustrates the potential of a hybrid system that combines an
on-premise system with remote infrastructure to support seamless cloud computing. This could be a
great start for many organizations that are on the bottom of the S curve of cloud computing adoption.
Organizations should consider pushing non-critical applications on a cloud with loose integration with
on-premise systems to begin the cloud computing journey and as the cloud infrastructure matures and
some concerns are alleviated IT could consider pushing more and more applications on the cloud.
Google App Engine for cloud computing is a good example to start creating applications on-premise
that can eventually run on Google's cloud and Amazon's AMI is expanding day-by-day to allow people
to push their applications on Amazon's cloud. Here is a quick comparison of Google and Amazon in
their cloud computing efforts. Elastra's solution to deploy EnterpriseDB on the cloud is also a good
example of how organizations can outsource IT on the cloud.
BENEFITS:
Cloud computing infrastructures can allow enterprises to achieve more efficient use of their
IT Hardware and software investments. They do this by breaking down the physical
inherent in isolated systems, and automating the management of the group of systems as a
single entity.
Cloud computing is an example of an ultimately virtualized system, and a natural evolution
for Data centers that employ automated systems management, workload balancing, and virtualization
technologies. A cloud infrastructure can be a cost efficient model for delivering information services
Application:
A cloud application leverages cloud computing in software architecture, often eliminating the need
to install and run the application on the customer's own computer, thus alleviating the burden of
software maintenance, ongoing operation, and support. For example:
Peer-to-peer / volunteer computing (BOINC, Skype)
Web applications (Webmail, Facebook, Twitter, YouTube, Yammer)
Security as a service (MessageLabs, Purewire, ScanSafe, Zscaler)
Software as a service (Google Apps, Salesforce,Nivio,Learn.com, Zoho, BigGyan.com)
Software plus services (Microsoft Online Services)
Storage [Distributed]
Content distribution (BitTorrent, Amazon CloudFront)

8.CONCLUSION:
39

In my view, there are some strong technical security arguments in favour of Cloud Computing assuming we can find ways to manage the risks. With this new paradigm come challenges and
opportunities. The challenges are getting plenty of attention - Im regularly afforded the opportunity to
comment on them, plus obviously I cover them on this blog. However, lets not lose sight of the
potential upside.
Some benefits depend on the Cloud service used and therefore do not apply across the board. For
example; I see no solid forensic benefits with SaaS. Also, for space reasons, Im purposely not
including the flip side to these benefits, however if you read this blog regularly you should recognise
some.
We

believe the Cloud offers Small and Medium Businesses major potential security benefits.

Frequently SMBs struggle with limited or non-existent in-house INFOSEC resources and budgets. The
caveat is that the Cloud market is still very new - security offerings are somewhat foggy - making
selection tricky. Clearly, not all Cloud providers will offer the same security.
Increases business responsiveness Accelerates creation of new services via rapid prototyping
capabilities Reduces acquisition complexity via service oriented approach Uses IT resources efficiently
via sharing and higher system utilization Reduces energy consumption
Handles new and emerging workloads Scales to extreme workloads quickly and easily
Simplifies IT management Platform for collaboration and innovation
Cultivates skills for next generation workforce

9.REFERENCES:
40

Web guild.org
http://www.webguild.org/
How stuff works.com
http://communication.howstuffworks.com/
Cloud security.org
http://cloudsecurity.org
IBM
http://www.ibm.com/developerworks/websphere/zones/hipods/
Google suggest
http://www.google.com/webhp?complete=1&hl=en

41

42

You might also like