Cloud Computing
The concept of cloud computing was first expounded in the 1960s, when the idea was
conceived of providing the general public with computing facilities available individually yet
simultaneously, just like a utility service such as gas or electricity (Parkhill, 1966). The term
“cloud” refers to the provisioning of services that are highly elastic in terms of networking
scope, computing power, storage facilities or application range. Cloud providers offer hardware
(CPU network infrastructure, hard drives) and the use of virtualization technologies to manage
the cloud platform at the level above physical infrastructure (Gagliardi, 1975). Through the
provision of transparent interfaces, the user can access the platform without needing an intimate
knowledge of the underlying infrastructure or hardware devices. In the 1990s, ATM networks
were described in terms that used various concepts of the cloud. In 2006, the term cloud gained
popularity after Eric Schmidt, CEO of Google, used the term to describe the business model of
providing services over the Internet. From that time, the cloud computing term has been used
for marketing purposes to represent many different ideas (Zhang, Cheng and Boutaba, 2010).
The US National Institute of Standards and Technology (NIST) proposed a definition which is
widely accepted, “Cloud computing is a model for enabling ubiquitous, convenient, on-demand
network access to a shared pool of configurable computing resources (e.g., networks, servers,
storage, applications, and services) that can be rapidly provisioned and released with minimal
management effort or service provider interaction.” (Mell P and Grance T (2011); NIST
(2013)).
To further clarify the NIST definition, network accessibility is improved by cloud computing
for the purpose of ready-to-use resources by supplying an easy and rapidly accessible network.
In order to serve all users, cloud computing resources are combined into a single host. Resource
demand is matched rapidly at any specific time by using fast provisioning facilities to operate
those resources. When demand increases computing power automatically responds to match
that demand. The rapid releasing of assigned resources means that they remain in an active
state in order to enable rapid reassignment (Brian, 2008; Mell P and Grance T (2011)).
Cloud computing delivers utility services or a collection of services that meet customer
demands for such facilities as storage, business processes, collaboration infrastructure,
application and computational power. The main significance of cloud computing is its
elasticity. As the demands of the user(s) increase then application, storage and computational
resources can also be increased, and conversely they can be ceased if the services are no longer
needed (Hill et al., 2013). Billing is applied using a pay-per-use policy. The process makes
user access to infrastructure flexible, simple and cheaper.
To provide a single service, other information processing technologies are also used by cloud
computing. In fact, cloud computing uses most of the information technologies used by other
computing services, though some are chosen solely to run the service environment.
The long-held dream of cloud computing is utility, with the capacity to transform a large part
of the IT industry, through developing software that is more attractive and user friendly when
provided as a service. An additional aim is to reshape the design and process of purchasing IT
hardware (Armbrust et al., 2010).
The following are commonly used cloud computing definitions:
“A Cloud is a type of parallel and distributed system consisting of a collection of inter-
connected and virtualized computers that are dynamically provisioned and presented as one or
more unified computing resource(s) based on service-level agreements established through
negotiation between the service provider and consumers.” (Buyya et al., 2009).
“Cloud computing is both a user experience (UX) and a business model. It is an emerging style
of computing in which applications, data and ICT resources are provided to users as services
delivered over the network. It enables self-service, economies of scale and flexible sourcing
options…an infrastructure management methodology - a way of managing large numbers of
highly virtualized resources, which can reside in multiple locations…” (Rochwerger et al.,
2009).
“Clouds are a large pool of easily usable and accessible virtualized resources (such as hardware,
development platforms, and/or services). These resources can be dynamically reconfigured to
adjust to a variable load (scale), allowing also for an optimum resource utilization. This pool
of resources is typically exploited by a pay-per-use model in which guarantees are offered by
the infrastructure provider by means of customized SLAs.” (Vaquero et al., 2008).
Cloud computing is being adopted by industry, research establishments and governments to
solve problems of storage and computation in an Internet-driven world. Factors that are making
cloud computing relevant today are: 1) The decreasing cost of hardware and the increasing
capacity of storage and computing power.. 2) The exponential increase in data due to scientific
simulation/instruction, archiving and Internet publishing; and 3) the wide-spread adoption of
Services Computing and Web 2.0 applications (Foster et al., 2008).
Cloud Computing Characteristics
NIST has defined five characteristics of cloud computing (see Figure 1) (Mell P and Grance T
(2011); NIST (2013)).
Figure 1: Cloud Computing Characteristics
Broad Network Access:
A personal computer is not necessary to access the cloud-computing environment. Users can
use laptops, smartphones, workstations, PDAs or tablets over the network. Data transfer rates
between the cloud and all such users are fast. Cloud computing features allow broad network
access using standardized interfaces that operate with several types of client devices.
On-demand Self-service:
With web access, it depends on the individual cloud provider whether clients are charged for
cloud space or service or whether those services are given free of cost. Either way, customers
do not need to deal with a cloud provider’s personnel for the purposes of purchasing or using
the services. Cloud computing resources can be accessed by users without requiring human
interaction, mostly through a web-based, self-service portal (management console). Users can
allocate and use resources whenever they are required. From the provider’s viewpoint, this
reduces budget costs of employing personnel. This feature also offers the flexibility to cloud
users of accessing cloud resources at any time over the Internet. Cloud users having the ability
to store and use data and software in the cloud not only from within their organization but from
anywhere provided with Internet access.
Resource Pooling:
Virtualization techniques are used to pool cloud resources, which are then allocated to user
requests, with the possibility of resource re-allocation. This means the same physical resources
are used to service multiple customers, securely separating the resources at logical level. In
other words, users can share cloud resources simultaneously according to the service they
require. The assigning and releasing of resources to multiple customers is executed
dynamically rather than physically. Customers do not know the physical location of resources,
but if they need to meet legal requirements they can try to restrict their resource.
Rapid Elasticity:
When providing services, required resources can be flexibly matched by cloud computing
according to the requirement demands of customers. Allocation of resources depends on the
current need. Rapid elasticity enables changes to these allocations to be effected within very
short time intervals. Thus, power usage is improved, through its supply and in conservation of
unused resources.
Measured Service:
Cloud computing stresses management of resources as well as measurement of operational
activity to ensure effective computing is in place. Various dashboards are utilized to provide
metrics for users as well as for providers. Monitoring, reporting, and measuring of resource
usage is transparently based on utilization. Service charges for cloud users are based on the
resources they use. If a cloud user needs 20 computers at the same time for one hour’s
computation, the user will be charged exactly for these computing hours. Afterwards, the
computers are released for immediate use by other users.
Cloud Computing Deployment Models
As Mell and Grance (2011) state, the implementation of cloud models can be categorized in
several different ways, namely, by considering the owner of the infrastructure, the management
team, the location, or the end user. Four main cloud deployment models (see Figure 2) are
described below (Mell P and Grance T (2011); NIST (2013)).
Figure 2: Cloud Computing Deployment Models
Public clouds: The public cloud is an environment that can be used and shared by everyone.
It is available through public networks, namely the Internet. In such a type of cloud, the cloud
environment and the management of its activities are under the full control of the provider.
Hardware as well as software infrastructure are also in provider premises. The biggest problem
in the deployment of a public cloud is that customers do not acquire any sort of contractual
agreement. In such conditions, the user is considered as untrusted, which heightens the need
for security and privacy controls. Amazon Elastic Compute Cloud (Amazon EC2) is an
example of a public cloud. Academic institutions are also providers of public cloud services.
Private clouds: The private cloud computing environment is dedicated to a single organization
or company and operated solely for its benefit or aims. Private networks are used to access the
cloud services provided and the cloud consumer is a member of the same organization. The
organization or company may be the provider itself, or it may have a provision agreement with
a separate provider of cloud facilities. In this situation, third parties are considered as trusted.
Community clouds: A community cloud is managed by a central organization. This model
resembles that of the private cloud. The community cloud is shared by several organizations
using shared resources. Data centres and infrastructure may belong to one or more community
organizations, or to a third party, and may be located on or off site. A government cloud is an
example of a community cloud where cloud services can be accessed by different governmental
organizations.
Hybrid clouds: Hybrid clouds combine the afore-mentioned environments. They contain
public, private, and community cloud characteristics. Standardized methods are used to provide
mechanisms for service portability. With hybrid clouds comprising different types of providers,
there is the risk of lack of common standards in communicating data and services. Users are
identified as either trusted or not trusted by this sort of cloud, and permissions will vary
accordingly. Trusted users are allowed to control, manage, and monitor their resources and
data flow. But users of a public cloud are not allowed to manage and monitor data flow. The
location of a hybrid cloud could be an organization or a trusted third-party site.
Cloud providers offer a few schemes that do not follow the previously mentioned models. For
example, Amazon is a public cloud model that enables users to connect to private cloud
resources and services (Bugiotti et al., 2012).
Cloud Computing Service Models
End-user service models are defined as three main types (see Figure 3). In ascending layers,
above the hardware and networking infrastructure, they are named as: Infrastructure as a
Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). All three types
of service are generally available to customer end-users through a web interface or a language-
interfacing Application Programming Interface (API) (Buyya et al., 2009; Mell P and Grance
T (2011); NIST (2013)). Details of each service model follow.
Figure 3: Cloud Computing Service Models
Software-as-a-Service (SaaS): In the SaaS model, cloud providers are used to deliver the
application services. SaaS defines a new model for software deployment in which
application(s) and hosting are provided to the customer as a service across the Internet, without
need for customers to install and run the software on their devices. Management of
applications, infrastructure and operating systems is the responsibility of the cloud provider.
Customers exercise control over applications configuration settings as well as use of resources.
They are unaware of the services, which run over the top of the cloud. SaaS is comparatively
restrictive compared with IaaS as it forces customers to use provided services. SaaS is a pay-
as-you-go model, which is adopted by many software companies rather than licensing. The
model is a cheaper way of using software initially, but tool costs increase for the customer if
usage becomes long term. SaaS providers are: Microsoft Office 365, Adobe Photoshop
Express, Google Docs, NetSuite, Marketo, CloudTran, AccelOps, and Apprenda.
Platform-as-a-Service (PaaS): With PaaS, the cloud provider furnishes the platforms for the
purpose of developing software applications and also provides the means to perform
development operations. The latter include tools and programming languages, APIs for
building and delivering web applications, and services such as design, development, testing,
deployment, hosting and maintenance. In short, the customer develops an application using
tools and programing language provided by the cloud host and runs development over the cloud
host infrastructure. In this model, the customer does not have full control over cloud
components such as operating systems, network, platforms and servers. However, the customer
does have full control over the deployed applications as well as the configuration-setting
environment. Examples of PaaS providers are Google AppEngine, Heroku, Microsoft Azure,
HPE Helion Stackto, Appcelerator, AppFog, and Engine Yards. HPE Helion Stackato, for
instance, is an open, scalable, flexible cloud application platform and ecosystem that supports
a variety of runtimes, frameworks and multi-cloud deployments.
Infrastructure-as-a-Service (IaaS): In the IaaS model, the provider completely outsources
infrastructure services. Raw data, network capacity, and processing power, which are
infrastructure elements, are made available to the customer for use. IaaS gives the customer
control over the operating system on a provider’s infrastructure but does not allow management
of the infrastructure itself. IaaS characterizes the concept of virtualization by enabling the
customer to run their guest operating system over the provider’s virtualized software. This
model is more flexible for customers than the previously described models as it gives customers
higher control. From SaaS to IaaS, customers progressively gain more control as providers give
up control. Data centres are also provisioned by IaaS by renting, managing, and maintaining
hardware facilities for customers. Among IaaS providers are Amazon AWS, Windows Azure,
Google Compute Engine, IBM SmartCloud Enterprise, Rackspace Open Cloud, and HP
Enterprise Converged Infrastructure.
In choosing an appropriate type of cloud service, consideration should be given to the need for
space and the range of computing facilities. The computing needs of business, and the cloud
service models that can provide them, will differ from the needs and service provisions
associated with home or personal use. Pay-as-you-go services enable cloud space to be
expanded or contracted as needed, with charges made only for the services consumed.
Cloud Computing Architecture
The four layers of cloud computing architecture are: the hardware or fabric layer, the
infrastructure or unified resource layer, the platform layer, and the application layer, as shown
in Figure 4 (Kim, 2013; Jadeja and Modi, 2012; Kim, 2011; Zhang, Cheng and Boutaba, 2010;
Foster et al., 2008). The layer architecture of cloud computing enables systems to be
maintainable, testable, fault tolerant and scalable. Each layer represents a particular level of
functionality. This layered structure enhances the ability to ensure that each layer above
remains stable and robust. Clearly, cloud computing architecture needs to be firmly resistant
to the risks of operating system or hardware failure. Each of the four layers is described below
in detail.
Figure 4: Cloud Computing Architecture
The hardware layer: This layer contains the physical resources of the cloud and also is
responsible for managing cloud resources such as power switches, routers, server devices, and
cooling systems. The hardware layer, due to its hardware-oriented features, also includes data
centre implementations. A data centre contains a number of servers organized in racks and
connected to each other through routers and switches. Issues which involve the hardware layer
include configuration, traffic management, fault tolerance, cooling, and power controls. The
hardware layer provides services like the CPU, NIC, RAM, and Disk, and is also responsible
for logic and host hardware control, short term memory (STM), network connectivity (NC),
and long-term storage (LTS).
The infrastructure layer: Another name for the infrastructure layer is the virtualization layer.
It is the most essential component of cloud computing. This layer creates a resource pool for
storage and other activities by partitioning the physical resources using such virtualization
technologies as Xen, KVM, and VMWare. It is the virtualization technologies of the
infrastructure layer that are used to deliver dynamic resource assignment.
The platform layer: The platform layer contains application frameworks and operating
systems. It can overcome the need to deploy applications into Virtual Machine (VM)
containers. For example, at the platform layer Google App Engine provides storage support for
databases and web application processing. The software framework resources provided at
platform level include APIs and the protocols, tools and databases needed to build any cloud-
based software applications.
The application layer: This is the most visualized layer of the cloud environment. The
application layer is at the top of the hierarchical structure of cloud computing layers.
Automatic-scaling features can be leveraged by cloud computing applications for the purpose
of achieving higher performance, availability, and lower cost. All kinds of cloud-based
applications can be run at the application layer and provided on demand to the user.
Cloud computing architecture is modular. This means that a layer is loosely coupled with the
layer below and above, but each layer is also able to work independently. It has similarities in
design to the Open Systems Interconnection (OSI) network protocol model.
Associated Technologies
Following the structural concepts defined above, some theoretical background is now given on
Service-Oriented Architecture (SOA) and Virtualization.
Service-Oriented Architecture
The term service-oriented architecture (SOA) defines the process of delivering software
components as a service over the Internet and using messaging over well-defined interfaces for
communications. Most applications that involve the principles of SOA today are developed as
web-based services. SOAP and REST are web-based technologies that are used for the
implementation of SOA. SOA helps to leverage the development of applications by delivering
essential components. Use of SOA by developers can improve application performance and
speed up development time to market. PUT, DELETE, POST and GET are commands used in
the REST design model to map the basic create, delete, and update operations. From the end-
consumer perspective, most applications delivered as cloud services are now based on REST-
style web services, as any browser can support basic REST operations.
Virtualization
A major trend in IT industry is virtualization, with large enterprises taking benefit from this
technology. In 1974, Goldberg and Popek proposed a definition of a Virtual Machine (VM) as
an “efficient, isolated duplicate of the real machine” (Popek and Goldberg, 1974). Many tasks
are implied by virtualization, a term which indicates abstraction of resources. Virtualization
principally applies to the OS/platform or application frameworks. VMWare President and CEO
Diane Greene, in 2008, said: “Evaluation of virtualization began with users deploying virtual
machines for testing and development and then easing into server consolidations for production
environments”. In workflow applications, which are data intensive, providers such as Amazon
can offer virtualization of hardware. For building and deploying workflow applications over
the cloud, PaaS enables a higher-level development and runtime environment. SaaS providers
in the cloud stack offer end-users standardized software solutions that can be integrated into
existing workflows.
Cloud Computing Platforms
For both business enterprises and the consumer, cloud computing is shifting toward Software
as a Service (SaaS) and Platform as a Service (PaaS) models for the provision of on-demand,
location-neutral resources. This has resulted in the rapid increase of cloud platforms.
Amazon Elastic Compute Cloud (EC2)
Amazon Elastic Computing offers an environment based on virtual computing that allows the
user to run applications over it. It provides the user with the facility to create a new Amazon
Machine Image (AMI), which contains the associated configuration, data, libraries and
applications. It also enables the user to upload selected or newly created AMIs to the Amazon
Simple Storage Service. (Buyya et al., 2009).
Google App Engine
Google App Engine is a PaaS hosting platform that enables the user to develop and run web
applications created in popular programming languages such as Python, Java, or PHP . App
Engine automatically scales resources to meet customer demand and offers basic Google
account subscribers free limited storage, page views and hosted applications before charges
apply. Data storage is dependent on the unique App Engine file system and developers are
restricted to using supported APIs, languages, and frameworks. Applications can be executed
only by invoking HTTP requests (Buyya et al., 2009).
Microsoft Azure
The Microsoft Azure platform represents both a PaaS and a SaaS model. It supports
applications that are developed in .NET language, such as C#, J#, and VB.NET, to run on the
cloud environment managed by the Azure operating system. Azure makes available a
collection of tools and a number of protocols for application development, for example,
Microsoft SQL Services, Microsoft .Net Services, Live Services, Microsoft Dynamics CRM
Services, and Microsoft SharePoint (Buyya et al., 2009). Some functions are also accessible to
the user for integration in an application. However, the user is not allowed control over the
underlying Windows Azure operating system. Azure services provide such functionalities as
automatic load balancing, application life cycle management, geo-replication, Windows
SharePoint Services, SQL Services and .NET services.
Aneka
Aneka is a hybrid resource management platform that is .NET-based service oriented. It was
formed to support application models, security solutions, communication protocols, and
persistence, with service selection able to be changed at any time without changing the existing
Aneka ecosystem. In order to create a system based on the Aneka cloud, instances of Aneka
are started as a container service hosted on a selected client desktop computer. The purpose of
the Aneka container is to initialize services while the rest of the Aneka cloud acts as a single
interaction point. Service Level Agreement support is provided by Aneka to meet user-
specified QoS in such terms as budget (maximum cost level set by user) and deadline
(maximum time period for application completion). (Buyya et al., 2009).
Eucalyptus
Eucalyptus is a software framework that is open source and implements an IaaS environment.
Deployment of both public and private cloud is supported by Eucalyptus. It provides the user
with control to run the entire instances of the virtual machine as they are deployed across
physical resources. Amazon EC2 has compatibility with Eucalyptus, and in this way
Eucalyptus and Amazon instances can be controlled by the same tools. The primary purpose
of Eucalyptus is to provide an application testing platform before moving to Amazon’s
infrastructure. It also functions to control and manage distributed resources. Eucalyptus is an
open source project, which supports the integration of Amazon’s API into Linux distribution.
OpenNebula
OpenNebula is a hybrid, open source cloud platform. It offers a PaaS model for enterprise
cloud computing that aims to enhance existing client solutions for networking, storage,
virtualization, and management. OpenNebula only provides services that are based on open
source licensing so users are not restricted by a commitment to proprietary products. This
service offers full platform independence, dynamic virtual data centres as isolated
infrastructure environments, and interoperable cloud support for portability to other platforms.
OpenStack
OpenStack is a free cloud computing platform that deploys open source software. Managed by
a non-profit foundation, it is primarily used as an IaaS cloud solution with support from both
large companies and the global development community. Users can deploy virtual machines
and task-handling software across multiple instances on OpenStack scaling infrastructure. The
technology involves pools of processing, storage, and networking resources controlled by
interrelated projects, which are accessed via a data centre using a web-based user interface,
command-line tools, or a RESTful API. The open source environment means anyone can
access source codes at OpenStack to change, modify, or enhance them for sharing with the
development community at large. Nova is a cloud computing framework controller, which is
the central command function of an IaaS system. It hosts and manages cloud computing
systems as a collection of open source, massively scalable technologies. It manages and
automates pools of computer resources in an expandable component-based structure and
integrates with a wide range of virtualization technologies, including hard disk virtualization
and high performance computing configurations. Data storage uses an SQL database. APIs are
compatible with systems like Amazon EC2. Swift is essentially a scalable redundant storage
system. Typically, static object data, such as documents and images, are copied to multiple
disk drives widely distributed among servers in the data centre. OpenStack software is used to
replicate data at least twice, ensure integrity across the cluster, and retrieve data on request.
Swift Object Storage offers data storage on a long-term basis, which can be rapidly scaled by
adding new nodes/servers. Horizon provides administrators and users with a web-based
graphical interface to access, provision, and automate cloud-based resources. It is the
authorized implementation of OpenStack Dashboard and provides a user interface to other
OpenStack services such as Nova, Swift, and Keystone. Keystone delivers a centralized
directory of authorized users coordinated with OpenStack services they are allowed to access.
It acts as a collective authentication and high-level authorization system for OpenStack and
can integrate with other directory services like LDAP.
Using Distributed Computing Infrastructures (DCIs), such as clouds, requires substantial
proficiency and understanding of the active components of such infrastructures. Scientists
working in unrelated fields cannot be expected to have the skills needed to design the operative
intricacies of such systems. To take full advantage of DCIs, their requirements are for a high-
level, domain-specific user interface that conceals underlying infrastructure while presenting
the science-related features of the applications to be executed using DCIs. Science gateways
answer these requirements. Typically, they present a web interface that can be accessed from
anywhere. They require no installation on personal desktops or mobile devices as applications
are run by web-access to the clouds. Such gateways are being built by increasing numbers of
scientific communities that recognize how they can simplify their use of DCIs.