Cloud computing Architecting in the cloud
anna.ruokonen@tut.fi
Anna Ruokonen, OHJ, TTY
Outline
Cloud computing
What is?
Levels of cloud computing: IaaS, PaaS, SaaS
Moving to the cloud?
Architecting in the cloud
Best practices and solutions
Amazon Web Services
Building a cloud application - Example
Anna Ruokonen, OHJ, TTY
What is cloud computing?
In principle, cloud computing implementations offer
seemingly infinite pooled computing resource over the
network
users can start, stop, and scale (up and down) its power at will
Comes close to the idea of utility computing:
Ideally computing is provided the same way as e.g. water or electricity;
available in every home and charged based on consumption,
outsourcing all the hardware and getting charged by the use.
Anna Ruokonen, OHJ, TTY
What is cloud computing?
Three criteria for a cloud service:
(1) The service is accessible via web browser or web services
API (no need to installation)
(2) No capital investments is needed to get started
(3) You pay only for what you use
[CloudArchitectures]
Anna Ruokonen, OHJ, TTY
What is cloud computing?
Stefan Tai, Karlsruhe Institute of Technology:
cloud computing provides scalable, network-centric, abstracted IT
infrastructure, platforms, and applications as on-demand services that
are billed by consumption.
Three important viewpoints
Business opportunities
Internet scale service computing
Efficient management and utilization of systems
Anna Ruokonen, OHJ, TTY
Whats new and whats old?
Cloud computing combines features of
cluster computing and
grid computing
...with the help of virtualization.
VM 1
VM 2
Virtualization layer
Internet
Host operating
system
Hardware
Anna Ruokonen, OHJ, TTY
VM 3
Levels of cloud computing
Infrastructure as a Service (IaaS)
Platform as a Service (PaaS)
Offers a computer infrastructure often a virtual hardware infrastructure that is immediately accessible and ready to use
Offers a computing platform and/or software stack as a service
Often consuming IaaS and sustaining SaaS cloud applications
Software as a Service (SaaS)
While IaaS and PaaS are aimed for a software developer, SaaS is often aimed
directly to the end user
Accessible via a browser and/or API (SOA services)
Anna Ruokonen, OHJ, TTY
Infrastructure as a Service (IaaS)
The service provider offers a computer infrastructure
including storage, hardware, servers and networking
components often as a virtual hardware infrastructure
The service provider owns the equipment and is responsible
for housing, running and maintaining it.
Anna Ruokonen, OHJ, TTY
IaaS: Some reasons why to use
Scalability
Error Recovery
Your hardware and the data located on your IaaS provider and are
housed in (hopefully) secure data centers
Time Back
Pay-as-you-go model allows you to scale up or down
You can focus on value-added tasks
Efficient payment model
No hardware investments
Anna Ruokonen, OHJ, TTY
IaaS: Business model
Charging based on the resources and services used
time,
bandwidth,
transactions,
storage
etc.
Custom units and different measuring methods make the
comparison of the provider prices harder.
Anna Ruokonen, OHJ, TTY
Platform as a Service (PaaS)
A PaaS service provides the hosting infrastructure, and tools
for development and deployment.
Sandboxed, more locked-in, but also more tasks handled by
the service provider (automation, load balancing, billing etc.)
Payment e.g. based on
Outgoing bandwidth
Incoming bandwidth
CPU time
Data storage space used
Recipients emailed
Anna Ruokonen, OHJ, TTY
PaaS: Support for
Application design
Application development
Testing
Deployment
Hosting
Team collaboration
Web service integration
Database integration
Security
Scalability
Storage
Persistence
State management
Application versioning
Application instrumentation
Anna Ruokonen, OHJ, TTY
PaaS: Some reasons why to use
Lower investment
Jump start development
No maintenance cost
Lower risk factor
If your project fails, just free the reserved resources and pay the
usage bill
Business
Provides a marketplace (e.g. Google Apps Marketplace) and/or a
customer pool (e.g. FaceBook)
Anna Ruokonen, OHJ, TTY
SaaS: Software as a Service
Software is provided and used through a web browser (or API)
A one-to-many model
Activities managed from central locations
As many business models as there are companies
Anna Ruokonen, OHJ, TTY
Deployment models
Public cloud
Community cloud
Private cloud
Hybrid cloud
Anna Ruokonen, OHJ, TTY
Positioning some players
Cloud
Technology
Providers
Infrastructure as a
Service
Platform as a
Service
Software as a
Service
HP
Google App
Engine
Cordys
IBM
Facebook
Zynga
RackSpace
Force.com
OpenStack
Eucalyptus
VMWare
GoGrid
Microsoft Azure
Terremark
WaveMaker
Amazon Web Services
Oracle
Hosting.com
Bungee Connect
Techila
Joyent
LongJump
Anna Ruokonen, OHJ, TTY
SalesForce
Dropbox
Animoto
Arch Red
[Huhtanen2010]
Positioning some players
Cloud
Technology
Providers
HP
IBM
RackSpace
Infrastructure as a
Service
Platform as a
Service
Google App
Cordys
Enginebut opportunities for
Vendor lock-in tightens,
Facebook
Zynga
innovation, business models and market entry
increase, need for Force.com
venture capital decrease
SalesForce
GoGrid
Microsoft Azure
Terremark
WaveMaker
Oracle
Hosting.com
Bungee Connect
Techila
Joyent
LongJump
OpenStack
Eucalyptus
VMWare
Software as a
Service
Dropbox
Market entry costs and capital investment needs
Amazon Web Services
increase, need for venture capital increases
Animoto
Anna Ruokonen, OHJ, TTY
Arch Red
[Huhtanen2010]
Moving to the cloud?
It's about the architecture...
Transactional Web
application
architecture
Internet
Application
server
Load balancer
Grid application
architecture
Processing
node
Database
cluster
Separation into
presentation,
business logic, and
data storage
Get job
Push job
Job
gueue
Publsh
results
Separation of the core
application from its data
processing
Anna Ruokonen,
OHJ, TTYnodes
Data
manager
Read results
[CloudArchitectures]
Moving to the cloud?
Options for IT infrastructure..
Internal
Managed services
The Cloud
Capital investment
Significant
Moderate
Neglible
Ongoing costs
Moderate
Significant
Based on usage
Provisioning time
Significant
Moderate
None
Limited
Moderate
Flexible
Significant
Limited
Moderate
Varies
High
Moderate to high
Flexibility
Staff expertise required
Reliability
Anna Ruokonen, OHJ, TTY
[CloudArchitectures]
Cloud best practices
1. Design for failure
2. Loose coupling
3. Implement Elasticity
4. Think Parallel
5. Build security in every layer - Design with security in mind
6. Don't fear constraints - Re-think architectural constraints
7. Leverage different storage options - One DOES NOT fit all
Anna Ruokonen, OHJ, TTY
[CloudBestPractices]
Cloud best practices: Design for
failure
Internet
No single points of failure
Replication, monitoring,
load balancing, backups,
snapshots..
Region1
Geographical redundancy with
master-slave replication
Zone1
Region2
Zone2
DB
master
DB
slave
Permanent
storage
Anna Ruokonen, OHJ, TTY
Cloud
front
Cloud best practices: Decouple your
components
Loose coupling using message queues for communication
(isolating, buffering)
Component design
As stateless as possible
Component1
Queue 1
Component 2
Queue 2
Component 3
Tight coupling
Queue 3
Loose coupling
Component1
Component 2
Anna Ruokonen, OHJ, TTY
Component 3
Cloud best practices: Elasticity
Scaling (e.g. machine configurations, storage, computing capacity)
Monitor system metrics
Use load balancing tools
Automatize
Scale based on variability in usage
Manual scaling
(up and down)
Small instance
Medium instance
Large instance
Medium instance
Automatic scaling
(out and in)
One instance
Two instances
Four instances
Anna Ruokonen, OHJ, TTY
Two instances
Cloud best practices: Parallel and
distributed computing
Create job flows using MapReduce
Designed for scalable processing of large amount of data
Automatic distribution of work load
Two simple programs, map(key, value) and reduce(key, values), are
distributed in several machines for parallel computation
Combine
Input
Map
Map
Map
Output
Reduce
Output
Reduce
Output
Map
Anna Ruokonen, OHJ, TTY
Cloud best practices: MapReduce
cont.
1. Map: is run to each key-value pair of the input and it
produces a list of preliminary values
2. Sort/combine: values are sorted according to keys
3. Reduce: reduce is run to a list of values for each key and it
produces a list of final values
<keyinput, valueinput> map <keyoutput, valueintermediate>
<keyoutput, list(valueintermediate)> reduce <keyoutput, list(valueoutput)>
Anna Ruokonen, OHJ, TTY
[MapReduce]
Amazon Web Services (AWS)
Elastic Computing Cloud (EC2)
Simple Queue Service (SQS)
Persistent storage
Files are hold in buckets
A file is identified by a key and URI
Simple DataBase (SDB)
Message gueue
Simple Storage Service (S3)
Virtual machines
No predefined schemas
domain:item:attribute, UTF-8 string
Attributes can be added dynamically and they can have multiple values
Elastic MapReduce (EMR)
Implements Googles MapReduce architecture
Uses Hadoop implementation on top of EC2 instances and S3
Anna Ruokonen, OHJ, TTY
[AWS]
Amazon Machine Image (AMI)
Used to instantiate a virtual machine
Bundles the operating system (VM), application software and
associated configuration settings
Provides API for configuration and management
Tomcat
App Server
Tomcat
Tomcat
Your code
Framework
Libraries
Your code
Libraries
J2EE
Tomcat
Your code
Your code
Your code
J2EE
Libraries
J2EE
Libraries
J2EE
Libraries
Linux
Linux
OS
Linux
Anna Ruokonen, OHJ, TTY
Linux
Amazon EC2
Building an AWS application
Your Application
SOAP and REST APIs
Command line tools
Admin Console
Amazon Elastic
MapReduce
SDB
SQS
Domains Queues
AutoScaling
Elastic
LB
Cloud
Watch
Amazon S3
Object and
Buckets
Amazon
Cloud
Front
Amazon EC2 Instances
Amazon WorldWide Physical Infrastructure
(geographical regions, availability zones, edge locations)
Anna Ruokonen, OHJ, TTY
Example: Image search service
List of
websites
An example of building a SaaS application
using AWS
Implemented with Python, Boto, and Diango
1.
2.
3.
4.
5.
6.
SQS
2
Create SQS and fill with website URIs
Process pages from SQS: extract images and
keywords
Store images in S3
Store keywords and S3 URIs in SDB
Use EMR to find search suggestions (find
common keyword relations)
Store search suggestions in SDB
[OHJ-5202]
Image
processing
EC2 instance
4
KeywordDB
SDB
Image
Storage
S3
download
images
MapReduce
EMR
(EC2, S3)
Anna Ruokonen, OHJ, TTY
search
WebApp
EC2 instance
Example: Find
search suggestions
Fig_1: finland, snow
Fig_2: finland, lapland, snow
Map: <finland.snow, 1>, <finland.lapland,1>,
<finland.snow, 1>..
Reduce: <finland.snow, 2>,
<finland.lapland,1>..
Suggest: <finland, snow>
Anna Ruokonen, OHJ, TTY
Fig_3: finland, winter
Other tools..
Google App Engine: http://code.google.com/appengine/
Windows Azure: http://www.windowsazure.com/en-us/
[WindowsAzure]
Anna Ruokonen, OHJ, TTY
References
[AWS] Amazon Web services: http://aws.amazon.com/
[CloudArchitectures] Cloud Application Architectures: Building Applications and
Infrastructure in the Cloud, George Reese, O'Reilly Media, 2009.
[CloudBestPractices] Architecting for the Cloud: Best Practices: http://
media.amazonwebservices.com/AWS_Cloud_Best_Practices.pdf
[Hadoop] Open Source MapReduce, Apache Hadoop: http://hadoop.apache.org/
[Huhtanen2010] Karri Huhtanen: Cloud computing Business models:, 2010
://www.cs.tut.fi/~tsysta/cloud-computing-business-models.pdf
[MapReduce] MapReduce: Simplified Data Processing on Large Clusters:
[OHJ-5202] OHJ-5202 Palvelupohjaiset jrjestelmt: http://www.cs.tut.fi/~palpo/
http
http://
static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/mapreduce-osdi04.pdf
Anna Ruokonen, OHJ, TTY