Building Private Cloud with Open Source
Software for Scientific Environment
Zoran Pantic & M. Ali Babar
IT University of Copenhagen, Denmark
Nordic Symposium on Cloud Computing &
Internet Technologies (NordiCloud)
August 21th & 22th, 2012, Helsinki, Finland
Zoran Pantic
Infrastructure Architect & Systems Specialist
Corporate IT @ University of Copenhagen
E-mail: zopa@itu.dk & zoran@pantic.dk
Academic profile: http://itu.academia.edu/ZoranPantic
Blog: http://zoranpantic.wordpress.com
LinkedIn: http://www.linkedin.com/in/zoranpantic
M. Ali Babar
Agenda
Non-technical part:
Why Private Cloud?
Why OSS?
Technical part:
Reflections on diverse IT-infrastructure aspects
OSS Private Cloud solutions:
UEC/Eucalyptus
OpenNebula
OpenStack
Conclusion
Questions? (also during the session)
Tutorial Goals
Understand the role and use of private cloud in specific
environments, e.g., scientific & academic
Gain knowledge of the technologies for setting up a private
cloud with open source software
Learn about the process for designing & implementing a
private cloud solution
Appreciate the socio-technical & technical challenges
involved and some potential strategies
Cloud Computing
“Cloud computing is a model for enabling convenient, on-demand network
access to a shared pool of configurable computing resources (e.g., networks,
servers, storage, applications, and services) that can be rapidly provisioned
and released with minimal management effort or service provider
interaction.”
(A definition by the US National institute of standards and technology - NIST)
Main Aspects of a Cloud System
Reproduced from Figure 1 of The Future of Cloud Computing: Opportunities for European Cloud Computing beyond 2010.
Commercial Efforts & NFRs
From The Future of Cloud Computing: Opportunities for European Cloud Computing beyond 2010
Service & Deployment Models
Software as a Service (SaaS)
Google Apps Zoho Salesforce Microsoft Cloud
CRM Services
Service Models
Platform as a Service (PaaS)
Google App Microsoft Force.com Yahoo
Engine Azure Open Strategy
Infrastructure as a Service (IaaS)
Amazon EC2 Eucalyptus IBM – Computing VMWare
On Demand (DoC) vSphere
Deployment Models
Public Private Community Virtual Private
Clouds Clouds Clouds Clouds
Hybrid Clouds
Private Cloud
Private cloud has different meanings to different people
But basically, it’s a cloud infrastructure set up, managed,
and upgraded by an organization or community for their
own use
Commercial vendors are entering in this domain quite
fast and Open Source providers are also there:
Eucalyptus, Open Nebula, OpenStack
Steps for Setting Up Private Cloud
Adopt a machine virtualization strategy
Profile application compute,memory, and storage usage
and performance requirements
Design a virtual machine development consultancy
Accounting and recharge policies adapted to self-service
Architect a deployment and deploy a private cloud
Source: Five Steps to Enterprise cloud computing, a White paper of Eucalyptus Systems, Inc.
Why Private Cloud? 1/2
Usually, the budget is low, and the project should start as soon as
possible
Growing strongly:
The need for processing large data volumes
The need to conserve power by optimizing server utilization
Non-standard highly-adaptable solution needed
Analyzing large amounts of data to get results
Many different research projects in one organization
Why Private Cloud? 2/2
Private clouds:
Have higher ROI than traditional infrastructure
Are more customizable
Can quickly respond to changes in demands
Support rapid deployment
Have increased security
Focus on an organization’s core business
Have effort required for running them tending downward
Why OSS?
In general:
Lowering the costs (i.e. no licensing headaches!) – the budgets
aren’t growing - but the demands are
Interchangeability & portability (general, avoiding vendor lock-in)
Socio-organizational reasons
Energy efficience
Examples: UEC/Eucalyptus, OpenNebula, OpenStack, Joyent
SmartOS
Private Cloud Challenges
Challenges:
Socio-technical
Technical
Socio-technical Challenges
Socio-technical challenges: mostly political and economic:
Existing structures oppose implementation of private cloud
Weak transparency of who is in charge of systems and economy,
Researches cannot be market cost-effective,
Administrators de facto in charge - instead of scientific groups
Tendency of implementing things because they are interesting
and “fun”, while maybe there is no need for those systems.
Technical Challenges
Private cloud maturity,
Problems porting of programming code,
IT departments should be big enough, with enough
expertise,
OSS: community cannot fix all your problems.
Implementing Cloud Solutions
Determine the needs and their nature – extensive interaction
with all the major stakeholders, e.g., project leader
Top-down steering of the process
Design and implement a test case
End users also thoroughly test the solution - free of charge,
Make sure that implementation succeeds first time!
Get a very clear picture of what services are to be offered,
who will use them, what they will use them for, and how!
Private Cloud in Scientific Environment
Based on Open Source Software (OSS)
Focus on the logistical and technical challenges, and
strategies of setting up a private cloud for scientific
environment
General scenarios:
Local DIY
OSS Private Cloud
Enterprise Private Cloud (with mgmt solution)
Virtual Private Cloud
... or just going Public Cloud
Focus on Scientific Environment
Difference in implementing for “infantry” and “supply troops”
“Infantry” - to support research, scientific computing and
High Performance Computing (HPC)
“Supply” - to support daily operational systems and tasks
i.e. joint administration
Bookkeeping, administration, Communications (telephony, e-
mail, messaging)
“Infantry” – stateless instances vs. “Supply” – stateful
instances
Scientific Environment: “Infantry” 1/2
Uses non-standard & advanced research instruments
Applicable in research, scientific computing and HPC, i.e.:
Generally if users need VMs that they administer themselves
(root access) - more appropriate to supply them with machines
from private cloud, then giving access to virtual hosts behind
firewall
Organizations like ITU (Denmark): for numerous different projects
Organizations like DCSC (Denmark): 1/3 of the jobs would be
runnable on private cloud
in HPC: Only in low end, for low memory and low core
number jobs
Scientific Environment: “Infantry” 2/2
Summarized suggestions
Have social psychology in mind as important factor
Consult the professor in charge of money for the project
Implement an open source solution – OpenStack, OpenNebula,
UEC based on Eucalyptus, Joyent SmartOS, ...
Scientific environment: “Supply”
Needs a stable and supported solution
Summarized suggestions
Have social psychology in mind as important factor
Consult the system owner in charge of money for the project
Implement a proprietary solution from reputable provider
Microsoft Hyper-V, VMware Virtual Infrastructure, …
Sign a support agreement & negotiate a good SLA
CPU and Memory
Processor architecture:
Intel & AMD
Definitely 64-bit – for performance reasons
Multiprocessor, multicore, hyper threading
Virtual Extensions enabled hardware is a must
Intel VT-X or AMD-V virtualization extensions – virtualization
enabled hardware (check by viewing /proc/cpuinfo)
Host’s RAM minimum 4 GB
Enable KSM (Kernel SamePage Merging)
Storage Options
Disk intefaces: parallel & serial
Contemporary disk types:
SATA
SCSI
SAS
SSD
Hybrid drives
Storage Types 1/3
Local vs Remote storage:
Local storage:
disks in the host itselv
DAS – attached directly to the host
Remote storage:
NAS - File Level Storage (NFS, SMB/CIFS)
Also distributed file systems (see i.e. MooseFS and GlusterFS)
SAN - Block Level Storage (FC/FCoE, iSCSI)
OSS/free SAN/NAS appliance example: Napp-it , based on
ZFS/Nexenta
Storage Types 2/3
Storage levels:
Block – bits stored sequentially in a block of fixed size; read & write raw
data blocks; for file systems or DBMSs
File – maintains physical location of files, mapping them to blocks (i.e.
inode number / pointers)
Object – data organized in flexible sized containers, objects, consisting
of data (sequence of bytes) and metadata (extensible attributes
describing the object); for static data; distributed storage spread accross
multiple disk drives and servers; no „central brain“ or „master point“ –
scalable, redundant, durable
Partitioning in Linux using Logical Volume Manger (LVM)
Physical Volume (PV)
Logical Volume (LV): multiple PVs make one LV
Volume Group (VG): multiple LVs make one VG
Storage Types 3/3
Disk configuration:
Independent disks (JBoD)
RAID:
Multiple drives comprising one logical unit
Can be based on software, hardware or firmware
Some of the RAID levels:
0 – block-level striping without parity or mirroring
1 – mirroring without parity or striping
5 – block-level striping with distributed parity
6 – block-level striping with double distributed parity
01 (0+1) – striped sets in a mirrored set
10 (1+0) – mirrored sets in a striped set
Virtualization 1/2
Different types of virtualization:
Hardware
Storage
Network
Memory
Application
Desktop
...
Virtualization 2/2
Hardware virtualization:
Full virtualization: guest unmodified, unaware
HW-assisted virtualization: hw architecture supports virtualization
Partial virtualization: partially simulates the physical hardware
of a machine; i.e. each guest has independent address space
Paravirtualization: guest is aware that it’s not „alone“; guest
modification required (drivers)
OS-level virtualization (Container-based virtualization):
physical server virtualized at OS-level, enabling multiple
isolated and secure virtualized servers to run on a single
physical server; guest and host share the same OS
Types of Hypervisors
Types of hypervisors:
Native / bare metal – run directly on the host’s hardware
Hosted – run within OS
Major virtualization vendors & technologies used in hypervisor layer:
http://www.cloudcomputeinfo.com/virtualization (source: Paul Morse)
Todays most used hypervisors:
KVM/QEMU
Xen
VirtualBox
VMware
Hyper-V
SmartOS
KVM
KVM - „Kernel-based Virtual Machine“, http://www.linux-kvm.org
Linux kernel module that allows a user space program to utilize the
hardware virtualization features of various processors (Intel and AMD
processors - x86 and x86_64, PPC 440, PPC 970, S/390)
KVM included in kernel » more recent kernel gives updated KVM
features, but is less tested)
virtualization solution that can run multiple virtual machines running
unmodified Linux or Windows guests
Supports .raw, .qcow2 and .vmdk disk image formats
Available as integrated part of every Linux distribution since 2.6.20
Components:
loadable kernel module „kvm.ko“ that provides the core virtualization
processor specific module „kvm-intel.ko“ or „kvm-amd.ko“
KVM is only an intefrace that is called through a special system file,
and requires QEMU to be a full virtualization environment
QEMU
QEMU – Quick Emulator - http://wiki.qemu.org
generic open source machine emulator and virtualizer:
Emulator: runs OS’es made for one machine on different machine
Virtualizer:
executes guest code directly on the host CPU
Executed under Xen hypervisor, or using the KVM kernel module
Xen
Open Source virtualisation technology - http://www.xen.org
Started as XenoServer project at Cambridge University
Used as standalone hypervisor, or as hypervisor component in
other cloud infrastructure frameworks
Supports .raw and .vmdk disk image formats
VirtualBox
Oracle VirtualBox - https://www.virtualbox.org
Free software released under GNU GPL
A x86 virtualization platform, created by Innotek, purchased by
Sun, and now owned by Oracle
Installed on a host OS as an application
VMware
VMware - http://www.vmware.com
Different hypervisors:
ESX – mainline product; commercial license
ESXi – mainline product, free (not OSS); boot from flash cards supported
Server – free (not OSS), installs on Linux & Windows
Workstation/Player – virtualization on user PC’s
Supports .vmdk disk image format
VMware
Hyper-V
Microsoft Hyper-V http://www.microsoft.com/en-us/server-
cloud/windows-server/hyper-v.aspx
Released in 2008, new 2012 release expected in November
Virtualization platform that is integral part of Windows Server
Only for x86-64
Can boot from flash card on servers motherboard
Variants:
Stand-alone product, free, limited to command line interface
As Hyper-V role inside Windows Server
Supports .vhd disk image format
SmartOS
Joyent SmartOS - http://smartos.org
Free, gone Open Source August 2011, descent from
OpenSolars - Illumos
Hypervisor powering Joyent’s SmartDataCenter, can run
private, public and hybrid cloud
Enables HW-level and OS-level virtualization in a single OS
Features: KVM, Zones, DTrace, ZFS
Networking Services 1/3
Providing basic network services (DNS, GW, NAT, ...) is a
good idea
Physical & virtual networks
Physical network:
Implementing private cloud using 2 or 3 networks: WAN,
Cloud public & Cloud private
Firewall: OSS based pfSense - to make the whole environment
independent of the network infrastructure / environment
where it will be “plugged in”
Networking Services 2/3
Virtual networks: (i.e. Nicira, Xsigo)
Independece from network HW
Reproduction of the physical network
Operating model of computing virtualization
Different hypervizor compatibility
Isolation between virtual and physical network, and control
layer
Scalling & performance cloud-like
Programmatic provisioning & control
Networking Services 3/3
Network virtualization (example: Nicira)
Redundancy
Automatic/manual failover/failback
Clusters: active/active, active/passive (quorum)
Private Cloud
Some HA features, but local to every provider
Work in progress:Corosync + Pacemaker
„Corosync“ – Open Source cluster solution
„Pacemaker“ – Open Source HA cluster resource manager
Private Cloud Offerings
List of OSS Private Cloud offerings: (source: Paul Morse)
http://www.cloudcomputeinfo.com/private-clouds
Covered:
Eucalyptus (Ubuntu Enterprise Cloud, UEC)
OpenNebula
OpenStack
Eucalyptus
Was bundled with Ubuntu (UEC); now „only“ supported
(Ubuntu is bundling OpenStack from 11.10)
UEC/Eucalyptus is an on-premise private cloud OSS based
platform, sponsored by Eucalyptus Systems
Started as research project in 2007 @ UCSB
Linux based – RHEL, CentOS, Ubuntu
Support for VMware
For scalable private and hybrid clouds
Hybrid clouds achieved by API compatibility with Amazon’s
EC2, S3, and IAM services
New feature since UEC: Eucalyptus HA
All figures taken from http://www.eucalyptus.com
Requirements
All components must be on physical machines (no VMs!)
Processor Intel or AMD with 2 cores of 2 GHz
Min 4 GB RAM
Storage: min 30 GB for each machine, 100-250 GB and more for
SC & NC recommended
Network: min 1 Gbps NICs, bridges configured on NCs
Linux – if Ubuntu, choose LTS (Long Time Support) version
Hypervisors: (Xen, KVM, VMware)
RHEL & CentOS must have Xen
Ubuntu must have KVM
Vmware
SSH connectivity between machines
Components
Designed as a distributed system with a set of 5 (6) elements:
Cloud Controller (CLC)
Walrus Storage Controller (WS3)
Cluster Controller (CC)
Storage Controller (SC)
Node Controller (NC)
VMware Broker (Broker or VB) - optional
Architectural Layers
Three levels:
1. Cloud level 2. Cluster level
Cloud Controller (CLC) Cluster Controller (CC)
Walrus Storage Controller Storage Controller (SC)
(WS3) VMware Broker (Broker or VB)
3. Computing level
Node Controller (NC)
Cloud Controller (CLC)
Entry point to Eucalyptus cloud
web interfaces for administering the infrastructure
web services interface (EC2/S3 compliant) for end users
/client tools
Frontend for managing the entire UEC infrastructure
Gathers info on usage and availability of the resources in the
cloud
Arbitrates the available resources, dispatching the load to the
clusters
Walrus Storage Controller (WS3)
Equivalent to Amazon’s S3
Bucket based storage system with put/get storage model
WS3 is storing the machine images and snapshots
Persistent simple storage service, storing and serving files
Cluster Controller (CC)
Entry point to a cluster
Manages NCs and instances running on them
Controls the virtual network available to the instances
Collects information on NCs, reporting it to CLC
One or several per cloud
Storage Controller (SC)
Allows creation of block storage similar to Amazon’s Elastic
Block Storage (EBS)
Provides the persistent storage for instances on the cluster
level, in form of block level storage volumes
Supports creation of storage volumes, attaching, detaching
and creation of snapshots
Works with storage volumes that can be attached by a VM or
used as a raw block device (no sharing though)
Works with different storage systems (local, SAN, NAS,
DAS)
VMware Broker (Broker or VB)
Optional component for Eucalyptus subscribers
Enables deploying VMs on VMware infrastructure
Responsible for arbitrating interactions between CC and
ESX/ESXi hypervisors
located with CC
Node Controller (NC)
Compute node (“work horse”), runs and controls the instances
Supported hypervisors:
KVM (preferred, open source version)
Xen (open source version)
VMware (ESX/ESXi, for subscribers)
Communicating with both OS and the hypervisor running on the
node, and Cluster Controller
Gathers the data about physical resource availability on the node
and their utilization, and data about instances running on that
node, reporting it to CC
One or several per cluster
Plan Installation
Integration with LDAP or AD
Support for remote storage (SAN/NAS – check supported devices)
Choosing from “ installing NC on one server and all other on
another“, to „each of components on separate server“
Trade-off between simplicity and performance & HA
Installation
Using Ubuntu+Eucalyptus bundled installation (not available
in new versions of Ubuntu, since version 11.10 Ubuntu includes
OpenStack instead)
Manually:
Install OS
Verify network (connectivity, FW, VLAN, DNS...)
Install hypervisor
Configure bridges, NTP and MTA
Install Eucalyptus
Configure Eucalyptus (network modes, hypervisors, runtime
environment)
Eventually configure HA
Scale-out Possibilities
2 physical servers 4 physical servers
Server 1: Server 1: CLC
CLC/WS3/CC/SC Server 2: WS3
Server 2: NC Server 3: CC/SC
3 physical servers: Server 4: NC
Server 1: CLC/WS3 5 physical servers
Server 2: CC/SC Server 1: CLC/WS3
Server 3: NC Server 2: CC1/SC1
Server 3: NC1
Server 4: CC2/SC2
Server 5: NC2
Scaling Out
CLUSTER 1CLOUD CLUSTER 2
CLUSTER
3
NC NC NC NC NC NC
Networking
Networking modes
offering different level
of security and
flexibility:
Managed
Managed No VLAN
System
Static
High Availability
Redundancy - Eucalyptus HA:
By configuring HA, primary and secondary cloud and cluster
components are introduced
Hot-swappable components: CLC, Walrus, CC, SC, and VB
Must have 3 NICs if fearing network hardware failure
For HA SCs, supported SANs needed
NCs are not redundant
Externally accessible components (cloud level) must have DNS
Round-Robin support
Arbitrator service uses ICMP messages to test reachability
If all arbitrators fails to reach some component, failover is initiated
WebGUI
GUI using HybridFox
OpenNebula
An Open Source project aiming at implementing the
industry standard for building and managing virtualized data
centres and cloud infrastructure (IaaS)
Sponsors:
EU through various programs (via DSA, RESERVOIR, 4CaaSt,
StratusLab, BonFIRE)
National grants
C12G Labs
Microsoft
All figures taken from http://opennebula.org
History
Characteristics 1/3
Doesn’t have specific infrastructure requirements, making it easy
to fit in the existing environment
Try it on your laptop! Does not require any special hardware or
software configuration (single server + distro of your choice)
Supports implementations as Private, Hybrid (with both Bursting
and Federation) and Public Cloud
Provides Storage system (storing disk images in datastores; images
can be OS installations, or data blocks), Template Repository
(registering VM definitions), Virtual Networking & Management
(CLI & Sunstone GUI, features live and cold migration, stop,
resume, cancel)
Characteristics 2/3
Has great modularity, which eases the integration with other
solutions
Implemented on a plugin model, making it easy to customize
different aspects (virtualization, storage, authentication &
authorization, ...)
Any action is performed by a bash scirpt
Doesn’t implement a „default“ hypervizor
The core of OpenNebula written in C++, making it robust
and scalable
Monitoring: Configurations of VM’s and all monitoring
information is stored in a (SQL) database
Characteristics 3/3
Uses common open industrial standards – i.e. Amazon EC2
API and Open Cloud Computing Interface (OCCI)
OpenNebula’s native cloud API:
available as Java, Ruby, and XML-RCP API
gives access to all the functions
enables integration of own procedures
Security at high level: host communication using SSH (RSA)
and SSL
Quality: relies on Community and own QA
Making VNC sessions to running VMs supported
Main components
Main features (v3.6)
User Security & Multitenancy using Group Management
Virtual Data Centers
Control & Monitoring of Physical & Virtual Infrastructure
Supports multiple hypervisors, data stores, network
integrations, datacenter monitoring (Ganglia)
Distributed Resource Optimization
High Availability
Hybrid Cloud & Bursting
Self-service provisioning portal
Internal Architecture 1/4
The three layers of the internal architecture:
Internal Architecture 2/4
Drivers communicate directly to the OS
Transfer driver: manage the disk images on the storage
system, that could be NFS or iSCSI, or copying using SSH
Virtual Machine driver: specific to the hypervisor
implemented; manage the VM’s running on the hosts
Information driver: specific to the hypervisor implemented;
showing the current status of hosts and VM’s hosts
Internal Architecture 3/4
Set of components to control and monitor VMs, VNs, storage &
hosts:
Request Manager: handles client requests
Virtual Machine Manager: manages & monitors VMs
Virtual Network Manager: manages virtual networks
Host Manager: manages & monitors physical resources
Database: persistent storage (state)
Internal Architecture 4/4
CLI: manual manipulation of the virtual infrastructure
Scheduler: invokes actions on VMs (using XML-RPC interface)
Other: 3rd party tools (using XML-RPC interface or OpenNebula Cloud
API)
OpenNebula – hypervisors
Xen
KVM/QEMU
VMware
OpenNebula – hardware
Processor requirement: CPU with virtualization support
Memory:
Host: minimum 4 GB
Guest: 256 MB for smallest instance
Storage based on RAID: local disk for PoC, SAN for
production systems
Network: gigabit network card(s), eventually bundling
several cards together (performance & redundance)
OpenNebula – system components
Frontend
Hosts
Image Repository
Physical network
OpenNebula - networking
Service Network is recommended to be dedicated
network
VM’s network interface is connected to a bridge in
the host (i.e. a host with two NICs, public and private, should
have two bridges)
Create bridges with the same name in all the hosts
Drivers that may be associated with each host:
Dummy
Fw Fw Ovsw 802 ebtbl VMw
802.1Q KVM Yes Yes Yes Yes No
Ebtables
Xen Yes Yes Yes Yes No
Ovswitch
VMware VMware No No No No Yes
Installation 1/7
Installation steps:
Planning and preparing the installation
Installing OS
Installing the OpenNebula software
Configuring the OpenNebula components
Installation 2/7
Planning & preparing: OpenNebula is a simple setup consisting
of front end(s) and hosts (cluster nodes).
Basic components:
Front end
Host
Datastores
Service Network
VM networks
Installation 3/7
Storage types: shared & non-shared
Non-shared storage:
Simple to configure
Initial start of an VM will be slower as image is copied to the host
Shared storage:
Any host has access to the image repository
Any operation on a VM goes quicker because there is direct access
to the images, no copying needed
In smaller environments or PoCs, implemented on front end
In bigger environments, implemented on NAS/SAN
Installation 4/7
OS installation:
Choose Linux distribution (i.e. Ubuntu)
Choose installation media: .iso or network
Use default installation steps, except evt. for partitioning
Partitioning:
If HW raid exists, it will appear as single disk; if SW raid should be
configured, can be done after creating partitions
Partitions for system, user and swap files
Default user creation (oneadmin)
The same account and group needed on both Front end & host
All the accounts need the same UID and GID (user & group IDs)
Installation 5/7
Front end:
Install OpenNebula software
Requirement:
Needs access to storage (direct or via network)
Needs access to each host
SSH to hosts using SSH keys (without passwords, auto-add to known hosts)
Ruby (≥ v1.8.7)
Hosts:
No OpenNebula software needed
Different hypervisors on different distros inside a cluster possible
Requirements:
Hypervisor
SSH server
Ruby (≥ v1.8.7)
Host should be registered in OpenNebula (onehost)
Installation 6/7
Configuring the OpenNebula components:
Hypervisor: KVM by default (and easiest), but other drivers can be
selected / modified
Host monitoring
Storage: shared filesystem used by default, can be changed
Networking
Users & Groups (admins, regular, public & service users; integration
with LDAP infrastructure possible)
Sunstone (Web GUI with same functionality as CLI)
Accounting & Statistics (info on usage, accounting, graphs)
Zones (oZone server, managing Zones and VDCs)
Hybrid clouds (for peak resource usages)
Public clouds (using public interfaces, EC2 query and OCCI)
Installation 7/7
Management tasks fter the installation:
Check if deamons are running
Check passwordless inter-host connectivity
Check / enable KSM
Managing hosts:
Registering (adding a host to OpenNebula)
Deleting (deleting a host, i.e. dismissing a host)
Enabling/disabling (no monitoring nor launc of new instances)
Hybrid cloud
AWS EC2 or compatibile
Public cloud
Giving access to the „outside world“ using:
EC2 Query interface using Amazon EC2 Query API
Open Cloud Computing Interface (OCCI)
Centralized management using oZone
Zones: several physical hosts with same or different
hypervisors, controlled by one front end
VDCs (Virtual Data Centers): several hosts from the same
zone logically grouped
Redundancy
Redundant frontends, but no automatism
Use separate MySQL backend (though oZones currently
suppors only SQLlite)
Sunstone can be deployed on a separate machine (not
necessarily on front end)
OpenStack 1/4
IaaS platform for building cloud solutions using any of the
deployment models
Open Source, released under Apache license
Co-founded by NASA and Rackspace in 2009 in a joint open
source project, with NASA delivering cloud compute code
(„Nebula“), and Rackspace delivering cloud object storage
(„Cloud Files“)
First release to public in November 2010 – „Austin“
Backed by i.e. HP, Cisco, IBM, RedHat, Dell, Citrix,
Canonical, ...
All figures taken from http://www.openstack.org
OpenStack 2/4
Has considerable take-off in use ...
...though NASA reported moving a part of its infrastructure to
Amazon, saving $1 million/yr
(http://www.wired.com/wiredenterprise/2012/06/nasa-web-services-openstack/)
Some contributors left NASA going to the private sector
(Nebula, Piston Cloud Computing, RackSpace, ...)
Active community:
http://forums.openstack.org
http://wiki.openstack.org
http://docs.openstack.org
OpenStack 3/4
Supported distros: Ubuntu, Debian, RHEL, CentOS, Fedora,
SUSE, Piston Enterprise OpenStack, SwiftStack,
Cloudscaling & StackOps
Releases: Austin (2010), Bexar, Cactus, Diablo (2011), Essex
(2012, current stable), Folsom (under development)
Hypervisors: KVM, Xen, ESXi
OS-level virtualization also supported, i.e. LXC
Networking modes: Flat (bridging), VLAN (vlan-switch)
Trying it (one or multiple servers):
on free „sandbox“ hosted environment (trystack.org), or
locally using a documented script (devstack.org)
OpenStack 4/4
Written in Python
Consists of:Compute, Networking, Storage, Shared Services
Managed through a dashboard
Implements on standard hardware, supported on ARM
Compute
Provides on-demand computing ressources by provisioning VMs
Access through APIs and web GUIs
Scales horizontally (scale-out)
Some features:
Manage CPU, memory, disk, network
Distributed and asynchronous architecture
Live VM management
Floating IP
Security groups & RBAC (Role Based Access Control)
API with rate limiting and authentication
Resource utilization: allocating, tracking, limiting
VM image management & cashing
Storage
Supports both Object Storage and Block Storage:
Object Storage – distributed, API-accessible, scale-out storage used
by applications, for backup, archiving and data retention (static data)
Block Storage - enables block storage to be used by VMs; supports
integration with enterprise storage solutions (i.e. NetApp, Nexenta, ...)
Some features:
Vertical and horizontal scalability
Huge & flat namespace
Built-in replication
RAID not required
Snapshot & Backup API
Networking
Managing networks and IP addresses
Pluggable, scalable and API-driven system
Flat networks & VLANs
Static Ips, DHCP & Floating IP
Shared services
Dashboard:
GUI for admins and users, brandable
Pluggable / 3rd party: billing, monitoring, additional management
Identity Service:
Central directory of users mapped to services they can access
Queryable list of all of the services deployed
Image Service:
provides discovery, registration and delivery services for disk and
server images
Stores images, snapshots, templates in OpenStack Object Storage
Supports following image formats: raw, AMI, VHD, VDI, qcow2,
VMDK, OVF
Service families
Nova - Compute Service
Swift – Object Storage Service
Glance – Image Registry & Delivery Service
Horizon – User Interface Service, „Dashboard“
Keystone – Identity Service
Quantum (in development) –Virtual Network Service
Nova
Main part – cloud computing fabric controller
One of 1st projects, descends from NASA’s Nebula
provides API to dynamically request and configure VMs
Two major components: messaging queue (RabbitMQ) and database,
enabling asynchronous orchestration of complex tasks through message
passing and information sharing
Components: Database, Web Dashboard, API, Auth Mgr, ObjectStore,
Scheduler, Volume Worker, Network Worker, Compute Worker
all of its major components can be run on multiple servers (designed as
distributed application)
Supported virtualization: KVM, Xen, Citrix Xen, ESX/ESXi, Hyper-V,
QEMU, Linux User Mode & Containers
Uses a SQL-based central database (in future, for larger deployments,
aggregated multiple data stores are planned)
Swift
Object/blob storage
One of 1st projects, descends from Rackspace’s Cloud Files
Components: Proxy Server, Ring, Object Server, Container
Server, Account Server, Replication, Updaters, Auditors
Can be clustered using Proxy nodes and Storage nodes
Filles cannot be accessed through filesystem, but via API client
Scalability and redundancy: writing multiple copies of each object
to multiple storage servers within separate zones
zone: isolated storage server groups
Isolation levels: different servers, racks, sections of a datacenter,
datacenters
Best practice: write 3 replicas across 5 zones (distributed
writes/reads)
Glance
Discovers, registers and retrieves VM images
Uses RESTful API for querying & retrieval
Supports various back-end storage solutions: VM image can
be stored on simple file systems and object storage systems
(Swift)
Components: Glance API server, Registry Server, Store
Adapter
Supported disk formats: raw, VHD, VMDK, qcow2, VDI,
ISO, AMI, ARI, AKI
Supported container formats: OVF, AMI, ARI, AKI
Horizon
Keystone
Cloud identity service
provides Identity, Token, Catalog and Policy services
implements OpenStack Identity API
Quantum
Virtual network service („Networking as a Service“)
Still under development, to be released with the release of
„Folsom“ 27th September 2012
Provides API to dynamically request and configure virtual
networks
Quantum API supports extensions providing advanced
networking (i.e. Monitoring, QoS, ACLs, ...)
Plugins for Open vSwitch, Cisco, Linux Bridge, Nicira NVP,
Ryu OpenFlow, NEC OpenFlow, MidoNet
Advanced setup
Redundancy
Comming with „Folsom“
„Corosync“ – open source cluster
Have multiple Swift and Nova servers
Cloud controller – single point of failure (nova-api, nova-
network):
Run multiple instances on multiple hosts (state is saved in DB)
Use (--multi host configuration in Nova)
Recommendations
Although still at an early stage, easier to install but still hard
to manage and maintain for a regular admin, and having steep
learning curve (admins & users), implementation is
suggested, at affordable, smaller scale
Implement on a current/modern hardware
Keep the knowledge updated
Keep software platform and hardware updated if possible
Monitor & analyze
costs, available features and complexity, compared to
budget, needs and internal resources available
Asses the implementation possibilities based on the analyses
Sources of Further Material 1/5
http://www.openstack.org/
http://opennebula.org/
http://www.ubuntu.com/
http://www.eucalyptus.com
http://www.napp-it.org/index_en.html
http://www.cloudcomputeinfo.com/virtualization
http://www.cloudcomputeinfo.com/private-clouds
http://www.linux-kvm.org
http://wiki.qemu.org
http://www.xen.org
https://www.virtualbox.org
http://www.vmware.com
http://www.microsoft.com/en-us/server-cloud/windows-server/hyper-v.aspx
http://smartos.org
Sources of Further Material 2/5
Armbrust, M., et al., 2010, A View of Cloud Computing, ACM,
53(4), pp. 50-58.
Zhang, Q., Cheng, L., Boutaba, R., Cloud Coomputing: state-of-
the-art and research challenges, Journal of Internet Services and
Applications, 2010, 1:7-18.
The Future of Cloud Computing: Opportunities for European
Cloud Computing Beyond 2010.
Chapman et. al. 2010. Software architecture definition for on-
demand cloud provisioning. In Proceedings of the 19th ACM
International Symposium on High Performance Distributed Computing
(HPDC '10), 2010, A View of Cloud Computing, ACM, 53(4), pp.
50-58.
Ali Babar, M.; Chauhan M.A.; , A tale of migration to cloud
computing for sharing experiences and observations, SECLOUD
'11, ACM.
Sources of Further Material 3/5
http://nicira.com/
http://www.xsigo.com/
http://www.reservoir-fp7.eu/
http://www.c12g.com/
http://dsa-research.org/
http://portal.ucm.es/en/web/en-ucm
http://occi-wg.org/
http://opennebula.org/documentation:rel3.6:ganglia
http://www.nasa.gov/
http://www.rackspace.com/
http://www.nebula.com/
http://www.pistoncloud.com/
Sources of Further Material 4/5
http://www.pistoncloud.com/openstack-cloud-software
http://swiftstack.com/
http://www.cloudscaling.com/
http://www.stackops.com/
http://www.rabbitmq.com/
http://www.corosync.org/
http://www.clusterlabs.org/
OpenNebula 3 Cloud Computing”, Giovanni Toraldo, Packtpub,
May 2012
Eucalyptus Guides, Eucalyptus Systems, Jun 2012
“Deploying OpenStack”, Ken Pepple, Oreilly, July 2011
OpenStack Manuals, docs.openstack.org, May 2012
Sources of Further Material 5/5
“Ubuntu Enterprise Cloud Architecture”, Technical White Paper, Simon
Wardley, Etienne Goyer & Nick Barcet – August 2009
“Building a Private Cloud with Ubuntu Server 10.04 Enterprise Cloud
(Eucalyptus)”, OSCON 2010
“Eucalyptus Beginner's Guide”, UEC Edition, 23 Dec 2010, Johnson D,
KiranMurari, Murthy Raju, Suseendran RB,Yogesh Girikumar
“Dell releases Ubuntu-powered cloud servers”, Joab Jackson, IDG News
Service, NetworkWorld
Interview at Danish Center for Scientific Computing (DCSC), 30th
March 2011
White Paper “Ubuntu - An Introduction to Cloud Computing”
Deployment Guide - Ubuntu Enterprise Cloud on Dell Servers SE
White Paper “Ubuntu Enterprise Cloud Architecture”, Wardley, Goyer,
Barcet, August 2009
“Practical Cloud Evaluation from a Nordic eScience User Perspective”,
Edlund, Koopmans, November 2011.
Questions
?
Thank you!
Thank you for your attention!
Still having questions?
maba@itu.dk
zopa@itu.dk