KEMBAR78
CC - Module 2 Notes | PDF | Virtualization | Virtual Machine
0% found this document useful (0 votes)
31 views61 pages

CC - Module 2 Notes

The document outlines the syllabus and educational framework for a Cloud Computing course at RNS Institute of Technology, focusing on virtualization of virtual machines, clusters, and data centers. It details various levels of virtualization implementation, including hardware and OS-level virtualization, and discusses the role of Virtual Machine Monitors (VMM) in managing resources. Additionally, it covers the advantages and challenges of different virtualization approaches, particularly in the context of cloud computing and resource management.

Uploaded by

Dv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views61 pages

CC - Module 2 Notes

The document outlines the syllabus and educational framework for a Cloud Computing course at RNS Institute of Technology, focusing on virtualization of virtual machines, clusters, and data centers. It details various levels of virtualization implementation, including hardware and OS-level virtualization, and discusses the role of Virtual Machine Monitors (VMM) in managing resources. Additionally, it covers the advantages and challenges of different virtualization approaches, particularly in the context of cloud computing and resource management.

Uploaded by

Dv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

RNS Institute of Technology, Bangalore – 98

BIS613D – Cloud Computing

For all VI semester Students

Module – 2

Virtual Machines and Virtualization of Clusters and Data


Centers
RNSIT Vision and Mission
Vision
Building RNSIT into a World Class Institution
Mission
To impart high quality education in Engineering, Technology and Management
with a difference, enabling students to excel in their career by
 Attracting quality Students and preparing them with a strong foundation in fundamentals
so as to achieve distinctions in various walks of life leading to outstanding contributions
 Imparting value based, need based, choice based and skill based professional education
to the aspiring youth and carving them into disciplined, World class Professionals with
social responsibility
 Promoting excellence in Teaching, Research and Consultancy that galvanizes academic
consciousness among Faculty and Students
 Exposing Students to emerging frontiers of knowledge in various domains and make
them suitable for Industry, Entrepreneurship, Higher studies, and Research &
Development
 Providing freedom of action and choice for all the Stakeholders with better visibility

Department of ISE
Vision
Building Information Technology Professionals by Imparting Quality Education and Inculcating
Key Competencies

Mission
 Provide strong fundamentals through learner centric approach
 Instil technical, interpersonal, interdisciplinary skills and logical thinking for holistic
development
 Train to excel in higher education, research, and innovation with global perspective
 Develop leadership and entrepreneurship qualities with societal responsibilities
Syllabus
Virtual Machines and Virtualization of Clusters and Data Centers:
Implementation Levels of Virtualization, Virtualization Structure/Tools and Mechanisms,
Virtualization of CPU/Memory and I/O devices, Virtual Clusters and Resource Management,
Virtualization for Data Center Automation.
Textbook 1: Chapter 3: 3.1 to 3.5
Module – 02

Virtual Machines and Virtualization of Clusters and Data Centers

Implementation Levels of Virtualization Virtualization:

Virtualization allows multiple virtual machines (VMs) to run on the same physical hardware,
improving resource sharing, performance, and flexibility.

It enhances system efficiency by separating hardware from software.

It has gained importance in distributed and cloud computing.

Levels of Virtualization Implementation

Virtualization can be implemented at various operational layers of the system, including:


Instruction Set Architecture (ISA) Level:

o Virtualizes the instruction set of the host machine to emulate different processor
architectures (e.g., running MIPS code on an x86 machine). o Uses code interpretation or
dynamic binary translation for better performance.

Hardware Abstraction Level:

o Virtualizes hardware resources like CPU, memory, and I/O devices to allow multiple users to
utilize the hardware concurrently.
o Historical example: IBM VM/370, modern example: Xen hypervisor for x86 machines.

Operating System (OS) Level:


o Creates isolated containers on a single server to allocate hardware resources among users.

o Commonly used in virtual hosting environments and server consolidation.


Library Support Level:

o Virtualizes the communication link between applications and the OS through API hooks.
o Examples include WINE (Windows applications on UNIX) and vCUDA (GPU acceleration
within VMs).

User-Application Level:

o Virtualizes individual applications or processes, often called process-level virtualization.


o Examples include the Java Virtual Machine (JVM) and Microsoft .NET CLR. o Other
approaches include application isolation, sandboxing, and application streaming, where the
application is isolated from the host OS for easier distribution and removal.

Purpose and Applications:

• Virtualization improves resource utilization, enables running different OS and applications


on the same machine, and simplifies the management of distributed systems.
• It plays a key role in enhancing distributed computing, cloud environments, and legacy
software support.

Relative Merits of Virtualization Approaches

Comparison Factors: Higher performance, application flexibility, implementation complexity,


and application isolation.

Merit Representation: Number of X’s in a table (5X = best, 1X = worst).

Performance Considerations:

 Hardware & OS-level virtualization → Highest performance but expensive.

 User-level virtualization → Most complex in terms of application isolation.

 ISA-level virtualization → Best application flexibility.

VMM Design Requirements and Providers


Definition & Role of VMM:

• VMM (Virtual Machine Monitor) is a layer between hardware and the operating system.
• Manages hardware resources and captures program interactions with hardware.
• Enables multiple OS instances to run on a single set of hardware.

Requirements of a VMM:

1. Identical Execution Environment:


o Programs should run as if they are on a real machine.

2. Minimal Performance Overhead:


o Should not significantly slow down program execution.

3. Full Control Over System Resources:


o Programs should only access explicitly allocated resources.

Performance Considerations:

VMs share hardware, leading to resource contention.

Timing dependencies and resource availability may cause minor performance differences.

Traditional emulators/simulators offer flexibility but are too slow for real-world use.

Efficiency is ensured by executing most virtual processor instructions directly on hardware.

Resource Control by VMM:

Allocates hardware resources to programs.

Restricts unauthorized access to unallocated resources.

Can reclaim allocated resources under certain conditions.

Challenges in VMM Implementation:

Some processors (e.g., x86) lack full virtualization support.


Solution: Hardware-assisted virtualization modifies hardware to support VMM requirements.

Virtualization Support at the OS Level

Role of OS-Level Virtualization in Cloud Computing

Cloud computing relies on virtualization to shift hardware and management costs to third-party
providers.

Two major challenges:

1. Dynamic resource allocation – Scaling CPU resources based on demand.


2. Slow VM instantiation – Fresh VM boots take time and lack awareness of the
application state.

Why OS-Level Virtualization?

• Hardware-level virtualization is slow and inefficient due to redundant VM image storage


and performance overhead.
• OS-level virtualization creates multiple isolated Virtual Execution Environments (VEs) or
Containers within a single OS kernel.
• VEs function like real servers with their own processes, file system, user accounts, and
network configurations but share the same OS kernel.
• Also known as single-OS image virtualization.

Advantages of OS-Level Virtualization

1. Fast start-up/shutdown, low resource use, high scalability.


2. State synchronization between VMs and the host OS – Allows better application state
awareness.
3. Efficiency through resource sharing – VEs can access most host resources without modifying
them.
4. Overcomes slow VM initialization and application state unawareness in cloud computing.
Disadvantages of OS-Level Virtualization

1. Same OS requirement – All VMs on a single container must belong to the same OS family
(e.g., Windows-based VMs cannot run on a Linux host).
2. User preference issues – Some cloud users require different OS types, limiting flexibility.
3. Resource duplication problem – If each VM has a full copy of system resources, it leads to
high storage and performance costs.

Implementation Considerations Creating virtual root directories:

1. Duplicate resources for each VM (higher cost).

2. Share most resources and create private copies on demand (preferred approach).

OS-level virtualization is often a second choice due to its limitations compared to hardware-
assisted virtualization.
Virtualization on Linux and Windows Platforms

Linux vs. Windows Virtualization

• Linux-based OS-level virtualization is well-developed, while Windows-based OS-level


virtualization is still in research.
• The Linux kernel provides an abstraction layer for handling hardware, often requiring
patches for new hardware support.
• Most Linux platforms are not tied to a specific kernel, allowing multiple VMs to run on the
same host.
• Windows OS virtualization tools are still experimental, with FVM being an example for the
Windows NT platform.

Virtualization Support on the Linux Platform: OpenVZ

OpenVZ is an open-source container-based virtualization tool for Linux.

It modifies the Linux kernel to support:

1. Virtual environments (VPS) – Each VPS functions like an independent Linux server with its
own processes, users, and virtual devices.
2. Resource management – Controls CPU, disk space, and memory allocation.
3. Check pointing and live migration – Saves VM state to a file for quick transfer and
restoration on another machine.
Resource Management in OpenVZ:

o Two-level disk allocation:


 First level: Admin allocates disk space for VMs.
 Second level: VM admin assigns disk space to users.
o Two-level CPU scheduling:
 First level: OpenVZ decides VM priority.
 Second level: Standard Linux CPU scheduler manages tasks within the VM.
o 20+ resource control parameters ensure optimized VM usage.

Middleware Support for Virtualization (Library-Level Virtualization)

Library-level virtualization is also known as user-level Application Binary Interface (ABI) or API
emulation.

This type of virtualization can create execution environments for running alien programs on a
platform rather than creating a VM to run the entire operating system.

API call interception and remapping are the key functions performed. This section provides an
overview of several library-level virtualization systems.

Namely the Windows Application Binary Interface (WABI), lxrun, WINE, Visual MainWin, and
vCUDA, which are summarized in Table 3.4.
Virtualization Structure/Tools and Mechanisms

VM Architecture Classes

After virtualization, a virtualization layer is inserted between the hardware and OS, converting
real hardware into virtual hardware.
This allows multiple OSes (Linux, Windows, etc.) to run simultaneously on a single machine.
There are three main classes of VM architecture:

1. Hypervisor-based virtualization (VMM – Virtual Machine Monitor)


2. Para-virtualization
3. Host-based virtualization

Hypervisor and Xen Architecture

The hypervisor enables hardware-level virtualization by running directly on bare metal


hardware (CPU, memory, disk, network interfaces).

It acts as an interface between physical hardware and guest OSes.

Types of Hypervisors:

Micro-kernel hypervisor (e.g., Microsoft Hyper-V):

o Includes only core functions (memory management, processor


scheduling).

o Device drivers and other components are external.

o Smaller hypervisor size.

Monolithic hypervisor (e.g., VMware ESX):

o Implements all functions, including device drivers. o


Larger hypervisor size but with better performance and control.

The Xen Hypervisor Architecture

• Xen is an open-source micro-kernel hypervisor developed at Cambridge University.


• Separates policy (handled by Domain 0) from mechanism (handled by Xen).
• No native device drivers → Guest OSes handle device management directly.
• Virtual environment between hardware and OS.
Components of Xen

Xen Domain Structure

Domain 0 (Dom0):

o Privileged guest OS with direct hardware access.

o Manages guest OS instances (Domain U).

o Controls resource allocation and device management.

Domain U (DomU):
o Unprivileged guest OS instances running under Xen.

o Cannot access hardware directly.

Security Considerations

• Domain 0 is the most critical component. If compromised, the attacker gains full control
over all VMs.
• Security policies are required to protect Domain 0.
VM State Management and Rollback

Unlike traditional machines (which follow a linear execution path), VM execution follows a tree
structure where multiple instances can be created at different states.

Benefits of VM state rollback:

o Error recovery (rollback to a previous working state).


o Efficient system distribution (duplicate VMs for dynamic content).

o Live migration (moving running VMs between hosts).

Challenges:

o Security risks in handling VM snapshots and rollbacks.

o Need for strict access control and auditing.

Binary Translation with Full Virtualization

Depending on implementation technologies, hardware virtualization can be classified into two


cate gories: full virtualization and host-based virtualization.

Full virtualization does not need to modify the host OS. It relies on binary translation to trap
and to virtualize the execution of certain sensitive, nonvirtualizable instructions.

The guest OSes and their applications consist of noncritical and critical instructions. In a host-
based system, both a host OS and a guest OS are used.

A virtualization software layer is built between the host OS and guest OS. These two classes of
VM architecture are introduced next.

Full Virtualization

With full virtualization, noncritical instructions run on the hardware directly while critical
instructions are discovered and replaced with traps into the VMM to be emulated by software.
Both the hypervisor and VMM approaches are considered full virtualization. Why are only
critical instructions trapped into the VMM? This is because binary translation can incur a large
performance overhead.

Noncritical instructions do not control hardware or threaten the security of the system, but
critical instructions do.

Therefore, running noncritical instructions on hardware not only can promote efficiency, but
also can ensure system security.

Binary Translation Using VMM (Virtual Machine Monitor)

• Implemented by VMware and other vendors.


• VMM is placed at Ring 0 (privileged mode).
• Guest OS runs at Ring 1, unaware that it is virtualized.
• VMM scans and translates privileged instructions before execution.
• Code caching helps optimize performance but increases memory usage.

Host-Based Virtualization

Runs on a host OS rather than directly on hardware.

The virtualization layer sits between the host OS and guest OS.

• Guest OSes and applications can run inside VMs, while other applications can run
directly on the host OS.
Advantages:
o Easier deployment (no need to modify the host OS).
o Simplified design (relies on the host OS for device drivers).
o Works on various hardware configurations.
Disadvantages:
o Lower performance due to multiple layers of hardware access.
o Requires binary translation if guest OS and host hardware have different ISAs.
o High overhead, making it less efficient in practice.
Para-Virtualization with Compiler Support
Overview of Para-Virtualization
• Requires modification of the guest OS kernel to support virtualization.
• Provides special APIs (hypercalls) to replace non-virtualizable OS instructions.

• Reduces virtualization overhead, improving performance compared to full virtualization.


• Unlike full virtualization, which relies on binary translation, para-virtualization requires OS
kernel modifications.

Para-Virtualization Architecture
A virtualization layer is inserted between hardware and OS.

In the x86 architecture, the OS typically runs at Ring 0 for privileged operations, while
applications run at Ring 3.

Para-virtualization modifies the guest OS to:

o Run at Ring 1 instead of Ring 0.

o Replace non-virtualizable instructions with hypercalls to the hypervisor/VMM.

Advantages of Para-Virtualization

• Improves performance by eliminating the need for complex binary translation.


• More efficient than full virtualization, especially for workloads with frequent privileged
instructions.
• Used by popular hypervisors like Xen, KVM, and VMware ESX.

Challenges of Para-Virtualization

• Requires modifying the OS kernel, making it less compatible with unmodified OSes.
• Maintaining para-virtualized OS versions is costly, as OS updates require modifications.
• Performance benefits depend on workload types—some workloads benefit greatly,
while others do not.

KVM (Kernel-Based VM)

This is a Linux para-virtualization system—a part of the Linux version 2.6.20 kernel. Memory
management and scheduling activities are carried out by the existing Linux kernel.

The KVM does the rest, which makes it simpler than the hypervisor that controls the entire
machine.

KVM is a hardware-assisted para-virtualization tool, which improves performance and supports


unmodified guest OSes such as Windows, Linux, Solaris, and other UNIX variants.

Para-Virtualization with Compiler Support

Unlike full virtualization, which traps privileged instructions at runtime, paravirtualization


modifies instructions at compile time.

The OS kernel replaces privileged instructions with hypercalls before execution.

Xen follows this architecture, where:

 The guest OS runs at Ring 1 instead of Ring 0.


 Privileged instructions are replaced with hypercalls to the hypervisor.
 Hypercalls function similarly to system calls in UNIX (using service routines).
Virtualization of CPU/Memory and I/O devices

Introduction to Virtualization Support in Hardware

Modern processors (e.g., x86) use hardware-assisted virtualization to support virtual machines
efficiently.
The Virtual Machine Monitor (VMM) and guest OS operate in separate modes, ensuring security
and isolation.

Sensitive instructions of the guest OS are trapped in the VMM, preventing unauthorized
hardware access.

Hardware Support for Virtualization

Processors have two main execution modes:

User Mode: Runs applications with limited access to hardware.

o Supervisor Mode (Privileged Mode): Runs the OS kernel and handles critical system
operations.

Virtualization complicates execution because multiple OSes run on a single machine.


Examples of hardware-assisted virtualization tools:

o VMware Workstation (host-based virtualization).


o Xen (hypervisor that modifies Linux as the lowest privileged layer).
o KVM (uses Intel VT-x or AMD-V for efficient virtualization).
o VirtIO (provides virtualized I/O devices like Ethernet, disk, memory ballooning, and VGA).

CPU Virtualization

VMs execute most instructions in native mode for efficiency, except critical instructions.

Critical instructions are classified into three categories:

1. Privileged Instructions: Only execute in privileged mode (Ring 0).


2. Control-Sensitive Instructions: Modify system settings or resources.
3. Behavior-Sensitive Instructions: Depend on system configuration (e.g., memory
access).

CPU virtualization requires trapping privileged instructions so that the VMM can handle them
securely.

RISC architectures are naturally virtualizable, as all sensitive instructions are privileged.

x86 architecture is not naturally virtualizable because some sensitive instructions (e.g., SGDT,
SMSW) are not privileged and cannot be trapped by the VMM.

Example: System Calls in UNIX and Xen

• In UNIX systems, system calls trigger the 0x80 interrupt, passing control to the kernel.
• In Xen (a para-virtualization system), system calls trigger both 0x80 (guest OS) and 0x82
(hypervisor).
• The hypervisor processes privileged operations before returning control to the guest OS.

Hardware-Assisted CPU Virtualization

Intel and AMD introduced an additional privilege mode (Ring -1) for virtualization.

Now, the hypervisor runs at Ring -1, while the guest OS runs at Ring 0.

This eliminates the need for complex binary translation (used in full virtualization).

Benefits:

o Simplifies virtualization implementation.

o Allows OSes to run in VMs without modification.

o Traps all privileged instructions in the hypervisor automatically.


Memory Virtualization

Virtual Memory Mapping in Traditional Systems

o The OS maps virtual memory to machine memory using page tables (one-stage mapping).
o Modern x86 CPUs use an MMU (Memory Management Unit) and TLB (Translation Lookaside
Buffer) to optimize memory performance.
Memory Virtualization in Virtualized Environments

o Physical RAM is shared and dynamically allocated among Virtual Machines (VMs).
o A two-stage mapping is required:
 Guest OS: Maps virtual memory to guest physical memory.
 VMM (Hypervisor): Maps guest physical memory to actual machine memory.

Shadow Page Tables & Nested Paging

o Each guest OS page table has a corresponding shadow page table maintained by the VMM.

o This additional layer leads to performance overhead and high memory costs.
o Nested Paging (Hardware-Assisted Virtualization):
 Reduces the overhead of shadow page tables.
 Introduced by AMD’s Barcelona processor (2007).

Optimizing Virtual Memory Performance

o VMware uses shadow page tables for address translation.


o TLB hardware enables direct mapping from virtual memory to machine memory, reducing
translation overhead.
I/O Virtualization & Multi-Core Virtualization
I/O Virtualization

I/O virtualization manages routing of I/O requests between virtual devices and shared physical
hardware. There are three main approaches:

Full Device Emulation

o Emulates real-world devices in software within the VMM (hypervisor).

o The guest OS interacts with virtual devices, and the VMM handles I/O operations.

o Drawback: High overhead and lower performance compared to real hardware. Para-

Virtualization (Split Driver Model - Used in Xen)

o Uses frontend and backend drivers to handle I/O:


 Frontend driver: Manages I/O requests in the guest OS.

 Backend driver: Runs in the privileged domain (Domain 0) and manages real
I/O devices.
o Pros: Better performance than full emulation.
o Cons: Higher CPU overhead. Direct I/O Virtualization

o Allows VMs to directly access physical devices.


o Pros: Close-to-native performance, lower CPU cost.
o Cons: Limited support for commodity hardware, potential system crashes during workload
migration.

Hardware-Assisted I/O Virtualization

Intel VT-d helps remap I/O DMA transfers and device interrupts, allowing direct device access
for VMs.

Self-Virtualized I/O (SV-IO) uses multi-core processors to virtualize I/O devices, providing an
efficient API for virtualized systems.
Multi-Core Virtualization

Virtualizing multi-core processors is more complex than uni-core processors due to:

Parallelization Challenges:

o Applications must be explicitly parallelized to utilize all cores efficiently.


o New programming models, languages, and libraries are needed.

Task Scheduling Complexity:

o Scheduling algorithms and resource management policies must optimize performance while
handling core assignments.

Dynamic Heterogeneity

• New architectures mix fat CPU cores and thin GPU cores on the same chip.
• Hardware reliability issues and increased complexity in transistor management make

resource allocation more difficult. Physical vs. Virtual Processor Cores Virtual CPU

(VCPU) Migration:

 Wells et al. proposed a method where VCPUs can move between cores dynamically.
 Reduces inefficiencies in managing processor cores by software.
 Located below the ISA, making it transparent to OS and hypervisors.

Virtual Hierarchy in Many-Core Processors

Virtual Hierarchy

• Many-core chip multiprocessors (CMPs) enable space-sharing, where different jobs are
assigned to separate groups of cores for long intervals.
• Virtual hierarchy is a dynamic cache hierarchy that adapts to workload demands, unlike
static physical cache hierarchies.
• Proposed by Marty and Hill, this method optimizes performance isolation and cache
coherence.
How Virtual Hierarchy Works

Many-core CMPs typically use physical cache hierarchies (L1, L2) with static allocation.

A virtual hierarchy dynamically adapts cache levels to workload needs, improving

access speed and reducing interference. benefits:

1. Locates data blocks close to cores for faster access.


2. Establishes shared-cache domains to minimize data transfer delays.
3. Reduces performance interference between different workloads.

Space-Sharing Workload Assignment

Workloads are grouped into virtual clusters of cores, each assigned to different virtual machines
(VMs):

o VM0 & VM3 → Database workload

o VM1 & VM2 → Web server workload

o VM4–VM7 → Middleware workload

o Each VM operates in isolation, minimizing cache misses and ensuring efficient resource
allocation.
Two-Level Virtual Coherence & Caching Hierarchy

First level:

o Each VM cluster operates in isolation

o Reduces cache miss access time.

o Prevents resource contention between workloads.

Second level:
o Maintains a globally shared memory for all VMs.

o Enables dynamic resource repartitioning without cache flushes.

o Supports content-based page sharing for efficient memory management.

Use Cases & Advantages

• Optimizes multiprogramming & server consolidation workloads.


• Improves system adaptability without major OS or hypervisor changes.
• Ensures performance scalability in many-core processors.

Virtual Clusters and Resource Management

Definition and Overview

• Physical Clusters: A collection of physical servers interconnected via a physical network


(e.g., LAN).
• Virtual Clusters: A group of VMs (Virtual Machines) running on distributed physical servers,
interconnected through a virtual network.

Design Issues in Virtual Clusters Live

Migration of VMs

o Moving VMs between physical machines without downtime.

o Ensures load balancing, fault tolerance, and resource optimization. Memory and File

Migrations

o Efficient movement of VM memory and file data across different servers.

o Avoids performance bottlenecks and ensures seamless transitions.

Dynamic Deployment of Virtual Clusters

o Enables scalable and on-demand provisioning of VMs.


o Adjusts resources dynamically to avoid overloading or underutilization.

Virtual Cluster Configuration and Management

Traditional VM Setup:

o Requires manual configuration by an administrator.


o Poor configurations may lead to performance issues (e.g., overloading, underutilization).

Cloud-based Virtualization (Example: Amazon EC2)

o Elastic Computing: Allows users to dynamically create, manage, and scale VMs.

o User Account Management: Customers can control VM resources over time.

Virtualization Platforms & Bridging Mode

o Platforms like XenServer and VMware ESX Server support bridging mode.

o In bridging mode, all VMs appear as individual network hosts.

o VMs can freely communicate over the virtual network interface and self-configure.

Physical vs. Virtual Clusters

Physical Clusters vs. Virtual Clusters

Physical Clusters: Comprise multiple physical servers interconnected via physical networks.

Virtual Clusters: Comprise VMs distributed across multiple physical servers and connected
through a virtual network.
Properties of Virtual Clusters Flexible

Node Configuration

o Virtual clusters can include both physical and virtual machines.


o A single physical machine can host multiple VMs with different Operating Systems

(OSes). Guest vs. Host OS

o Each VM runs a guest OS, which may differ from the host OS of the physical machine.

Resource Consolidation & Utilization

o VMs help in consolidating multiple applications on the same physical server.

o Enhances server utilization and improves application flexibility.

Distributed Parallelism & Fault Tolerance

o VMs can be replicated across multiple servers for better fault tolerance and disaster
recovery.
o If a physical node fails, only the VMs running on that node are affected.
o A VM failure does not impact the host system. Scalability & Dynamic Allocation

o The number of nodes in a virtual cluster can increase or decrease dynamically, similar to
P2P networks.

Virtual Cluster Management & Storage

Efficient VM Deployment & Monitoring: Requires techniques like resource scheduling, load
balancing, server consolidation, and fault tolerance.

VM Image Storage:

o A large number of VM images must be stored efficiently.


o Template VMs can be used to pre-install common software, reducing redundancy.

o Users can customize OS instances by adding specific libraries and applications.

Dynamic Cluster Boundaries

Virtual clusters are flexible:

o VMs can be moved, added, or removed dynamically.

o They can span multiple physical clusters and adapt to changing workloads.

Fast Deployment & Scheduling

• Fast deployment involves quickly setting up OS, libraries, and applications on physical
nodes.
• VM runtime environments should switch efficiently between different users to optimize
resources.
• Green computing aims to minimize energy consumption across the cluster, not just on
single nodes.
• Live VM migration shifts workloads between nodes but can introduce overhead affecting
performance.
• Load balancing improves resource utilization and system response times.
High-Performance Virtual Storage

• VMs use template images (pre-installed OS and software) to reduce setup time.
• Copy-on-Write (COW) technique minimizes disk space usage by creating small, efficient
backup files.
• Storage management should reduce duplicate blocks to optimize disk usage in virtual
clusters.

Live VM Migration Steps and Performance Effects

Overview of VM Migration

• In mixed host-guest clusters, physical nodes run tasks directly, while VMs serve as failover
replacements.
• VM failover is more flexible than traditional physical failover but depends on the host’s
availability.
• Live VM migration enables a running VM to move between hosts without service
interruption.
Steps of Live VM Migration

1. Start Migration (Steps 0 & 1)


o Identify the VM to migrate and select the destination host.

o Migration is triggered automatically (e.g., for load balancing or server consolidation).


2. Memory Transfer (Step 2)
o The entire VM memory is copied to the new host in multiple rounds.
o Changed memory pages are recopied iteratively until only a small portion remains.
3. Suspend & Final Data Copy (Step 3)
o The VM is temporarily stopped while transferring the final memory pages, CPU, and
network states.
o This causes downtime, which should be minimized for user experience.
4. Commit & Activate VM (Steps 4 & 5)

o The VM is restored on the new host, resuming execution.


o The network is redirected to the new VM, and the old one is removed.

Performance Effects of Migration

Data Throughput Impact:

o Before migration: 870 MB/s


o First memory pre-copy (63 sec): 765 MB/s

o Further iterations (9.8 sec): 694 MB/s

Downtime: Only 165 milliseconds, ensuring minimal disruption.

Minimal Migration Overhead: Critical for dynamic cluster reconfiguration and disaster recovery,
especially in cloud computing.

Virtual Cluster Management Approaches

1. Guest-Based Manager: Runs within VMs (e.g., openMosix on Xen, Solaris cluster on
VMware).
2. Host-Based Manager: Runs on physical hosts and can restart VMs after failure (e.g.,
VMware HA).
3. Independent Cluster Manager: Manages both host and guest systems, increasing
complexity.
4. Integrated Cluster Management: Differentiates between virtual and physical resources
for optimal efficiency.

Role of Virtual Clusters

• Used in cloud computing, HPC, and computational grids.


• Enables dynamic resource allocation, quick failover, and efficient workload management.
• Live migration helps maintain uptime, reduce network congestion, and support flexible
resource scaling.

Migration of Memory, Files, and Network Resources

Introduction

Due to the high initial cost of clusters—including space, power, and cooling—leasing or sharing
clusters is a cost-effective approach.

Shared clusters improve resource utilization through multiplexing and economies of scale.

Early configuration and management systems help define service-specific clusters and allocate
physical nodes accordingly.

When migrating a system to another physical node, several key considerations must be
addressed.

Memory Migration

Memory migration is a critical aspect of VM migration, as moving a VM’s memory from one
host to another must be done efficiently.

Memory transfer sizes typically range from hundreds of megabytes to several gigabytes.
Internet Suspend-Resume (ISR) Technique

The ISR technique exploits temporal locality, meaning that memory states in the suspended and
resumed VM instances are largely similar.

Mechanism:

o Each file is represented as a tree of small subfiles.

o Both the suspended and resumed VM instances contain a copy of this tree.

o Only changed files are transmitted, reducing data transfer.

Limitations:

o ISR is useful when live migration is not required.

o Downtime is relatively high compared to live migration techniques.

File System Migration

For seamless VM migration, a system must ensure a consistent, location-independent file


system across all hosts.

Approaches to File System Migration

1. Virtual Disk Mapping

• Each VM is assigned a virtual disk that contains the file system.


• The entire disk is moved along with the VM state.
• Challenges: High-capacity disks make full disk migration impractical due to network
overhead.

2. Global Distributed File System

• A network-accessible file system eliminates the need for file transfers.


• This reduces migration time and enhances accessibility.
3. ISR-Based Distributed File System

• ISR uses a distributed file system as a transport mechanism for VM state transfer.
• The actual file systems are not mapped directly onto the distributed system.
• Instead, relevant files are copied into and out of the local file system during suspend and
resume operations.
• Advantages:
o Simplifies implementation by avoiding direct dependency on distributed file
system semantics.
• Challenges:
o The VMM must store VM virtual disk contents locally, which must be moved with
the VM state.

Network Migration in Virtual Machine (VM) Environments

VM migration involves maintaining network connectivity for the VM without relying on


forwarding mechanisms or redirection mechanisms from the original host. This is essential for
uninterrupted service during migration.

Some Key Concepts:

Virtual IP and MAC Addresses

• Each VM is assigned a virtual IP address and MAC address, which are distinct from the
host machine’s address.

• These addresses must be maintained during migration for network communication.

• The Virtual Machine Monitor (VMM) maintains a mapping between the virtual IP/MAC

addresses and the VM. Network Migration Mechanism

o When a VM migrates to a new host, the migration must include all protocol states and
the IP address of the VM.
o On a switched LAN network, the migrating host sends an unsolicited ARP reply, informing
other devices that the VM’s IP has moved. This allows peers to update their network
configurations to route future packets to the VM’s new location.
o If the VM maintains its original Ethernet MAC address, the network switch can
automatically detect the migration to a new port without requiring further network
configuration.

Live Migration of Virtual Machines

Live migration refers to the process of moving a VM from one physical node to another without
interrupting the VM’s operating system or applications.

This is essential for various enterprise workloads such as load balancing, system maintenance,
and proactive fault tolerance.

Live Migration Mechanism:

The precopy approach is widely used in live migration, where:

o Memory pages are transferred to the target node in multiple iterations.


o The first round transfers all memory pages, followed by subsequent rounds that only
transfer modified (dirty) pages.
o The VM remains online during migration, though there is a performance degradation due
to the network bandwidth consumption.

Precopy Migration Challenges:

Performance Degradation:

o The migration daemon consumes network bandwidth to transfer dirty pages, leading to
performance degradation.
o Rate limiting can mitigate performance hits, but this prolongs the migration process.

Convergence Issues:
o Some applications may not have small writable working sets, causing difficulties in
convergence, which might require additional migration iterations.

Memory Transfer Volume:

o The large volume of data transferred during the migration process is a key limitation in
precopy-based migration.

Checkpointing and Trace/Replay (CR/TR-Motion) Migration

To address the limitations of precopy migration, an alternative approach using checkpointing


and trace/replay (CR/TR-Motion) has been proposed.

• CR/TR-Motion transfers an execution trace file rather than the dirty memory pages,
significantly reducing the amount of transferred data.
• Advantages:
o Drastically reduces total migration time and downtime.
o Log files (execution traces) are much smaller than dirty pages, leading to a more
efficient migration.
• Limitations:
o The approach is effective only if the log replay rate exceeds the log growth rate.
The differences between the source and target nodes may limit its effectiveness
in some scenarios.

Postcopy Migration

Postcopy migration transfers all memory pages at once, reducing the baseline migration time.
However, it introduces significant downtime due to latency as memory pages are fetched from
the source node before the VM can be resumed on the target node.

Advantages:

o Reduced total migration time as the baseline transfer is done only once.

Challenges:
Higher downtime compared to precopy due to the latency in fetching pages.

Compression-Based Optimization for Memory Migration

With advancements in multicore and many-core machines, there is a possibility to compress


memory pages to reduce the data transfer required during migration.

Compression Algorithms:

o Memory compression reduces the amount of data transferred during migration.


o Decompression is fast and doesn’t require significant memory, thus improving
migration efficiency.

Live Migration Using Xen Hypervisor

Xen is a widely used Virtual Machine Monitor (VMM) that supports live migration by utilizing a
send/recv model to transfer VM states between source and target hosts.

• Dom0 (the control domain) manages the migration process, including the creation,
termination, or migration of VMs across hosts.
Virtual Cluster Research and Dynamic Deployment

Several virtual cluster research projects have focused on dynamic deployment to improve the
flexibility and resource allocation of VMs across clusters:

1. Cellular Disco at Stanford:

o A shared-memory multiprocessor system designed to handle dynamic VM


migrations.
2. INRIA Virtual Cluster: o Created to test parallel algorithm performance across VMs.
3. COD and VIOLIN Clusters:
o Other clusters that are studied for performance and dynamic VM management.

These projects demonstrate the potential benefits of dynamic VM migration within virtual
clusters, ensuring resources are efficiently allocated based on demand

Virtualization for Data Center Automation

Growth of Data Centers


Major IT companies like Google, Amazon, Microsoft, HP, Apple, and IBM are investing in data
centers.

Billions of dollars are spent on data-center construction and automation.

Automation enables dynamic resource allocation for millions of users with QoS and cost-
efficiency.

Role of Virtualization in Data-Center Automation Virtualization

and cloud computing drive automation.

Market growth (2006-2011):

o 2006: $1.044 billion market share, dominated by production consolidation and software
development.
o 2011 (projected): $3.2 billion market share, expanding into high availability (HA), utility
computing, and workload balancing.

Virtualization reduces planned downtime and enhances mobility and scalability.

Developments in Virtualization

High availability (HA) ensures system uptime.

Backup services improve disaster recovery.

Workload balancing optimizes resource allocation.

Service-oriented automation and policy-based management enhance efficiency.

Server Consolidation in Data Centers Workloads

are categorized into:

o Chatty workloads: Fluctuating demand (e.g., video streaming at night).


o Noninteractive workloads: Consistent demand (e.g., high-performance computing).
Problem: Underutilized servers waste resources (hardware, space, power, management costs).

Solution: Virtualization-based server consolidation optimizes resource utilization.

Benefits of Server Virtualization

Improves hardware utilization by consolidating multiple servers into fewer machines.

Enhances disaster recovery and backup services.

Reduces Total Cost of Ownership (TCO):

 Fewer servers needed → lower maintenance, power, and cooling costs.

 Improves availability and business continuity:

 Guest OS crashes do not affect host OS or other VMs.


 VMs can be migrated between servers seamlessly.

Automated Resource Management in Virtualized Data Centers

Key factors:

o Resource scheduling and power management.

o Performance optimization through analytical models.

Scheduling levels:

o VM-level, server-level, and data-center-level scheduling.

Dynamic CPU allocation:

o Based on VM utilization and application QoS metrics.


o Two-level resource management using local (VM-level) and global (server-level) controllers.

Challenges and Future Scope


• Optimizing memory management in multicore processors (CMPs).
• Power budgeting strategies to balance power saving and data-center performance.
• Inter-VM communication and memory access protocols to enhance efficiency.

Server Consolidation in Data Centers

Types of Workloads in Data Centers

Chatty workloads: Burst at peak times and remain idle otherwise (e.g., web video services).

Noninteractive workloads: Do not require human intervention after submission (e.g., high-
performance computing).

Challenge: Workloads have different resource demands, leading to underutilized servers when
resources are allocated for peak demand.

Problems with Static Resource Allocation

Servers are often underutilized, wasting:

o Hardware resources
o Space and power
o Management costs

Resource optimization is needed at the level of CPU, memory, and network interfaces.

Virtualization-Based Server Consolidation

Key approach: Reduces the number of physical servers while optimizing resource use.

More effective than other consolidation techniques (e.g., centralized and physical consolidation).

Virtualization allows fine-grained resource allocation, improving flexibility.

Benefits of Server Virtualization


• Enhances hardware utilization: Underutilized servers are consolidated into fewer machines.
• Facilitates backup and disaster recovery.
• Improves agility in resource provisioning: VM images can be cloned and deployed quickly.
• Reduces Total Cost of Ownership (TCO):
o Fewer servers → lower maintenance, power, cooling, and cabling costs.
• Improves availability and business continuity:
o Guest OS crashes do not affect host OS or other VMs.
o Virtual servers can be migrated seamlessly between physical machines.

Challenges in Virtualized Data Centers

Resource scheduling complexity: Needs optimization at multiple levels:

o VM level

o Server level

o Data-center level Dynamic CPU allocation:

o Based on VM utilization and application-level QoS metrics.


o Adjusts resources automatically for varying workloads.
Two-level resource management:

o Local controller (VM level) and Global controller (server level) work together
for autonomic resource allocation.

Multicore Processors and Virtualization

Challenges with CMP (Chip Multiprocessing):

o Memory systems are not fully optimized for virtualization.


o Need to reduce memory access time and minimize inter-VM interference.
o VM-aware power budgeting required for balancing power savings and
performance.
Future Considerations for Optimized Virtualized Data Centers
Improving inter-VM communication protocols.
Enhancing memory sharing and management in CMP-based servers.
Integrating power management policies with VM-aware scheduling.
Addressing heterogeneity in workloads for better performance and efficiency.
Virtual Storage Management

Storage Virtualization in System Virtualization

Traditional Storage Virtualization: Aggregation and repartitioning of physical disks for use by
physical machines.

In System Virtualization: Virtual storage includes the storage managed by Virtual


Machine Monitors (VMMs) and guest OSes, with the data classified into two categories:

• VM images (specific to virtual environments)


• Application data (similar to traditional OS environments).

Challenges in Virtual Storage

Encapsulation and Isolation:

o VMs provide isolation between guest OSes, allowing multiple VMs to run on a
physical machine.
o Storage systems struggle to keep up with system and CPU advancements, becoming
the bottleneck in VM deployment.

Storage Management Issues:

o Guest OS storage operations behave as though accessing a real hard disk, but they
cannot directly access the physical disk.
o Multiple guest OSes may compete for disk resources when running on the same
machine.
o The storage management layer of the underlying VMM is much more complex than
traditional guest OS management.
VM Storage Primitives:

o Operations like remapping volumes across hosts and checkpointing disks are
complicated and sometimes unavailable.

Problems in Virtual Storage Management

Flooded VM images: Large numbers of VMs in data centers create excessive VM images,
consuming significant storage space.

Storage Management Complexity: Current storage management techniques don't cater to


virtualization needs, making it difficult to manage and optimize storage.

Solutions in Virtual Storage Management

Parallax (Distributed Storage System for Virtualization):

o A customized storage solution designed for virtual environments, aimed at


simplifying management and improving performance.
o Content Addressable Storage (CAS): Reduces the total size of VM images, supporting
large VM-based systems in data centers.
o Federated Storage VMs: Moves traditional storage features into a federation of

storage VMs that share physical hosts with the VMs they serve, improving

management and performance.

Parallax System Architecture

Storage Appliance VM:

o Acts as a block virtualization layer between the VMs and physical storage devices. o
Provides a virtual disk for each VM on the same physical machine.
o Supports various system virtualization techniques, such as paravirtualization and full
virtualization.
Benefits of Parallax

• Simplifies storage management in virtualized data centers.


• Reduces the storage footprint by reducing the size of VM images.
• Improves performance by handling virtual storage operations more efficiently.
Cloud OS for Virtualized Data Centers

To serve as cloud providers, data centers must be virtualized.

Several Virtual Infrastructure (VI) managers and Cloud OSes are designed for managing
virtualized data centers efficiently.
Functions of VI Managers:

Create and manage VMs

Aggregate VMs into virtual clusters to provide elastic computing resources

Virtual networks support (Nimbus, Eucalyptus)

Dynamic resource provisioning and advance reservations (OpenNebula)

Virtual storage & data protection (vSphere 4)


Trust Management in Virtualized Data Centers

Role of the Virtual Machine Monitor (VMM) in Security:

The VMM (Hypervisor) creates and manages VMs, acting as an interface between OS and
hardware.

A VM is fully encapsulated, meaning its entire state can be copied, moved, and deleted like a
file.

The VMM is the foundation of security in a virtual system, controlling how VMs access
hardware resources.

Typically, one management VM is privileged to create, suspend, resume, or delete other VMs.

Security Risks in VMM-based Virtualized Environments:

VMM or Management VM Compromise:

o If an attacker gains control over the VMM or management VM, all VMs and the entire
system are at risk.

Random Number Reuse Issue:

o VMs can be rolled back to a previous state, causing old random numbers to be reused.
o This weakens session key security in cryptographic protocols. o TCP hijacking attacks
can occur due to reuse of initial sequence numbers.

VM-Based Intrusion Detection Systems (IDS)

Intrusion: Unauthorized access to a system via local or network-based attacks.

Intrusion Detection System (IDS): Detects and recognizes these unauthorized actions.

IDS can be classified into:


o Host-based IDS (HIDS): Runs on individual VMs, but can be compromised if the VM is
attacked.
o Network-based IDS (NIDS): Monitors network traffic but may miss sophisticated attacks.
Advantages of VM-Based IDS in Virtualization:

Isolation: Even if a VM is compromised, other VMs remain unaffected.

VMM Security Monitoring: The VMM can monitor and audit access requests for hardware and
system software.

Combination of HIDS & NIDS:

o Provides system-level monitoring (HIDS).


o Offers network-level detection (NIDS).

Two Implementation Methods for VM-Based IDS:

Independent IDS Process in Each VM or a High-Privileged Management VM:


o IDS runs separately in every VM.

o A privileged VM manages IDS functions.


IDS Integrated into the VMM (Full Privilege Access):
o IDS is built into the hypervisor, giving direct hardware access.
o More effective but also increases the risk if compromised.
Additional Intrusion Prevention Methods:

IDS Logs:

o Analyzing attack patterns is critical.

o Logs are used for security monitoring but must be protected from tampering.

Honeypots & Honeynets:

o Honeypots simulate vulnerable systems to lure attackers.


o Honeynets use multiple honeypots to analyze attack behaviors.
o Virtual honeypots use VMs as decoy targets to track attackers.

You might also like