Support
Support
Administration (ECA)
Course Guide
Copyright
COPYRIGHT
Copyright 2022 Nutanix, Inc.
Nutanix, Inc.
1740 Technology Drive, Suite 150
San Jose, CA 95110
All rights reserved. This product is protected by U.S. and international copyright and intellectual
property laws. Nutanix and the Nutanix logo are registered trademarks of Nutanix, Inc. in the
United States and/or other jurisdictions. All other brand and product names mentioned herein
are for identification purposes only and may be trademarks of their respective holders.
License
The provision of this software to you does not grant any licenses or other rights under any
Microsoft patents with respect to anything other than the file server implementation portion of
the binaries for this software, including no licenses or any other rights in any hardware or any
devices or software that are used to communicate with or in connection with this software.
Conventions
Convention Description
variable_value The action depends on a value that is unique to your
environment.
ncli> command The commands are executed in the Nutanix nCLI.
Version A
Last modified: November 21, 2022
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 2
Contents
Copyright...................................................................................................................2
License.................................................................................................................................................................. 2
Conventions........................................................................................................................................................ 2
Version.................................................................................................................................................................. 2
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 3
Segmenting Networks................................................................................................................................. 85
Segmented and Unsegmented Networks............................................................................... 86
Investigating Network Issues................................................................................................................... 88
Module Summary.......................................................................................................................................... 90
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 4
Creating a Storage Policy............................................................................................................ 155
Updating a Storage Policy...........................................................................................................157
Automating Common Administrative Tasks with Playbooks.....................................................158
Creating and Running a Playbook............................................................................................159
Exporting a VM as an OVA File............................................................................................................ 162
Enabling VM High Availability................................................................................................................ 163
Module 6 Summary.....................................................................................................................................164
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 5
What is Nutanix Move?............................................................................................................................. 231
Downloading and Installing Nutanix Move.......................................................................................232
The Nutanix Move Dashboard............................................................................................................... 232
Migrating VMs Using Nutanix Move.................................................................................................... 233
Upgrading Nutanix Move.........................................................................................................................234
Viewing and Downloading Move Logs.............................................................................................. 235
Module 9 Summary.................................................................................................................................... 237
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 6
Putting a Node into Maintenance Mode................................................................................ 321
Removing a Node from Maintenance Mode........................................................................ 323
Starting and Stopping Nodes and Clusters..................................................................................... 325
Shutting Down a Node in an AHV Cluster...........................................................................325
Starting a Node in an AHV Cluster.........................................................................................326
Shutting Down an AHV Cluster................................................................................................326
Modifying a Cluster.................................................................................................................................... 327
Expanding a Cluster...................................................................................................................... 327
Removing a Node from a Cluster............................................................................................328
Module 13 Summary...................................................................................................................................329
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 7
Getting Started with a Nutanix Cluster
Module
1
GETTING STARTED WITH A NUTANIX CLUSTER
Module 1 Overview
This module will introduce you to the concept of HCI, briefly discuss Nutanix cloud solution
packages, and introduces Prism Element and Prism Central.
You will learn the basic layout and elements of the Home dashboards of Prism Element and
Prism Central, and will be able to interact with key elements of these interfaces to view, explore,
and identify key cluster information, locate various types of performance data, and identify alert
and event messaging.
Traditional infrastructure creates silos, which have become a barrier to change and progress.
Every step of the acquisition, deployment, and management process is affected by these silos:
new initiatives require approval from multiple teams, IT needs must be predicted three to five
years in advance, and lock-in and licensing costs are stretching budgets to their breaking point.
As a result, enterprise IT teams are looking for ways to deliver their on-premise services to their
internal customers with the speed and operational efficiency of public cloud services such as
Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP).
It is important to note that the cloud companies listed above, as well as some of the world's
largest web companies, faced all of the limitations that come with traditional infrastructure long
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 8
Getting Started with a Nutanix Cluster
before the wider market, and developed distributed systems technologies to meet their needs
for scalability, reliability, and operational efficiency.
Then, engineers from several of these companies realized that their internal technology solution
had applications for IT at large, and this resulted in the birth of Hyperconverged Infrastructure
(HCI).
HCI is both hardware and software. The underlying hardware, although important, is
interchangeable, as long as its general capabilities can support whatever workloads you
intend to operate. This hardware is powered and managed by a distributed software layer
that eliminates the common pain points associated with three-tier infrastructure. The software
running on each node distributes all operating functions across the cluster for superior
performance and resilience.
At a fundamental level, this allows for complex, expensive, legacy infrastructure to be replaced
by a distributed platform running industry-standard commodity servers. Platform hardware
configurations can be made available to fit any workload by scaling the resources of an
individual node (CPU, RAM, storage). Nodes can also be provisioned with or without GPUs for
graphics acceleration. All nodes include flash to optimize storage performance, and all-flash
nodes are available to deliver maximum I/O throughput with minimum latency for all enterprise
applications. This enables enterprises to size their workloads precisely and scale as flexibly as
needed.
In addition to the distributed storage and compute platform, HCI solutions also include a
management pane that allow HCI resource administration and management to be performed
from a single interface. This eliminates the need for separate management solutions for servers,
storage, storage networks, and virtualization.
HCI effectively eliminates the need for forklift upgrades every three to five years, in which you
buy new storage and then migrate all your applications and data from your old storage to your
new storage.
With HCI, when the time comes to retire nodes and replace them with newer ones, you simply
add the new node to your cluster, let it integrate, and then tell the cluster’s admin console that
an old node is being taken out of service. The cluster control software evacuates that node and
severs its connection to the cluster. After that, you remove the node from its rack and take it
out of service.
What makes this approach remarkable is that no manual migration is needed, which means no
service disruption and no application downtime. The software layer does all the hard work.
Similarly, when you need more capacity or computing power, you just add nodes.
From an operational perspective, next-generation HCI allows you to implement what amounts
to a fractional consumption model, allowing you to scale on-demand without requiring a
massive capital investment each time your needs change. You buy what you need for right now
and when your needs grow, you add another node.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 9
Getting Started with a Nutanix Cluster
Agile Operations
HCI is fast and easy to install due to a well packaged architecture with an integrated full-stack
installer. Not only does this allow you to deploy new HCI clusters quickly, it also automates the
deployment at multiple geographically distributed sites simultaneously.
HCI allows you to scale linearly and predictably due to its core architecture that automatically
redistributes data as new nodes are added. This increases storage performance along with
storage capacity.
With HCI, you can run different workloads on a single platform, all while maximizing
performance, resource efficiency, as well as cost. Now you can manage your global
infrastructure as a unified cloud platform. HCI integrates these separate layers into a single
infrastructure platform which is managed through a single user interface.
Sustained Innovation
The next major benefit of HCI is how it enables your organization to focus on sustained
innovation. It helps maximize your team’s impact to the organization by enabling them to shift
more of their focus from maintenance and manual processes to innovation.
Innovation is a key component of driving business growth, and it depends on people dedicating
time and energy on innovation projects. The streamlined nature of HCI makes it possible to
easily automate many of the processes that have traditionally required manual intervention.
HCI makes it easy to implement hybrid cloud without needing to compromise on the benefits of
either on-prem infrastructure or the public cloud.
Optimized Economics
One of the most important benefits of HCI is how it enables you to optimize your IT
investments.
One of the most important financial advantages of Hyperconverged Infrastructure is its ability
to scale dynamically with changing business needs. HCI gives you the ability to easily change
course as business needs change.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 10
Getting Started with a Nutanix Cluster
HCI not only makes it easy to integrate with public clouds, but it also brings cloud economics
to your private cloud. One piece of that is subscription-based licensing, enabling you to allocate
opex budget for datacenter infrastructure. Subscription licenses give you the flexibility to pay in
the way that works best for your business, without locking yourself into a particular solution or
vendor.
An Introduction to Nutanix
Nutanix provides a single platform that unifies hybrid multicloud management. The Nutanix
cloud platform integrates compute, virtualization, storage, networking, security, and containers,
and simplifies day to day management of a company’s IT environment. The Nutanix cloud
platform also extends from private to public clouds, currently available on Amazon Web
Services, with support for Microsoft Azure under development.
By running the Nutanix software on industry-standard servers, a business can gain all the
benefits of the Nutanix solution while also starting with a relatively small deployment and
scaling one node (i.e., server) at a time, as needed. Each node includes Intel-powered x86
or IBM Power hardware with flash SSDs and HDDs. Nutanix software running on each node
distributes all operating functions across the cluster for performance and resilience.
A single Nutanix cluster can scale as large as the hypervisor cluster it is on. Different hardware
platforms are available to address varying workload needs for compute and storage. Nutanix
software is hardware agnostic, running on hardware from vendors such as Dell, Lenovo, Cisco
UCS, HPE ProLiant, and more.
• Delivers a consistent operating model across public, private, and hybrid clouds
• Supports any application and workload in any location
• Offers choice and flexibility so that businesses can implement the right cloud operating
model for them
Nutanix solutions are available in five solution packages that bring together the infrastructure
management, operations, security, and business continuity technologies that Nutanix is known
for into easy-to-consume packages that provide customers with complete sets of capabilities
for their needs.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 11
Getting Started with a Nutanix Cluster
- Provides a complete software solution including virtual compute, storage and networking
for virtual machines and containers, that can be deployed in private data centers on
the hardware of your choice or in public clouds with built-in resilience, self-healing,
performance, disaster recovery capabilities, and security.
- Solution package includes the following products:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 12
Getting Started with a Nutanix Cluster
- Delivers distributed, and software defined storage for multiple protocols (volumes, files,
objects) to support a variety of workloads deployed anywhere – private, public, or hybrid
cloud – with license portability in between.
- A single point of management for all storage resources eliminates complexity of multiple
interfaces and enables non-storage experts to handle most day-to-day storage and data
management tasks.
- Intelligent analytics integrated into the solution provide data visibility and deep insights
for governance and security of data.
- Solution package includes the following products:
o Files Storage. Allows you to centrally manage, scale and adapt to changing file-storage
needs from on-premises to multiple clouds.
o Volumes Block Storage. Bridges physical and virtual infrastructure, combining them
into one unified platform with the simplicity that enterprises have grown to rely on.
o Objects Storage. Delivers secure S3-compatible object storage at massive scale to
hybrid cloud environments.
o Mine Integrated Backup. Minimize downtime and consolidate your backup operations
into one turnkey solution that is simple, scalable and natively integrated.
• Nutanix Database Service (NDB)
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 13
Getting Started with a Nutanix Cluster
o Nutanix Database Service. Easily operate fleets of Microsoft SQL Server, MongoDB,
MySQL, Oracle, and PostgreSQL databases at scale on-premises and in the cloud.
• Nutanix End User Computing Solutions (EUC)
- Enables the delivery of virtual apps and desktops to users worldwide from public, private,
and hybrid cloud infrastructure.
- Provides a per-user licensing option for NCI that simplifies capacity planning by matching
the infrastructure cost model to that of the end user computing platform.
- Includes a simple, fast, and flexible Desktop-as-a-Service (DaaS) platform that can run
end user workloads on Nutanix Cloud Infrastructure (NCI), on public clouds or on hybrid
clouds.
- Solution package includes the following products:
o Virtual Desktop Infrastructure (VDI). A complete software stack to unify your hybrid
cloud infrastructure including compute, storage, network, and hypervisors, in public or
private clouds.
o Frame (Desktop-as-a-Service). Deliver virtual apps and desktops to users worldwide,
either in the public cloud with AWS, Azure, or GCP, or on-prem with Nutanix AHV.
Three products form the core of the Nutanix solution, so much so that they are part of every
license tier for the Nutanix Cloud Infrastructure (NCI) solution package. These three products
are: AOS Storage, AHV Virtualization, and Prism.
AOS Storage
AOS Storage is a scalable, resilient, high performance, distributed storage solution that
you can use for all of your workloads. It provides enterprise-grade storage services for
applications, while eliminating the need for traditional SAN and NAS solutions. It also includes a
comprehensive set of capabilities for performance acceleration, data reduction, data protection,
and more. Among the benefits of AOS Storage are:
• High performance storage. Distributed data processing and local access to modern high
performance storage media like NVMe and Optane drive high bandwidth and low latency
that won’t degrade over time.
• Resilient and secure storage. Utilizing advanced distributed software algorithms, AOS
Storage protects data against everything from bit rot and hardware failure to physical theft
and total site failure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 14
Getting Started with a Nutanix Cluster
• Flexibile and scalable cloud infrastructure. Easily size and deploy infrastructure at any scale,
for any workload, and scale out quickly. Mix and match hardware configurations seamlessly,
so you can adapt over time.
AHV Virtualization
Prism
Prism lets you manage your entire environment from a single console. Simplify monitoring and
remediation with an end-to-end, application-centric view of your network—from every node
in the cluster to VM-specific details. Maintain control over resources with role-based access
control (RBAC), and automate workflows with Prism’s comprehensive REST APIs. Among the
benefits of Prism are:
A node is an x86 server with compute and storage resources. A single Nutanix cluster
running AHV can have a maximum of 32 nodes. A single (non-mixed) cluster running ESXi
can have a maximum of 48 nodes, while a single Hyper-V cluster can have a maximum of
16 nodes. Different hardware platforms are available to address varying workload needs for
compute and storage.
For more information, see the Configuration Maximums section of the Nutanix Support Portal.
There are three types of node configurations for AHV: HCI, compute, and storage.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 15
Getting Started with a Nutanix Cluster
• A HCI node is the most common type of node. It includes CPU, memory, and storage in a
single physical chassis.
• A storage-only (SO) node has the bare minimum CPU capacity, but a significant amount of
onboard storage.
• A compute-only (CO) node has the bare minimum onboard storage, but a significant amount
of CPU and memory.
Note: A HCI node can run any supported hypervisor: AHV, VMware ESXi, or Hyper-
V. However, SO and CO nodes support AHV only.
For more information, see the Storage-Only Node Configuration and Compute-Only Node
Configuration pages of the Prism Web Console Guide, on the Nutanix Support Portal.
Block
In a typical Nutanix cluster, a block is a chassis that holds one to four nodes, and contains
power, cooling, and the backplane for the nodes. The number of nodes and drives depends on
the hardware chosen for the solution.
Cluster
A Nutanix cluster is a logical grouping of physical and logical components. A single Nutanix
cluster can consist of one, two, three, four, or more nodes, and these nodes can be housed in
one or more blocks. Since a cluster is both a physical and a logical grouping, it is possible for
nodes in a single block to belong to different clusters.
Joining multiple nodes in a cluster allows for resources to be pooled. As an example, all storage
hardware in a cluster (that is, all SSDs and HDDs) is presented as a single storage pool.
All components run on multiple nodes in the cluster and depend on connectivity between their
peers. Most components also depend on other components for information.
The following figure is a detailed list of all cluster components. It highlights the role that each
component plays (for example, Prism is responsible for the UI and APIs; Genesis handles
component management; Arithmos is responsible for statistics; and so on) and the connections
and relationships between components.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 16
Getting Started with a Nutanix Cluster
Some of the more important components, which are useful for you to know when administering
a Nutanix cluster, are:
Acropolis
An Acropolis follower runs on every CVM with an elected Acropolis leader. The Acropolis
follower is responsible for stat collection and publishing, and provides VNC proxy capabilities.
The Acropolis leader is responsible for stat collection and publishing, task scheduling and
execution, VM placement and scheduling, network controller, and VMC proxy.
Genesis
Genesis is a process which runs on each node and is responsible for any services interactions
(start/stop/etc.) as well as for the initial configuration. Genesis is a process which runs
independently of the cluster and does not require the cluster to be configured/running. The
only requirement for Genesis to be running is that Zookeeper is up and running.
Zookeeper
Zookeeper stores information about all cluster components (both hardware and software),
including their IP addresses, capacities, and data replication rules, in the cluster configuration.
Zookeeper is active on either three or five nodes, depending on the redundancy factor
(number of data block copies) applied to the cluster. Zookeeper uses multiple nodes to prevent
stale data from being returned to other components. An odd number provides a method for
breaking ties if two nodes have different information.
Of these nodes, Zookeeper elects one node as the leader. The leader receives all requests for
information and confers with its follower nodes. If the leader stops responding, a new leader is
selected automatically.
Zookeeper has no dependencies, meaning that it can start even if no other cluster components
are running.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 17
Getting Started with a Nutanix Cluster
Zeus
Zeus is an interface to access the information stored within Zookeeper and is the Nutanix
library that all other components use to access the cluster configuration.
A key element of a distributed system is a method for all nodes to store and update the
cluster's configuration. This configuration includes details about the physical components in the
cluster, such as hosts and disks, and logical components, like storage containers.
Medusa
Distributed systems that store data for other systems (for example, a hypervisor that hosts
virtual machines) must have a way to keep track of where that data is. In the case of a Nutanix
cluster, it is also important to track where the replicas of that data are stored.
Medusa is a Nutanix abstraction layer that sits in front of the database that holds metadata.
The database is distributed in a ring topology across multiple nodes in the cluster for resiliency,
using a modified form of Apache Cassandra.
Cassandra
Nutanix's implementation of Cassandra uses a version of Apache Cassandra that has been
modified for high performance and automatic, on-demand scaling. Cassandra stores all
metadata about the guest VM data in a Nutanix storage container.
Cassandra runs on all nodes of the cluster. These nodes communicate with each other once
a second using the Gossip protocol, ensuring that the state of the database is current on all
nodes. Cassandra depends on Zeus to gather information about the cluster configuration.
Stargate
A distributed system that presents storage to other systems (such as a hypervisor) needs a
unified component for receiving and processing data that it receives. The Nutanix cluster has a
software component called Stargate that manages this responsibility.
All read and write requests are sent across an internal vSwitch to the Stargate process running
on that node.
Stargate depends on Medusa to gather metadata and Zeus to gather cluster configuration data.
From the perspective of the hypervisor, Stargate is the main point of contact for the Nutanix
cluster.
Curator
A Curator leader node periodically scans the metadata database and identifies cleanup and
optimization tasks that Stargate should perform. Curator shares analyzed metadata across
other Curator nodes.
Curator depends on Zeus to learn which nodes are available, and Medusa to gather metadata.
Based on that analysis, it sends commands to Stargate.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 18
Getting Started with a Nutanix Cluster
Why are we starting with Nutanix Prism? When you use the Nutanix Cloud Platform, you will
use Prism to perform most (if not all) of your monitoring, administrative, and management
tasks. As a result, this course has been built and oriented around Prism. That is, you will
understand key concepts first, and then learn where they can be found, used, or accessed in
Prism.
This section serves as an introduction to Nutanix Prism and its two components: Prism Element
and Prism Central. In this section, you will learn:
• What Nutanix Prism and its components - Prism Element and Prism Central - are.
Prism is a part of every Nutanix deployment and has two core components:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 19
Getting Started with a Nutanix Cluster
• Prism Element
- Is a service built into the platform for every Nutanix cluster deployed.
- Provides the ability to fully configure, manage, and monitor Nutanix clusters running any
hypervisor (that is, AHV, vSphere, Hyper-V, and so on).
- Only manages the cluster that it is a part of. This means in a deployment with multiple
Nutanix clusters, each cluster has a unique instance of Prism Element for the management
of that specific cluster.
• Prism Central
- Allows you to manage different clusters on one screen. These clusters can be in a single
physical location or in multiple physical locations.
- Offers an organizational view into a distributed Nutanix environment.
- Offers the ability to enable a variety of services, such as Calm, Karbon, Files, and so on.
- Is essentially used to consolidate multiple instances of Prism Element and manage them in
a single interface.
Prism Element
Prism Element provides you with several different management capabilities for the
administration of a single cluster. Some of these capabilities are:
Cluster Management
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 20
Getting Started with a Nutanix Cluster
You can manage a single cluster using Prism Element, it involves configuring and monitoring
the entities within the cluster, such as virtual machines, storage containers, hardware
components, and so on.
Storage Management
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 21
Getting Started with a Nutanix Cluster
Prism Element helps you monitor storage usage across the cluster. It provides you the options
to create storage containers and volume groups, as well as configure a threshold warning for
storage capacity available in the cluster. You can also reserve storage capacity for rebuilding
failed nodes, blocks, or racks
Network Management
Prism Element allows you to track and record networking statistics for a cluster. On AHV
clusters, a network visualizer is provided that displays a consolidated graphical representation
of the network formed by the VMs and hosts in a Nutanix cluster. You can use the visualizer to
monitor the network and to obtain information that helps you identify network issues.
Before we can explore Prism Element and understand its capabilities, we need to actually
access the web console.
Note: Most of the steps in the following procedure are only applicable to your
very first Prism Element login. On each subsequent login, entering your username
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 22
Getting Started with a Nutanix Cluster
and password will take you directly to the Home dashboard with no additional
actions typically required. Additional actions may be required after an upgrade is
performed.
1. Open a supported browser, such as FireFox, Chrome, Safari, Internet Explorer version 11, or
Microsoft Edge, and Type http://management_ip_addr in the address bar and press Enter or
click the right arrow icon.
• This IP address can be either the cluster virtual IP address or the IP address of
any Nutanix Controller VM (CVM) in the cluster.
2. The browser will redirect to the encrypted port (9440) and may display an SSL certificate
warning. Acknowledge the warning and proceed to the site.
3. Enter your username and password, and press Enter. If you are logging in as an
administrator for the first time, you will be required to change the default password.
4. After you change your password, you will be prompted to accept the End User License
Agreement (EULA). On this page, read the license agreement, update the required
information, accept the terms and conditions, and click the Accept button.
• You will only see the EULA if you are logging in for the first time, or if the EULA has
changed since the last login.
5. The next screen will inform you that Pulse will be enabled. Click the Continue button.
• Like the EULA screen, you will only see this screen if you are logging in for the first time,
or after an upgrade. Pulse is a feature that alerts Nutanix customer support if your cluster
experiences health issues and is discussed in the next section.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 23
Getting Started with a Nutanix Cluster
6. If Pulse was enabled or after a cluster is upgraded, the next screen will prompt you to enable
enhanced cluster health monitoring. Click the Yes button.
• This feature acts in addition to Pulse provides Nutanix customer support with more
detailed information that allows them to monitor the health of your cluster more
effectively. It is recommended to enable enhanced cluster health monitoring unless
providing cluster information to Nutanix customer support violates your security policies.
Before we look at Prism Element's Home dashboard, let's take a moment to discuss Pulse.
Pulse is a feature that provides diagnostic system data to Nutanix customer support, allowing
proactive, context-aware support to be delivered for Nutanix solutions. This diagnostic data is
collected unobtrusively in the background with little to no effect on system performance.
When logging into Prism for the first time or after an upgrade, the system will check whether
or not Pulse is enabled. If it is not, you will be prompted to enable Pulse. When Pulse is enabled,
it sends a summary email of the cluster configuration to a Nutanix Support server daily by
default. These messages are sent through ports 80/8443/443 (by default) or through your mail
server (if configured). It is recommended to enable Pulse unless providing cluster information to
Nutanix customer support violates your security policies.
• System alerts.
• Current Nutanix software version.
• Nutanix processes and CVM information.
• Hypervisor information, such as type and version.
Important data, such as system-level statistics and configuration information, is collected more
frequently so that issues can be detected automatically, and troubleshooting can be made
easier.
Note: For details about the information collected by Pulse, see KB 2332 on the
Nutanix Support Portal.
By default, the Home dashboard of Prism Element provides a high-level overview of nearly
every major facet of a single cluster. As a result, if you are responsible for monitoring and
reporting on a single cluster, the Home dashboard of Prism Element is a good place to start
looking for information.
From the Home dashboard alone, you can view the status of VMs, hosts, storage, and Prism
Central; a handful of basic performance metrics; the overall health and data resiliency status of
your cluster; as well as alerts and events grouped by severity level.
Having this information readily accessible means that you can quickly scan through the Home
dashboard, identify areas that need further investigation or exploration, and then either use the
Home dashboard widgets themselves or different dashboard in Prism Element to drill down and
obtain further details.
Each of these different dashboards in Prism Element allows you to perform different
administrative tasks on different entities in a cluster - from hardware and networks, to VMs
and protection domains. If we consider managing VMs as an example, with Prism Element you
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 24
Getting Started with a Nutanix Cluster
can create, update, delete, clone, power on/off, and migrate VMs. You can also take snapshots,
launch the console, and manage Nutanix Guest Tools.
If you are just starting out with Nutanix and want to familiarize yourself with Nutanix
management features and capabilities, Prism Element is a good place to start. However,
once you are familiar with Prism Element, for a larger variety of options and a richer, deeper
administrative feature set, it is recommended to use Prism Central.
In this section, you will be introduced to the various elements of Prism Element's Home
dashboard and how to gather information for monitoring and reporting purposes.
After you log into Prism Element, this is what you will typically see:
The Prism Element Home dashboard consists of a number of default widgets that give you a
quick overview of the components, performance, health, and overall status of your cluster as
soon as you log in. The elements of the Home dashboard are:
The main menu bar is visible at the top of every screen in Prism Element. It comprises of
multiple icons which provide you with information, such as the cluster name and details, the
overall health of the cluster, the name of the current dashboard, and so on. It also provides
short cuts to access different information screens. Clicking icons take you directly to that
dashboard, for example, clicking the heart icon opens the health dashboard, the gear icon
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 25
Getting Started with a Nutanix Cluster
opens the setting menu, the question mark icon opens a drop-down menu with a list of help
items, and so on.
Note: In a typical Nutanix cluster, a block is a chassis that holds one to four nodes,
and contains power, cooling, and the backplane for the nodes. Nodes, blocks, and
clusters will be discussed in more detail in the Hardware module.
Performance Widgets
By default, Prism Element displays a number of performance charts and statistics on the Home
dashboard, designed to help you assess cluster performance at a glance.
Moving the mouse cursor over any one of the three charts will display a button at the top
left that will allow you to add a temporary, custom chart to the Analysis dashboard in Prism
Element. You will be prompted to save the chart on the Analysis dashboard for future viewing.
On the Home dashboard, moving your mouse cursor over the graph itself will display details for
a specific point in time.
Moving your mouse cursor over the Cluster CPU Usage or Cluster Memory Usage widgets will
display an Analyze button. Click this button to add a temporary, custom chart to the Analysis
dashboard.
Health Widgets
The Health widget displays the health status for the cluster as a whole, as well as cluster
entities. The three possible health states are good, warning, and critical, indicated by green,
yellow, and red heart icons respectively. Scrolling down in the widget will display the health
status of hosts, VMs, cluster services, disks, storage containers, and storage pools. Clicking the
widget will take you to the Health dashboard, where you can view more details about each or
all of these entities.
The Data Resiliency Status widget summarizes the number of failures that a cluster can
withstand. Moving your mouse cursor over the question marks next to Failure Domain and
Fault Tolerance will display a small popup with more information about each item. Clicking the
widget will display a Data Resiliency Status details page, which shows how many individual
component failures the cluster can withstand based on the currently configured failure domain.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 26
Getting Started with a Nutanix Cluster
The Alerts widgets display the most recent unresolved alerts on the cluster. Each set of alerts
is displayed in a separate widget, with one widget each for critical, warning, and informational
alerts.
The Events widget displays the total number of events that have occurred on the cluster, as
well as how recently the last event occurred.
In Prism Element, clicking the cluster name in the main menu bar displays the Cluster Details
window as shown in the following figure.
The Cluster Details window contains the cluster UUID, cluster ID, cluster incarnation ID, cluster
subnet, cluster name, FQDN, virtual IP, virtual IPv6, ISCSI data services IP, a checkbox for the
retention of deleted VMs, and the cluster encryption state.
With the exception of the cluster UUID, ID, incarnation ID, subnet, and encryption state, all other
details can be configured directly in the Cluster Details window.
Prism Central
Like Prism Element, Prism Central is also a web-based management console, that can be used
to perform a variety of monitoring, management, and administrative tasks. However, there are
some important differences to note.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 27
Getting Started with a Nutanix Cluster
Unlike Prism Element, which is built into every cluster, Prism Central is an application that must
first be deployed as a VM before it can be used. A single instance of Prism Central can be
used to manage just one cluster, or several clusters. The choice of whether or not to use Prism
Central depends on two factors:
Even if you only have a single cluster to manage - whether it is a large or small deployment
- management via Prism Central gives you access to features that are not present in Prism
Element, such as advanced reporting capabilities; a customizable main dashboard and the
ability to create custom dashboards; tools to analyze system activity, plan for resource needs,
create usage reports, and automate routine administrative tasks; and more.
Note: The features in Prism Central as compared to Prism Element are discussed
later in this module.
In addition to the Prism Element capabilities, Prism Central includes a host of other advanced
features (based on your licensing) which help in managing and monitoring your clusters. Some
of which are:
Cost Management
Provides access to a cost governance and security compliance SaaS product offering, called Xi
Beam. This helps gain visibility into cloud spend and security compliance status across multiple
cloud environments.
Resource Planning
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 28
Getting Started with a Nutanix Cluster
You can use the planning dashboard to review and analyze current and potential resource
needs. The Capacity Runway tab allows you to view current resource runway information
across the registered Nutanix and Non-Nutanix clusters.
Task Automation
The X-Play feature allows you to automate routine administrative tasks, and auto-remediate
issues that may occur in your system. This automation is achieved by creating playbooks.
You can enable selected services through Prism Central, such as Calm, Karbon, Files, Object,
and more.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 29
Getting Started with a Nutanix Cluster
1. Open a web browser, enter http://management_ip_addr and press Enter. When logging into
Prism Central, you must provide the IP address of the Prism Central VM.
2. At the login screen, enter your username and password and press Enter or click the right
arrow icon.
3. If you are logging in to Prism Central for the first time, you will see several screens that are
similar to the first-time login screens for Prism Element. These are:
a. A prompt to change the default admin password. If prompted, enter a new password,
retype to confirm, and press Enter or click the right arrow icon.
b. A license agreement screen. Read the license agreement, enter the required information,
agree to the terms and conditions, and click the Accept button.
After you have successfully logged in, you should see the Main Dashboard of Prism Central.
Note: You can also access Prism Central from the Prism Element home dashboard
if the cluster is registered to a Prism Central instance. Click Register to register the
cluster to a Prism Central instance. Once registered, you can click Launch to launch
the Prism Central instance in a new tab of your browser.
Much like Prism Element, Prism Central also provides a high-level overview of nearly every
major facet of one or more clusters. However, Prism Central offers additional, more advanced
functionality when compared to Prism Element.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 30
Getting Started with a Nutanix Cluster
The Main dashboard of Prism Central is customizable. While a number of widgets are available
on the Main dashboard out-of-the-box, you can remove these widgets and replace them with
others as needed.
In addition to customizing the Main dashboard, based on your Prism Central license level, you
can also create multiple, custom dashboards with widgets that provide information about a
specific resource or usage topic. A list of custom widgets is available in the Prism Central Guide
on the Nutanix Support Portal.
So, the home dashboards of Prism Element and Prism Central serve different needs for cluster
monitoring. Prism Element provides a robust, out-of-the-box monitoring experience that
provides you with easy access to summarized and detailed information. Prism Central provides
all of the functionality of Prism Element, while also offering a highly customizable monitoring
experience that can be tailored to suit the specific needs of your business. And, of the two
interfaces, Prism Central is the only one that allows you to monitor multiple clusters from a
single location.
Prism Central also includes more management and administrative features when compared
to Prism Element. If we consider VM management as an example, in addition to all of
Prism Element's functionality, Prism Central also allows you to enable or disable efficiency
measurement, enable or disable anomaly detection, add VMs to catalogs, run playbooks,
manage categories, and more. As a result, when managing even one cluster, it is recommended
to use Prism Central, so that you have access to its advanced feature set.
In this section, you will learn more about the Main dashboard of Prism Central, and how you
can use it to quickly and easily obtain both basic and detailed status, health, and performance
information for one or more clusters.
When you log into Prism Central, you will see the Main Dashboard. It is fully customizable
and contains a number of default widgets that provide an overview of key information for all
clusters that are connected to this particular instance of Prism Central. The following figure
shows a sample, default Prism Central dashboard with two connected test clusters.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 31
Getting Started with a Nutanix Cluster
The Prism Central Main Dashboard can be divided into three sections:
The Prism Central's main menu provides you with different elements, such as the entities menu
which gives you access to the different dashboards and screens, the search field, the alerts
(bell) icon, the settings (gear) icon, and much more. In the next section we will discuss the
entities menu. This is what you will use to access the different dashboards and screens to
manage various aspects of one or more clusters.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 32
Getting Started with a Nutanix Cluster
The first option on the entities menu is Dashboard. Click this to return to the Main Dashboard
from anywhere in Prism Central at any time.
The Bookmarks option shown in the figure above may not be visible if you are using or working
with a brand new cluster, or a cluster that has no bookmarked pages. When you search for an
entity in the search field of the main menu bar, you can click the star icon to the right of the
field to bookmark that particular search result. That bookmarked result (or page) will then be
displayed in the bookmarks category of the entities menu.
You can also bookmark a sub-page or dashboard by clicking the star icon next to its name in
the entities menu.
Next, you will see eight category options. Each category contains a collection of dashboards
that can be used to monitor and manage different aspects of the cluster. For example, the
Compute & Storage category contains individual dashboards for VMs, Templates, OVAs,
Images, Catalog Items, Volume Groups, and Storage Policies. The categories and their
associated dashboards are:
• Compute & Storage. Has pages for VMs, Templates, OVAs, Images, Catalog Items, Volume
Groups, and Storage Policies.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 33
Getting Started with a Nutanix Cluster
• Network & Security. Has pages for Subnets, Virtual Private Clouds, Floating IPs, Connectivity,
and Security Policies.
• Data Protection. Has pages for Protection Summary, Protection Policies, Recovery Plans, and
VM Recovery Policies.
• Hardware. Has pages for Clusters, Hosts, Disks, and GPUs.
• Activity. Has pages for Alerts, Events, Tasks, and Audits.
• Operations. Has pages for Analysis, Cost Management, Discovery, Monitoring Configurations,
Operations Policies, Planning, Playbooks, and Reports.
• Administration. Has pages for Categories, Users, Roles, Projects, Availability Zones, and LCM.
• Services. Has pages for Calm, Files, Foundation Central, Karbon, and Objects.
Note: Depending on your licensing tier (Starter, Pro, or Ultimate) and the solution
package that you own, you may see more or fewer options in the entities menu of
your cluster.
The Dashboards menu bar is directly under the main menu bar and is only visible when viewing
the Main Dashboard in Prism Central. The options available are:
• Main Dashboard. Click this option to return to the main dashboard from a custom dashboard
if you may have created. If you do not have custom dashboards or are currently viewing the
Main Dashboard, clicking this option will have no effect.
• Manage Dashboard. Click this option to create a custom dashboard.
• Reset Dashboard. If you add custom widgets to your main dashboard or remove some
of the default widgets, you can click this option to restore the default Prism Central main
dashboard.
• Add Widgets. Click this to add custom widgets to either your main dashboard or a custom
dashboard.
• Data Density. This affects the display of data within a widget. Light, medium, and dense
adjusts the amount of spacing between text, charts, and other data in each widget. Medium
density is the default, and was chosen based on user research by Nutanix.
This section includes a number of widgets which provide information that you can use to
monitor the clusters registered to this instance of Prism Central. Some of these widgets are:
• Alerts Widget: Displays the number of critical, warning, and informational alerts on all
monitored clusters within a selected time period.
• Cluster Quick Access Widget: Displays the clusters managed by this instance of Prism
Central. Each of the cluster names is clickable; doing so will open that cluster's Prism
Element.
• Cluster Storage Widget: Displays the amount of free and used storage space, and data
resiliency for all clusters managed by this instance of Prism Central.
• Cluster Performance Widgets: These widgets display key performance metrics for all clusters
managed by this instance of Prism Central. The four performance widgets are cluster
latency, cluster memory usage, cluster CPU usage, and controller IOPS.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 34
Getting Started with a Nutanix Cluster
• Cluster Runway Widget: Displays the available capacity for all clusters managed by this
instance of Prism Central.
• VM Efficiency Widget: Displays a summary of problematic VMs and provides a link to view all
inefficient VMs on a filtered version of Prism Central's VMs dashboard.
• Plays Widget: A playbook is an automation tool in Prism Central that allows you to define a
trigger, which will result in the automated occurrence of an action or a series of actions. A
play is an execution instance of a playbook. The Plays widget displays the number of plays
that have been completed, failed, or paused in the last 24 hours. Each of the three numbers
is clickable.
Prism Central allows you to view both basic and detailed cluster information for all clusters that
it manages. To access this information, you need to:
3. By default, you should see the List view of the Clusters dashboard. In the Name column, click
the name of a managed cluster to view the Cluster Details page.
You will see a page similar to the one in the figure below.
Each cluster details page contains a wealth of information that can be used to quickly and
easily assess the performance and current state of a cluster. Information is presented in 8 tabs:
• Summary. Allows you to launch Prism Element for that cluster, upgrade software, configure
fault tolerance, manage categories, run playbooks, enable encryption, and more.
• Alerts. Lists all alerts for that specific cluster only.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 35
Getting Started with a Nutanix Cluster
Note: Certain cluster specific details - such as the cluster virtual IP and the
ISCSI data services IP - cannot be configured in Prism Central. They can only be
configured in the Cluster Details window of Prism Element.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 36
Getting Started with a Nutanix Cluster
As seen in the image above, expanding the entities menu in Prism Central provides you multiple
options to manage, administer, and monitor your clusters as well as gives access to other
feature sets, such as creating categories, roles, projects, and much more. These features are not
available in the Prism Element console. As a result, in general, it is recommended to use Prism
Central to monitor, manage, and administer the clusters in your environment, and to use Prism
Element specifically when certain tasks require that it be used.
The following table provides a non-exhaustive list of Prism Central and Prism Element
features in four categories (hardware, storage, network, and VMs). You can use this table as
quick reference guide to decide when to use Prism Central versus Prism Element. For more
information, see the Prism Central Guide and the Prism Web Console Guide respectively.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 37
Getting Started with a Nutanix Cluster
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 38
Getting Started with a Nutanix Cluster
Virtual machines Create, update, clone, delete, Create, update, clone, delete,
launch console, migrate, power launch console, migrate, take
off/on, hard power off, soft snapshot, power on/off, manage
shutdown, enable/disable efficiency guest tools.
measurement, enable/disable
anomaly detection, protect/
unprotect, create recovery point,
install NGT, manage NGT, upgrade
NGT, configure host affinity, add
to catalog, export as OVA, run
playbook, manage categories,
manage affinity policies, manage
NGT policies.
As you can see, while Prism Element does have a strong set of features, the overwhelming
majority of monitoring, management, administration, and operations features are present
in Prism Central. As a result, this course focuses primarily on monitoring and certain basic
administrative tasks in Prism Central.
Regardless, so that you are familiar with these procedures and key documents on the Nutanix
Support Portal, this section briefly describes initial cluster setup tasks that need to be
performed on a new cluster, along with useful references to step-by-step instructions for each.
Running Nutanix Cluster Check (NCC)
Nutanix Cluster Check (NCC) is cluster-resident software that can help diagnose cluster
health and identify configurations qualified and recommended by Nutanix. NCC continuously
and proactively runs hundreds of checks and takes actions as necessary to resolve issues.
Depending on the issue identified, NCC will raise an alert or automatically create a Nutanix
Support case. NCC can be run provided that the individual nodes are up, regardless of cluster
state.
While NCC runs regularly, it is typically recommended to run NCC both before and after cluster
upgrades, before and after making changes to cluster hardware, and after deploying a new
cluster or completing the Foundation process on an existing one.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 39
Getting Started with a Nutanix Cluster
For step-by-step instructions, see Running NCC on Prism Central and Running NCC on Prism
Element . Note that Prism Element allows you to run NCC via the GUI, while Prism Central
requires that NCC be used via the command line.
For more information about NCC, see the Nutanix Cluster Check Guide on the Nutanix Support
Portal.
Network Time Protocol (NTP) is used for clock synchronization between computers. Hosts and
CVMs in a Nutanix cluster must be configured to synchronize their system clocks with a list of
stable NTP servers.
Accurate time synchronization between Nutanix clusters paired in Disaster Recovery (DR)
configurations also ensures that snapshots do not expire too quickly or too late.
Graphs in the Prism interface rely on CVM time, and incorrect time can skew these graphs,
especially in relation to other monitoring platforms such as vCenter, which rely on other clock
sources.
Note: By default, Nutanix CVMs and the Prism Central VM default to UTC. And, in
all versions from AOS 5.18 and Prism Central pc.2020.8 onwards, timestamps in
Nutanix logs are expressed in UTC irrespective of the configured cluster timezone.
For a list of guidelines and recommendations, see the Recommendations for Time
Synchronization section of the Prism Web Console Guide on the Nutanix Support Portal.
For step-by-step instructions, see Configuring NTP Servers in the Prism Central Guide and
Configuring NTP Servers in the Prism Web Console Guide.
Simple Mail Transport Protocol (SMTP) is an Internet standard protocol for electronic mail
transmission across Internet Protocol (IP) networks. Nutanix systems (such as Prism Central
and Prism Element) use SMTP to send Alert emails, and to exchange emails with Nutanix
Support.
For step-by-step instructions on setting up an SMTP server, see Configuring an SMTP Server in
the Prism Central Guide or Configuring an SMTP Server in the Prism Web Console Guide.
• SAML authentication.
• Local user authentication.
• Active Directory authentication.
These features are discussed in more detail the Securing a Nutanix Cluster module. However,
if you need to configure Active Directory as part of your initial cluster setup process, see the
Adding an Authentication Directory section of the Prism Central Guide on the Nutanix Support
Portal.
Changing UI Settings
Both Prism Central and Prism Element allow you to modify UI settings, including login screen
animations, and timeout settings for admin and non-admin users.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 40
Getting Started with a Nutanix Cluster
To modify these UI settings, see Modifying UI Settings in the Prism Central Guide and Modifying
UI Settings in the Prism Web Console Guide.
Prism Central allows you to configure a welcome banner, which is the first thing users will
see when they attempt to log in. The banner can include both a custom message and custom
graphics.
For step-by-step instructions, see the Configuring a Banner Page section of the Prism Central
Guide.
Module 1 Summary
In this module, you learned:
• The concept of hyperconverged infrastructure.
• How HCI works.
• What the Nutanix Cloud Platform is.
• What Nutanix cloud solution packages are.
• What Prism Element and Prism Central are.
• How to log into Prism Element.
• About the various components of Prism Element's Home dashboard.
• How to log into Prism Central.
• About the various components of Prism Central's Main dashboard.
• When to use Prism Central and when to use Prism Element.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 41
Securing a Nutanix Cluster
Module
2
SECURING A NUTANIX CLUSTER
Module 2 Overview
This module will introduce you to various security features that Nutanix offers, including
authentication, user account creation, role-based access control, cluster lockdown, and data-at-
rest encryption.
By the end of this module, you will have learned:
• How security is built into the Nutanix platform from the ground up.
• How STIGs and SCMA are used and automated to self-heal a cluster using security baselines.
• How to configure authentication in Prism Central via a directory service or a SAML-based
identity provider.
• How to enable and configure client (that is, two-way) authentication.
• How to create and manage local user accounts in Prism Central.
• What role-based access control is, how it works, and how you can use both built-in and
custom roles to assign permissions to your users.
• What cluster lockdown is.
• What Flow Network Security is, how security polices work, and what types of security
policies are available.
• How Nutanix implements both hardware and software-based data-at-rest encryption.
The strong pervasive culture and processes built around security harden the Nutanix Enterprise
Cloud and eliminate zero-day vulnerabilities. Efficient one-click operations and self-healing
security models easily enable automation to maintain security in an always-on hyperconverged
solution.
Nutanix conforms to RHEL 7 Security Technical Implementation Guides (STIGs) that use
machine-readable code to automate compliance against rigorous common standards. With
Nutanix Security Configuration Management Automation (SCMA), you can quickly and
continually assess and remediate your platform to ensure that it meets or exceeds all regulatory
requirements.
Nutanix has standardized the security profile of the Controller VM to a security compliance
baseline that meets or exceeds the standard high-governance requirements. The most
commonly used references in the United States for security-related technical requirements are:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 42
Securing a Nutanix Cluster
• The National Institute of Standards and Technology Special Publications Security and
Privacy Controls for Federal Information Systems and Organizations (NIST 800.53)
• The US Department of Defense Information Systems Agency (DISA) Security Technical
Implementation Guides (STIG)
The Nutanix Controller VM supports STIG compliance with the RHEL 7 STIG as published by
DISA. STIG rules are capable of securing the boot loader, packages, file system, booting and
service control, file ownership, authentication, kernel, and logging.
SaltStack and SCMA are used self-heal any deviation from the security baseline configuration
of the operating system and hypervisor to remain in compliance. If any component is non-
compliant, then the component is set back to the supported security settings without the need
for manual, administrative intervention.
SCMA Implementation
The Nutanix platform and all products leverage the Security Configuration Management
Automation (SCMA) framework to ensure that services are constantly inspected for variance
to the security policy. SCMA checks multiple security entities for both Nutanix storage and
AHV, automatically reports and logs inconsistencies, and then reverts entities to the baseline as
needed.
With SCMA, you can schedule the STIG to run hourly, daily, weekly, or monthly. Running the
STIG does not affect system performance since it has the lowest system priority within the
virtual storage controller, allowing you to run checks as frequently as your company policies
require.
Security Updates
Nutanix provides continuous fixes and updates to address threats and vulnerabilities. Nutanix
Security Advisories provide detailed information on the available security fixes and updates,
including the vulnerability description and affected product/version.
To see the list of security advisories or search for a specific advisory, log on to the Support
Portal and select Documentation, and then select Security Advisories.
Note:
• For more information about security features that are offered out-of-the-box,
see the Nutanix Security Landscape section of the Security Guide on the Support
Portal.
• For non-US students, refer to the this link.
• Active Directory or OpenLDAP. Users can authenticate using their Active Directory or
OpenLDAP credentials when support is enabled for Prism Central.
• SAML. Users can authenticate through a supported identity provider when SAML support is
enabled for Prism Central.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 43
Securing a Nutanix Cluster
• Local user authentication. Users can authenticate if they have a local Prism Central account.
All authentication options - adding a directory service as well as adding local users - can be
accessed from the Settings menu in Prism Central, in the Users and Roles category.
To add a new directory service, in the Users and Roles category, click Authentication. When
the Authentication Configuration dialog box appears, in the Directories section, click + New
Directory.
First, you will be prompted to choose a directory type. The options in the remainder of the
dialog box will change based on your selection. For example, if you select Active Directory, you
will need to provide a name, domain, directory URL, search type, and service account, as shown
in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 44
Securing a Nutanix Cluster
On the other hand, if you select OpenLDAP, in addition to the information listed above, you will
also be prompted to provide the user object class, user search base, username attribute, group
object class, group search base, group member attribute, and group member attribute value.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 45
Securing a Nutanix Cluster
Identity and Access Management (IAM) is an authentication and authorization feature that can
be enabled and managed in Prism Central. It uses attribute-based access control (ABAC) and is
disabled by default.
If IAM is not enabled, the only supported IDP for single sign-on is Active Directory Federations
Services (ADFS). If IAM is enabled, more options are available, including Azure ADFS, Okta,
PingOne, Shibboleth, and Keycloak.
• Highly Scalable Architecture. IAM is based on Kubernetes, and uses independent pods for
authentication, authorization, and IAM data storage and replication.
• Secure by Design. Mutual TLS authentication (mTLS) secures IAM component
communication. The Micro Services infrastructure on Prism Central provisions certificates for
mTLS.
• More SAML identity providers. As described earlier, enabling IAM allows you to use IDPs
beyond just ADFS.
Note: For more information about IAM, see the Security Management Using Identity
and Access Management (Prism Central) section of the Security Guide on the
Nutanix Support Portal.
Much like adding an authentication directory, a SAML-based identity provider is also added
from the User and Roles category of the Settings menu in Prism Central. In the Authentication
Configuration dialog box, click the + New IDP button.
To configure an IDP, you need to provide a configuration name, an optional group attribute
name, an optional group attribute delimiter, and upload the metadata file that contains the
identity provider's information. This metadata file is typically an XML file that is obtained via the
identity provider's website, which can be downloaded and then uploaded to Prism Central.
Doing this completes the IDP configuration in Prism Central, but you must also configure the
callback URL for Prism Central on the identity provider. To do this, you need to download
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 46
Securing a Nutanix Cluster
a metadata file that describes Prism Central from the Authentication Configuration dialog
box, and then upload the XML file to the identity provider.
When you configure an IDP, this is what you will typically see:
And when you need to download the metadata file with Prism Central information to
configure the callback URL, it can be done from the Configuring Authentication dialog box, as
highlighted in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 47
Securing a Nutanix Cluster
Client authentication enables secure access to Prism through exchange of a digital certificate,
which is then validated by Prism against your organization's trusted signed certificate.
In a typical, one-way authentication process, the server provides a certificate so that the user
can verify the identity of the server. Client authentication makes this a two-way system, in
which the server also verifies the authenticity of the user.
Client authentication ensures that a Nutanix cluster receives a valid certificate from a user. The
user provides this certificate when accessing Prism either by installing the certificate on their
local machine or by providing it through a smart card reader.
Prism also supports the user of a Common Access Card (CAC) for client authentication. It is
a smart card, roughly the size of a credit card. After you insert a CAC into a reader and enter
your PIN, your personal certificate will be extracted and forwarded to the server.
• Validate that the certificate has been signed by your organization’s trusted signing
certificate.
• Extract the Electronic Data Interchange Personal Identifier (EDIPI) from the certificate and
uses the EDIPI to check the validity of an account within the Active Directory. The security
context from the EDIPI is used for your PRISM session.
Prism Central supports both certificate authentication and basic authentication in order to
handle both Prism Central login using a certificate and also allow the REST API to use basic
authentication. It is not possible for REST API to use CAC certificates. With this behavior, if the
certificate is present during Prism Central login, the certificate authentication is used. However,
if the certificate is not present, basic authentication is enforced and used.
To configure client authentication in Prism Central, you need to access the Settings menu
and scroll down to the Users and Roles category. Then, click Authentication and in the
Authentication Configuration dialog box, click the Client tab. This is what you will typically see
when configuring client authentication for the first time:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 48
Securing a Nutanix Cluster
Any user accounts that you create can have their name, email, password, language, and roles
updated at any time. Custom user accounts can also be deleted after creation.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 49
Securing a Nutanix Cluster
The default 'admin' user cannot be deleted and the roles associated with this account cannot be
changed. However, some information - first and last name, email, password, and language - can
still be updated or modified.
In this section, you will learn how to use Prism Central to create, update, and delete local user
accounts, and reset local user passwords.
To create a local user account in Prism Central, you need to access the Settings page, scroll
down to the User and Roles category, and click Local User Management. When you click +
New User, you will be presented with the dialog box shown in the following figure.
Creating a new user requires that you provide a username, first name, last name, password, and
select a language (English, Simplified Chinese, Japanese, and Korean).
Note: When creating a username, only some types of special characters are
allowed, such as dashes or underscores. Therefore, john-smith and john_smith are
valid usernames but john.smith is not. If you use an invalid special character in a
username, you will be prompted to change it.
Two roles are available for assignment. Assigning an account the User Admin role will allow
that user to view information in Prism Central, perform any administrative task, and create and
modify user accounts. Assigning an account the Prism Central Admin role will allow that user
to view information in Prism Central and perform any administrative task. However, a Prism
Central Admin cannot create and modify user accounts.
It is also possible to assign no role at all to a user account. In this case (that is, if both the User
Admin and Prism Central Admin options are left unchecked), the user will be able to log into
and view information in Prism Central but will not be able to perform administrative tasks or
manage user accounts.
Note: For step-by-step instructions on creating user accounts in Prism Central, see
Managing Local User Accounts in the Security Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 50
Securing a Nutanix Cluster
Local user accounts can also be created in Prism Element. The procedure is the same (Settings
> Users and Roles > Local User Management > + New User) and you will be prompted to
provide the same information that is required when creating a new local user in Prism Central.
However, the roles available in Prism Element are User Administrator, Cluster Administrator,
and Backup Admin. For step-by-step instructions on creating user accounts in Prism Central,
see Creating a User Account in the Security Guide on the Nutanix Support Portal.
After you create a local user account, you can disable, update, or delete it using the options
shown in the following figure.
Clicking the link in the Enabled column will allow you to either enable or disable a user account.
By default, as soon as you create a user account, it will be enabled and the column will display
Yes. Clicking Yes will disable the account and the column will display No. The default 'admin'
account cannot be disabled.
Clicking the Edit (pencil) icon will allow you to edit a user account. You can change most of the
information that you provided when creating the account - first name, last name, email, role,
and password. The username will be locked, however, and cannot be modified. The only way
to change a username is to delete the user account and create a new one. The default 'admin'
account can also have most of its information changed, except for the username and role.
Note: There are two ways to change a user's password. The first method, described
above, is by editing the user's account in Prism. The other method is by using the
command line interface (CLI). For step-by-step instructions, see the Resetting
Password (CLI) section of the Security Guide on the Nutanix Support Portal.
Clicking the X icon will allow you to delete a user account. You will be prompted for
confirmation first, and Prism will display a reminder that user accounts can be disabled if
desired.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 51
Securing a Nutanix Cluster
Note: For step-by-step instructions, see the Updating a User Account section of
the Security Guide on the Nutanix Support Portal.
Here, you can view all available roles, the number of users that have these roles assigned,
and a brief description of each role. Clicking the name of a role will open a details page, as
shown in the following figure. From the role details page, you can view a summary of the role,
a description, actions that the role is allowed to perform, and assigned users and user groups.
You can also manage roles from the roles details page. Built-in roles can be assigned and
duplicated. Custom roles can be assigned, duplicated, updated, and deleted.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 52
Securing a Nutanix Cluster
In this section, you will learn how to create and modify custom roles, and how to perform both
general and precise role mapping.
To create a custom role in Prism Central, on the Roles dashboard, click Create Role. The Roles
page will be displayed, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 53
Securing a Nutanix Cluster
The Roles page has two major sections. In the General Settings section, you need to provide a
name and a description for the custom role. A name is required, but a description is optional.
In the second half of the page, you can assign specific permissions to users across a variety of
categories. You can use the Filter Entities field to display specific permission categories. For
example, typing VM in the Filter Entities field will display two categories: VM and VM Recovery
Point.
You can use each of the categories on the page to assign permissions to your custom role.
Permissions can be assigned from one or more categories, based on the type of custom role
that you want to create.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 54
Securing a Nutanix Cluster
Let's take the VM category as an example. If you want to create a custom role that only allows a
user to work with VMs, you need to click the VM category in the second half of the Roles page.
You will then see several options.
By default, like all of the other categories, VMs will be set to No Access. You can change this
to view, basic, edit, or full access. Each of the available options has a small tooltip next to the
name, represented by a question mark icon. Moving your mouse cursor over the question mark
icon will display the permissions associated with that type of access.
For example, the View Access option includes the Access Console VM and View VM
permissions. Basic Access includes the two permissions available to View Access, and
additionally includes the Update VM Power State permission.
The checkbox at the bottom of the VM category allows you to grant VM creation permission
to the role. However, in some cases, there may be dependencies with other categories. In this
example, to create VMs, the role may also need View Cluster and View Subnet permissions. You
can check if such dependencies exist, by moving your mouse cursor over the question mark
icon next to the Allow VM Creation option.
If you want more granular control over permissions assignments, you can select the Set
custom permissions option and click the Change link. This will display a list of all available VM
permissions that can be assigned to the role, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 55
Securing a Nutanix Cluster
After you make your selections and click Save in the Custom VM Permissions dialog box, you
can either save the role itself to complete the role creation process, or add more permissions
from a different category.
Note: For step-by-step instructions, see the Creating a Custom Role role section of
the Security Guide on the Nutanix Support Portal.
Managing Roles
Whether you use built-in roles or create custom ones, several management actions are available
to you. You can assign roles to users or groups of users; duplicate roles; delete roles; or update
them. Depending on the type of role, some actions may not be available. For example, built-in
roles can only be assigned or duplicated, but not updated or deleted. All four actions can be
performed on custom roles, as shown in the following figure. You can easily distinguish built-
in roles from custom ones, because built-in roles display System beside the role name in the
Name column.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 56
Securing a Nutanix Cluster
To access role management options, select the checkbox next to the role name and click the
Actions drop-down list at the top of the screen.
Clicking Update Role will display the Roles page once more. You can change the name, the
description, and add or remove permissions as needed.
Clicking Delete will display a confirmation prompt, because this action cannot be undone.
Clicking Duplicate will display the Roles page again. You will have access to the same options
as when creating or updating a role, and can make modifications as needed.
Clicking Manage Assignment will display the Role Assignment page. The two ways in which
you can assign both built-in and custom roles to users are described in the following section.
After you configure authentication, users and directories are not assigned permissions or
roles by default. Permissions must be explicitly assigned to users, authorized directories, or
organizational units.
There are two ways in which you can assign roles to users or groups of users: Role Mapping
and Role Assignment.
Role Mapping allows you to use pre-defined roles in Prism Central - User Admin, Cluster Admin,
and Viewer - and assign those roles to users, groups, and organizational units. This feature is
accessed via the Settings menu, in the Users and Roles category. Click Role Mapping to create
new mappings, or modify or delete existing ones.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 57
Securing a Nutanix Cluster
Role Mapping is typically used to create basic role maps. If you are using custom roles in
Prism Central, then you can assign those roles to users or groups with the Role Assignment
feature. To do this, select a role from the Roles dashboard, click the Actions drop-down list, and
select Role Assignment.
On the Role Assignment page, you can select users, user groups, or organizational units to
which the selected role will be applied. In addition, you can also select entities. The list of
available entities will depend on the role that is currently being assigned. For example, in the
following figure, since a custom VM Admin role has been selected, the entities to which it can
be assigned are all related to VMs - AHV VMs, clusters, subnets, or all selectable entity types.
Note: For more information and step-by-step instructions, see the following
sections of the Security Guide on the Nutanix Support Portal:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 58
Securing a Nutanix Cluster
• Assigning a role
Cluster RBAC allows you to give users or user groups access to one or more clusters that have
been registered with Prism Central. This feature will allow the user to take action on entities
such as VMs, containers, and hosts only on the clusters to which they have access. Up to 15
clusters can be assigned to any one user or user group.
Consider an example in which you have three clusters that are managed by Prism Central - one
in Bangalore, India; one in Raleigh, North Carolina; and one in San Jose, California. The admins
in Bangalore need access only to the Bangalore cluster. Similarly, the admins in Raleigh needs
access only to the Raleigh cluster. However, the admins in San Jose need access to all three.
You can use cluster RBAC to provide each group of admins with access to clusters in their
region (or across regions, in the case of San Jose) as needed. After you configure cluster RBAC,
when a Bangalore or Raleigh admin logs into Prism Central, they will only be able to view and
manage their assigned clusters. On the other hand, a San Jose admin will have access to all
three clusters.
All dashboards in Prism Central will update to reflect the clusters that each group of admins
has access to, and the actions they will be able to perform will depend on which role they were
assigned.
Note: Cluster RBAC is supported only on AHV and ESXi clusters. For more
information, see the Cluster Role Based Access Control (RBAC) section of the
Security Guide on the Nutanix Support Portal.
You can create one or more key pairs and add the public keys to enable key-based SSH access.
However, when site security requirements do not allow this, you can remove all public keys to
prevent SSH access.
Note: For step-by-step instructions, see the Controlling Cluster Access section of
the Security Guide on the Nutanix Support Portal.
Virtualized networks also regularly change both their network configurations and hosts as
they start, stop, and migrate frequently. Manually enforcing security policies using traditional
firewalls cannot keep up with these changes because they rely on network configuration
information to inspect traffic.
To address these points of vulnerability, Nutanix offers Flow Network Security. Flow uses a
policy-driven security framework that inspects traffic that originates and terminates within the
datacenter, eliminating the need for additional firewalls.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 59
Securing a Nutanix Cluster
Flow also uses a workload-centric approach instead of a network-centric one. Security policies
are applied to categories. Through categories, policies inspect traffic to and from groups of
VMs without the policies having to be applied to the VMs themselves. This means that any
number of VMs can be assigned to a category and have the security policy inspect their traffic
without administrative intervention, at any scale.
This also allows Flow to examine network traffic to and from VMs irrespective of network
configuration and host changes.
As we learned a few moments earlier, entities within a Flow security policy are identified by the
categories to which they belong. After a VM is associated with a category, and that category
is associated with a policy, traffic continues to be monitored even if the VM migrates to a
different network or changes its IP address. Network attributes, such as IP address and VLAN,
play no role in traffic monitoring.
This is possible because Flow's security policies using application-centric policy language. This
simply means that you first need to specify the VMs that belong to the application you want to
protect, and then identify the networks or entities with which you want to allow those VMs to
communicate (in both the inbound and outbound directions).
More granular policies can be defined by also specifying which ports and protocols can be used
for communication.
However, it is important to note that you cannot specify categories and subnets that you want
to block. This is because the number of these entities is typically much larger than the list of
allowed entities, and tends to grow at a much faster rate as well. Instead, once you specify what
categories and subnets are allowed to communicate, everything else is blocked. This results in
a smaller, tighter policy configuration that can be modified, monitored, and controlled more
easily.
Policies can also be enforced in two ways. A policy can be set to run in Monitor mode, which
will allow all traffic, including traffic that is not allowed by the policy. This allows you to get
a better understanding of the policy's impact and refine it further as needed. Once a policy
and its impact have been determined, then you can switch it to Apply mode, which effectively
enables the policy, blocking all traffic that is not allowed by its configuration.
An Application Security Policy allows you to define a multitiered application and control traffic
to and from as well as within the application tiers.
An Isolation Environment Policy allows you to block all traffic between two groups of VMs. For
example, you may want all of your production VMs to be incapable of communication with
your Dev/Test or Sandbox VMs. In this scenario, if you have two categories (one for production
and one for sandbox), you can define a policy that will prevent VMs in those two groups from
communicating with each other. However, communication between VMs within a category will
still be allowed. So, all VMs within the production category will still be able to communicate with
each. Similarly, all VMs within the Sandbox category will also be able to communicate with each
other.
A Quarantine Policy, like the name suggests, is meant to be used when you want to isolate an
infected VM.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 60
Securing a Nutanix Cluster
And finally, a VDI Policy can be used to secure a VDI environment and allows you to create
security policies based on users and groups in an Active Directory domain. When users log
into their virtual desktops, Flow's ID firewall will place VMs in certain predefined categories
automatically, ensuring that security policies are automatically enforced.
Nutanix provides two options that you can use to secure your data:
• DARE using Self-Encrypting Drives (SEDs). This involves using a combination of SEDs and an
external Key Management Server (KMS) to secure data while it is at rest.
• DARE using Software-only Encryption. When software-only encryption is enabled, AOS
uses the AES-256 encryption standard to encrypt your data. You can use either the Nutanix
Native Key Manager (both local and remote) or an external KMS to secure your keys.
- In this scenario, data is encrypted as a low priority background task that is designed not
to interfere with other workloads that are running on the cluster.
- Once the task to encrypt a cluster begins, the operation cannot be cancelled. Even if you
stop and restart the cluster, the system will resume the operation.
• You can change the encryption from SED-based DARE to software-only encryption.
Note: For a detailed list of considerations, see the Data-at-Rest Encryption section
of the Security Guide on the Nutanix Support Portal.
SED-based DARE
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 61
Securing a Nutanix Cluster
To accomplish this, Nutanix uses a data security configuration that uses SEDs with keys that are
maintained through a separate key management device. The encryption workflow is as follows:
1. SEDs are installed for all data drives in a cluster. These drives are FIPS 140-2 validated and
use FIPS 140-2 validated cryptographic modules. This is easily done for a brand new cluster.
An existing cluster can be converted by replacing existing drives with SEDs.
2. Data on the drives is encrypted but read and write access is open. By default, the built-in
manufacturer key is used to protect access to data. However, when data protection for the
cluster is enabled, the CVM must provide a proper key.
3. A symmetric data encryption key (DEK), such as AES 256, is applied to all data being
written or read. The key is known only to the drive controller and never leaves the physical
subsystem, so there is no way to access the data directly from the drive itself.
4. Another key, known as a key encryption key (KEK), is used to encrypt/decrypt the DEK
and authenticate to the drive. You may have heard this referred to as an authentication key
or PIN by other vendors. Each drive has a separate KEK, generated by the FIPS compliant
random number generator in the drive controller. Although the KEK is generated on the
node it is not stored locally; all KEKs are sent to the KMS for secure storage and retrieval.
6. Keys are stored in a KMS that is separate from the cluster. Each node maintains a set of
certificates and keys to establish a secure connection with the KMS. The CVM communicates
with the key management server using the Key Management Interoperability Protocol (KMIP)
to upload and retrieve drive keys.
7. While only one KMS is required, multiple devices are recommended so that the KMS does
not become a single point of failure. KMSes should be configured in clustered mode, so
that they can be added to the Nutanix cluster as a single, resilient entity that is capable of
withstanding a single failure.
8. When a node is fully powered off and then on again, the CVM retrieves the drives keys from
the KMS uses them to unlock the drives.
When SED-based DARE is set up and configured correctly, the CVM cannot access data on
the drives if it cannot get the correct keys from the KMS. If a drive is stolen, data is inaccessible
because the KEK cannot be obtained from the drive itself. If a node is stolen, the KMS can
revoke the node's certificates so that they cannot be used to access data on the drives.
Note: For more information and step-by-step instructions, see the Data-at-Rest
Encryption (SEDs) section of the Security Guide on the Nutanix Support Portal.
Software-based DARE
To accomplish this, Nutanix uses a data security configuration that uses open standards (KMIP
protocols), AOS functionality, and a cluster's native KMS or an external KMS.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 62
Securing a Nutanix Cluster
For software-based encryption, data protection must be enabled for the cluster before data can
be encrypted. A symmetric data encryption key (DEK) such as AES 256 is applied to all data
being written or read. The key is known only to AOS, so data cannot be accessed directly from
the drive.
If an external KMS is used, each node maintains a set of certificates and keys to establish a
secure connection with the KMS. And while only one KMS is required, multiple devices are
recommended so that the KMS does not become a single point of failure. KMSes should be
configured in clustered mode, so that they can be added to the Nutanix cluster as a single,
resilient entity that is capable of withstanding a single failure.
Note: For more information and step-by-step instructions, see the Data-at-Rest
Encryption (Software Only) section of the Security Guide on the Nutanix Support
Portal.
Module 2 Summary
In this module, you learned:
• How security is built into the Nutanix platform from the ground up.
• How STIGs and SCMA are used and automated to self-heal a cluster using security baselines.
• How to configure authentication in Prism Central via a directory service or a SAML-based
identity provider.
• How to enable and configure client (that is, two-way) authentication.
• How to create and manage local user accounts in Prism Central.
• What role-based access control is, how it works, and how you can use both built-in and
custom roles to assign permissions to your users.
• What cluster lockdown is.
• What Flow Network Security is, how security polices work, and what types of security
policies are available.
• How Nutanix implements both hardware and software-based data-at-rest encryption.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 63
Configuring Cluster Networking
Module
3
CONFIGURING CLUSTER NETWORKING
Module Overview
This module will help you understand how to configure and manage virtual networks in an AHV
cluster.
By the end of this module, you will have learned:
A VS defines a collection of AHV nodes and the uplink ports on each node. It’s an aggregation
of the same OVS bridge on all the compute nodes in a cluster. For example, vs0 is the default
virtual switch and is an aggregation of the br0 bridges and br0-up uplinks from all the
nodes. Nutanix designed the VS configuration to provide flexibility in configuring virtual bridge
connections
Bridges
A bridge acts as a virtual switch to manage traffic between physical and virtual network
interfaces. The default AHV configuration includes an OVS bridge called br0 and a native Linux
bridge called virbr0. The virbr0 Linux bridge carries management and storage communication
between the CVM and AHV host. All other storage, host, and VM network traffic flows through
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 64
Configuring Cluster Networking
the br0 OVS bridge. The AHV host, VMs, and physical interfaces use ports for connectivity to
the bridge.
Ports
Ports are logical constructs created in a bridge that represent connectivity to the virtual switch.
Nutanix uses different port types, such as:
• An internal port: This acts as the AHV host management interface. It is usually with the same
name as the default bridge (br0).
• Tap ports: Connects VM virtual NICs (vNICs) to the bridge.
• VXLAN ports: Are only used for the IP address management (IPAM) functionality provided
by AHV.
• Bonded ports: Provide NIC teaming for the physical interfaces of the AHV host.
Bonds
Bonded ports aggregate the physical interfaces on the AHV host for fault tolerance and load
balancing. By default, the system creates a bond named br0-up in bridge br0 containing all
physical interfaces. OVS bonds allow for several load-balancing modes to distribute traffic,
including active-backup, balance-slb, and balance-tcp. Administrators can also activate
LACP for a bond to negotiate link aggregation with a physical switch. During installation, the
bond_mode defaults to active-backup, which is the configuration Nutanix recommend for ease
of use.
• Active-Backup: The recommended and default bond mode is active-backup, where one
interface in the bond is randomly selected at boot to carry traffic and other interfaces
in the bond are used only when the active link fails. Active-backup is the simplest bond
mode, easily allowing connections to multiple upstream switches without additional switch
configuration.
• Active-Active with MAC pinning/Balance-SLB: The balance-slb bond mode in OVS takes
advantage of all links in a bond and uses measured traffic load to rebalance VM traffic from
highly used to less-used interfaces. When the configurable bond-rebalance interval expires,
OVS uses the measured load for each interface and the load for each source MAC hash to
spread traffic evenly among links in the bond.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 65
Configuring Cluster Networking
• LACP with Balance-TCP: Taking full advantage of bandwidth provided by multiple links to
upstream switches, from a single VM, requires dynamically negotiated link aggregation and
load balancing using balance-tcp. Nutanix recommends dynamic link aggregation with LACP
instead of static link aggregation due to improved failure detection and recovery.
Bridge Chaining
In newer AOS versions all AHV hosts use a bridge chain (multiple OVS bridges connected in a
line) as the backend for features like microsegmentation. Each bridge in the chain performs a
specific set of functions. Physical interfaces connect to bridge brX, and VMs connect to bridge
brX.local.
AHV supports the use of VLANs for the CVM, AHV host, and user VMs. You can easily create
and manage a vNIC’s networks for user VMs using the Prism GUI, the Acropolis CLI (aCLI),
or REST without any additional AHV host configuration. Each virtual network in AHV maps to
a single VLAN and bridge. You must create each VLAN and virtual network created in AHV on
the physical top-of-rack switches as well, but integration between AHV and the physical switch
can automate this provisioning.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 66
Configuring Cluster Networking
IPAM enables AHV to assign IP addresses automatically to VMs using the Dynamic
Host Configuration Protocol (DHCP). You can configure each virtual network and
associated VLAN with a specific IP subnet, associated domain settings, and group
of IP address pools available for assignment. AOS uses VXLAN and OpenFlow rules in OVS to
intercept DHCP requests from user VMs so that the configured IP address pools and settings
are used. A managed network refers to an AHV network in which IPAM has been enabled,
whereas an unmanaged network refers to an AHV network in which IPAM has not been
enabled.
Availability Zones
Availability Zones are distinct locations within a region that are engineered to be
isolated, provide inexpensive, and low-latency network connectivity from failures in other
Availability Zones. Each availability zone contains one or more data centers. A region holds one
or more availability zones
Availability zones are implemented such that normal failures (such as a power plant failure)
in one zone will not affect another. Natural and manmade disasters such as catastrophic
earthquakes and nuclear strikes may disable more than one availability zone in a region.
This table lists key AHV networking terms that are important to remember, as well as the
equivalent term in ESXi or Hyper-V. If you have prior experience with either VMware or Hyper-
V, some of these terms may be familiar to you.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 67
Configuring Cluster Networking
Bridges act as virtual switches to manage network traffic between physical and virtual network
interfaces. The default AHV configuration includes an OVS bridge called br0 and a native Linux
bridge called virbr0. The virbr0 Linux bridge carries management and storage communication
between the CVM and AHV host. All other storage, host, and VM network traffic flows through
the br0 OVS bridge. The AHV host, VMs, and physical interfaces use ports for connectivity to
the bridge.
By default, a bond named br0-up is created in bridge br0. After the node imaging process,
all interfaces are placed in a single bond. OVS bonds allow for several load-balancing modes,
including active-backup, balance-slb, and balance-tcp. LACP can also be activated for a bond.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 68
Configuring Cluster Networking
In recent AOS versions, all AHV hosts use a bridge chain (multiple OVS bridges connected in
a line) as the backend for features like microsegmentation. Each bridge in the chain performs a
specific set of functions.
Traffic from VMs enters the bridge chain and flows through the chain. The bridge either
forwards the traffic to the physical network or sends it back through the chain to reach another
VM. All VM traffic must flow through the bridge chain, which applies microsegmentation
and network functions. The management of the bridge chain is automated, and no user
configuration of the chain is required or supported.
Several of these capabilities are advanced networking concepts, and are discussed in other
Nutanix training programs. Access to these features also depends on your Nutanix Cloud
Infrastructure license tier, and whether or not you have Flow Network Security configured on
your clusters.
However, irrespective of your license level, certain fundamental aspects of network monitoring
will always be accessible to you. As a result, in this module, you will learn about the Subnets
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 69
Configuring Cluster Networking
dashboard, and how to use it to view, monitor, and configure networks, LAN interfaces, and
virtual switches.
1. From the Entities menu, expand the Network & Security category.
2. Click the Subnets option. Note that you can click the star icon to the right of the option
name to bookmark the dashboard for easy access.
By default, the List view will be displayed, as shown in the following figure.
The Subnets dashboard displays all virtual networks that have been configured on the clusters
that are managed by Prism Central. The table display the name, external connectivity status,
VLAN ID, associated VPCs, associated virtual switches, the IP prefix, and the cluster on which
the networks reside, and the hypervisor of those clusters. You can also use the Create Subnet
button at the top of the page to create new virtual networks.
Clicking the name of a network in the table will display a details page, as shown in the following
figure. In addition to network summary information, if the network has IP address management
configured, you will be able to view the IP address pool here. If the network does not have IPAM
configured (that is, if it is an unmanaged network), no IP address pool information will be shown
here.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 70
Configuring Cluster Networking
Finally, from the Subnets dashboard, you can view additional network details by clicking the
Network Configuration button at the top of the page. The Network Configuration dialog box
has three tabs: subnets, virtual interfaces, and virtual switch, as shown in the following figure.
The Subnets tab displays the names of all configured subnets, associated virtual switches,
VLAN IDs, number of free and used IP addresses, and allows you to either edit or delete
a subnet. The Internal Interfaces tab displays the name of the interface, the subnet that
the interface belongs to, available features, and the interface designation (such as eth0 or
eth1). And finally, the Virtual Switch tab lists all virtual switches that are available, associated
bridges, the MTU in bytes, and the bond type associated with the virtual switch.
Creating a Subnet
To create a subnet in Prism Central, you need to click the Entities menu, expand the Network
& Security category, and select Subnets. Then, on the Subnets dashboard, click the Create
Subnet button to view the Create Subnet dialog box.
Note: You can also create a subnet by clicking the Network Config button, then
select a cluster for the network. In the Network Configuration dialog box, on the
Subnets tab, click the + Create Subnet button.
In the Create Subnet dialog box, you need to provide the subnet name, VLAN ID, select the
cluster and virtual switch and select the IP Address Management check box to enable IPAM (If
you select IPAM, you will create a managed network. Leaving IPAM unchecked will result in an
unmanaged network).
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 71
Configuring Cluster Networking
If you need to define a domain, select the DHCP Settings checkbox and provide DNS servers
and domain information. Optionally, you can also create an IP address pool to define a range of
addresses for automatic assignment to virtual NICs.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 72
Configuring Cluster Networking
Then, configure a DHCP server by entering an IP address in the DHCP Server IP Address field
and click Save.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 73
Configuring Cluster Networking
In Prism Central, click the Entities menu, expand the Network & Security category, and select
Subnets. On the Subnets dashboard, click the Network Config button to view the Network
Configuration dialog box.
The default view in the network configuration dialog box is the Subnets tab. It displays a list of
all the available networks. Under the Actions section are two options available to you; Edit and
Delete. The Edit option is used to make changes to the network and Delete is used to remove a
network.
As mentioned earlier, the edit option allows you to make changes to the network configuration,
but what actually happens when you click the Edit button? It opens the Update Subnet dialog
box, which includes the same fields as the Create Subnet dialog box. You can update most of
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 74
Configuring Cluster Networking
the fields that are available when you create a subnet. This includes changing the subnet name,
selecting a different virtual switch, updating the VLAN ID, updating DHCP settings, creating and
editing the IP address pool. If IPAM was previously selected, you cannot disable it when editing
a network. However, when updating an unmanaged network, you can enable IPAM to convert it
to a managed network. Once you update the required fields, click Save to complete the update.
Note: You can also update the configuration directly on the Subnet's dashboard.
Select the required network from the dashboard and select Update from the
Actions menu. This opens the Update Subnets window where you can make
changes to the configuration. Most of the fields in the window are editable, however
you cannot change the VLAN id and IPAM.
Deleting a Network
If a network needs to be removed, select that network in the Subnets tab and click Delete. In
the confirmation pop-up click OK to confirm the deletion.
Note: You can also delete a network directly from the Subnet's dashboard. Select
the required network from the dashboard and select Delete from the Actions menu.
AHV uses a virtual switch to connect the Controller VM, the hypervisor, and the guest VMs
to each other and to the physical network. A virtual is configured by default on each AHV
node and virtual switch services start automatically when you start a node. Virtual switch
configuration is designed to provide flexibility in configuring virtual bridge connections.
A virtual switch defines a collection of AHV compute nodes and the uplink ports on each node.
It is an aggregation of the same OVS bridge on all the compute nodes in a cluster. For example,
vs0 is the default virtual switch is an aggregation of the br0 bridge of all the nodes.
The default virtual switch, vs0, is created by the system. It connects the default bridge, br0, on
all the hosts in the cluster during installation of or when upgrading to the compatible versions
of AOS and AHV. The default virtual switch, vs0, has the following characteristics:
• It cannot be deleted.
• The default bridges br0 on all the nodes in the cluster map to vs0. thus, vs0 is not empty. It
has at least one uplink configured.
• The default management connectivity to a node is mapped to default bridge br0 that is
mapped to vs0.
• The default parameter values of vs0 - Name, Description, MTU and Bond Type - can be
modified subject to aforesaid characteristics.
• The default virtual switch is configured with the Active-Backup uplink bond type.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 75
Configuring Cluster Networking
The virtual switch aggregates the same bridges on all nodes in the cluster. The bridge (br1)
connects to the physical port such as eth3 (Ethernet port) through the corresponding
uplink (br1-up). The uplink ports of the bridges are connected to the same physical network.
For example, the following illustration shows that vs0 is mapped to the br0 bridge, in turn
connected via uplink br0-up to various (physical) Ethernet ports on different nodes.
To create a virtual switch in Prism Central, you need to click the Entities menu, expand the
Network & Security category, and select Subnets. Then, on the Subnets dashboard, click the
Network Config button, select a cluster and navigate to the Virtual Switch tab. On the Virtual
Switch tab, click Create VS (to create a new virtual switch) or click the pencil icon (to update
an existing virtual switch) to open the Create Virtual Switch page.
On the Create Virtual Switch page, you will notice two tabs: the General tab and Uplink
Configuration tab.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 76
Configuring Cluster Networking
On the General page, you need to provide a name for the switch, optionally provide a
description, set the MTU value, and select the configuration method. The configuration
method can either be Standard (ensures minimal disruption to the workloads), which is the
recommended method or the Quick (can briefly interrupt the workloads on the cluster)
method. Clicking Next opens the Uplink Configuration page.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 77
Configuring Cluster Networking
On the Uplink Configuration page, you need to update information, such as selecting the bond
type, hosts, uplink ports, port speeds, and the host port (this is based on your selection in the
Select Uplink Ports section. Verify the information and then click Create.
The corresponding SNMP settings on the first-hop network switch need to be configured first,
before you can use Prism web console to configure network switch information.
To configure one or more network switches for statistics collection, navigate to the Settings
menu and select Network Switch. On the Network Switch Configuration page, you will notice
two tabs: Switch Configuration and SNMP Profile.
Click the + Add Switch Configuration button to open the switch configuration dialog box.
Update the fields, such as the Switch Management IP Address, Host IP Addresses or Host
Name, SNMP Profile and version, SNMP Community Name, and the SNMP Username. If you
select SNMP v3, instead of the community name field you need to update the SNMP Security
Level field.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 78
Configuring Cluster Networking
Click the SNMP Profile tab and then click the + Add SNMP profile button to open the profile
configuration dialog box. Update the fields in the dialog box, such as the profile name, SNMP
Version, SNMP Community Name and the SNMP Username. Then click Save to save the
information.
Modifying Switch Information
The process to update or change either the switch configuration information or the SNMP
profile is quite straightforward. On the Network Switch Configuration page, based on the
information you want to edit, select either the Switch Configuration tab or the SNMP Profile
tab. Since the process is similar, we'll only cover how to modify the switch configuration
information. On the Switch Configuration tab, select the configuration to edit, and click
the pencil icon. This opens the edit switch configuration dialog box. This is similar to the
create switch configuration dialog box. All the fields are the same in both the create and edit
dialog boxes. You can update all fields except for the Switch Management IP Address field. In
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 79
Configuring Cluster Networking
the figure listed above, notice that the Switch Management IP Address field is greyed out, and
the SNMP Profile field has been updated. Once all the changes are complete, click Save.
Note: When updating the SNMP Profile, all fields are editable except for the Profile
Name field.
If you use the Active-Active NIC-teaming policy, you must enable LAG and LACP on the
corresponding ToR switch for each node in the cluster one after the other. You enable LAG
and LACP in an Active-Active configuration, to avoid the network traffic being dropped
by the switch. This might occur because in an Active-Active NIC-teaming configuration,
network traffic is balanced among the members of the team based on source and destination
IP addresses and TCP and UDP ports. With link aggregation negotiated by LACP, the uplinks
might appear as a single layer-2 link so a VM MAC address might appear on multiple links and
use the bandwidth of all uplinks.
To enable LAG and LACP on a TOR switch after enabling the Active-Active policy. Put the
node in maintenance mode, this is in addition to the previous maintenance mode that enabled
Active-Active on the node. Then enable LAG and LACP on the ToR switch connected to that
node. After LAG and LACP is successfully enabled, exit the maintenance mode. Repeat this
process for all nodes in the cluster.
The network visualizer is a consolidated graphical representation of the network formed by the
VMs and hosts in a Nutanix cluster and first-hop switches.
The network visualizer can be used to monitor the network and obtain information that helps
with identifying network issues. You can use the visualizer to view the physical and logical
network topology, a summary of the number and types of devices in the network, the network
configuration of the devices in the topology and of components such as physical and virtual
NICs, as well as real-time usage graphs of physical and virtual interfaces.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 80
Configuring Cluster Networking
Prerequisites
- This is if you want the network visualizer to display the switch and switch interface
statistics.
The network visualizer not only displays interactive visual elements for networked devices
and network components such as physical and logical interfaces, but also provides filter and
grouping capabilities, allowing you to customize the display and view only a specific set of
networked devices and connections. For example, the image above displays the view when the
grouping is "by Host".
The network visualizer can be divided into two panes: the virtual networks pane and the
topology view pane.
• Virtual Networks Pane: The pane on the left lists the virtual switches (VSs) configured on
the Nutanix cluster. Selecting a VS includes the VMs on that VS in the topology view. If you
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 81
Configuring Cluster Networking
deselect a check box, the associated VMs are removed from the topology view. You can
show up to five VSs at a time. The check box titled Other corresponds to VMs that are not
connected to any VS. At a minimum, this option is typically associated with the Controller
VMs in the cluster.
• The Topology View: The topology view displays:
- Virtual Switch: VSs are configured on the cluster. The visualizer displays a different color
for each VS. It shows the VSs to which a VM or the VMs in a VM group belong. It also
shows which VSs are configured on a first-hop switch.
- VMs: VMs associated with the VSs that have been selected in the virtual networks pane.
You can customize the topology view by using the filter and group-by options. VMs can
be filtered or grouped by power state, host, and VM type.
- Hosts: Hosts (that is, nodes) in the Nutanix cluster. The filter above the hosts allows you
to specify which hosts you want to display in the topology view. You can choose to
display information for some, all, or none.
- Switches. First-hop switches and the VSs configured on each of them. The filter above the
switches allows you to specify which switches you want to display in the topology view.
• By Virtual Switches (VS). You can select the virtual switches and the networks in each virtual
switch that you want displayed in the topology view and deselect those that you do not
want displayed.
- For VMs that are not on any VLANs, you can either select or clear Other if you want to
include or exclude the VMs.
• By VMs. You can select the VMs to be displayed using:
- Search option: Enter a string in the search filter field to filter VMs that match the search
string.
- Group-by option: Display VMs according to their grouping
o Power State. Groups VMs by states such as On and Off. By default, the VMs are
grouped by power state.
o Host. Groups VMs by the host on which they are running.
o VM Type. Groups VMs into guest VMs and Controller VMs.
Note: If the group-by and filter operations result in a VM group, you can click
the VM group to view the VMs within the group.
• Hosts: You can specify which Nutanix hosts you want displayed in the topology view. You
select the hosts that you want to show and clear those that you want to exclude.
• Switches: You can specify which switches you want displayed in the topology view. You
select the switches that you want to show and clear those that you want to exclude.
In the following sections we will discuss what information can be viewed for each entity.
VM Information
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 82
Configuring Cluster Networking
You can use the visualizer to view the settings and real-time statistics of a virtual NIC.
You can use the group-by and filtering capabilities of the visualizer to display VMs. Clicking
the name of a VM brings up the network information window for that VM. In the figure above,
the network information window displays VM NIC 1 information, such as its MAC address, Port
Name, MTU, and graphs showing real time statistics. If you need additional details, click the Go
to VM Detail button, this opens the VM dashboard where you can view additional details. To
view information about a different NIC, click the NIC down-down menu and select a different
NIC.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 83
Configuring Cluster Networking
Clicking a network component, such as a bridge or bond, allows you to view the settings and
statistics of that component. For example, in the image below, selecting the bridge displays the
bridge details on the upper right of the pane.
Switch Information
If you click the name of a switch, it opens the switch information window, this window
includes switch-level information, such as the switch name, vendor, and management IP
address. To view interface-level information, click a switch interface, this open the switch port
information window. The window shows both network settings and interface statistics, such as
interface name, physical address, interface type, and also displays statistics for each interface.
Extending Subnets
A subnet can be extended between on-prem local and remote clusters or sites (Availability
Zones or AZs) to support seamless application migration between these clusters or sites.
With Layer 2 subnet extension, you can migrate a set of applications to the remote AZ while
retaining their network bindings such as IP address, MAC address, and default gateway.
Since the subnet extension mechanism allows VMs to communicate over the same broadcast
domain, it eliminates the need to re-architect the network topology, which could otherwise
result in downtime. The Layer 2 extension assumes that there are underlying existing layer 3
connectivity already available between the Availability Zones.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 84
Configuring Cluster Networking
• VPN: This can be used to extend the subnet across two Nutanix AZs.
• VTEP: This can be used to extend the subnet across two Nutanix AZs as well as between a
Nutanix AZ and one or more non-Nutanix datacenters.
Supported Configurations
• Ensure that the Prism Central versions support Layer 2 virtual subnet extension.
• Ensure that you pair the Prism Central at the local AZ with the Prism Central at the remote
AZ to use the Create Subnet Extension wizard to extend a subnet across the AZs and
facilitate bidirectional communication between these clusters or sites.
• Ensure that you set up a default static route with 0.0.0.0/0 prefix and the External Network
next hop for the VPC you use for any subnet extension. This allows NTP and DNS access for
the Network Gateway appliance.
Best Practices
Nutanix recommends the following configurations to allow IP address retention for VMs on
extended subnets.
• When using Nutanix IPAM ensure the address ranges in the paired subnets are unique to
avoid conflict between VM IP addresses across extended subnets.
• If the source and target sites use third-party IPAM, ensure that there are no conflicting IP
address assignments across the two sites.
• If connectivity between sites already provides encryption, consider using VTEP only subnet
extension to reduce encryption overhead.
• Use the Subnet Extension to a Third Party Data-Center workflow to:
- Extend a subnet to more than one other AZ. This is also known as point to multi-point.
- Extend subnets between clusters managed by the same Prism Central.
Note: For additional information, refer to the Layer 2 Virtual Network Extension of
the Flow Networking Guide on the Nutanix Support Portal.
Segmenting Networks
Network segmentation enhances security, resilience, and cluster performance by isolating a
subset of traffic to its own network.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 85
Configuring Cluster Networking
You can separate management traffic from storage replication (or backplane) traffic by
creating a separate network segment (LAN) for storage replication. You can segment the
network on an existing cluster by using the Prism web console. You must configure a separate
VLAN for the backplane network to achieve logical segmentation. The network segmentation
process creates a separate network for backplane communications on the existing default
virtual switch and places the eth2 interfaces (that are created on the CVMs during upgrade)
and the host interfaces on the newly created network. This method allows you to achieve
logical segmentation of traffic over the selected VLAN. From the specified subnet, assign IP
addresses to each new interface. You, therefore, need two IP addresses per node. When you
specify the VLAN ID, AHV places the newly created interfaces on the specified VLAN.
You can physically isolate the backplane traffic (intra cluster traffic) from the management
traffic (Prism, SSH, SNMP) in to a separate vNIC on the CVM and using a dedicated virtual
network that has its own physical NICs. You can use Prism to configure the vNIC on the CVM
and configure the backplane traffic to communicate over the dedicated virtual network.
However, you must first manually configure the virtual network on the hosts and associate it
with the physical NICs that it requires for true traffic isolation.
You can also secure traffic associated with a service (for example, Nutanix Volumes) by
confining its traffic to a separate vNIC on the CVM and using a dedicated virtual network that
has its own physical NICs. This type of segmentation offers true physical separation for service-
specific traffic. You can use Prism to create the vNIC on the CVM and configure the service to
communicate over the dedicated virtual network. But first, you must manually configure the
virtual network on the hosts and associate it with the physical NICs that it requires for true
traffic isolation. You need one virtual network for each service you want to isolate.
Network segmentation for Volumes also requires you to migrate iSCSI client connections to the
new segmented network. If you no longer require segmentation for Volumes traffic, you must
also migrate connections back to eth0 after disabling the vNIC used for Volumes traffic. You
can create two different networks for Nutanix Volumes with different IP pools, VLANs, and data
services IP addresses. For example, you can create two iSCSI networks for production and non-
production traffic on the same Nutanix cluster.
Some Nutanix platforms support remote direct memory access (RDMA) for Stargate-to-
Stargate service communication. You can create a separate virtual network for RDMA-enabled
network interface cards. If a node has RDMA-enabled NICs, Foundation passes the NICs
through to the CVMs during imaging. The CVMs use only the first of the two RDMA-enabled
NICs for Stargate-to-Stargate communications. The virtual NIC on the CVM is named rdma0.
Foundation does not configure the RDMA LAN. After creating a cluster, you need to enable
RDMA by creating an RDMA LAN from the Prism web console.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 86
Configuring Cluster Networking
Unsegmented Network
In the default unsegmented network in a Nutanix cluster (AHV), the Controller VM has two
virtual network interfaces—eth0 and eth1. Interface eth0 is connected to the default external
virtual switch, which is in turn connected to the external network through a bond or NIC team
that contains the host physical uplinks. Interface eth1 is connected to an internal network that
enables the CVM to communicate with the hypervisor.
In an unsegmented network all external CVM traffic, whether backplane or management traffic,
uses interface eth0. These interfaces are on the default VLAN on the default virtual switch.
In AHV, VM live migration traffic is also backplane, and uses the AHV backplane interface,
VLAN, and virtual switch when configured. If you further isolate service-specific traffic,
additional vNICs are created on the CVM. Each service requiring isolation is assigned a
dedicated virtual NIC on the CVM. The NICs are named ntnx0, ntnx1, and so on. Each service-
specific NIC is placed on a configurable existing or new virtual network (vSwitch or bridge) and
a VLAN and IP subnet are specified.
In a segmented network, management traffic uses CVM interface eth0 and additional services
can be isolated to different VLANs or virtual switches. In backplane segmentation, the
backplane traffic uses interface eth2. The backplane network uses either the default VLAN
or, optionally, a separate VLAN that you specify when segmenting the network. In AHV this
internal interface is created automatically in the selected virtual switch. For physical separation
of the backplane network, select the desired virtual switch in the AHV GUI.
If you want to isolate service-specific traffic such as Volumes or Disaster Recovery as well as
backplane traffic, then additional vNICs are needed on the CVM, but no new vmkernel adapters
or internal interfaces are required. AOS creates additional vNICs on the CVM. Each service that
requires isolation is assigned a dedicated vNIC on the CVM. The NICs are named ntnx0, ntnx1,
and so on. Each service-specific NIC is placed on a configurable existing or new virtual network
(vSwitch or bridge) and a VLAN and IP subnet are specified.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 87
Configuring Cluster Networking
You can choose to perform backplane segmentation alone, with no other forms of
segmentation. You can also choose to use one or more types of service specific segmentation
with or without backplane segmentation. In all of these cases, you can choose to segment any
service to either the existing, or a new virtual switch for further physical traffic isolation. The
combination selected is driven by the security and networking requirements of the deployment.
Note: In the above figure the interface name 'br0-bp' is read as 'br0-backplane'.
Note: For additional information, refer to the Securing Traffic Through Network
Segmentation section of the Security Guide on the Nutanix Support Portal.
VM NICs Tab
This tab can be accessed from the VMs tab. The VM NICs tab displays information
about the virtual NICs in a selected VM. (This tab appears only when a VM is selected.)
Each line represents a virtual NIC, and you can view information for each NIC, such
as the virtual NIC identification number, the adaptor type defined for the virtual NIC, the
virtual NIC MAC address, the virtual NIC IPv4 address(es), and more.
Switch Properties
To track and record networking statistics for a cluster, the cluster requires information about
the first-hop network switches and the switch ports being used. This information can be viewed
on the Switch tab. Which can be accessed from the Hardware dashboard.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 88
Configuring Cluster Networking
Clicking the Switch tab displays information about the physical switches used by the host
NICs to support traffic through the virtual NICs. The information is displayed in a table. The
top section of the screen displays information about the switches, and the lower portion of the
screen displays additional information when a switch is selected. You can configure any number
of switches, but only the switches that are actually being used for virtual NIC traffic appear in
this table.
Another location to view information is the network visualizer screen. You can view information
about the switch in the Topology view. Selecting a switch, provides the switch port data.
For information on switch properties, refer to the Switch tab section of the Hardware
dashboard of the Prism Web Console guide.
Switch Status
To verify if a switch has been configured correctly, you need to check switch configurations.
This can be viewed by navigating to the Settings page and selecting Network Switch from
the menu. On the Network Switch Configuration page, you can view and edit the configuration
information and SNMP profile.
For information on switch configurations, refer to the Configuring Network Switch Information
section of the Prism Web Console guide.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 89
Configuring Cluster Networking
As discussed earlier in this module, the network visualization page not only allows you to view
most of the network configuration information but also view detailed information about the
network configurations, components, and the connected entities. For information on using the
visualizer, refer to the Network Visualization section of the Prism Web Console guide.
Health Checks
Prism Element provides you with the option to run health checks. These checks help monitor
the cluster health and identify issues with the cluster. The cluster health checks cover a range of
entities including AOS, hypervisor, and hardware components. A set of checks are enabled by
default, but you can run, disable, or reconfigure any of the checks at any time.
These checks can be run directly from Prism Element. For details on configuring and running
the checks, refer to the Running Checks by Using Web Console section of the Prism Web
Console guide.
Using Charts
You can create custom charts to monitor different metrics of your cluster. For details on the
metrics that can be monitored, refer to the Chart Metrics section of the Prism Web Console
guide.
Additional Resources
You can use the following links to access additional information on how you can identify issues
with your network.
• Host Network Management
• List of OVS Commands
• Investigating Network Issues Identified in Network Visualizer
Module Summary
In this module, you learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 90
Configuring Cluster Networking
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 91
Managing Images
Module
4
MANAGING IMAGES
Module 4 Overview
This module will introduce you to the Image Service, which is available in both Prism Central
and Prism Element.
By the end of this module, you will have learned:
ISO and disk image files can be imported using either Prism Central or Prism Element. These
images can then be used to install operating system or applications. With Prism Central you can
clone an existing VM disk and add it to the image list.
Images that are uploaded to Prism Element, reside in and can be managed by only that Prism
Element. In essence, they are only available for use on a single cluster.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 92
Managing Images
However, if you are using Prism Central, you can migrate images manually to Prism Central and
manage all of the images for one or more clusters from a single location. An image migrated
to Prism Central in this way remains on Prism Element, but can only be updated, modified, and
managed from Prism Central.
Note: The image services used port 2007, so it must be open. For a complete list
of required ports, see the Port Reference document in the Software Documentation
section of the Nutanix Support Portal.
1. Click the Entities menu and expand the Compute & Storage category.
2. Click the Images option. Note that you can click the star icon to the right of the option name
to bookmark the dashboard for easy access.
By default, the List view will be displayed, as shown in the following figure.
• List. The List tab displays a table with the name, description, type, size, and creator of all
images managed by Prism Central. Clicking the name of an image will display a details page
for that image. From this tab, you can add, import, delete, and update images; add images
to catalogs; and manage categories. You can also filter the page by name, description,
category, and type.
• Policies. The Policies tab has two options:
- Placement Policies. Prism Central enables you to configure policies that govern which
clusters receive the images that you upload. These policies, called image placement
policies, map images to target clusters using categories associated with both those
entities. The Placement Policies page lists all image placement policies that have been
configured, the images to which they have been applied, and whether or not an image is
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 93
Managing Images
compliant with the policy. You can also create new placement policies directly from this
page.
- Bandwidth Throttling Policies. A bandwidth throttling policy limits the bandwidth
consumed during image creation using the URL option in the specific clusters (this is
described in more detail later in this module). Without a bandwidth throttling policy,
image creation from a remote server will consumes as much bandwidth as is available.
This limits the bandwidth availability for the other cluster operations. From this page, you
can view, create, update, and delete bandwidth throttling policies.
The image details page displays basic image information (name, description, type, size, creator,
and owner cluster UUID) as shown in the following figure.
The image details page also displays the cluster on which the image is located, along with basic
cluster information (name, AOS version, hypervisor, number of hosts, and number of VMs), as
well as the policies that apply to the selected image.
From this page, you can also delete and update an image; add the image to a catalog; and
manage categories.
1. Log into Prism Central, click the Entities menu, expand the Compute & Storage category,
and click Images.
2. On the Images dashboard, click the Add Image button at the top left corner.
4. Select a location for the image and click Save. This will first create your image, and then
upload your selected image file. You can monitor the upload progress by clicking the Tasks
icon near the top right of the page, or from the Tasks dashboard.
While the process is quite simple, several options are available that provide flexibility in how
you upload your images, and on which clusters you can place them.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 94
Managing Images
Selecting an Image
When selecting an image, you will see three options, as shown in the following figure.
• Image File. This option allows you to upload files directly from the workstation on which you
are accessing Prism Central. You can select multiple images and upload them as part of a
single operation.
• URL. Most modern browsers impose file size limitations that affect the Image File method
of upload. Also, the browser type, and CPU and RAM utilization on the workstation limit
the number of concurrent uploads. Concurrent uploads exceeding the default limit of the
browser are queued or throttled by the browser and can take more time. Large file uploads
and high CPU and memory utilization can slow down the browser. In these situations,
or if you need to upload an image that is 2GB or larger, uploading from a remote server
is preferred. Note that you can also specify URLs to multiple images as part of a single
operation.
• VM Disk. Allows you to select a VM that is managed by Prism Central and clone a VM disk for
use as an image.
After you select an image (either from an image file, by providing a URL, or by selecting a VM
disk), details of the selected image will be displayed, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 95
Managing Images
The Name field will contain the filename of the image, and can be edited. The Type field will
display the type of uploaded image, and can be either ISO or disk. You can (optionally) also
provide a description of the image and select a hashing algorithm (SHA-1 or SHA-256).
Note: The option to select a hashing algorithm is available only for images
uploaded from a workstation. The Checksum field will not appear for images
uploaded via a remote server or created using a VM disk.
Selecting a Location
After you have selected your image and entered relevant details, you will be prompted to select
a location for the image, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 96
Managing Images
There are two ways you can determine where the image will be placed. You can:
• Place image directly on clusters. This option is recommended for smaller environments. You
can use this option to select one or more clusters as shown in the figure above; the uploaded
image will be made available on all selected clusters.
• Place image using image placement policies. This option is recommended for larger
environments. To use this option, you need to set up image placement policies, which
involves associating categories with clusters, and then assigning categories to images. Once
this is done, simply selecting a category while uploading an image (as shown in the following
figure) will place the image on the appropriate cluster.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 97
Managing Images
After you have selected a location for the image and clicked Save, Prism Central will first create
the image and then upload the associated image file. You can view the status and progress of
both these tasks by clicking the Tasks icon near the top right of the page.
You can also monitor progress from the Tasks dashboard, which can be accessed by clicking
the View All Tasks link.
For step-by-step instructions, see the Creating an Image section of the Prism
Central Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 98
Managing Images
Modifying an Image
After you upload an image to Prism Central, several options become available to you. To access
these options, on the Images dashboard, select the checkbox next to the name of an image and
click the Actions drop-down list as shown in the following figure.
The four options listed here - delete, update, add to catalog, and manage categories are also
available on the image details page, at the top right as shown in the following figure.
Deleting an Image
Clicking Delete will allow you to permanently remove the image from Prism Central and any
clusters that the image may have been made available on. Image deletion cannot be undone; if
you need to use the image again, you will need to reupload it and make it available on different
clusters once more.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 99
Managing Images
Updating an Image
Clicking Update Image will open the Edit Image page. Here, you can rename the image, change
the description (or add one, if a description was not provided earlier), and change the image
type (disk and ISO). No other changes can be made.
This feature is useful if you are using Prism Self Service and empowering your end users to
create their own VMs. Clicking the Add Image to Catalog option will add a copy of that image
to the self service catalog, and will be available to users who have permissions to create VMs.
Since a copy is added to the catalog, you can safely delete the original image without affecting
the catalog copy. Similarly, deleting the image from the catalog will have no impact on the
original image.
Clicking Manage Categories allows you to assign categories to images. This, in turn, allows
policies to be applied automatically to images, along with all the other benefits that come with
categorizing various entities in Prism Central.
Note: For step-by-step instructions, see the Modifying an Image section of the
Prism Central Guide on the Nutanix Support Portal.
To meet these requirements without constant, excessive manual intervention, Prism Central
allows you to define image placement policies that determine which clusters will receive the
images that you upload. These policies automatically assign images to target clusters based on
categories that are applied to images as well as clusters.
You can also define how strictly these policies are enforced. 'Soft' enforcement allows clusters
that are not specified in the policy to use uploaded images. 'Hard' enforcement prevents this
entirely.
Note: For a detailed list of potential scenarios and related image policy
configurations, see the Sample Scenarios and Configurations section of the Prism
Central Guide on the Nutanix Support Portal.
1. Create categories. You will need to create categories for images and for clusters.
Creating a Category
A category, in the simplest of terms, is a key-value pair. As an example, a category could be
"Department" with the associated values being "Marketing", "Sales, "Finance", and "Education".
Prism Central offers several built-in categories, and allows you create custom ones as well.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 100
Managing Images
Typically, you will assign entities to a category, and then apply policies to that category - which
in turn will apply the policy to all associated entities.
Consider a backup policy as an example. You may have a company policy that requires all
VMs to be backed up, but with different frequencies for different departments. For example,
Marketing and Sales VMs may be backed up only once every day, but Finance VMs need to be
backed up every hour. In this scenario, you can create a Department category with Marketing,
Sales, and Finance as three of its values. Then, you can assign VMs to that category, and apply
custom backup policies for different departments.
VMs are not the only type of entity that can be assigned to categories. In this module, we will
discuss how to create and apply categories to images.
To create a category, you need to navigate to the Categories dashboard in Prism Central. To do
this, click the Entities menu, expand the Administration category, and then select Categories.
By default, the Categories dashboard will display a list of system-defined categories, a button
to create new ones, and will present options to modify or assign a category if a specific
category is selected.
Clicking New Category will display the Create Category page. Here, you need to provide a
name, a purpose (optional), and values for the category, and then save your changes. In the
example shown in the following figure, we are creating a custom NTNXDepartment category,
with values for Education, Sales, Marketing, CreativeServices, and CustomerSucess. Note that
when creating category values, spaces cannot be used, which is why we use CreativeServices
intead of Creative Services.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 101
Managing Images
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 102
Managing Images
A category cannot be associated with entities from the Categories dashboard in Prism Central.
To assign categories, you need to navigate to the dashboard of the specific entity that you
want to associate. In this example, we're going to associate a category with an image, so we
need to navigate to the Images dashboard.
On the Images dashboard, if we select our VirtIO image, click Actions, and select Manage
Categories, the Manage Image Categories page will be displayed. On this page, you can use
the Search field to find a specific category, click the category to select it, and then click the +
icon next to the search bar to assign the category to the image.
In this example, we have assigned the NTNXDepartment: Education category to our VirtIO
image, but more categories can be assigned if needed. After you have selected all required
categories, click Save.
Now, if we return to the details page of our NTNXDepartment category on the Categories
dashboard, you will see that the Education value has one image associated with it.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 103
Managing Images
On the Create Image Placement Policy page, you can name your policy, provide a description
(optional), and use categories to select images and the clusters to which they will be assigned.
You can also choose Hard or Soft enforcement.
In this example, we want the images that are used in our lab environment to be made available
only on our test clusters. So, our policy is named Education Lab Image Assignment, and all
images tagged with NTNXDeparment: Education will only be assigned to clusters that are
tagged Environment: Testing. There will be no exceptions to this policy, since we have selected
Hard enforcement.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 104
Managing Images
Note: For more information and step-by-step instructions, see the Image
Placement Policies section of the Prism Central Guide on the Nutanix Support
Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 105
Managing Images
After your image has been uploaded, if you want to manage it centrally, you can import the
image to Prism Central.
1. Log into Prism Element, open the Settings page, and in the General category, click Image
Configuration.
3. In the Create Image dialog box, name your image, select an image type (disk or ISO), select
a storage container, and choose an image source (from URL or upload a file from your
workstation).
4. Review the information in the dialog box and, if no changes are required, click Save.
The image import process allows you to be as granular as required. If you choose to import
All images, every image on every cluster connected to Prism Central will be imported. If you
choose to import images only from a specific cluster, you have two additional options. First,
you can upload all images on the selected cluster, as shown in the following figure. To do this,
select Images on a cluster, then select the checkbox next to the name of a cluster, and click
Save.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 106
Managing Images
If you only want to import specific images, you can click Select Images. A dialog box will be
displayed that lists all images available on the selected cluster (in our example, the name of
one of our test clusters - CDev9 - can be seen at the top of the following figure). You can then
choose to import some or all of the available images. After you select the images you want to
import, click Next, and in the Import Images dialog box, click Save.
Module 4 Summary
In this module, you learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 107
Creating Virtual Machines
Module
5
CREATING VIRTUAL MACHINES
Module 5 Overview
This module will help you understand how VMs are created using Prism Central on an AHV
cluster, both as an administrative user and as a self-service user.
By the end of this module, you will have learned:
In addition to all of the functionality of Prism Element, Prism Central also allows you to
manage affinity and NGT policies, view filtered VM-specific lists of alerts and events, create
VM templates, enable/disable efficiency measurement and anomaly detection, run playbooks,
manage categories, add VMs to catalogs, manage VM ownership, and more. Prism Central also
includes a larger collection of VM metrics, available out-of-the-box.
As a result, even if you have a relatively small deployment with a single cluster, it is
recommended to use Prism Central when creating and managing VMs. This gives you access
to Prism Central's broader feature set and makes more options available for simpler, more
effective VM administration.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 108
Creating Virtual Machines
1. Click the Entities menu and expand the Compute & Storage category.
2. Click the VMs option. Note that you can click the star icon to the right of the option name to
bookmark the dashboard for easy access.
By default, the List view will be displayed, as shown in the following figure.
By default, the VMs dashboard has six tabs: Summary, List, Policies, Alerts, Events, and
Metrics. You can use these tabs to get an overview of all VMs on clusters that are managed by
Prism Central, view information specific to each VM, view and acknowledge VM-specific alerts
and events, view and create policies, and view metrics.
If you click the name of a VM in the List view of the VM dashboard (or in the Highlighted
Entities widget on the Summary view) a VM details page will be displayed, as shown in the
following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 109
Creating Virtual Machines
The top menu bar displays information specific to the selected VM, such as: the name, the
cluster it belongs to, the host, and the VM. It also allows you to select and view other clusters
and hosts.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 110
Creating Virtual Machines
VM Details Tabs
Directly below the top menu is a series of tabs that can be used to view specific VM related
information. The tabs available are:
• Summary: This is the default view for every VM details page. It can be divided into two
sections; Action buttons and Widgets. The action buttons allow you to perform various
administrative actions on VMs, and the widgets are used to view information about the VM,
such as efficiency, IP addresses, IOPS, IO latency, anomalies, and so on.
• Console: Displays the VM console screen.
• Recovery Points: It is a list of recovery points (backup snapshots) when backups are
enabled.
• Snapshots: This tab displays the backup snapshots of the VM which were taken using Prism
Element.
• Alerts. The Alerts tab displays a filtered list of unresolved alerts specifically for the selected
VM.
• Events. The Events tab displays a filtered list of events specially for the selected VM.
• Metrics: This tab displays a page with 23 built-in charts that can be used to monitor the VM's
performance. The charts are: CPU Usage (%), CPU Ready Time, Memory Usage (%), and so
on.
• NICs: This tab displays information about the virtual NICs in the VM, such as the VLAN ID,
Subnet, Network Connection State, VPC, Virtual Switch, MAC Address, and so on.
• Disks: The Disks tab displays information about the virtual disks in the VM. It provides
information on the Disk Address and Capacity.
• Categories: The Categories tab displays the categories and image placement policies
associated with the VM.
• Apps & Relationships: This is a drop-down menu that displays two options: Discovered Apps
and App Relationships.
Network
As we discussed earlier in this course, Nutanix allows you to configure network connections
in Prism. Each VM network interface is bound to a virtual network, and each virtual network
is bound to a single VLAN. You can view information about available virtual networks in the
Network Configuration page.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 111
Creating Virtual Machines
Adding a existing network (that is, assigning a previously-created network to a VM) is a step
in the VM creation process.
Images
As discussed earlier in this course, you can import and configure operating system ISO and disk
image files on AHV clusters using either Prism Central or Prism Element. These images can then
be used for operating system or application installation. For example, you can attach a disk or
CD-ROM to a VM by cloning an existing image from the image library.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 112
Creating Virtual Machines
VirtIO is available in two formats. If you are performing a fresh installation of Windows on an
AHV VM, use the VirtIO ISO. If you are updating VirtIO for Windows on an existing VM, use the
VirtIO MSI installer file.
Supported versions of Windows operating systems are: Microsoft Windows Server 2008 R2 or
later, and Microsoft Windows 7 or later.
Note: On Windows 7 and Windows Server 2008 R2, install Microsoft KB3033929 or
update the operating system with the latest Windows Update to enable support for
SHA2 certificates.
The latest VirtIO drivers can be downloaded from the Nutanix Support Portal. To download the
latest versions:
1. Access the Downloads section on the Support Portal, and select AHV.
2. Select VirtIO in the drop-down menu and download the required VirtIO package.
You can download the latest VirtIO drivers from the Nutanix Support Portal, here.
Categories
As we discussed earlier in this course, a category is a key-value pair. As an example, a category
could be "Department" with the associated values being "Marketing", "Sales, "Finance", and
"Education". Prism Central offers several built-in categories, and allows you create custom ones
as well. Typically, you will assign entities to a category, and then apply policies to that category
- which in turn will apply the policy to all associated entities.
Consider a backup policy as an example. You may have a company policy that requires
all VMs to be backed up, but with different frequencies for different departments. For example,
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 113
Creating Virtual Machines
Marketing and Sales VMs may be backed up only once every day, but Finance VMs need to be
backed up every hour. In this scenario, you can create a Department category with Marketing,
Sales, and Finance as three of its values. Then, you can assign VMs to that category, and apply
custom backup policies for different departments.
Irrespective of the size of your deployment, categories are a useful feature for managing VMs.
They allow you to quickly and easily assign policies, create custom reports so you can view
specific datasets related to VMs in that belong to a specific category, and so on.
As a result, before you create a VM, it is recommended that you have categories prepared for
use. If, however, you choose to create or modify your categories later, you can assign VMs
to your new/updated categories (or change category assignment) even after a VM has been
created.
To do this, on the VMs Dashboard in Prism Central, select one or more VMs, click the Actions
drop-down menu and select Manage Categories, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 114
Creating Virtual Machines
UEFI (Unified Extensible Firmware Interface) firmware is a successor to the BIOS firmware
and replaces the BIOS on PCs. Like the BIOS, UEFI is low-level software and boots before your
operating system, but is more modern and supports larger hard drives, offers faster boot times,
and provides more security features. One of these features is Secure Boot.
The BIOS cannot distinguish between a boot loader and malware that may be designed
to replace a boot loader - the BIOS will simply boot whatever software it detects. UEFI
eliminates this vulnerability by checking the boot loader before it launches it, using policies
present in the firmware along with certificates, to ensure that only properly signed and
authenticated components are allowed to execute. If the boot loader has been tampered with,
UEFI will prevent it from booting, which in turn prevents malware from concealing itself in your
operating system.
• Nutanix does not support converting a VM that uses IDE disks or legacy BIOS to VMs that
use Secure Boot.
• The minimum supported version of the Nutanix VirtIO package for Secure boot enabled VMs
is 1.1.6.
A generalized image can be used in combination with an answer file to customize installations
of Windows on multiple VMs. An answer file is an xml file that contains various definitions
and values to be used during the Windows Setup process, and can be used to perform an
unattended installation (that is, an installation with no manual intervention) of Windows. An
answer file specified details such as how to partition disks, which product key to apply, and
disable specific features in a software package such as Windows Media Player.
Note: For more information and step-by-step instructions, see the Customization
of Windows Virtual Machines with System Preparation section of the Prism Web
Console Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 115
Creating Virtual Machines
Creating a VM using Prism - whether you are using Prism Central or Prism Element - is as simple
as filling out a form and clicking a few buttons. To create a VM in Prism Central, you need to
click the Entities menu, expand the Compute & Storage category, and select VMs. Then, on the
VMs dashboard, click the Create VM button to view the Create VM page.
The first thing to take note of is that the Create VM page has four tabs: Configuration,
Resources, Management, and Review. These are the categories in which you provide
specifications for a VM.
On the Configuration page, you need to provide basic descriptive and configuration
information for your VM. This includes a name, description, the cluster on which you want this
VM to reside, the number of VMs you want to create using these configuration specifications,
vCPUs, cores per vCPU, and memory.
Of particular note is the ability to create multiple VMs using a single set of configuration
specifications. For example, if you choose to create 5 VMs using the information displayed in
the following figure, they will all share the same VM properties (1 vCPU, 2 cores per vCPU, and
8 GB of RAM). They will also all be hosted on the same cluster (CDev9 in our example) at the
time of their creation. And, their names will be suffixed with sequential numbers automatically
as they are created.
On the Resources page, you need to attach disks, networks, and specify the boot configuration.
When attaching disks, two storage device options are available: CD-ROM and disk. When
choosing CD-ROM, you can either attach an empty CD-ROM, or attach a CD-ROM with an
image. The latter is useful for software installations - you can, for example, attach two CD-
ROMs, one with a Windows image and another with a VirtIO image so that you can install
Windows as soon as the VM has been created. When attaching a disk, you can specify which
storage container the disk will be created on and what the storage capacity of that disk will be
in GB. Multiple CD-ROMs and disks can be attached to a single VM if needed.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 116
Creating Virtual Machines
To attach a network, you will need either a managed or an unmanaged network on the cluster
on which the VM will be created. If no subnets have been created previously, then you will not
be able to take any action here. However, you can still create and test your VM normally, and
attach a network after the VM has been created.
Next, you need to select the boot mode - either BIOS or UEFI. If you select BIOS, you can also
choose the boot priority. If you select UEFI, you will receive a prompt indicating that the disk
image format is different between BIOS and UEFI, and will be asked to confirm your selection.
If you choose UEFI, additional security options become available in the Shield VM Security
Settings section of this page. You can choose to use Secure Boot for the VMs you are creating,
with Windows Defender Credential Guard if you are using a Windows image that supports this
functionality.
Finally, if you have a GPU-enabled AHV cluster, you will also be able to add a GPU here. Either
vGPU or Passthrough options are available with further choices depending on your selection
(NVIDIA Virtual GPU License and vGPU profile for vGPU, and a simple radio button selection if
you chose Passthrough).
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 117
Creating Virtual Machines
On the Management page, you can assign categories, select a timezone for the VM, choose
to use the VM as an agent VM, and provide either sysprep or cloud-init scripts for VM
customization.
Both system-defined and custom categories can be assigned here. If you chose to create
multiple VMs on the Configuration page, the categories that you select and specify here will be
applied to all VMs.
If you choose to create an agent VM, the VM you create will always be powered on first on a
host (before all other VMs) and powered off last (after all other VMs).
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 118
Creating Virtual Machines
Finally, on the Review page, you will see a complete breakdown of your VM with all of the
settings that you selected on previous pages. All of these settings are editable. You can either
click the Edit link in each section or click the Back button to return to a previous page and
make your changes there.
Once you have reviewed all of your configuration information, if no changes are required, click
Create VM to complete the process.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 119
Creating Virtual Machines
Prism Self Service, like the name suggests, is a feature that allows IT administrators to extend
a measure of self-sufficiency and autonomy to consumers of IT infrastructure. Individual users
or teams that need to create VMs with any degree of regularity can be empowered to do so on
their own, without the need for administrative intervention.
It is important to note, however, that empowering users in this way does not give them
complete, unrestricted access to a cluster or its resources. Using a Self Service feature known
as Projects, an IT administrator can allow both users and self-service administrators access to
specific groups of resources, specific entities (such as VM templates or images), and configure
roles for project members.
In this section, you will learn how to enable and configure Prism Self Service, and how to add
VM templates to a catalog, which your end-users can deploy to create VMs of their own.
A self-service administrator can create projects, add users or groups to projects, configure
roles for each project member which will grant those members access to entities and actions
on entities, publish VM templates and images to a catalog, monitor project resource usage and
adjust quotas as needed.
Although self-service administrators do not have administrative access to Prism Central, they
do have full access to all VMs running on a cluster, including VMs that are not part of a project.
Also, after a user or group has been assigned the self-service administrator role, the Prism
Central administrator cannot limit the user or group's privileges.
As a result, always exercise caution when assigning the self-service administrator role to
user groups. Ensure that all users does not include users that must not have administrative
privileges. Nutanix recommends that you create a separate Active Directory group that
includes only self-service administrators and assign roles accordingly.
A project user is a user and consumer of resources. They are added to projects by either a
Prism Central Administrator or a self-service administrator, and can perform actions or work
with entities based on the role and permissions that have been assigned to them. When project
users login, they see a custom GUI that displays only what their permissions allow them to see
or interact with.
Administrator Tasks
Before you can configure Prism Self Service, you need to configure Active Directory with a pool
of self service users. Then, once that has been created, to configure Prism Self Service, click the
Settings (gear) icon in Prism Central. On the Settings page, scroll down to the Setup category
and click Self-Service Admin Management as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 121
Creating Virtual Machines
On the Self-Service Admin Management page, you need to first connect to Active Directory
and then configure self service.
On the Connect to AD page, you need to select the Active Directory that contains your self
service users. You also need to provide credentials to a service account, which will be used to
query the Active Directory. When correctly configured, your Connect to AD page should look
something like this:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 122
Creating Virtual Machines
On the Configure Self Service page, you need to add self-service administrators. To do this,
click the + Add Admins link and select either a user or a group to receive admin privileges.
After you configure an admin, you can edit your changes if desired, or delete the admins and
start over. This can also be done at any time after you set up and configure Prism Self Service
if you want to change who your self service admins are. After you make the required updates,
click Save. Your Configure Self Service page should look something like this:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 123
Creating Virtual Machines
After this, as an administrator, you need to configure custom permissions for your project users
if you want them to have custom permissions, and publish VMs and images to the catalog so
that self-service users have access to them.
Adding a VM to a Catalog
Since self service users can only create VMs based on pre-defined templates that are available
in Prism's catalog, an administrator (either the Prism Central Admin or the Self Service
Admin) must make these VM templates available.
To do this, navigate to the VMs dashboard in Prism Central, select any VM using the checkbox
next to the name, and click the Actions drop-down menu. From the list of options, select Add
to Catalog as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 124
Creating Virtual Machines
This will display the Add VM to Catalog page. Here, you can name the VM, add a description,
and choose whether or not you want to attach a sysprep or cloud-init script to perform VM
customizations. Making changes here does not affect the original VM in any way. Once you
have made your changes, click Save.
In the example shown in the following figure, we have taken our HMN Test 1 VM, renamed it to
HMN-Template-VM, and added it to the catalog with no further changes or modifications. This
means that our self-service users will be able to deploy a VM that has been preconfigured
with 4 vCPUs, 6 GB of RAM, and a connection to our existing managed network.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 125
Creating Virtual Machines
As an administrator, you do not need to take any further action after this step. As we will see
in the next section, simply adding a VM template to the catalog is enough to enable your self-
service users to deploy VMs on their own.
When a self service user logs into Prism Central, they will see a very different view from what
the Prism Central Administrator (or even a Self Service Administrator) will typically see.
As the following figure shows, a self service user will see the VMs dashboard as their home
page. Clicking the Prism logo at the top middle of the screen (which usually displays Prism
Central's Main Dashboard) will bring a self service user to their custom version of the VMs
dashboard instead. On this custom version of the VMs dashboard, a self service user will only
see VMs that they have created. In this case, for example, our ss_user1 has created one VM in
the past (RAN-VM), which they can select and perform actions on.
Other administrative options, such as the ability to view alerts, events, and metrics, as well as
the ability to switch between summary and list views, are not available to a self service user.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 126
Creating Virtual Machines
Selecting our RAN-VM and clicking the Actions button displays a much shorter list of actions
compared to those available to an administrator. While some actions - such as update, delete,
clone, power on/off - are common, certain others are noticeably absent, such as the ability to
enable/disable anomaly detection and efficiency measurement.
Clicking the name of the VM will display a details page, but a different version from what an
administrator may be used to. Four tabs are available - Summary, Console, Disks, and NICs
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 127
Creating Virtual Machines
- which provide a detailed breakdown of the VM's configuration, performance metrics, disk
addresses and capacity, and network information such as IP addresses, subnets, connection
status, and so on.
Actions that can be performed on the VM using the Actions menu (such as update, delete,
clone, and install NGT) on the previous page are available at the top right of this VM details
page.
Finally, a quick look at the Entities menu in Prism Central reveals that our self service user only
has access to (limited versions of) Prism Central dashboards that are required for VM creation.
As the following figure shows, our self service user can access images, catalog items, OVAs,
subnets, and reports.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 128
Creating Virtual Machines
To create a VM as a self service user, you need to log into Prism Central. The VMs dashboard
will be displayed by default. Click the Create VM button at the top left. The Create VM
page will be displayed as shown in the following figure, which requires information in three
categories: Source, catalog, and deployment.
On the Select Source page, you can choose to create a VM from either a template or a disk
image. In the Adding a VM to a Catalog section of this course guide, we added a template VM
to the catalog specifically for this purpose.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 129
Creating Virtual Machines
On the Browse Catalog page, you can choose from all available templates for your VM. In this
example, since we have our HMN-Template-VM, that's the one we will select. Each template
available on this page displays a brief summary with basic configuration information - CPU,
storage, and memory - as shown in the following figure.
On the Deploy VM page, you have several options in multiple categories: Deployment settings,
boot configuration, disks, guest customization, network, categories, and advanced settings.
On this page, you can name your VM, choose a target project, set the timezone, choose the
disk to boot from, select a subnet to connect to, perform guest customizations (if enabled by
the Prism Central admin or the self service admin), assign the VM to categories, and override
the template's settings for CPU, cores per CPU, and memory with your own custom settings if
required.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 130
Creating Virtual Machines
After you make your changes, click Save to deploy your VM.
• The NGT installer, which allows you to install NGT in a guest VM.
• The Nutanix Guest Agent (NGA) service, which maintains communication between the CVM
and guest VMs.
• The Nutanix VirtIO package, which includes VM mobility drivers that enable VM migration
between AHV and ESXi, in-place hypervisor conversion, and cross-hypervisor disaster
recovery.
If you install NGT in a guest VM, you can use the following features:
• Self Service File Recovery: Performs self-service file-level recovery using VM snapshots.
• Nutanix VM Mobility: Facilitates VM migration between ESXi and AHV, in-place hypervisor
conversion, and cross-hypervisor disaster recovery.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 131
Creating Virtual Machines
• VSS requestor and hardware provider for Windows VMs: Enables application-consistent
snapshots of AHV or ESXi Windows VMs.
• Application-consistent snapshot for Linux VMs: Supports application-consistent snapshots
for Linux VMs by running specific scripts on VM quiesce.
Note: For a detailed list of prerequisites and limitations related to NGT installation
and management in:
• Prism Element, see the Nutanix Guest Tools Requirements and Limitations
section of the Prism Web Console Guide on the Nutanix Support Portal.
• Prism Central, see the NGT Management in Prism Central Requirements section
of the Prism Central Guide on the Nutanix Support Portal.
For example, in Prism Element, you need to first enable NGT on a VM, then log into the VM
and install NGT. While this is feasible for smaller deployments, it is less so for larger ones.
Prism Central, on the other hand, allows you to both enable and install NGT directly from the
management console without having to launch the VM itself. This is recommended for larger
deployments, and is also likely to be preferable for smaller ones due to the simplicity of the
process.
To enable NGT in Prism Central, click the Entities menu, expand the Compute & Storage
category, and select the VMs dashboard. On the VMs dashboard, select one or more VMs on
which NGT has not been enabled. If a VM is powered off or disabled, you will not be able to
enable NGT, and will be notified about this in Prism Central, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 132
Creating Virtual Machines
However, if the VMs are powered on, the option will not be greyed out. As shown in the
following figure, we have turned two VMs on, selected them, and are going to install NGT on
both. To do this, after you select your VMs, click the Actions drop-down menu, scroll down the
list of available options, and select Install NGT.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 133
Creating Virtual Machines
You will see the Install NGT dialog box as shown in the following figure. Here, you can choose
which NGT features to enable (Self Service Restore and Volume Snapshot Service). Since a VM
restart is required, you can also choose to skip the restart, restart as soon as the installation is
complete, or restart at a specific date at time. Since NGT is only used on guest VMs and NGT
can be installed while a VM is in use, the third option is the recommended one if you are not
installing NGT during a maintenance window. After you make your selections, click Confirm &
Enter Password.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 134
Creating Virtual Machines
Before you confirm, you can also click the Review button to view a list of VMs that you are
installing NGT on and which features you have chosen to install on them.
Clicking Confirm & Enter Password will install NGT on the selected VMs. The installation
process is fully automated, requires no further administrative intervention, and does not require
you to log in to the VM.
Note: For more information, see the NGT Management in Prism Central section of
the Prism Central Guide on the Nutanix Support Portal.
If you are using Prism Element to install NGT, it is important to note that you can only do this
on one VM at a time, and that NGT must first be enabled on the guest VM before it can be
installed. To enable NGT on a guest VM, navigate to the Table view of the VM dashboard in
Prism Element, select a VM, and click Manage Guest Tools from the row of links below the
table.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 135
Creating Virtual Machines
In the Manage VM Guest Tools dialog box, you can choose to enable NGT, as well as the Self
Service Restore and Volume Snapshot Service features that could be selected in Prism Central.
If the VM has an empty CD-ROM drive, you can also mount NGT, so that you can launch the VM
and complete the installation. After you make your changes, click Submit.
The next step is to install NGT on the selected guest VM. You can perform a normal installation
on Windows and Linux VMs, or a silent installation specifically on Windows VMs.
Note: For step-by-step instructions for both Windows and Linux VMs, see the NGT
Installation section of the Prism Web Console Guide on the Nutanix Support Portal.
Module 5 Summary
In this module, you learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 136
Creating Virtual Machines
• The prerequisites for VM creation, including networks, images, VirtIO drivers, Sysprep, and
cloud-init.
• How to create a VM as a Prism Central Administrator.
• How to enable Prism Self Service and add VM templates to the catalog, to enable self-service
users to create their own VMs.
• How to create a VM as a self-service user.
• How to enable and install Nutanix Guest Tools, for access to advanced VM management
capabilities.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 137
Managing Virtual Machines
Module
6
MANAGING VIRTUAL MACHINES
Module 6 Overview
This module will introduce you to a variety of administrative tasks that you can perform on
VMs, ranging from performing simple configuration updates and updating VM placement rules,
to automating basic management tasks.
By the end of this module, you will have learned:
In Prism Element, after you create a VM, you can update, edit, delete, migrate, power on/off,
take a snapshot, and clone the VM. You can also enable and manage Nutanix Guest Tools, view
a summary of the VM, and view VM statistics for performance, virtual disks, VM NICs, VM
snapshots, VM tasks, and IO metrics.
All of these options are accessible directly from the VM dashboard in Prism Element. You need
to select the VM you want to manage and choose the relevant option from the list of links that
is displayed below the table.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 138
Managing Virtual Machines
In Prism Central, you have all of the options that are available in Prism Element, as well as some
additional capabilities. In Prism Central, you can also create templates, protect/unprotect,
enable/disable efficiency measurement and anomaly detection, run playbooks, manage
categories, manage ownership, apply storage policies, and so on.
All of these options are accessed directly from the VMs dashboard in Prism Central. Unlike
Prism Element, you can perform these actions on more than one VM at a time. To access these
options, you need to select one or more VMs, click the Actions drop-down menu and choose
the relevant option from the list.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 139
Managing Virtual Machines
Since Prism Central has a larger set of management capabilities, in this module we will discuss
how you can manage and perform administrative actions on VMs using Prism Central.
Updating a VM
To update a VM's configuration, select it from the VMs dashboard using the checkbox next to
the VM's name, click Actions, and then select Update.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 140
Managing Virtual Machines
The Update VM page should be instantly familiar, because it is identical to the Create VM
page. Again, you can provide information in four major sections: Configuration, Resources,
Management, and Review. However, it is important to note that not all of the information that
you provide when creating a VM can be changed when you update it.
For example, as the following figure shows, when you update a VM, on the Configuration page,
you can change the name, description, CPU, cores per CPU, and memory. However, the cluster
on which the VM was created cannot be changed here. If you want to move the VM between
clusters, you will need to use a separate tool like Nutanix Move (which is covered later in this
course). If you want to move the VM between nodes in the same cluster, you need to use the
Migrate option from the Actions menu instead.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 141
Managing Virtual Machines
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 142
Managing Virtual Machines
• The cluster on which the VM resides. You need to migrate the VM to make this change.
• The boot configuration. While you can enable and disable Secure Boot when updating a VM,
the choice of BIOS or UEFI cannot be changed.
Note: For more information and step-by-step instructions, see the Managing a VM
(AHV) section of the Prism Central Guide on the Nutanix Support Portal.
The Clone VM dialog box contains most of the same fields as the Create VMs page, and inherits
most of its configuration information from the source VM. However, it is still possible for you to
make configuration changes before you clone your VM without affecting the source VM.
• General Configuration, which allows you to specify the number of clones you want to create
and the name of the clones.
• Compute Details, which includes vCPUs, cores per vCPU, and memory.
• Network Adapters, which allows you to add a new NIC or remove an existing one.
However, as the following figure shows, when you are cloning a VM, you cannot change the
boot configuration, and you cannot add or remove disks or volume groups.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 143
Managing Virtual Machines
Customizing a VM
When cloning a VM, you can use Sysprep or cloud-init to perform guest customizations and
to facilitate unattended provisioning. For unattended provisioning, you can specify an answer
file for Sysprep and a user data file for cloud-init. To enable Sysprep or cloud-init to access the
script, AOS creates a temporary ISO image that includes the script and attaches the ISO image
to the VM when you power on the VM.
You can also specify source paths to the files or directories that you want to copy to the VM,
and the target directories for those files. This is particularly useful if you need to copy software
that is needed at start time, such as software libraries and device drivers. For Linux VMs, AOS
can copy files to the VM. For Windows VMs, AOS can copy files to the ISO image that it creates
for the answer file.
To enable these customizations when cloning a VM, select the Custom Script checkbox in the
Clone VM dialog box and provide information as required.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 144
Managing Virtual Machines
You can provide custom scripts by providing a file location, uploading the file directly to Prism,
or by pasting the script in the space provided. If you have files that need to be copied to the
temporary ISO image, upload the files to a storage container on the cluster.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 145
Managing Virtual Machines
When live migration occurs automatically, it can be due to the Acropolis Dynamic Scheduler
(ADS) monitoring a cluster for hotspots and contentions and making adjustments as needed,
or due to a node being placed in maintenance mode. To perform live migration manually, an
admin needs to navigate to the VM dashboard and use the Migrate option on a VM.
Automatic VM Migration with Acropolis Dynamic Scheduler
Acropolis Dynamic Scheduling (ADS) proactively monitors your cluster for any compute
and storage I/O contentions or hotspots over a period of time. If ADS detects a problem, it
eliminates hotspots within a cluster by migrating VMs from one node to another.
ADS runs every 15 minutes and analyzes resource usage for that period of time. If the resource
utilization of an AHV node remains > 85% for 15 minutes, migration tasks are triggered to
remove the hotspot. For a storage hotspot, ADS looks at the last 40 minutes of data and uses a
smoothing algorithm to use the most recent data. For a CPU hotspot, ADS looks at the average
CPU usage over the last 10 minutes. ADS does not monitor memory and network usage.
A list of all ADS migrations is available in Prism Central on the Audits dashboard. To access it,
click the Entities menu, expand the Activity category, and select Audits. Then, click the Filter
button at the top right and from the Operation Type category, select Migrate.
Note: For more information, including instructions on how to enable and disable
ADS via the CLI, see the Acropolis Dynamic Scheduling in AHV section of the AHV
Administration Guide on the Nutanix Support Portal.
Another scenario in which automated VM live migration can occur is if a node is put into
maintenance mode. When a node is put in maintenance mode, it is marked as unschedulable so
that no new VMs can be created on it. If VMs are already running on that node, an attempt is
made to evacuate them.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 146
Managing Virtual Machines
The evacuation process simply involves moving all eligible VMs to different nodes in the cluster.
Upon exiting maintenance mode, all VMs are automatically returned to the original node, so no
manual intervention is needed.
Some exceptions to this automated movement are agent VMs, VMs with GPUs, CPU
passthrough, PCI passthrough, and host affinity policies. GPU-enabled VMs can be live
migrated, but this must be done manually by an administrator. Agent VMs are always shut
down if a node is put into maintenance mode, and are powered on once maintenance mode is
exited.
Manual VM Migration
An administrator can manually migrate a VM from one host to another from the VMs dashboard
in Prism Central. To do this, navigate to the VMs dashboard, select your VM, click the Actions
drop-down menu and select Migrate. As the following figure shows, since this is a live migration,
the VM can be powered on when the migration process is initiated.
In the Migrate VM dialog box, you can choose to either migrate within or outside the cluster. If
you choose to migrate within the cluster, you can allow a node to be selected automatically, or
manually select a node as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 147
Managing Virtual Machines
Note: For step-by-step instructions, see the Managing a VM (AHV) section of the
Prism Central Guide on the Nutanix Support Portal.
• The VM-host affinity policy controls the placement of a VM. It is used to specify which nodes
a VM is allowed to run on. Every time you power on or migrate a VM to which this policy is
applied, the policy is checked and enforced.
• The VM-VM anti-affinity policy is meant to keep specified virtual machines apart. That is,
the VMs specified in this type of policy will not be allowed to run on the same node. As a
result, if a problem occurs with a node, causing one of the VMs in the anti-affinity policy
to go down, the other VMs specified in the policy will not be affected. However, this is a
preferential policy and can be ignored by ADS if resource constraints require VMs to be
temporarily placed on the same node.
Prism Central can be used to define VM-host affinity policies using categories, which allows you
to easily manage affinities for larger numbers of VMs. However, VM-VM anti-affinity policies
cannot be defined in Prism Central.
Prism Element can be used to create both VM-host affinity policies and VM-VM anti-affinity
policies. However, they can only be applied to one VM at a time, either when the VM is being
created or when it is being updated.
VM-host affinity policies in Prism Central are accessed from the VMs dashboard. On the VMs
dashboard, at the top of the screen, click Policies and select Affinity Policies.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 148
Managing Virtual Machines
Affinity Policies have a dedicated page in Prism Central, from which you can view a list of
existing policies, the number of VMs and hosts to which they apply, their status (compliant
or non-compliant), and when and by whom the policy was last modified. You can also create,
update, delete, and re-enforce policies from this page.
Clicking the name of a policy will display a details page with two tabs. On the Summary tab,
you can view details about the policy; the VMs, hosts, and categories that the policy has been
associated with; and a simple chart that displays the number of VMs that are compliant or non-
compliant with the policy. You can also update, delete, and re-enforce an affinity policy from
this tab.
Clicking the Entities tab will display a page with two views: VMs and hosts. This allows you
to change the information displayed in the table to be sorted by either the VMs or hosts to
which the selected affinity policy applies. The VMs view displays the name of the VM to which
the policy applies, the hosts that the VMs reside on, the cluster to which those hosts belong,
the categories used to create the policy, and the compliance status of the VMs. The VMs view
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 149
Managing Virtual Machines
also includes a filter button which you can use to filter VMs by their compliance status. The
Hosts view displays only the name of the host, and the cluster and categories to which the host
belongs.
Prism Central's affinity policies are category based, which means that you need to have either
defined and used custom categories or applied the default categories before your policy can
take effect. Even if a category does not have VMs or hosts associated with it, you can still
create a VM-host affinity policy, and assign categories later. Once you assign categories to the
correct entities, you can re-enforce the policy to move VMs onto their designated hosts.
To create a new VM-host affinity policy, on the Affinity Policies dashboard, click the Create
button. On the Create Affinity Policy page, you can name your policy, add a description, and
select VM and host categories to determine on which hosts your selected VMs should run. After
you make your selections, click Create.
After you select your host categories, the page will automatically display the number of hosts
associated with that category. As you can see in our example, we have a single host that is
used to run Education VMs. The policy in the following figure has been defined to ensure that
Education VMs run only on the designated Education host.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 150
Managing Virtual Machines
When you create an affinity policy, it is recommended to always include a category with
multiple hosts, or multiple categories that include multiple hosts. If you create an affinity policy
with just one host, the VMs that are part of that policy will not be moved to and powered on
another host if the original host fails.
If you include just one host in a policy, Prism Central will offer a warning before allowing you to
complete the creation process.
Note: For step-by-step instructions, see the Affinity Policies Defined in Prism
Central section of the AHV Administration Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 151
Managing Virtual Machines
Prism Element allows you to define both affinity and anti-affinity policies. Affinity policies can
be defined when a VM is created or updated in Prism Element. Anti-affinity policies are created
and defined via the command line.
When you create or update a VM, Prism Element allows you to 'pin' the VM to one or more
hosts. This simply means defining the hosts on which a VM is allowed to run, with the
recommendation being to pin a VM to multiple hosts to account for potential failures.
To set VM-host affinity, either create a new VM or select a VM from the VM dashboard and click
Update. In the dialog box, scroll down to the VM Host Affinity section. In our example, we are
updating a VM that had no affinity policy defined when it was created.
Clicking the Set Affinity button will display a dialog box with all hosts that are part of the
cluster. We can then select one or more hosts to pin the VM to, which will allow the VM to be
restarted on another host if the host it is currently on should fail.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 152
Managing Virtual Machines
VM-VM anti-affinity policies can only be defined using the command line. To do this, you must
first SSH into the CVM and create a group. After you create the group, you need to add VMs to
it and then apply anti-affinity to the group.
After you power on the VMs, all VMs that are included in the group will be started on different
hosts if there are enough resources available on different hosts to do so. However, since VM-
VM anti-affinity is a preferential policy, ADS will move VMs as needed to resolve resource
constraints, even if those movements are in violation of the anti-affinity policy.
Note: For step-by-step instructions, see the Affinity Policies Defined in Prism
Element section of the AHV Administration Guide on the Nutanix Support Portal.
Storage policies can be created in Prism Central, are supported only on AHV, and allow you to
define and manage encryption, compression, and quality of service (QoS) attributes for specific
VMs. Storage policies can be accessed from the dedicated Storage Policies dashboard in Prism
Central. To access it, click the Entities menu, expand the Compute & Storage category, and
select Storage Policies.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 153
Managing Virtual Machines
If you have no existing storage policies and have not accessed this dashboard before, Prism
Central will display a brief tutorial about the feature, prompt you to create a new storage policy,
and present a link that will provide you with useful help topics that will help you start using
this feature. To view these help topics, click the storage policy model link. To create a policy,
click the Create Storage Policy button.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 154
Managing Virtual Machines
There are several considerations involved in creating storage policies. For example:
• Every storage policy must have a unique name.
• When you create a storage policy, it must have at least one non-default value. That is, at
least one value (encryption, compression, or QoS) must have a value that is not inherited
from cluster.
• Storage policies are supported only on AHV, not entities running on ESXi or Hyper-V.
• Categories are required for storage policies to be implemented. A storage policy can only
be associated with a category, and cannot be directly associated with a VM.
• Multiple categories can be associated with a single storage policy. However, a single
category cannot be associated with multiple storage policies.
• After encryption is enabled using a storage policy, it cannot be disabled. However, if the
storage policy is deleted, any new data that is written to entities will not be encrypted. Old
data, which was written while the policy was still in effect, will remain encrypted.
Note: For a full list of all requirements, limitations, and considerations, see the
Storage Policy Management section of the Prism Central Guide on the Nutanix
Support Portal.
On the Configuration page, you must enter a unique name and configure encryption,
compression, and QoS. By default, encryption and compression will be set to Inherit from
Cluster. You can choose to either enable encryption or leave the default cluster setting. For
compression, you can choose to inherit the cluster setting or turn compression on or off. If
you choose to enable compression, you can also choose between inline and post-process
compression.
The QoS fields are blank by default, and the quality metric and throttled value have to be
specified. The options for quality metric are IOPS and Throughput. Throttled value requires a
numeric entry and, if IOPS is selected, that value must be greater than 99 or equal to -1.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 155
Managing Virtual Machines
After you click Next, you will see the Associations page. For a storage policy to take effect, it
must be associated with at least one category. You can use the field provided to search for a
category. When you select one, the category will display the number of VMs, hosts, and images
that are associated with it - one of each in our example, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 156
Managing Virtual Machines
Although multiple entities are associated with the category, since storage policies apply to VMs
only, the host and image that are associated with our selected category will be unaffected.
After you add all of the categories that you want to be associated with your storage policy,
click Save.
Alternatively, you can click the name of the policy to view the details page and, on the
Summary tab, click Update.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 157
Managing Virtual Machines
Updating a storage policy is similar to creating one. You can change the storage policy's
configuration and change the categories with which the policy is associated. However, it is
important to note that when updating a storage policy, only the Name, Compression, and QoS
fields can be changed. If encryption was enabled, it cannot be disabled. However, if you chose
to inherit the encryption setting from the cluster, you can choose to enable it when updating
the storage policy.
In addition to updating a policy's configuration, you can also manage associations separately.
To do this, click either the Categories or Entities tabs on the Storage Policy details page, and
click the Manage Associations button. This will allow you to add or remove categories from the
storage policy.
Note: For more information and step-by-step instructions, see the Creating or
Updating a Storage Policy and the Managing Associations sections of the Prism
Central Guide on the Nutanix Support Portal.
Using a simple interface with a collection of built-in actions, you can drag and drop actions in
a sequence and customize them to minimize or eliminate the amount of manual intervention
needed for simple, routine, administrative tasks.
In this section, in order to understand how you can automate some VM management tasks, we
will first look at a playbook that is available out of the box in Prism, and then create a custom
playbook of our own.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 158
Managing Virtual Machines
Then, from the list of available alerts, we can search for the one that corresponds to a VM's
memory being constrained. After we select the alert, we can refine it further. If the alert has
multiple severity ratings (info, warning, or critical) we can specify the severity at which the
playbook will take action. Since this specific alert only has one severity, we will leave the default
selected. We can also specify which VMs the playbook can act on: all VMs with constrained
memory, VMs in specific categories, or specific VMs.
With our trigger defined, now we need to list the actions that the playbook will take. To do this,
click the Add Action link and select the appropriate action from the list that is available. Our
first action is to acknowledge the alert.
Next, we want to send a Slack message to a channel that has been set up specifically to track
all automated playbook executions. For this, we can add a Slack action, provide a token, a
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 159
Managing Virtual Machines
channel name, and a description of the message that will be sent. You can click the Parameters
link to add field values to your message, for the trigger, time, description, source entity, and so
on.
Next, before we make any changes to the VM itself, we want to make sure we have a snapshot
of our VM's current state to ensure that there is no data loss if an issue inadvertently occurs. To
do this, we will add a VM Recovery Point action, with a time to live of 15 days.
Now, we want to add memory to our VM. As we discussed earlier in this module, memory
can be hot-plugged into a VM. This means that, unlike the system-defined alert that we saw
earlier, there is no need for us to power off the VM in order to add memory to it. However, it is
important to note that not all operating systems support hot-plug. ESXi VMs can also not have
their memory increased to more than 3GB due to vSphere restriction. More information on both
these topics is available in the AHV Administration Guide. This information will be presented on-
screen when we add the VM Add Memory action to the playbook.
Here, we specify that the playbook is to add 1GB and the maximum amount of memory allowed
on the VM (since we are running an AHV cluster) is 8GB.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 160
Managing Virtual Machines
Next, we want to send another Slack message to the same channel as before indicating the
playbook executed successfully. We also want to specify what the VM's memory is after the
increase, and we want to indicate that alert generated originally (which acted as a trigger for
this playbook) will be resolved. So, we will add another Slack action to our playbook.
Finally, we want to resolve the alert that we originally defined as a trigger for this playbook. To
do this, we will add a Resolve Alert action to our playbook.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 161
Managing Virtual Machines
Now that our playbook has been completely defined, clicking Save & Close will make the
playbook available for use. When we save the playbook, we can name it, add a description, and
choose to enable it or disable it. Before saving the playbook, you can also configure autopilot.
This will allow you to define how frequently the playbook attempts to run automatically in order
to resolve memory constraints before administrator intervention is needed.
To ensure that the playbook runs, simply enable it. Then, whenever an alert that matches the
playbook's trigger is generated, the playbook will run automatically and attempt to resolve the
issue.
Note: For more information and step-by-step instructions, see the Task Automation
section of the Prism Central Guide on the Nutanix Support Portal.
In the Export as OVA dialog box, you need to name the OVA and select the format in which
you want the VM's disks to be made available. The options are VMDK and QCOW2. After you
make your selections, click Export to save your VM as an OVA.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 162
Managing Virtual Machines
Prism Central has a separate OVA dashboard from which you can view, manage, deploy, and
export all of your OVA files. To access it, click the Entities menu, expand the Compute &
Storage category, and select OVAs. Here, you will see the new OVA that we just created (HMN-
OVA), other OVAs that were created previously, the source VMs from which they were created,
the disk formats that have been used, and when these OVAs were created and by whom.
Selecting an OVA and clicking the Actions menu will allow you to deploy the OVA as a
VM, download the OVA file so that you can share it with another user for deployment, rename
the OVA, or delete it.
Note: For more information and step-by-step instructions, see the OVA
Management section of the Prism Central Guide on the Nutanix Support Portal.
VM HA also respects affinity and anti-affinity rules. For example, consider a cluster with four
hosts, A, B, C, and D, and a VM that has been pinned to hosts A and B using an affinity rule. If
host A goes down, VM HA will attempt to restart the VM on host B. However, if both hosts A
and B go down, the VM will not be restarted on C or D since they have not been included in the
affinity rule.
Two VM HA modes are available, and can be enabled either in Prism or by using the command
line. They are:
• Default. This mode requires no configuration and is included by default when installing an
AHV-based Nutanix cluster. When an AHV host becomes unavailable, the VMs that were
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 163
Managing Virtual Machines
running on that AHV host restart on the remaining hosts if the necessary resources are
available. If the remaining hosts do not have sufficient resources, some of the failed VMs may
not restart.
• Guarantee. This configuration reserves space throughout the AHV hosts in the cluster to
guarantee that all VMs can restart on other hosts in the cluster when an AHV host becomes
unavailable. To enable Guarantee mode, select the Enable HA check box, as shown in the
following figure. A message then displays the amount of memory reserved and how many
AHV host failures the system can tolerate.
To enable VM HA, navigate to the Settings menu in Prism Element and scroll down to the Data
Resiliency section. Click Manage VM High Availability and, on the page that is displayed, select
the Enable HA Reservation checkbox.
Upon selection, before you save your changes, the dialog box will display how much memory
will be reserved to account for failures. This value can change and will either be increased or
decreased based on changes in cluster resource utilization.
Note: For more information, see the Virtual Machine High Availability Tech Note in
the Solutions Documentation section of the Nutanix Support Portal.
Module 6 Summary
In this module, you learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 164
Managing Virtual Machines
• About Acropolis Dynamic Scheduler and how it makes VM placement decisions to resolve
resource contentions.
• What affinity and anti-affinity policies are and how to create these policies.
• What storage policies are and how to create and manage them.
• How to automate common VM management tasks using Playbooks.
• How to export a VM as an OVA file so that it can be shared with users and used as a
template file.
• What VM HA is and how to enable it.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 165
Protecting Virtual Machines and Their Data
Module
7
PROTECTING VIRTUAL MACHINES AND THEIR
DATA
Module 7 Overview
In this module, you will learn about the options that are available in Prism Element and Prism
Central to protect virtual machines and their data.
• Per-VM Backup: This involves designating specific VMs for backup to a different site. It is
especially useful for remote office and branch office (ROBO) environments, since only a
subset of VMs will typically need to be backed up regularly to a central site.
• Selective, Bi-directional Replication: It is not enough to simply replicate VMs and data to a
secondary, passive, backup site which can be activated in the event of a disaster. A flexible
replication solution requires that replication be bi-directional, from the primary site to the
secondary site, and back to the primary site when required.
• Synchronous Datastore Replication: Datastores can be spanned across two sites to provide
seamless protection in the event of a site disaster.
With these three strategies, Nutanix supports a variety of different topologies to meet a variety
of requirements. Four of these topologies are one-to-many, many-to-one, many-to-many, and
two-way mirroring.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 166
Protecting Virtual Machines and Their Data
One-to-many
In this scenario, there is one central site with multiple remote locations - tier 1 workloads run at
site 1, while sites 2 and 3 serve as backup locations. The workloads on site 1 can be replicated
to sites 2 and 3. In the event of a disaster, workloads can be restarted on either of the remote
sites, providing a high level of VM availability.
A one-to-many topology can also be designed to optimize bandwidth. For example, if sites 1
and 2 are in the US, but site 3 is in Asia, workloads can be distributed so that larger VMs are
replicated between sites 1 and 2, while smaller VMs are replicated to site 3.
Many-to-one
Many-to-one is also called a hub and spoke architecture, in which workloads running on
multiple sites are replicated to a single, central site. Centralizing replication to a single site may
improve operational efficiency for geographically disperse environments. ROBO is a classic
example of this topology.
Many-to-many
In this topology, there is neither a single primary site nor a single backup site. Instead, multiple
sites exist and all serve as replication targets for each other. This topology allows for the most
flexibility, and gives IT departments the maximum amount of control to ensure application and
service level continuity.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 167
Protecting Virtual Machines and Their Data
Two-way mirroring
In a two-way mirroring topology, there are two sites, each of which serves as a replication
target for the other. There are active workloads running on both sites simultaneously, and there
are no idle resources in either location. Utilizing storage, compute and networking resources at
both locations has a significant advantage over traditional data protection strategies in which
servers sit idle in anticipation of a future data disaster event.
Term Definition
Consistency A consistency group is a subset of the entities in a protection
Group domain. Consistency groups become operational when you create a
protection domain. The system captures snapshots for all the entities
within a consistency group for that protection domain in a crash-
consistent manner. The system generates a snapshot for each VM in
the consistency group simultaneously at the same time.
Primary Site Primary site is the local cluster or clusters that host the entities that
require protection and whose snapshots must be replicated on
another cluster or a remote site.
Protection A standard (Async DR) protection domain is a defined group of
Domain entities (virtual machines, volume groups, or storage containers) that
(Async DR) must be protected. These entities are backed up locally on a cluster
and optionally replicated to one or more remote sites.
Protection A metro availability protection domain consists of a specified (active)
Domain storage container in the local cluster. This active storage container is
(Metro linked to a (standby) container on a remote site.
Availability)
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 168
Protecting Virtual Machines and Their Data
Term Definition
Recovery A recovery point is a copy of the state of a system at a particular
Point point in time. A recovery point refers to a snapshot.
Recovery A configurable policy that orchestrates the recovery of VMs at the
Plan recovery site.
Recovery A time interval that refers to the acceptable data loss if a failure
Point occurs. For example, if the RPO is 1 hour, the system creates a
Objective recovery point every 1 hour. In the event of a recovery, you can
(RPO) recover VMs with data as of up to 1 hour ago.
Recovery The time period from failure event to the restored service. For
Time example, an RTO of 30 minutes enables you to back up and run VMs
Objective in 30 minutes in the event failure.
(RTO)
Remote Site A remote site is a separate cluster used as a target location to
replicate and store data asynchronously or a standby storage
container for metro availability. A remote site maybe another cluster
in the same physical site as well.
Replication Replication is the process of asynchronously copying snapshots from
one cluster to one or more remote sites. Nutanix several replication
scenarios are supported, as we discussed in the previous section.
Snapshot A snapshot is read-only copy of the state and data of a VM or volume
group at a point in time.
This workflow will help you determine which of Nutanix's DR solutions is appropriate for your
environment. And in this module, we will discuss protection domain-based DR and on-prem to
on-prem backup using Leap.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 169
Protecting Virtual Machines and Their Data
By default, like most dashboards in Prism Element, the Data Protection dashboard has two
views: Overview and Table. The Overview tab is meant primarily for monitoring and displays
a series of widgets that summarize ongoing and pending replication, the last successful
replication, bandwidth used, and data protection-specific alerts and events.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 170
Protecting Virtual Machines and Their Data
The Table view provides more detailed information. It has two tabs, allowing you to view
protection domains and remote sites separately. Taking the Async DR tab as an example,
it allows you to view protection domains that have been set up, associate remote sites, the
number of entities that have been secured by the protection domain, the time of the next
snapshot and space utilized by snapshots, bandwidth utilized, as well as any ongoing and
pending replication.
If you select a protection domain, details will be displayed in the second half of the page,
including a summary of the protection domain's configuration, and detailed information about
replication, protected entities, local and remote snapshots, metrics, alerts, and events.
Note: For more information, see the Data Protection Dashboard section of the Data
Protection and Recovery with Prism Element guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 171
Protecting Virtual Machines and Their Data
On the Entities tab, you need to select the entities that you want to protect using the
protection domain. All unprotected entities will be listed; if an entity (such as a VM or a volume
group) is already part of a protection domain, it will not be listed here. In our test cluster, as the
following figure shows, there are 11 unprotected entities.
Note that, by default, Auto protect related entities is selected. This means, for example, if we
choose to protect certain VMs, any volume groups associated with those VMs will automatically
be added to the protection domain. To illustrate this, let's select three VMs. Note that, at the
bottom of the dialog box, it shows that we have chosen to protect only 3 entities.
We can also choose to add our VMs to a consistency group. This simply means that all VMs that
are part of the consistency group will have crash-consistent snapshots taken.
A snapshot is crash-consistent if it captures all of the data components (write order consistent)
at the instant of the crash. VM snapshots are crash-consistent by default, which means that the
vDisks that the snapshot captures are consistent with a single point in time. Crash-consistent
snapshots are more suited for non-database operating systems and applications which may
not support quiescence (freezing) and un-quiescence (thawing) and such as file servers, DHCP
servers, print servers.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 172
Protecting Virtual Machines and Their Data
To actually add these entities to our protection domain, we need to click the Protect Selected
Entities (3) button at the bottom of the dialog box. However, as the following screenshot
shows, five items have been added. This is because one of our VMs (HMN Test 1) has two
volume groups (VG1 and VG2) linked to it. This is illustrated by the link icon next to the VM and
volume group names.
Also note that our HMN Test 1 VM and the two volume groups are part of the same consistency
group, which means crash-consistent snapshots will be taken together for these three entities.
After we click Next, we need to define our schedule. By default, no schedules are available
out of the box in Prism Element, as shown in the following figure. However, schedules can
be created as needed and are highly customizable. To create one, we need to click the New
Schedule button at the top right of the dialog box.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 173
Protecting Virtual Machines and Their Data
On the Schedule tab, we can configure a replication frequency (in minutes, hours, days, specific
days of the week, weekly, monthly, or on specific days of the month). You can also define a
retention policy, which will determine how many recent snapshots will be retained. If a remote
site has been configured, you can choose to replicate your snapshots to either the local site or
the remote site.
In this sample schedule, we are scheduled our backup to run every week on Wednesday, have
specified a start date and time, are storing snapshots locally since no remote site has been
defined, and are keeping two snapshots as part of our retention policy.
To complete the process, we need to click Create Schedule on the Schedule tab, and then click
Close. This will create your protection domain, which can be viewed on the Async DR tab of the
Data Protection dashboard's Table view.
Note that the Schedule tab of the Protection Domain configuration process allows you to use
various Nutanix capabilities such as NearSync and Async DR. These features are defined by
their RPO; that is, NearSync has an RPO of 1 to 15 minutes, while Async DR has an RPO of 60
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 174
Protecting Virtual Machines and Their Data
minutes or more. To use these features, simply choose either minutes or hours respectively and
provide the corresponding information.
In the next sections, we will discuss specific Nutanix data protection and DR features in more
detail.
To configure a remote site, in Prism Element, navigate to the Data Protection dashboard
and click + Remote Site at the top right corner of the page. From the drop-down menu that
appears, choose Physical Cluster.
The Remote Site dialog box has two tabs: Details and Settings.
On the Details tab, first, you need to enter a name for the remote site. The maximum length for
a name is 75 characters, upper- and lowercase Latin characters (a to z and A to Z) and decimal
digits (0 to 9) are allowed in the name. The only special characters allowed are dots, hyphens,
and underscores.
Next, you can choose whether or not to enable proxy. Enabling this field allows the specified
IP addresses—remote CVM IP addresses or remote Virtual IP address—to be used as a proxy
to communicate with a Nutanix cluster on the remote site. In this case, the source cluster
communicates with only one of the remote proxy IP addresses (remote CVM IP addresses or
remote Virtual IP address), which forwards the requests to the appropriate remote CVMs. The
proxy setting on the remote site limits the replication traffic to the defined destination remote
site IP address (many to one) rather than to each individual CVM IP address in the destination
cluster (many to many). The many-to-one replication approach can be used to simplify firewall
rules configuration.
Then, you can designate the site as either a backup or a disaster recovery site. Backup allows
the remote site cluster to be used as a backup replication target. This means data can be
backed up to this remote site cluster and snapshots can be retrieved from the site to restore
locally, but failover protection (that is, running failover VMs directly from the remote site) is not
enabled. However, Disaster recovery allows the remote site to be used both as a backup target
and as a source for dynamic recovery, which means that failover VMs can be run directly from
the remote site.
Finally, you need to enter the remote cluster's virtual IP address and click the Add Site button.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 175
Protecting Virtual Machines and Their Data
On the Settings tab, several options are available for configuration, including bandwidth
throttling, network mapping, and vStore mapping.
Bandwidth throttling allows you to define a policy that enforces a limit for the network
bandwidth, based on utilization of your network. For example, you can define a policy that a
Nutanix cluster should replicate data from one site to another at less than 10 MBps between 9
a.m. and 5 p.m. on weekdays because there might be other critical traffic between the two sites
then. If you do not want to define a bandwidth policy, you can enter the maximum bandwidth
allowed (in MBps) that should be used in the Default Bandwidth Limit field instead.
You can also enable network compression by using the Compress On Wire toggle. However,
note that enabling this option will consume more CPU resources.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 176
Protecting Virtual Machines and Their Data
On the second half of the Settings tab, you need to configure network and vStore mapping.
All the VLANs that are available on the source and destination clusters are mapped in the
drop-down menu. In the following figure, we have two VLANs configured on the source and
destination sites respectively, Managed Network and vlan812. When we map them to each
other, in a failover scenario, a VM running on the Managed Network in the source cluster will
automatically be brought up on vlan812 in the destination cluster.
Similarly, each vStore entry is associated with a storage container on the source site and a
storage container on the destination site. In our example below, the default storage container
on our source cluster has been mapped to the default storage container on the destination
cluster.
If you have multiple networks and multiple storage containers, clicking the + icon will allow you
add as many mapping as needed until all networks and storage containers on the source and
destination clusters have been mapped to each other.
After the mappings are complete, clicking Save will create the remote site.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 177
Protecting Virtual Machines and Their Data
Note: For step-by-step instructions, see the Remote Site Configuration section of
the Data Protection and Recovery with Prism Element guide on the Nutanix
Support Portal.
NearSync allows you to protect your data with an RPO of 1 to 15 minutes. NearSync is useful
for protecting mission-critical applications, securing data with minimal loss, providing granular
control during the restore process, and allowing for the resolution of a disaster event within
minutes.
This is possible because NearSync uses a Nutanix technology called lightweight snapshots
(LWS). These snapshots are generated at the metadata level only and continuously replicate
incoming data generated by workloads running on the active cluster.
When you configure NearSync on a protection domain, snapshots will first be generated on
an hourly basis and replicated in order to seed the remote site with data. Once the required
snapshots have been generated and replicated for the purpose of data seeding, LWS snapshots
will be generated and replicated instead based on the RPO defined in the protection domain.
It is also possible for replication to automatically, and repeatedly transition into and out of
NearSync. This could be because:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 178
Protecting Virtual Machines and Their Data
However, despite these repeated transitions, the retention policy defined will continue to be
honored. For example, if you want an RPO of 1 minute and want to retain snapshots for 5 days:
• Every 1 minute, a snapshot is created and retained for the maximum of 15 minutes.
• Every hour, a snapshot is created and retained for 6 hours.
• Every day, one snapshot is created and retained for 5 days.
Similarly, you can also define a retention policy in weeks or months. For example, if you want
snapshots retained for a period of 3 months:
Configuring NearSync
NearSync is configured on the Schedules tab of the protection domain configuration dialog
box. If you choose an RPO of 1 to 15 minutes, as shown in the following figure, it means that a
NearSync schedule has been configured.
In the following example, we have defined a schedule that will begin on May 18, with an RPO of 1
minute, with snapshots to be retained for 5 days.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 179
Protecting Virtual Machines and Their Data
Asynchronous replication refers to schedules that have been configured with an RPO of 60
minutes or more. These schedules involve full snapshots, unlike the lightweight snapshots that
are used for NearSync.
As with NearSync, the schedule you define when creating a protection domain will determine
whether or not async replication is used. In the following example, we have defined a schedule
that will begin on May 18, with an RPO of 90 minute, with snapshots to be retained for 5 days.
Since the RPO is greater than 60 minutes, async replication will be used.
Cloud Connect is a Nutanix data protection capability that facilitates asynchronous replication
to AWS.
When AWS is configured as a remote site for backup, a single node cluster with a Nutanix CVM
is created on an AWS cloud in the region of your choice. This single node cluster has a 30 TB
disk attached to it, with a usable capacity of 20 TB. Once configured, the cluster on the remote
site can be managed in Prism Element just like any other remote site.
Amazon S3 is used to store data (extents) and Amazon Elastic Block Store (EBS) is used to
store metadata. You can then use the Amazon management tools to manage and monitor
billing and related usage. You will be charged only for used capacity, not the full capacity of the
single node cluster.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 180
Protecting Virtual Machines and Their Data
When the AWS Remote feature replicates a snapshot data to AWS, the Nutanix Controller
VM on AWS creates a bucket on S3 storage. The bucket name is ntnx-cluster_id-
cluster_incarnation_id-disk_id.
In Prism Element, navigate to the Data Protection dashboard and click + Remote Site at the
top right corner of the page. From the drop-down menu that appears, choose Cloud. If AWS
accounts have been configured previously in Prism Element, they will appear here. If no existing
accounts are available, you can configure a new one. To do this, on the Credentials tab, click
Add New Key and enter a name, an access key ID, and a secret access key, as shown in the
following figure.
Click Save. Click the radio button next to the account that was just added, and click Next.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 181
Protecting Virtual Machines and Their Data
On the Remote Site Settings tab, you need to name the remote site, choose a region, select
your connectivity method, and map source and destination vStores. When complete, a typical
configuration should look similar to the following figure. After you provide the required
configuration information, click Create to complete the process.
Note: For more information and step-by-step instructions, see the Asynchronous
Replication Using Cloud Connect section of the Data Protection and Recovery with
Prism Element guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 182
Protecting Virtual Machines and Their Data
Metro availability is a policy applied on a storage container, which effectively spans the
storage container across two sites. This is accomplished by pairing a storage container on the
local cluster with one on a remote site and then synchronously replicating data between the
local (active) and remote (standby) storage containers. When metro availability is enabled,
everything in the active storage container is replicated synchronously to the remote storage
container. Metro availability configurations can include VMs, but they cannot include volume
groups.
Metro availability policies apply per storage container (not cluster), so a cluster can be active
for one datastore and standby for another. For example, consider a cluster configuration
with an Oracle datastore (Datastore A) in storage container 1 and an Exchange datastore
(Datastore B) in storage container 2. Cluster A can be active for Datatstore A with Cluster B as
the standby, while Cluster B can be active for the Datastore B with Cluster A as the standby. In
addition, metro availability storage containers can co-exist with regular storage containers in
the same cluster.
Metro availability is supported on clusters running ESXi or Hyper-V only. Metro is supported on
ESXi clusters that include storage-only nodes (which run AHV) but not Hyper-V clusters with
storage-only nodes.
When configuring metro availability, you can also configure a Witness VM. A Witness VM is a
special VM that monitors the metro availability deployment's configuration health. The Witness
VM resides in a separate failure domain to provide an outside view that can distinguish a site
failure from a network interruption between the metro availability sites. The goal of the Witness
option is to automate failovers in case of site failures or inter-site network failures.
Note: For more information about and step-by-step instructions for configuring a
Witness, see the Witness Option section of the Data Protection and Recovery with
Prism Element guide on the Nutanix Support Portal.
Leap protects your guest VMs and orchestrates DR to other Nutanix clusters (or Xi Cloud
Services) when events causing service disruption occur at the primary site. Protection policies
with Asynchronous, NearSync, or Synchronous replication schedules generate and replicate
recovery points to other on-prem or Xi Cloud sites. Recovery plans orchestrate DR from the
replicated recovery points to other Nutanix clusters at the same or different on-prem sites.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 183
Protecting Virtual Machines and Their Data
There are two variants of this Nutanix data protection capability. Leap is used to facilitate
VM, data, and application protection and DR to other Nutanix clusters at the same physical
site or different physical sites. Xi Leap is an extension of Leap in which VM, data, and
application protection extends from an on-prem site to Xi Cloud Services or from Xi Cloud
Services to an on-prem site.
Leap offers an entity-centric automated approach to protect and recover applications. It uses
categories to group guest VMs and automate the protection of guest VMs as applications scale.
Application recovery is flexible with network mappings, an enforceable VM start sequence, and
inter-stage delays. Application recovery can also be validated and tested without affecting your
production workloads. Asynchronous, NearSync, and Synchronous replication schedules ensure
that an application and its configuration details synchronize to one or more recovery locations
for a smoother recovery.
Leap works with sets of physically isolated locations called availability zones. An instance of
Prism Central represents an availability zone. One availability zone serves as the primary site for
an application while one or more paired availability zones serve as the recovery sites.
When paired, the primary site replicates the entities (protection policies, recovery plans,
and recovery points) to recovery sites in the specified time intervals (RPO). This approach
helps application recovery at any of the recovery sites when there is a service disruption at
the primary site (due to natural disasters or scheduled maintenance, for example). Entities
are replicated back to the primary site when the primary site is up and running to ensure
application high availability. The entities you create or update synchronize continuously
between the primary and recovery sites. This reverse synchronization enables you to create or
update entities (protection policies, recovery plans, or guest VMs) at either the primary or the
recovery sites.
The simplest representation of this relationship between primary and secondary sites is
shown in the following figure. However, it is possible for one primary site to replicate to and
synchronize with:
• Up to two on-prem recovery clusters in the same physical location as the primary site.
• Two on-prem recovery sites in different physical locations.
• One on-prem recovery site in a different physical location and Xi Cloud Services.
• Xi Cloud Services only.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 184
Protecting Virtual Machines and Their Data
Some important Leap terms that you should be familiar with are:
Term Definition
Availability Zone A zone that can have one or more independent datacenters inter-
connected by low latency links. An availability zone can either
be in your office premises (on-prem) or in Xi Cloud Services.
Availability zones are physically isolated from each other to ensure
that a disaster at one availability zone does not affect another
availability zone. An instance of Prism Central represents an on-
prem availability zone.
Network A mapping between two virtual networks in paired sites. A network
Mapping mapping specifies a recovery network for all guest VMs of the
source network. When you perform a failover or failback, the guest
VMs in the source network recover in the corresponding (mapped)
recovery network.
Primary A site that initially hosts guest VMs you want to protect.
Availability Zone
Recovery A site where you can recover the protected guest VMs when a
Availability Zone planned or an unplanned event occurs at the primary site causing
its downtime. You can configure at most two recovery sites for a
guest VM.
Recoverable A guest VM that you can recover from a recovery point.
Entity
Recovery Plan A configurable policy that orchestrates the recovery of protected
guest VMs at the recovery site.
Recovery Virtual The virtual network to which guest VMs migrate during a failover or
Network failback operation.
Source Virtual The virtual network from which guest VMs migrate during a failover
Network or failback.
Note: For more information, see the Leap Terminology section of the Leap
Administration Guide on the Nutanix Support Portal.
As we saw earlier in this module, these are the same schedules supported by Prism
Element. However, while protection domains are used to enforce these schedules in Prism
Element, Prism Central uses protection policies. Next, we will discuss how to create and
configure protection policies using Leap in Prism Central.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 185
Protecting Virtual Machines and Their Data
If your primary and secondary clusters are at the same site and managed by a single Prism
Central, Leap only needs to be enabled once to apply to both clusters.
To enable Leap, in Prism Central, click the Settings (gear) icon and scroll down to the Setup
category. Click Enable Leap in the left pane to display the Leap dialog box. Here you will see
two tabs: Enablement and How to Setup.
• Prism Element that is hosting Prism Central needs to be registered with Prism Central.
• The iSCSI data services IP must be configured on the Prism Element hosting Prism Central
and must be reachable on port 3260.
• Calm enablement must not be in progress.
If issues are detected, after you resolve them, you can return to this dialog box to run the
prechecks again. If the prechecks are successful, you will be able to enable Leap, as shown in
the following figure.
If you click Enable, Prism Central will display a dialog box that indicates how much additional
memory will be allocated to each Prism Central VM to facilitate Leap's functions. It will also
warn you that once enabled, Leap cannot be disabled.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 186
Protecting Virtual Machines and Their Data
It typically takes a few minutes to enable Leap. You can monitor progress from the Tasks page
in Prism Central. Once the tasks is complete, the Enablement tab will display a How to Setup
link as shown in the following figure.
On the How to Setup tab, no further action can be taken. Instead, a series of instructions are
displayed that will help you understand the next steps involved in using Leap and protection
policies in Prism Central. These include connecting to an availability zone, creating protection
policies for VM protection, and setting up networks and recovery plans for DR.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 187
Protecting Virtual Machines and Their Data
Note: For step-by-step instructions, see the Enabling Leap section of the Leap
Administration Guide on the Nutanix Support Portal.
To pair two sites with each other, in Prism Central, click the Entities menu, scroll down to the
Administration category and click Availability Zones.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 188
Protecting Virtual Machines and Their Data
On the Availability Zones dashboard, you will see a table with a list of all configured
zones, their name, region, type, and connectivity status. A Local AZ will be available by default.
To create a new pairing, click the Connect to Availability Zone button at the top left of the
page.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 189
Protecting Virtual Machines and Their Data
In the Connect to Availability Zone dialog box, you need to first select the type of availability
zone you want to connect to: Xi Cloud Services or a Physical Location. In this example, since
we've selected Physical Location and provided the details of one of our on-prem Education test
clusters.
After entering the required information and clicking Connect, Leap will attempt to connect
to the cluster using the IP address and credentials provided. If the process fails, you will be
required to correct or re-enter the information. If the process succeeds, you will see a new
availability zone on the dashboard, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 190
Protecting Virtual Machines and Their Data
Note: For step-by-step instructions, see the Pairing Availability Zones (Leap)
section of the Leap Administration Guide on the Nutanix Support Portal.
Protection policies can be automated using the three supported SLAs: Asynchronous (1 hour
or greater RPO), NearSync (1 to 15 minute RPO), and Synchronous replication (0 RPO). As with
protection domains, each of these three options are configured in a schedule for a protection
policy.
To create a new protection policy, in Prism Central, click the Entities menu, expand the Data
Protection category, and select Protection Policies.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 191
Protecting Virtual Machines and Their Data
No policies are available by default on the Protection Policies dashboard. To create a new
policy, click the Create Protection Policy button at the top left corner of the page.
On the Create Protection Policy page, you need to figure configure the policy by adding a
primary and recovery location, and configuring a schedule. Then, you need to select the entities
to protect.
When configuring the primary location, the default availability zone (Local AZ) will be selected
by default. You can change this if needed, and select the clusters that you want to replicate
from. In this instance, we have two clusters but will only be selecting one of them - CDev7.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 192
Protecting Virtual Machines and Their Data
After clicking Save in the Primary Location section, we will be prompted to add a Recovery
Location. In this case, we will be selecting the Prism Central availability zone that we paired
earlier in this section. Since that Prism Central has only one cluster - CDev4 - we will select that
cluster as our recovery cluster.
When selecting clusters as a recovery location, you will see an Auto option in the drop-down
list. Selecting Auto is not recommended, since it could result in full replication to clusters that
you do not actually want to use as replication targets.
Also note that Leap needs to be enabled on both the primary and the recovery locations. If
Leap is not enabled, you will receive a prompt in Prism Central indicating that Leap must be
enabled, as shown in the following figure. In order to proceed, and for the protection policy to
function, ensure that you enable Leap on both primary and secondary location before creating
a protection policy.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 193
Protecting Virtual Machines and Their Data
After you click Save in the Recovery Location section, click Add Schedule to set up automated
replication.
In Add Schedule dialog box, you can choose either Asynchronous or Synchronous replication.
If you choose Asynchronous, you can set the snapshot schedule to be in minutes (NearSync),
hours (Async), days, or weeks.
You can also choose between Linear and Rollup for retention. Linear will keep the defined
number of snapshots on the clusters. As shown in the following figure, we have selected Linear
and have chosen to retain two recovery points on our primary and recovery availability zones.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 194
Protecting Virtual Machines and Their Data
On the other hand, Rollup will maintain a rolling window of snapshots for every schedule. If you
select Roll Up, for example, and the defined retention period is 1 week, as shown in the following
figure, 24 hourly, 7 daily, and 1 weekly recovery point will be retained. Similarly, if the retention
period is set to 2 years, 24 hourly, 7 daily, 4 weekly, 12 monthly, and 2 yearly recovery points will
be retained.
Every time you change the value in the Retention on fields, Prism Central will provide a short
description below the field to indicate how many recovery points of each retention period will
be saved.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 195
Protecting Virtual Machines and Their Data
When defining a schedule, you also have the option of reversing retention for VMs at the
recovery location. This maintains the number of recovery points even after failover to a
recovery site and failback to the primary site.
For example, consider a schedule with two recovery points at the primary site, and three at the
recovery site. When failover occurs, the recovery site will have two recovery points, while the
primary site will have three. If you enable reverse retention, when failback to the primary site
occurs, the primary site will have two retention points while the recovery site will have three
(as they did originally). If you do not enable reverse retention, when failback to the primary site
occurs, the primary site will have three retention points while the recovery site will have two.
For this example, we will use Linear retention, with an hourly schedule and save the two most
recent snapshots on both our primary and recovery locations. To proceed, click Save Schedule.
Before proceeding to the next step, you can choose when the protection policy will come into
effect. By default, protection will begin immediately. However, you can click the Immediately
link and set a specific time at which protection will begin. After reviewing the information on
the Configure Schedule page, click Next.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 196
Protecting Virtual Machines and Their Data
On the Add Entities page, you can select categories from the left pane to assign those
categories and the VMs associated with them to the protection policy. In this example, we have
selected the NTNXDepartment: Education category, which we created earlier in this course,
which also has VMs associated with it. To add selected categories to the protection policy, click
Add.
A completed protection policy will typically appear as shown in the following figure. Review the
configuration information once and then, to complete the process, click Create.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 197
Protecting Virtual Machines and Their Data
Module 7 Summary
In this module, you learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 198
Module
8
CONFIGURING AND MANAGING CLUSTER
STORAGE
Module 8 Overview
This module will introduce you to AOS Distributed Storage, describe various storage
constructs, and help you understand how to create storage containers and apply a variety of
capacity optimization features to a storage container.
AOS Distributed Storage simplifies storage and data management for virtual environments. By
pooling flash and hard disk drive storage across a Nutanix cluster and exporting it as a data
store to the virtualization layer as iSCSI, NFS, and SMB shares, AOS eliminates the need for SAN
and NAS solutions.
AOS stores user data (VM disk/files) across storage tiers (SSDs, Hard Disks, Cloud) on multiple
nodes. AOS also supports instant snapshots, clones of VM disks, and other advanced features
such as deduplication, compression and erasure coding.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 199
Configuring and Managing Cluster Storage
AOS Distributed Storage presents all the storage devices in the cluster to the hypervisor as a
pool of storage that provides cluster-wide storage services, such as snapshots, clones, HA, DR,
deduplication, compression, and erasure coding.
The Controller VMs (CVMs) running on each node combine to form an interconnected
network within the cluster, where every node in the cluster has access to data from shared
SSD, HDD, and cloud resources. The CVMs allow for cluster-wide operations on VM-centric
software-defined services: snapshots, clones, high availability, disaster recovery, deduplication,
compression, erasure coding, and so on.
Hypervisors (AHV, ESXi, Hyper-V) and AOS communicate using the industry-standard
protocols iSCSI, NFS, and SMB3.
AOS Distributed Storage has several capabilities that improve performance, including:
Intelligent Tiering
AOS Distributed Storage continually monitors data access patterns and optimizes data
placement on either the SSD or HDD tier, achieving the best performance without administrator
intervention. The SSD tier provides maximum performance for hot data and random I/O, while
the HDD tier provides maximum capacity and economy for cold data and sequential I/O.
Data Locality
AOS Distributed Storage ensures that as much of a VM’s data as possible is stored on the
node where the VM is running. This negates the need for read I/O to go through the network.
Keeping data local optimizes performance and minimizes network congestion. Every VM’s data
is served locally from the CVM and stored preferentially on local storage. When a VM is moved
from one node to another using vMotion or live migration (or during an HA event), the migrated
VM’s data automatically follows the VM in the background based on read patterns.
AOS Distributed Storage is capable of responding to different workloads and allows different
node types (for example, compute-heavy or storage-heavy nodes) to be mixed in a single
cluster. However, when mixing nodes in a cluster, it is important to ensure uniform distribution
of data, taking into account the different storage capacities of different nodes. The native
disk balancing feature ensures that data is distributed uniformly among nodes once storage
utilization on a node crosses a particular threshold. Movement of data is always done within the
same tier (that is, SSD or HDD) when disk balancing is performed. Disk balancing will not move
data between storage tiers.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 200
Configuring and Managing Cluster Storage
Data path redundancy ensures high availability in the event a Nutanix Controller VM (CVM)
becomes unavailable or needs to be brought down for upgrade. If a CVM becomes unavailable
for any reason, Nutanix CVM auto pathing automatically reroutes requests to a “healthy” CVM
on another node. This failover is fully transparent to the hypervisor and applications.
This redirection continues until the local CVM failure issue is resolved. Because the cluster has
a global namespace and access to replicas for all the data on that node, it services requests
immediately. This structure provides a high degree of fault tolerance and failover capability for
all VMs in a Nutanix cluster. If the node’s CVM remains unavailable for a prolonged period, data
automatically replicates again to maintain the necessary replication factor.
A storage pool is a group of physical storage devices including PCIe SSD, SSD, and HDD
devices for the cluster. The storage pool can span multiple Nutanix nodes and is expanded as
the cluster scales. In most configurations, only a single storage pool is leveraged.
Storage Container
A container is a logical segmentation of the Storage Pool and contains a group of VM or files
(vDisks). Some configuration options (e.g., RF) are configured at the container level, however
are applied at the individual VM/file level. Containers typically have a 1 to 1 mapping with a
datastore (in the case of NFS/SMB).
vDisk
A vDisk is a subset of available storage within a storage container that provides storage to
virtual machines. A vDisk is any file over 512 KB on DSF, including VMDKs and VM disks. vDisks
are broken up into extents, which are grouped and stored on physical disk as an extent group.
The Nutanix platform now allows you to migrate a vdisk from one storage container to another,
while it is attached to a guest VM without needing to shutdown or delete that VM.
Volume Group
A volume group is a collection of logically related virtual disks or volumes. It is attached to one
or more execution contexts (VMs or other iSCSI initiators) that share the disks in the volume
group. You can manage volume groups as a single unit.
Each volume group contains a UUID, a name, and iSCSI target name. Each disk in the volume
group also has a UUID and a LUN number that specifies ordering within the volume group. You
can include volume groups in protection domains configured for asynchronous data replication
(Async DR) either exclusively or with VMs.
Volume groups cannot be included in a protection domain configured for Metro Availability, in
a protected VStore, or in a consistency group for which application consistent snapshotting is
enabled.
vBlock
A vBlock is a 1MB chunk of virtual address space composing a vDisk. For example, a vDisk
of 100MB will have 100 x 1MB vBlocks, vBlock 0 would be for 0-1MB, vBlock 1 would be from
1-2MB, and so forth. These vBlocks map to extents which are stored as files on disk as extent
groups.
Extent
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 201
Configuring and Managing Cluster Storage
An extent is a 1MB piece of logically contiguous data which consists of n number of contiguous
blocks (varies depending on guest OS block size). Extents are written/read/modified on a sub-
extent basis (aka slice) for granularity and efficiency. An extent’s slice may be trimmed when
moving into the cache depending on the amount of data being read/cached.
Extent Group
An extent group is a 1MB or 4MB piece of physically contiguous stored data. This data is stored
as a file on the storage device owned by the CVM. Extents are dynamically distributed among
extent groups to provide data striping across nodes/disks to improve performance. NOTE: as of
4.0, extent groups can now be either 1MB or 4MB depending on deduplication.
Extent Store
The Extent Store is the persistent bulk storage of DSF and spans all device tiers (Optane SSD,
PCIe SSD, SATA SSD, HDD) and is extensible to facilitate additional devices/tiers. Data entering
the extent store is either being drained from the OpLog or is sequential/sustained in nature and
has bypassed the OpLog directly. Nutanix Intelligent Lifecycle Manager (ILM) will determine tier
placement dynamically based upon I/O patterns, number of accesses of data and weight given
to individual tiers and will move data between tiers.
OpLog
The OpLog is similar to a filesystem journal and is built as a staging area to handle bursts of
random writes, coalesce them, and then sequentially drain the data to the extent store. For
sequential workloads, the OpLog is bypassed and writes go directly to the extent store. If data
is currently sitting in the OpLog and has not been drained, all read requests will be directly
fulfilled from the OpLog until they have been drained, where they would then be served by the
extent store/unified cache.
Unified Cache
The Unified Cache is a read cache which is used for data, metadata, and deduplication, and is
stored in the CVM’s memory and solid-state disk. Upon a read request of data not in the cache
(or based upon a particular fingerprint), the data will be read from the extent store and will also
be placed into the single-touch pool of the Unified Cache which completely sits in memory,
where it will use LRU (least recently used) until it is evicted from the cache. Any subsequent
read request will “move” (no data is actually moved, just cache metadata) the data into the
multi-touch pool. Any read request for data in the multi-touch pool will cause the data to go to
the peak of the multi-touch pool where it will be given a new LRU counter.
The Autonomous Extent Store (AES) is a different method for writing/storing data in the
Extent Store. It leverages a mix of primarily local + global metadata allowing for much
more efficient sustained performance due to metadata locality. For sustained random write
workloads, these will bypass the OpLog and be written directly to the Extent Store using AES.
For bursty random workloads these will take the typical OpLog I/O path then drain to the
Extent Store using AES where possible. As of AOS 5.20, AES is enabled by default for new
containers on All Flash Clusters and as of AOS 6.1 if requirements are met, AES is enabled on
new containers created on Hybrid (SSD+HDD) Clusters.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 202
Configuring and Managing Cluster Storage
1. Click the Entities menu and expand the Compute & Storage category.
2. Click the Storage Containers option. Note that you can click the star icon to the right of the
option name to bookmark the dashboard for easy access.
By default, the List view will be displayed, as shown in the following figure.
By default, the Storage Containers dashboard has five tabs: Summary, List, Alerts, Events, and
Metrics. You can use these tabs to get an overview of all storage managed by Prism Central,
view storage container information, view and acknowledge storage container-specific alerts
and events, and metrics. It also includes action buttons that can be used to create, update, and
delete storage containers.
If you click the name of a storage container in the List view of the Storage Containers
dashboard (or in the highlighted entities widget on the Summary view) a storage container
details page will be displayed, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 203
Configuring and Managing Cluster Storage
The top menu bar displays the name of the selected container as well as a series of tabs that
can be used to view specific types of storage container-related information. They are:
• Summary. This is the default view for every storage container details page. It contains two
action buttons (Update and Delete), and five widgets (Properties, Usage, Optimization &
Performance, Alerts, and Anomalies) that display storage container information.
• Alerts. The Alerts tab displays a filtered list of unresolved alerts specifically for the selected
storage container.
• Events. The Events tab displays a filtered list of events specially for the selected storage
container.
• Metrics. The Metrics tab displays a page with 9 built-in charts (storage controller IOPS,
storage controller read IOPS, storage controller write IOPS, and so on) that can be used to
monitor storage container performance.
• Storage Usage. The Storage Usage tab displays a page with 2 charts by default: usage
summary and tier-wise usage. Both these charts are specific to the selected storage
container.
Directly below the host details tabs are two action buttons, which are used to update or delete
a storage container.
Note: The Storage dashboard, in Prism Element, is used to view information as well
as manage the storage configuration in a cluster. For details refer to the Storage
Dashboard section of the Prism Web Console Guide on the Nutanix Support portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 204
Configuring and Managing Cluster Storage
The Volume Groups dashboard provides both detailed and summarized information about
volume groups that can be managed by this instance of Prism Central.
1. Click the Entities menu and expand the Compute & Storage category.
2. Click the Volume Groups option. Note that you can click the star icon to the right of the
option name to bookmark the dashboard for easy access.
By default, the List view will be displayed, as shown in the following figure.
By default, the Volume Groups dashboard has four tabs: Summary, List, Alerts, and Metrics. You
can use these tabs to get an overview of all volume groups managed by Prism Central, view
volume group information, view and acknowledge volume group-specific alerts, and metrics. It
also includes action buttons that can be used to create, update, and delete volume groups as
well as manage connections.
If you click the name of a host in the List view of the Volume Groups dashboard (or in the
highlighted entities widget on the Summary view) a volume group details page will be
displayed, as shown in the following figure.
The top menu bar displays the name of the selected volume group as well as a series of tabs
that can be used to view specific types of information. They are:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 205
Configuring and Managing Cluster Storage
• Summary. This is the default view for every volume group details page. It contains three
action buttons (Update, Manage Connections, and Delete), and three widgets (Properties,
Usage and Performance, and Alerts) that display storage container information.
• Virtual Disks. Lists the virtual disks that are part of the selected volume group. From this
page, you can add a new virtual disk to the volume group, and update or delete an existing
one.
• Connections. Lists the client connections to the selected volume group. From this page, you
can add a new connection, and update or delete an existing one
• Recovery Points. If the selected group has any snapshots that can be restored, they will
appear on the recovery points page.
• Alerts. The Alerts tab displays a filtered list of unresolved alerts specifically for the selected
volume group.
• Metrics. The Metrics tab displays a page with 10 built-in charts (controller IOPS, controller
read IOPS, controller write IOPS, and so on) that can be used to monitor storage container
performance.
Action Buttons
You will have access to different action buttons depending on which tab of the details page you
are viewing.
• On the Summary tab, you can update or delete a volume group, and manage connections.
• On the Virtual Disks tab, you can add a new virtual disk, and update or delete an existing
one.
• On the Connections tab, you can add a new connection, and update or delete an existing
one.
Replication Factor
Replication factor refers to the number of copies of data and metadata that will be maintained
on a cluster. A replication factor of 2 means that 2 copies of data will be available (1 original and
1 copy), while replication factor 3 means that 3 copies of data will be available (1 original and 2
copies).
While replication factor 1 is available (only the original data will be maintained, with no copies)
it is not recommended unless your cluster is running applications that provide their own data
protection or high availability. As only one copy of data is maintained, replication factor 1 does
not guarantee data availability if a node or disk failure occurs. This feature was introduced
specifically to improve performance and reduce capacity needed for workloads which do not
require data resiliency (such as temporary data for analytics) or when the data resiliency is
handled at the application level. For more information, see the Replication Factor 1 Overview
page of the Prism Web Console Guide on the Nutanix Support Portal.
As we discussed at the beginning of this module, the OpLog acts as a staging area to absorb
incoming writes onto the low-latency SSD tier. When data is written to a local OpLog, it is
synchronously replicated to another one or two Nutanix CVM’s OpLog (one other OpLog for
RF2 and two other OpLogs for RF3) before being acknowledged as a successful write to the
host. This ensures that the data exists in at least two or three independent locations and is fault
tolerant.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 206
Configuring and Managing Cluster Storage
It is important to note that replication factor is handled differently for data and metadata. For
data RF2, there will be two copies of data and three copies of metadata. For data RF3, there
will be three copies of data and five copies of metadata. Metadata replication factor cannot be
set or configured independently of data replication factor, and is dependent on data replication
factor and the cluster's redundancy factor.
Note: For more information, see the Metadata section of The Nutanix Bible.
To view the currently configured replication factor for all storage containers managed by Prism
Central, click the entities menu, expand the Compute & Storage category, and click Storage
Containers. The table on the Storage Containers dashboard displays the replication factor in a
separate column, as shown in the following figure.
Replication factor is configured when creating a storage container. The option to set replication
factor to either 2 or 3 can be found under the Advanced Settings section of the Create Storage
Container window as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 207
Configuring and Managing Cluster Storage
The replication factor and redundancy factor features are very closely connected. As you can
see in the figure above, Replication Factor is set to 2 and is greyed out. This is because our
test cluster's redundancy factor is 2. For replication factor to be changed to 3, the cluster's
redundancy factor must be set to 3 first.
Remember that:
• Redundancy factor 2 supports replication factor 2 only.
• Redundancy factor 3 supports replication factor 2 and 3.
Note: For replication factor 3 to be enabled, the cluster must have a redundancy
factor of 3.
Redundancy Factor
Redundancy factor refers to the number of failures (such as a node failure or a disk failure) that
a cluster is able to withstand while still continuing to operate. By default, Nutanix clusters (with
the recommended minimum of 3 nodes) have a redundancy factor of 2. This means they can
tolerate the failure of a single drive or node at a time.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 208
Configuring and Managing Cluster Storage
• A cluster must have at least five nodes, blocks, racks for redundancy factor 3 to be enabled.
• For guest VMs to tolerate the simultaneous failure of two nodes or drives in different blocks,
the data must be stored on storage containers with replication factor 3. (Replication factor is
the next topic in this section.)
Note: This procedure can only be performed in Prism Element; you cannot view or
change redundancy state in Prism Central.
In Prism Element, click the Settings (gear) icon, scroll down to the Data Resiliency section, and
click Redundancy State.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 209
Configuring and Managing Cluster Storage
In the Redundancy Factor Readiness dialog box that appears, you will see the cluster's
currently configured redundancy factor, a drop-down menu to make changes if the current
number of nodes in the cluster is sufficient, and an option to enable Replication Factor 1.
Redundancy Factor 2 is the default setting and, if you do not have at least five nodes in the
cluster, the drop-down menu to change it will be greyed out. Additionally, you will also see a
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 210
Configuring and Managing Cluster Storage
message above the Redundancy Factor Readiness dialog box if your cluster does not have
enough nodes to change the redundancy factor from 2 to 3.
To change the cluster's redundancy factor, click the Desired Redundancy Factor drop-down
menu, select 3, and then click Save.
Note: Redundancy Factor cannot be reverted and you cannot reduce the
redundancy factor. For example, you cannot change redundancy factor from 3 to 2.
Attempting to do so will result in an error. For more information, see the Increasing
the Cluster Fault Tolerance Level section of the Prism Web Console Guide on the
Nutanix Support Portal.
The Redundancy Factor Readiness window in the figure above also contains several important
pieces of information.
• If your cluster is not large enough (that is, it does not have enough nodes, like our test
cluster) to support redundancy factor 3, a message will appear above the window indicating
this.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 211
Configuring and Managing Cluster Storage
• You can also read through brief explanation of what redundancy factor is, your current
redundancy factor, how many failures can be tolerated, and what the current data replication
factor is.
• You can also enable replication factor 1, which is not recommended by Nutanix except in
very specific scenarios. (Replication factor is the next topic in this section.)
It is expected that a cluster be capable of withstanding the failure of one or more physical
disks. In addition, Nutanix also offers node, block, and rack awareness, which allows for a cluster
to account for and accommodate failures of a node, a block (consisting of multiple nodes), or a
rack (consisting of multiple blocks).
In this section, we will discuss how a Nutanix cluster handles failures at different levels - disk,
node, block, and rack.
Drive Failure
Drives in a Nutanix node store four primary types of data: persistent data (hot-tier and cold-
tier); storage metadata; OpLog; and CVM boot files.
Cold-tier persistent data is stored on the capacity tier of the node. Storage metadata, oplog,
hot-tier persistent data, and CVM boot files are stored in the performance tier. SSDs in a dual
SSD system are used for storage metadata, oplog, hot-tier persistent data according to the
replication factor of system and in a RAID-1 configuration for CVM files. In all-flash nodes, data
of all types is stored in the SATA-SSDs.
Each CVM boots from a SATA-SSD. During cluster operation, this drive also holds component
logs and related files.
A boot drive failure will eventually cause the associated CVM to fail. The host does not access
the boot drive directly, so other guest VMs can continue to run. Data Path Redundancy
redirects the storage path to another CVM.
Note: The CVM might restart under certain rare conditions on dual SSD nodes
if a boot drive fails, or if you unmount a boot drive without marking the drive for
removal and the data has not successfully migrated.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 212
Configuring and Managing Cluster Storage
Cassandra uses up to 4 SSDs to store the database providing read and write access for cluster
metadata.
When a metadata drive fails, the local Cassandra process will no longer be able to access its
share of the database and will begin a persistent cycle of restarts until its data is available. If
Cassandra cannot restart, the Stargate process on that CVM will crash as well. Failure of both
processes results in automatic IO redirection using data path redundancy.
During the switching process, the host with the failed SSD may report that the shared storage is
unavailable. Guest VM IO on this host will pause until the storage path is restored.
After redirection occurs, VMs can resume read and write I/O. Performance may decrease
slightly, because the I/O is traveling across the network rather than across the internal network.
Because all traffic goes across the 10 GbE network, most workloads will not diminish in a way
that is perceivable to users.
Multiple drive failures in a single selected domain (node, block, or rack) are also tolerated.
If Cassandra remains in a failed state for more than thirty minutes, the surviving Cassandra
nodes detach the failed node from the Cassandra database so that the unavailable metadata
can be replicated to the remaining cluster nodes. The process of healing the database takes
about 30-40 minutes.
If the Cassandra process restarts and remains running for five minutes, the procedure to
detach the node is canceled. If the process resumes and is stable after the healing procedure
is complete, the node will be automatically added back to the ring. A node can be manually
added to the database using the nCLI command.
Each node contributes its local storage devices to the cluster storage pool. Cold-tier data is
stored in HDDs, while hot-tier data is stored in SSDs for faster performance. Data is replicated
across the cluster, so a single data drive failure does not result in data loss. Nodes containing
only SSD drives only have a hot tier.
When a data drive (HDD/SSD) fails, the cluster receives an alert from the host and immediately
begins working to create a second replica of any guest VM data that was stored on the
drive. For a brief period of time, guest VMs with files on the failed data drive will need to read
across the network.
In a cluster with a replication factor 2, losing a second drive in a different domain (node, block,
or rack) before the cluster heals can result in some VM data loss to both replicas. Although a
single drive failure does not have the same impact as a host failure, it is important to replace the
failed drive as soon as possible.
Node Failure
A Nutanix node consists of a physical host and a CVM. Either of these components can fail
without impacting the rest of the cluster.
CVM Failure
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 213
Configuring and Managing Cluster Storage
A CVM failure may include a user powering down the CVM, a CVM rolling upgrade, or any
event, which might bring down the CVM. In any of these cases the storage traffic is served by
another CVM in the cluster. The hypervisor and CVM communicate using a private network on a
dedicated virtual switch. This means that the entire storage traffic is routed through an internal
IP address on the CVM. The external IP address of the CVM is used for remote replication and
for CVM-to-CVM communication.
In the event of a local CVM failure, the local addresses previously used by the local CVM
become unavailable. In this case, AOS Distributed Storage automatically detects the outage
and redirects storage traffic to another CVM in the cluster over the network. The re-routing is
done transparently to the hypervisor and to the VMs running on the host. So even if a CVM is
powered-off, VMs continue to perform I/O operations.
AOS Distributed Storage is also self-healing, which means that it detects when a CVM is
powered-off and automatically reboots the local CVM. Once the local CVM is available, the
traffic is seamlessly transferred back to be served by the local CVM.
AOS Distributed Storage uses replication factor and checksum to ensure data redundancy and
availability in the case of a node or disk failure or corruption. In the case of a node or disk failure
the data is then re-replicated among all nodes in the cluster to maintain the RF, which is called
re-protection. Re-protection may result after a CVM is down.
Host Failure
The built-in data redundancy in a Nutanix cluster supports high availability provided by the
hypervisor. If a node fails, all HA-protected VMs can be automatically restarted on other nodes
in the cluster.
Two Acropolis Services, Curator and Stargate, respond to two types of issues that arise from
host failure. First, when the guest VM begins reading across the network, Stargate begins
migrating those extents to the new host. This improves performance for the guest VM. Second,
Curator will notice that there is a missing replica of those extents, and instruct Stargate to
begin creating a second replica.
In the case of host failure, there may be a slight disruption to users running guest VMs. Users
who are accessing HA-protected VMs will notice that their VMs are unavailable while they
are being restarted on the new host. Any VMs that are not protected by HA will need to be
manually restarted.
The scenario described above accounts for the failure of a single host. Depending on how
loaded the cluster is, a second host failure could leave the remaining hosts with insufficient
processing power to restart the VMs from the second failed host.
However, even in a lightly loaded cluster, the bigger concern is additional risk to guest VM data.
With two sets of inaccessible physical disks, there is a chance that some VM data extents will be
missing entirely, and I/O requests will not be served.
Nutanix offers block fault tolerance as an opt-in procedure or a best-effort procedure. Opt-in
block fault tolerance feature offers guaranteed data resiliency when required conditions are
met. For best-effort fault tolerance mode, data copies remain on the same block when there is
insufficient space across all blocks.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 214
Configuring and Managing Cluster Storage
With block fault tolerance enabled, guest VMs can continue to run after a block failure because
redundant copies of guest VM data and metadata exist on other blocks.
• Every storage tier in the cluster contains at least one drive on each block.
• Every storage container in the cluster has replication factor of at least two.
• For replication factor 2, there are a minimum of three blocks in the cluster.
• If the replication factor of storage containers in the cluster is two, then at least two blocks
require free space. If the replication factor is three, then at least three blocks require free
space.
• A minimum of four blocks for RF2 or six blocks for RF3 is required to maintain block
awareness if erasure coding is enabled on any storage container.
Since block fault tolerance is configured at a cluster level, it must be accessed from Prism
Element and not Prism Central. To configure block fault tolerance, in Prism Element, click the
Settings (gear) icon, scroll down to the Setup section, and click Rack Configuration as shown
in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 215
Configuring and Managing Cluster Storage
On the Fault Tolerance Level page, select Block and click Next. On the Block Assignment page,
you can either use existing racks or add a new one as needed.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 216
Configuring and Managing Cluster Storage
Note: For more information, see the Block Fault Tolerance section of the Prism
Web Console Guide on the Nutanix Support Portal.
To enable rack fault tolerance, you must specify the mapping of the blocks to the racks based
on the actual placement of the blocks in the datacenter. The minimum cluster requirements are:
As with block tolerance, rack tolerance must be accessed from Prism Element. To configure
rack fault tolerance, in Prism Element, click the Settings (gear) icon, scroll down to the Setup
section, and click Rack Configuration.
Then, on the Fault Tolerance Level page, select Rack and click Next. On the Block Assignment
page, you can either use existing racks or add a new one as needed. Note that a Pro or higher
license is required in order to configure rack awareness. If your cluster's current licensing level
does not support this feature, Prism will display a message as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 217
Configuring and Managing Cluster Storage
Note: For more information, see the Rack Fault Tolerance section of the Prism Web
Console Guide on the Nutanix Support Portal.
In addition to the default-container, you can also create additional, custom storage containers if
required. A common use case is if you intend to apply different types of capacity optimization
features to your storage, but do not intend to apply all optimization features on a single storage
container.
Note: In the next section, we will discuss reservation and optimization features in
detail.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 218
Configuring and Managing Cluster Storage
To create a new storage container, in Prism Central, click the Entities menu, expand the
Compute & Storage category, and select Storage Container. On the Storage Containers
dashboard, click the Create Storage Container button at the top left of the page.
By default, the Create Storage Container dialog box presents options to name the storage
container and select the cluster that it will be created on, as shown in the following figure. The
dialog box also displays the maximum available capacity on the selected cluster.
Clicking Advanced Settings will display a number of additional options, including the ability to
configure the replication factor, set reserved and advertised capacity, and enable or disable
various storage optimization features such as compression, deduplication, and erasure coding.
By default, compression will be selected, although deduplication and erasure coding will
not. For the purpose of this example, we are going to create a storage container with no
optimization features enabled, and will update these settings later as we discuss individual
storage optimization features.
To complete the storage container creation process, we only need to click Create. In later
sections, we will update this storage container to include different capacity optimization
features.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 219
Configuring and Managing Cluster Storage
After you create a storage container, you can modify some configuration options by selecting it
from the Storage Containers dashboard, clicking Actions, and then clicking Update.
As the following figure shows, in the Update Storage Container dialog box, both the name
and cluster fields are greyed out. This is because a storage container cannot be renamed via
this process, and a storage container cannot be moved to a different cluster after it has been
created.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 220
Configuring and Managing Cluster Storage
If we look at the advanced options, we still have access to options such as reserved and
advertised capacity, compression, and deduplication, because these can be enabled or modified
after a storage container has been created.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 221
Configuring and Managing Cluster Storage
• You cannot rename a storage container through the Update Storage Container dialog box.
• You also cannot rename a storage container if it contains vdisks.
• You cannot change the replication factor when updating a storage container in Prism. This
can be done only via the command-line interface.
• If the compression policy is changed from compressed to uncompressed (or vice versa), the
data in the storage container will be uncompressed or compressed as a background process
when the next data scan detects data that needs this change.
Capacity reservation allows you to reserve a minimum amount of space in a storage pool
for a specific storage container, which prevents this space from being used by other storage
containers. Reserved capacity can be set when creating a new storage container or when
updating an existing one.
To set reserved and advertised capacity, we're going to update the Edu_Storage_Container
we created earlier. Our intention is to use this storage container exclusively to store images,
which we will use to create VMs for lab development and testing. As a result, we will not need a
significant amount of space on the cluster, since this container will have no workloads running
on it.
First, we will set our reserved capacity to 15 GB. This is the minimum amount of space that will
be reserved on the cluster for use by this storage container. Then, we will set our advertised
capacity to 20 GB, which is the maximum amount of space that this container is allowed to use.
Note that advertised capacity must always be larger than reserved capacity. To finalize our
changes, we need to click Save.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 222
Configuring and Managing Cluster Storage
• Reserve capacity for a storage container only if the storage pool has multiple storage
containers. Unless there is a specific reason to have multiple storage containers, Nutanix
recommends having a single storage pool with a single storage container.
• In total, reserve no more than 90% of the space in the storage pool.
• When setting advertised capacity for a storage container, remember to allocate some extra
space beyond the projected size of any VMs placed in the container. This will ensure that
there is room for data that has not yet been garbage collected, which can be substantial
depending on the workload (10% or more of the storage capacity in some cases).
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 223
Configuring and Managing Cluster Storage
AOS Distributed Storage handles inline compression on random data slightly differently.
Initially, the data is written uncompressed to the OpLog. Then, after it is coalesced, it is
compressed in memory and written out to the extent store. Post-process compression sees the
data written uncompressed to disk before the Nutanix MapReduce framework compresses the
data cluster-wide.
Administrators should decide what form of compression to use and only enable it once, either
at the container level or in the application.
Selecting inline will result in the Delay field being set to 0 minutes, since data is compressed
as it is written. Selecting post process will set the delay to 60 minutes. 60 minutes is also the
Nutanix-recommended delay for post process compression. In our case, since our container is
meant only for image storage, we can select either option, so we will select post process, as
shown in the following figure. To finalize our changes, we will click Save.
Note: For more information, see the Compression (Prism Central) section of the
Prism Web Console Guide on the Nutanix Support Portal.
Deduplication
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 224
Configuring and Managing Cluster Storage
In addition, to enable deduplication, the CVMs in the cluster needs to be configured with
additional RAM. Each CVM in a cluster needs 24 GB of RAM for cache deduplication, and 32 GB
of RAM for capacity deduplication.
Deduplication is also not recommended for all workload types. It is recommended to enable
deduplication for full clones, physical-to-virtual migration, and persistent desktops. It is not
recommended to enable deduplication for the following workloads:
• Linked clones or Nutanix VAAI clones: Duplicate data is managed efficiently by DSF so
deduplication has no additional benefit.
• Server workloads: Redundant data is minimal so there may be no significant benefit from
deduplication.
As the following figure shows, our license level does not support this option (our test cluster
does not have a Pro license) and Prism informs us that this option is not available.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 225
Configuring and Managing Cluster Storage
The only option we can select is Inline Deduplication of Read Caches, so we will select it and
click Save.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 226
Configuring and Managing Cluster Storage
Note: For more information, see the Deduplication (Prism Central) section of the
Prism Web Console Guide on the Nutanix Support Portal.
Erasure Coding
As cluster membership sizes increase, we can change the replication factor from 2 to 3 to
handle additional failures. This increase, of course, reduces usable disk space by creating
additional redundant copies of the data.
Nutanix addresses this drawback with the Erasure Coding (EC-X) feature, which increases
usable disk space while maintaining the same cluster resiliency by striping individual data
blocks and associated parity blocks across nodes rather than disks, forming an erasure strip.
In the event of a failure, the system uses the parity block along with the remaining blocks
in the erasure strip to recalculate the missing data onto a new node. All blocks associated
with erasure coding strips are stored on separate nodes. Each node can then take part in
subsequent rebuilds, which reduces potential rebuild time.
EC-X works best on cold data, archives, and backups. Containers with applications that incur
numerous overwrites, such as log file analysis or sensor data, require a longer delay than the
one-hour EC-X post-processing default.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 227
Configuring and Managing Cluster Storage
Consider a 6-node cluster with replication factor 2, that contains four data blocks (a, b, c,
and d). The black text in the following figure represents the data blocks and the orange text
represents the copies.
When the data becomes cold, the erasure coding engine calculates the value of parity (P) by
performing an exclusive OR operation. Once parity has been calculated, the copies are removed
and replaced by the parity information as shown in the following figure.
The presence of parity ensures redundancy (because data can be rebuilt in the event of a
failure), and simultaneously provides space savings, because data is now (a + b + c + d + P)
instead of 2 x (a + b + c + d).
In the event of a failure, as shown in the following figure, the missing data block is rebuilt using
the rest of the erasure coded stripe (a b d and P). The restored block (block c, in this example)
is then placed on a node that does not have any other members of this erasure coded stripe.
Note: When a cluster is configured for the redundancy factor 3, two parity blocks
are maintained so that the erasure coded data has the same resiliency as the
replicated data. An erasure coded stripe with two parity blocks can withstand the
failure of two nodes.
As with compression and deduplication, you can enable erasure coding when creating a new
storage container or updating an existing one. In the Advanced Settings section of the dialog
box, Erasure Coding is available as a checkbox, similar to Compression and Deduplication.
However, unlike compression and deduplication, erasure coding has specific cluster size
requirements.
As the following figure shows, a minimum of 4 nodes are needed to enable erasure coding in a
cluster with RF2, while at least 6 nodes are needed in a cluster with RF3. If the number of nodes
in the cluster is fewer than the required number, erasure coding cannot be enabled.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 228
Configuring and Managing Cluster Storage
Note: Before enabling erasure coding, see the Erasure Coding Best Practices and
Requirements section of the Prism Central Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 229
Configuring and Managing Cluster Storage
Module 8 Summary
In this module, you learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 230
Module
9
MIGRATING WORKLOADS WITH NUTANIX
MOVE
Module 9 Overview
Nutanix Move is a cross-hypervisor mobility solution to move VMs with minimal downtime.
By the end of this module, you will have learned:
Move supports migration from the following sources to targets, where the first platform is the
source, and second platform is the target.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 231
Migrating Workloads with Nutanix Move
The Nutanix Move VM is typically hosted on the target AHV cluster, and its software services
can be grouped into three major components.
• Management server. The management server maintains source and target cluster
information, as well as migration plan details and current status. It also allows APIs and the UI
to create and manage migration plans.
• Agents for source and target. The source agent is a platform specific (ESXi, Hyper-V, or
cloud) software component that schedules migration copy requests through disk readers.
It collects source cluster and VM details and helps the user select the VMs to migrate using
the management server UI. The target agent collects and keeps inventory information for
the target cluster, allowing you to create migration plans. It also mounts the container in the
target to prepare the disk writer to copy the data. At cutover, the target agent converts disk
formats to support AHV.
• Disk readers and writers. Disk reader processes use source-specific APIs to read data and
coordinate with disk writer processes to complete outstanding copy operations. The disk
reader checkpoints copy operations to handle any failures and resume operations as needed.
Note: Visit this link to download Nutanix Move from the Nutanix Support Portal.
To install Nutanix Move on AHV, you need to create a new VM and mount the QCOW2
image so that you can log into the Move UI. To do this, first log into Prism and upload the Move
QCOW2 image to the Image Service. Then, when creating the VM:
• Configure the VM to have at least 2 cores, 2 vCPUs per core, and 8GB of memory.
• Ensure that the network that the VM is connected to can connect to both the source
vCenter Server as well as the target AHV cluster.
After the Move VM has been created and powered on, you can log into the Move UI, access the
Move dashboard, and perform VM migrations.
Note: For step-by-step instructions, see the Move Deployment section of the Move
User Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 232
Migrating Workloads with Nutanix Move
Note: For more information and additional links on each major task, see the Move
Dashboard section of the Move User Guide on the Nutanix Support Portal.
Note: Move does not currently support all hypervisor environments as migration
sources or non-AHV environments as targets, so heterogeneous source and target
environments currently require other migration methods.
While details of specific steps vary based on the source and target clusters that are selected for
migration, the overall migration workflow is as follows:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 233
Migrating Workloads with Nutanix Move
1. Log in to Move.
When performing a migration, Move checks to ensure that the target environment has enough
compute and storage resources to support the VMs added to a migration plan. Move can sort
VMs by whether you can migrate them and provides a summary to indicate why you cannot
migrate certain VMs (for example, if the VM does not have VMware tools installed or meet
virtual hardware version level minimums). Move supports the same virtual guest operating
systems that AHV supports.
For ESXi sources, Move uses VMware vSphere Storage APIs: Data Protection (VADP) to
manage the replication process, so you do not need to have agents installed in the VMs or the
ESXi hosts. You have the option to allow Move to connect to VMs directly in order to install the
VirtIO drivers compatible with AHV and to capture network settings to carry over to the target
environment. You can specify the credentials to connect to the VMs selected in a plan either
for all VMs at once or individually as needed. You can also specify network mappings to match
the source and destination networks for the VMs. By defining a migration schedule, you can set
data seeding to start in a predetermined window.
Once you have configured the options described, the migration can begin, using VADP to
seed the data to the AHV cluster. This process involves creating ESXi-based snapshots for
each VM, then replicating the virtual disks to the specified AHV container. You can pause or
abort migrations in progress at any time. Move stores the VMDK files for the migrating VMs in
a temporary folder and incrementally uses changed-block tracking (CBT) APIs and continued
snapshot operations to keep them up to date.
When it is time to cut over and complete the migration, Move powers off the source VMs and
disconnects the vNICs. Incremental data then synchronizes over to the AHV cluster. Once all
data replication is complete, Move uses the AHV image service to convert the VMDK files to
the native RAW format used by AHV. Because the disk formats are the same, conversion from
VMDK to RAW is extremely fast—each disk converts in just a few seconds, limiting downtime.
Move also provides an estimated cutover time, so you can determine any maintenance window
in advance.
You can choose to cut over VMs in a plan either together or separately. To complete the
migration, Move powers the VMs on in the target environment and removes all temporary
VMDK files and converted images in the AHV image service. Although the source VMs are now
powered off and disconnected from their networks, they persist in case you need them for any
reason.
Note: For step-by-step instructions involving various types of source and target
environments, see the Move User Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 234
Migrating Workloads with Nutanix Move
In the Upgrade dialog box, there are two ways to perform an upgrade. If a connection to the
internet is available, the latest available version will be displayed, and you can click Upgrade to
begin the process. If no internet connection is available, you can download the Offline upgrade
package from the Nutanix Support Portal and upload it here. After the upload is complete, click
Upgrade to begin the process.
Note: For step-by-step instructions, see the Move Upgrade Management section
of the Move User Guide on the Nutanix Support Portal. For information about other
management tasks, such as changing passwords and uninstalling Move, see the
Move Administration section.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 235
Migrating Workloads with Nutanix Move
A list of logs will be displayed in a new tab or window. On the logs page, you can filter logs by
component, choose a specific log, or use your browser's search feature to find specific items or
logs.
You can also download a collection of logs and make them available to Nutanix Support if you
need assistance with troubleshooting Nutanix Move. The bundle includes Move logs, agent logs
in the case of Hyper-V, and ping statistics for source ESXi and target AHV hosts.
To download the bundle, click the Settings (gear) icon and select Download Support
Bundle. The file is a .gz archive which contains move-support-bundle and the date of the
download in the filename.
Note: For step-by-step instructions on investigating issues with Nutanix Move, see
the Move Troubleshooting section of the Move User Guide.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 236
Migrating Workloads with Nutanix Move
Module 9 Summary
In this module, you learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 237
Monitoring Cluster Performance
Module
10
MONITORING CLUSTER PERFORMANCE
Module 10 Overview
This module will introduce you to Prism Central's performance monitoring capabilities.
The Analysis dashboard allows you to create custom charts or use built-in ones to analyze
various aspects of cluster performance. The Sessions dashboard allows you to create custom
groupings of charts and data, and switch between them quickly and easily for analysis.
The Reports dashboard allows you to create, customize, and export reports about your
infrastructure resources and deliver them directly to a mailbox.
And, for cluster monitoring at a glance, Prism Central also allows you to customize the main
dashboard and add widgets that cater specifically to your monitoring needs.
So, in this module, we will explore five key elements of performance monitoring in Prism
Central:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 238
Monitoring Cluster Performance
To access the Analysis dashboard in Prism Central, click the Entities menu, expand the
Operations category, and select Analysis.
When you open the Analysis dashboard, it will display the last session that you were working
on. If you are accessing the Analysis dashboard for the first time, you will see a default, system-
generated session, named Analysis Session.
A session is simply a grouping of charts and metrics, which you can toggle between at will,
so you can examine different infrastructure and entity metrics without having to create new
or custom charts multiple times. The default, system-generated session does not contain any
performance data. However, as shown in the following figure, it displays a chart of alerts and
events over time called the Alerts and Events monitor. Moving your mouse cursor over an item
in the chart will display how critical alerts, warning alerts, and events were generated at that
point in time.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 239
Monitoring Cluster Performance
On the Analysis dashboard, you can perform a number of different actions. In addition to
creating and switching between sessions (which we will discuss in the next section), you can
also:
Clicking the Alerts and Events link at the top right of the page will display a pane with two tabs:
Alerts, and Events. On the Alerts tab, the tab itself displays the number of critical and warning
alerts, color coded in red and yellow respectively. As the following figure shows, our test cluster
has 20 critical alerts and 28 warning alerts. These alerts are presented in chronological order,
with the most recent alert appearing first in the list. By default, alerts are categorized by Entity
Type (Prism Central VM, Cluster, Host, VM, Remote Site, and Volume Group as shown in the
following figure), but you can use the Group By drop-down list to sort the list by either Impact
or Severity.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 240
Monitoring Cluster Performance
Each group of alerts can be expanded or collapsed using the Show and Hide links in each row.
Each alert entry is also clickable. So, for example, if we were to click single alert in the Prism
Central VM group, the corresponding alert details page would be displayed. From here, as
shown in the following figure, you can view more information about the alert, examine possible
causes and resolutions, and acknowledge or resolve the alert as needed. Closing this alert
details page will display the Analysis dashboard once more.
Like the Alerts tab, the Events tab also displays the total number of cluster events on the tab
itself - over 99 for our test cluster. Clicking the Events tab displays events in three categories:
User Action, Behavioral Anomaly, and System Action. Each group can be expanded or
collapsed, and each event is clickable, so selecting an event will allow you to view more details
on the event details page.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 241
Monitoring Cluster Performance
Finally, you can also customize the date and time range for which the Analysis dashboard
displays data. Clicking the Range drop-down list at the top right of the page will display
options for 3 hours, 6 hours, 1 day, 1 week, 1 month, and 3 months, as shown in the following
figure. Selecting any of the options will update all charts on the page to display information for
the selected range.
Selecting Custom Range from the drop-down list will display a dialog box that allows you to
select both a start and end date as well as a start and end time for the charts displayed on the
Analysis dashboard.
Note: For more information, see the Analysis Dashboard section of the Prism
Central Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 242
Monitoring Cluster Performance
The Sessions dashboard is accessed from the Analysis dashboard. To view it, click the
Switch Session link at the top left of the screen and select View All Sessions.
The Sessions dashboard displays a list of both system-defined and custom sessions, and
includes a number of options for filtering, creating, and managing sessions. As the following
figure shows, only one system-defined session is available by default: the Analysis session.
We also have a custom session available, that was created to track and monitor cluster
performance.
From the Sessions dashboard, you can filter sessions by keywords in the name or description,
add columns to the table using the View by option, edit a session's details, or delete the
session. Note that only custom sessions can be deleted; system-defined sessions cannot be
deleted. However, both system-defined and custom sessions can be edited.
Note: For more information, see the Sessions Dashboard section of the Prism
Central Guide on the Nutanix Support Portal.
Creating a Session
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 243
Monitoring Cluster Performance
When you click the Create Session button at the top left of the Sessions dashboard, you will
see what appears to be a copy of a blank Analysis dashboard page as shown in the following
figure. This page will contain the Alerts and Event monitor, and all of the options that are
available on the system-generated Analysis session - Switch Session, Add Charts, Alerts and
Events, and Actions.
Navigating away from this page without making any customizations - such as changing the
name or adding a chart - will not save the session.
The first task when creating a new session is to update the name and (optionally) add a
description. There are two ways to update the name. You can click the Edit (pencil) icon at the
top left of the screen, next to the name, and change the name directly.
Alternatively, you can click the Actions link at the top right of the screen and select Edit
Session Details. This will display a dialog box where you can update the name and add a
description, as shown in the following figure.
From the Actions menu, you can also delete the session that you are currently viewing, or
close it. Clicking Delete will prompt you for confirmation. Clicking Close will return you to the
Sessions dashboard.
Note: For step-by-step instructions, see the Creating a New Session section of the
Prism Central Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 244
Monitoring Cluster Performance
A metric chart allows you to monitor the performance of a single metric for one or more
entities of the same entity type. For example, if you were to create a single chart that monitors
CPU utilization for all CVMs in a cluster, that would be a metric chart.
An entity chart allows you to monitor multiple metrics for a single entity. For example, if you
were to create a single chart that monitored the CPU and memory for a single VM, that would
be an entity chart.
In this section, we will create a pair of metric charts and add them to our sample session.
Let's start with the VM CPU Utilization session that we just created. To add a new chart to it, we
need to click the + Add Chart link at the top right of the page, or the + Add Chart button in the
middle of the page. Clicking either will display the Add Chart dialog box.
This dialog box allows to provide specifications for multiple charts (if necessary) and add them
to your session with just a few clicks. For our first chart, as shown in the following figure, we are
going to select VM as the entity type, and then select all CVMs that are available. Our metric,
since we are going to monitor CPU utilization, will be CPU usage (%).
Clicking the + Add Another Chart link in the dialog box will allow us to define a second chart.
In this second chart, as shown in the following figure, we will select VM as our entity type again
but, this time, we will select all of the guest VMs on the clusters managed by Prism Central
instead.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 245
Monitoring Cluster Performance
Clicking the Add button will make both of these charts available on the dashboard, as shown in
the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 246
Monitoring Cluster Performance
Once a chart has been added to the Session, you can use the checkboxes below the chart to
refine the data that you are looking at. For example, if we wanted to view CPU utilization for
only NTNX-CDEV7-3-CVM and NTNX-CDEV7-1-CVM, we can clear the other checkboxes and
view data only for the entities that we are interested in.
Now that our two charts have been created, for the purposes of illustration, we are going to
power on a few VMs that are currently off, and then wait for a little while to generate some
utilization data. As you can see in the following figure, a series of new events have been
generated.
Clicking the corresponding bar on the Alerts and Events monitor displays the Events tab, where
we can see a series of memory-related anomalies as shown in the following figure.
The one we are interested in is the KA-VM-CLONE anomaly, since that is one of the VMs that
we are monitoring. However, both of our charts are for CPU utilization, so they currently display
no relevant data. There are two potential ways to address this: we can either create a new chart
for memory utilization, or change our guest VM CPU usage chart to display memory utilization
data instead.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 247
Monitoring Cluster Performance
To change our existing chart to display new data, we need to click the options icon on the
second VMs CPU Usage (%) chart and select Edit, as shown in the following figure.
In the Edit Chart dialog box, we need to change the metric to Memory Usage (%) and rename
the chart to VMs Memory Usage (%).
Clicking Update will change the chart on our current session. Now, moving the mouse cursor
over the chart at the time corresponding to the event, we can see an increase in memory
utilization by our KA-VM-CLONE. The change in memory utilization is not high enough to
warrant an alert, but is significant enough that the Analysis dashboard in Prism will alert you to
its occurrence.
So, in this way, by creating and updating charts, and comparing their data with alerts and
events generated by Prism, you can quickly and easily identify, understand, and (when
necessary) remediate potential issues as and when they may arise on a Nutanix cluster.
Note: For more information and step-by-step instructions, see the Chart
Management section of the Prism Central Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 248
Monitoring Cluster Performance
The Reports dashboard will be displayed, with a list of all system-defined and custom reports.
Two reports are available out-of-the-box: Environment Summary and Cluster Efficiency
Summary. These can be identified by the System tag next to their names.
The dashboard displays the name of the report, the owner, whether or not a schedule has
been defined for the report, the frequency with which the report will be automatically run (if a
schedule has been defined), and the date and time when the report was last updated.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 249
Monitoring Cluster Performance
Clicking the name of a report on the dashboard will display a report details page
with information about how many times the report has been run, with additional options to run,
edit, clone, or delete the report. Selecting an instance of the report will allow you to resend the
report (if the ability to email reports has been configured) or delete that particular instance
from the list. You can also download a PDF of all reports that have been successfully generated.
Clicking the PDF link in the instances table will first display the report in your browser. From
here, you can view the PDF, click through sections using the table of contents hyperlinks in the
left pane, download the file to your local computer, and prepare the document for printing.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 250
Monitoring Cluster Performance
Note: For more information, see the Reports Management section of the Prism
Central Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 251
Monitoring Cluster Performance
The first step in creating a custom report is to name it. To rename the report, click the Edit
(pencil) icon that appears at the top middle of the page, enter the name in the field provided,
and click OK to save your changes.
Next, we need to add views. Both custom and predefined views are available, on two different
tabs of the New Report page. Predefined views include licensing information, alerts details,
runway information, and information about inactive, constrained, overprovisioned, or bully VMs.
Custom views allow you to create custom charts and specify the type of data that you want to
see. For example, if this were a report about hosts in the cluster, we could include a bar chart
with information about physical resource consumption on all hosts in the cluster, as shown in
the following figure.
However, in this example report, we are interested in two types of data: the amount of runway
we have for CPU, memory, and storage, and the amount of CPU, memory, and storage that can
be reclaimed to save space. These are predefined views. To add them to your report, simply
click the name of the view in the left pane.
We are going to add six predefined views to our Runway and Reclamation Report: CPU runway,
memory runway, storage runway, potential CPU reclaim, potential memory reclaim, potential
storage reclaim. As the following figure shows, once the views have been added, a preview will
appear that shows how the data will look. Our runway views, for example, will appear as line
charts.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 252
Monitoring Cluster Performance
After you add your views, you can also change the sequence in which they are displayed. Each
block can be dragged and dropped so, in our sample report, we can adjust our reclamation
views so that they all appear in a single row.
Next, we have several options. Now that the report has been named and views have been
added, we can save the report, save and run the report, add a schedule, or customize the
report's appearance and settings. Report scheduling and customization are both optional.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 253
Monitoring Cluster Performance
However, in the next sections, we will take a look at some of the options that are available, and
generate a sample report using the default report settings.
Note: For step-by-step instructions, see the Creating a New Report section of the
Prism Central Guide on the Nutanix Support Portal.
Customizing these settings is optional. Even if you keep all of the default settings, you will still
be able to run reports in Prism Central, view them online, and download them to your local
computer if necessary. However, if you do want to make customizations, click the Report
Settings button.
On the Appearance tab, you can upload a custom logo, change the color of the background on
the cover page and the header on content pages, and add a copyright statement.
On the Email Settings tab, you can configure recipient email IDs and the subject and message
of the email itself. If an SMTP server has been configured correctly, Prism will indicate this at the
top of the dialog box.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 254
Monitoring Cluster Performance
The text in the Subject and Body fields cannot be edited directly. However, in the two fields
provided, you can enter text that will be added before the subject line and after the default
body text as shown in the following figure.
On the Report Retention tab, you can define how many report instances will be saved or for
how long they will be saved. Selecting Number of Instances will allow you to specify how many
unique reports you want Prism Central to retain. A maximum of 25 reports can be saved. If you
already have 25 saved instances and generate a new report, the oldest instance will be deleted.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 255
Monitoring Cluster Performance
Selecting Time Duration, on the other hand, will allow you to specify (in days) how long a
report instance should be saved for. Reports can be saved for a maximum of 3 months.
Finally, you can choose to save no reports at all using the Do not retain reports checkbox.
If you choose not to define a custom retention policy, 10 instances of the report will be saved
by default.
On the Report Format tab, you can choose to generate a PDF or CSV report using the
checkboxes provided. PDF is selected by default. However, these options are not mutually
exclusive. As shown in the following figure, you can generate both a PDF and a CSV output for
each report that you create.
Finally, on the Time Zone tab, you can choose the time zone for the report. The default is UTC,
but all common time zone options are available.
Once you have made your changes, you can click Save to save your custom settings. If you
want to discard your changes, you can either click Cancel or click the Reset to Default button,
which will restore the settings on all tabs to their default values.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 256
Monitoring Cluster Performance
Note: For more information and step-by-step instructions, see the Configuring
Report Settings section of the Prism Central Guide on the Nutanix Support Portal.
Scheduling a Report
Scheduling a report is only necessary if you want the Prism Central to generate the report
automatically on a regular basis. To define a schedule for a report, when creating a new report
or editing an existing one, click the Create Schedule button at the top right of the page.
You can choose to set a daily, weekly, monthly, or yearly schedule for the report. If you choose
to run the report daily, you can also specify the time at which to run the report. Weekly reports
allow you to select a day of the week and a time. When scheduling a monthly report, you can
specify the day of the month and the. And if you choose yearly, you can choose the month,
date, and time.
In addition, you can specify the duration for which data should be presented in the report.
The options available are last 24 hours, last week, last month, or a custom date range. If you
select the Email Report checkbox (and have configured an SMTP server) you can specify email
recipients as well.
A sample weekly schedule, with the report set to run every Sunday at 10 PM, will look like this.
Note: For step-by-step instructions, see the Scheduling a Report section of the
Prism Central Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 257
Monitoring Cluster Performance
Prism Central's main dashboard comes with a collection of out-of-the-box widgets that provide
insight into the health, performance, and overall status of your cluster. This dashboard is
highly customizable. Widgets can be added, removed, or rearranged to make infrastructure
monitoring both simple and comfortable.
As an example, the Plays widget appears in the third row of the main dashboard by default, as
shown in the following figure.
However, if we wanted it to appear in the first row, we could simply click and drag the widget
to whatever vertical and horizonal position is required. Note that moving widgets will result
in widgets being swapped, and not pushed horizontally. So, for example, if we were to move
our Plays widget from the third row to the first row, in the place of the Cluster Quick Access
widget, the Cluster Quick Access widget will move down to the third row. No other widgets on
the page will be affected.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 258
Monitoring Cluster Performance
To delete a widget, move your mouse cursor over the top right of the widget, click the x button
that appears there, and provide confirmation when prompted.
Deleting a widget from a particular column will move all widgets in that column up by one
position. No other widgets on the page will be affected. There will also be no horizontal
movement of widgets, so a blank space may appear on the page, as shown in the following
figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 259
Monitoring Cluster Performance
In addition to moving and deleting existing widgets, you can also add new ones to Prism
Central's main dashboard. To do this, click the + Add Widgets button at the top right of the
page. On the Add Widgets page, you will see several options to add widgets from different
groups. You can add:
Selecting a widget will display a preview in the middle of the page. The pane on the right will
indicate whether or not customizations are possible based on the widget you have selected.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 260
Monitoring Cluster Performance
After you select a widget, you can click the Add to Dashboard button to make the widget
available without leaving the Add Widgets page. This will allow you to continue to add as many
widgets as you require. Alternatively, you can click Add & Return to Dashboard link to add the
widget and close the Add Widgets page.
Adding a new widget will include it on the dashboard after the last widget. As you can see in
the following figure, any blank spaces that are on the page as a result of deleting or moving
widgets will be ignored, and the new widget will appear at the very end of the page.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 261
Monitoring Cluster Performance
You can reset Prism Central's main dashboard at any time by clicking Reset Dashboard at the
top right of the page. This will remove any custom widgets that you have added, restore any
deleted widgets, and undo any customizations in widget position.
Note: For more information and step-by-step instructions, see the Main Dashboard
section of the Prism Central Guide on the Nutanix Support Portal.
To create a custom dashboard, on the main dashboard of Prism Central, click Manage
Dashboards.
If no custom dashboards are currently available, the Manage Dashboards dialog box will
indicate this and prompt you to create a new one. Click the New Dashboard button.
As a first step, we will be prompted to name our new dashboard. Since this once will be used
exclusively to monitor the Prism Central VM, we will name it PC VM Performance and click
Save.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 262
Monitoring Cluster Performance
After the dashboard has been saved, Prism Central will display the new, blank dashboard. The
name of the custom dashboard will appear at the top of the page between the Main Dashboard
and Manage Dashboards buttons. By default, when a custom dashboard is created, it will
contain no widgets. You will need to add new widgets manually. To do this, click the Add
Widgets link in the middle of the page or the + Add Widgets button at the top right.
Clicking Add Widgets will display the same page that we saw in the previous section. Here,
we will add three Custom Chart widgets. As shown in the following figure, our first chart is
for Hypervisor CPU usage. The entity type is virtual machine. The VM we select is our Prism
Central VM, PC-1. And our metric is Hypervisor CPU Usage (%). To add the chart to our new
dashboard, we need to click Add to Dashboard.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 263
Monitoring Cluster Performance
Next, we are going to repeat the process and add two more charts: one for memory usage and
another for disk usage. Both charts use the same entity type (virtual machine) and the same VM
(PC-1). After we add our three charts, if we return to our custom dashboard, it will look like this.
As with the main dashboard, you can add more widgets, remove existing ones, or change the
position of widgets on the page to further personalize and customize your viewing experience.
Note: For step-by-step instructions, see the Creating a New Dashboard section of
the Prism Central Guide on the Nutanix Support Portal.
Module 10 Summary
In this module, you learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 264
Monitoring Cluster Performance
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 265
Monitoring Cluster Health
Module
11
MONITORING CLUSTER HEALTH
Module 11 Overview
This module describes how to use Prism Central and Prism Element to monitor the health of a
Nutanix cluster and its various components.
By the end of this module, you will have learned:
• How to find summarized and detailed cluster and entity health information in Prism Central.
• How to obtain summarized and detailed cluster and entity health information in Prism
Element.
• What Nutanix Cluster Check (NCC) is, how to run NCC, and how to schedule health checks in
Prism Element.
• Some of the common log types that are available in Prism Element.
• How to collect logs directly from Prism Element web console.
• How to configure a syslog server in Prism Central for log forwarding.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 266
Monitoring Cluster Health
Additionally, since you can customize the main dashboard or create your own dashboard, you
can also add widgets that provide additional health information. For example, we can add a
cluster information widget, as shown in the following figure. This widget provides a summary of
cluster anomalies in the last 24 hours, the resource runway that is available, and the number of
inefficient VMs on the cluster.
Each of the numbers in the widget is clickable. So, for example, if we wanted to investigate
the anomalies in more detail, we need to click the number 44 in the widget. This will display a
filtered version of the Events dashboard (with filters for event type, time, and cluster), with a list
of behavioral anomalies that we can review and remediate as needed.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 267
Monitoring Cluster Health
To access summarized health information for all entities in all clusters managed by Prism
Central, in the search field at the top left of the screen, type health and press Enter. Prism
Central will display a page with a list of all entities, their name, health status (good, warning,
critical, unknown), latency information, and critical and alert warning counts.
Entities are displayed in different groups, as shown in the following figure. The groups available
are clusters, hosts, VMs, storage containers, and disks.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 268
Monitoring Cluster Health
However, this page only contains summarized information. If we want details, we need to use
the View All links at the bottom of each group of information. For example, our CDEV7 cluster
has a health status of Critical, which warrants investigation and remediation.
To address this, we can click the View All 2 Clusters link, which will display the Clusters
dashboard. From here, we can navigate to alerts and events pages, view metrics, and view the
details page specifically for CDEV7 in order to make configuration adjustments as needed.
However, Prism Element has a dedicated Health dashboard, which provides detailed
information about all entities in a single cluster, and allows health checks to be run from directly
within the web console. As a result, in this module, we will focus on monitoring cluster health
using Prism Element.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 269
Monitoring Cluster Health
The Health widget displays the health status for the cluster as a whole, as well as cluster
entities. The three possible health states are good, warning, and critical, indicated by green,
yellow, and red heart icons respectively. Scrolling down in the widget will display the health
status of hosts, VMs, cluster services, disks, storage containers, and storage pools. Clicking the
widget will take you to the Health dashboard, where you can view more details about each or
all of these entities.
The Data Resiliency Status widget summarizes the number of failures that a cluster can
withstand. In the following figure, our cluster used can withstand the failure of a single node.
Moving your mouse cursor over the question marks next to Failure Domain and Fault Tolerance
will display a small popup with more information about each item.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 270
Monitoring Cluster Health
Clicking the widget will display a Data Resiliency Status details page, which shows how many
individual component failures the cluster can withstand based on the currently configured
failure domain, as shown in the following figure.
Finally, there are four different alerts and events widgets in the rightmost column of the Home
dashboard. The Alerts widgets display the most recent unresolved alerts on the cluster. Each
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 271
Monitoring Cluster Health
set of alerts is displayed in a separate widget, with one widget each for critical, warning, and
informational alerts. Clicking an alert message in any of the widgets will open the Details page
for that specific alert on the Alerts dashboard.
The Events widget displays the total number of events that have occurred on the cluster, as
well as how recently the last event occurred. Clicking the Events widget will open the Events
view of the Alerts dashboard.
Note: For more information, see the Home Dashboard section of the Prism Web
Console Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 272
Monitoring Cluster Health
To view the Health dashboard, click the main menu’s drop-down list and select Health.
• The left pane consists of tabs for each entity type that the cluster monitors.
• The middle pane consists of a total count of entities monitored at the top, and the same
breakdown of the entities that are in the left pane but with more detailed information.
• The right pane provides access to Nutanix Cluster Check, and allows you to view information
about checks that have been run, manage health checks, and download logs.
By default, the left pane displays a summary of entities that are healthy or in a critical or
warning state in a total of 8 categories. These categories are VMs, hosts, disks, storage pools,
storage containers, protection domains, remote sites, and cluster services. Each row in the left
pane is clickable, but all rows are collapsed by default.
If we were to click a row (VMs, for example), the left pane would expand to show additional
options for groups, and the middle pane would also expand based on our selection in the left
pane.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 273
Monitoring Cluster Health
Upon selecting an entity type, the left pane expands to show additional groupings of entities.
In this example, we can view health information for VMs grouped by host, memory, reserved
memory, disk capacity, and so on.
The middle pane also shows the same groupings, but with more details. Also note that the
number of entities displayed above the pane, in the Currently Watching row, has updated
to reflect our selections. While there are a total of 49 entities in the cluster, we are currently
viewing health information for 14.
In the middle pane, you can move your mouse cursor over an entity name or a graphic
representation of data to learn more about each item in the table. For example, if we were to
look at the Host row, we would see that on host CDEV7-1, there are 6 VMs. Three of them are
critical, one has a status of warning, and two are healthy.
Each group in the left pane and the middle pane is clickable. As you can see in the previous
figure, moving the mouse cursor over the Host row in the middle pane highlights the host
group in blue in the left pane. Clicking a row in either the left or middle pane will display more
information. For example, clicking the Memory group displays a custom page with our 14 VMs
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 274
Monitoring Cluster Health
separated into 5 groups based on their memory configuration. This page also includes a list of
detailed filters, so you can further refine your view by health, VM type, power state, or number
of CPUs.
As the previous figure shows, only one group of VMs appears to be problematic - the one
with memory configured in the 4 to 8 GB range. If we wanted to focus our attention on these
specific VMs, clicking the See all in this group link will display the following page.
All of the filters can be applied here as well and there is a row of circular icons below the
graphic health display that corresponds to each of the five problematic VMs. Moving your
mouse cursor over any one of these icons will display the name of the VM and its configuration
so that you can review the VM and consider making adjustments as needed.
Note: For more information, see the Health Dashboard section of the Prism Web
Console Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 275
Monitoring Cluster Health
In the next sections, we will discuss the right pane of the Health dashboard, and how you can
use it to perform health checks, manage health checks, and collect logs.
In Prism Element, NCC can be run from both the web console and the command line. In Prism
Central, NCC can be run from the command line only. Irrespective of where they are run from,
NCC tests can complete with one of the following status types:
• PASS. This means that the tested aspect of the cluster is healthy and no further action is
required. A check can also return a PASS status if it is not applicable.
• FAIL. This means that the tested aspect of the cluster is not healthy and must be addressed.
This message requires an immediate action. If you do not take immediate action, the cluster
might become unavailable or require intervention by Nutanix Support.
• WARN. This means that the plugin returned an unexpected value that you must investigate.
This message requires user intervention which you should resolve as soon as possible to help
maintain cluster heath.
• INFO. This means that the plugin returned an expected value that however cannot be
evaluated as PASS/FAIL. The plugin returns information about the tested cluster item.
In some cases, the message might indicate a recommendation from Nutanix that you
implement as soon as possible.
• ERR. This means that the plugin failed to execute. This message represents an error with the
check execution and not necessarily an error with the cluster entity. It states that the check
cannot confirm a PASS/INFO/WARN/FAIL status.
In Prism Element, these results are displays in the right pane of the Health dashboard, on the
Summary tab, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 276
Monitoring Cluster Health
Additionally, the Summary pane also displays the total number of NCC checks that are
available, whether any of the checks have been disabled (the Off row), and the number
of checks that have been scheduled to run automatically, the number that need to be run
manually, and the number that are triggered automatically by events in the cluster.
Moving your mouse cursor over any of the rows will display a popup with more details. For
example, if we were to look at the 15 failed checks, we would see that 8 of those failed checks
were scheduled, 4 were run manually, and 3 were triggered by some cluster event.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 277
Monitoring Cluster Health
Clicking any of the rows will display a filtered list of checks corresponding to that particular
check status. For example, if we want to see which specific checks have failed, we can click
the Failed row. This will switch us to the Checks tab, as shown in the following figure, and will
display a detailed list of each failed check in all of the different check groups.
To drill down even further, we can click one of the failed checks. This will display a details page
for that specific check with a description of the check itself, a brief history of the check, the
cause of failure, and potential resolutions. From this page, you can also turn the check off and
update the alert policy associated with the check. The left pane of this details page contains a
list of all failed checks, so that you can switch between different check details as needed.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 278
Monitoring Cluster Health
Note: For more information, see the Nutanix Cluster Check section of the Nutanix
Cluster Check Guide on the Nutanix Support Portal.
To run a health check, navigate to the Health dashboard and, in the right pane, click Actions.
From the drop-down menu, select Run NCC Checks, as shown in the following figure. This will
display the Run Checks dialog box.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 279
Monitoring Cluster Health
In the Run Checks dialog box, you have three options to choose from. By default, All Checks
will be selected. This will run every available health check; the total number of checks that will
be run is displayed in brackets next to the option.
Alternatively, you can only run checks that previously had a status of Failed or Warning. This
is useful after an error has been discovered and you have taken some action to remediate
the issue. Re-running failed or warning checks will allow you to determine if remediation was
successful, or if further action is required. For this option as well, Prism will display the number
of checks that will be run - 18 in the case of our test cluster, as shown in the previous figure.
Finally, you can also choose to run specific checks, if there are any specific aspects of cluster
health that you want to investigate. To do this, select Specific Checks and then search for the
check that you want to run in the field provided. You can select multiple checks if required.
When you run NCC checks, you can also send a report to specific recipients via email. If
recipients have not been configured previously, when attempting to select the checkbox, you
will see a popup prompting you to do so.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 280
Monitoring Cluster Health
Note: For step-by-step instructions, including how to run NCC via the command
line on Prism Central and Prism Element, see the Run NCC Checks section of
Nutanix Cluster Check Guide on the Nutanix Support Portal.
In the Set NCC Frequency dialog box, you can define a frequency of every four hours, every
day, or every week. If you choose Every Day, you will also be prompted to select a start time. If
you choose Every Week, you will be prompted to select a day of the week, in addition to a start
time.
If Alert emails have been configured for your cluster, you can also choose to email the results of
these automated checks to a list of recipients.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 281
Monitoring Cluster Health
Note: For step-by-step instructions, see the Scheduling and Automatically Emailing
NCC Results section of the Nutanix Cluster Check Guide on the Nutanix Support
Portal.
The /home/nutanix/data/logs directory stores Nutanix logs. This location contains all the
Nutanix process logs at the INFO, WARNING, ERROR, and FATAL levels. It also contains the
directories for the system stats (sysstats), and Cassandra system logs (cassandra).
The node self-monitors itself by running several Linux tools every few minutes, including ping,
iostat, sar, and df.
• The ping command is used to check if a network is available and if a host is reachable. It
allows you determine if a server is running and can be used to troubleshoot connectivity
issues.
• The iostat command is used to monitor both physical and logical input/output devices by
observing the time for which these devices are active.
• The sar command stands for system activity report, and is used to monitor various aspects
of system performance such as CPU usage, memory usage, disk usage, process and thread
allocation, and so on.
• The df command stands for disk filesystem (or disk free) and is used to check the amount of
free and used disk space.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 282
Monitoring Cluster Health
The results of these checks, and others, are available in the self-monitoring logs.
The data/logs/consolidated_audit.log directory stores the audit logs. The audit log allows you
to view a list of actions performed across the clusters.
Cassandra Logs
Genesis Logs
Stargate Logs
Zookeeper Logs
Note: For more information, including how to analyze and interpret the contents of
these logs, see the Logs section of the Acropolis Advanced Administration Guide on
the Nutanix Support Portal.
Collecting Logs
You can collect logs for Controller VMs, file server, hardware, alerts, hypervisor, and other
components of your cluster. Log collection includes logs and configuration information for one
or more Controller VMs; configuration information for hypervisors; logs generated by sysstats
utilities; and information about alerts.
Prism Element allows you to collect logs directly from the web console. To do this, navigate to
the Health dashboard in Prism Element and, in the left pane, click Actions. From the drop-down
menu, select Collect Logs.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 283
Monitoring Cluster Health
On the Collect Logs page, you will need to provide three major pieces of information: the nodes
for which you want to collect logs, log settings, and output preferences. On the Node Selection
page, click the Select Nodes button.
The select nodes dialog box will display a list of all available nodes in the cluster. In our test
cluster, we have three nodes. We will select two of those nodes and click Done.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 284
Monitoring Cluster Health
The nodes we chose will be displayed on the Node Selection page. It is possible to add
additional nodes or remove the ones listed on this page, if you want to change your selections.
To proceed, click Next.
On the Log Settings page, we can choose to collect all available logs, or only specific logs using
tags. For this example, we are going to collect all logs, and click Next.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 285
Monitoring Cluster Health
On the Output Preferences page, we need to select a duration for which logs will be collected,
specify a date and time before or after which logs must be collected, and select a destination.
You can specify the duration in hours or days, and the maximum duration for which logs can
be collected is 4 days. Entering a number larger than this will display an error in the associated
field.
When choosing a destination, you can download files locally, upload them to a custom server,
upload them to a Nutanix Support server if these logs are meant to assist with troubleshooting
a ticket you have raised.
You can also choose to anonymize your logs. This will mask sensitive information, such as IP
addresses. After you make your selections, click Collect.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 286
Monitoring Cluster Health
Note: For step-by-step instructions, including how to collect logs via the command
line, see the Log Collection section of the Nutanix Cluster Check Guide on the
Nutanix Support Portal.
You can configure multiple remote syslog servers if needed, and can also configure log
forwarding so that separate log modules are sent to different servers.
To configure syslog monitoring in Prism Central, click the Settings (gear) icon, scroll down
to the Alerts and Notifications category, and select Syslog Server. If no servers have been
configured, you will be prompted to add a new one. If syslog servers have already been
configured, you can add additional servers as needed.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 287
Monitoring Cluster Health
On the Syslog Servers page, you need to provide information in two groups: Syslog Server and
Data Sources.
On the Syslog Server page, you need to provide a server name, an IP address, and the
destination port. You can also choose either TCP or UDP as the transport protocol and
(optionally) enable RELP (Reliable Event Logging Protocol).
On the Data Sources tab, select log modules from the four available groups (API Audit,
Audit, Security Policy Hitlogs, and Flow Service Logs). Finally, select the Severity Level and click
Save.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 288
Monitoring Cluster Health
Note: For step-by-step instructions, including additional details about log modules
and severity levels, see the Configuring Syslog Monitoring section of the Prism
Central Guide on the Nutanix Support Portal. For step-by-step instructions on using
the command line to send logs to a remote syslog server, see the Send Logs to
Remote Syslog Server section of the Acropolis Advanced Administration Guide on
the Nutanix Support Portal.
Module 11 Summary
In this module, you learned:
• How to find summarized and detailed cluster and entity health information in Prism Central.
• How to obtain summarized and detailed cluster and entity health information in Prism
Element.
• What Nutanix Cluster Check (NCC) is, how to run NCC, and how to schedule health checks in
Prism Element.
• Some of the common log types that are available in Prism Element.
• How to collect logs directly from Prism Element web console.
• How to configure a syslog server in Prism Central for log forwarding.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 289
Investigating and Remediating Performance Issues
Module
12
INVESTIGATING AND REMEDIATING
PERFORMANCE ISSUES
Module Overview
This module will help you understand how to identify potential and existing performance
issues in your cluster, and the some of the steps that you can take to resolve them.
In addition, Prism Central also includes dedicated Alerts and Events dashboards, which
aggregates all alerts and events that have occurred on all clusters managed by Prism Central.
From these dedicated dashboards, you can read alert and event messages, identify underlying
causes, read through possible remediation steps, and acknowledge and resolve issues.
• The Alerts dashboard displays a list of alerts that provide details of cluster, impact type,
severity, and status. You can further investigate an alert from the details view page.
• The Events dashboard displays a list of events that have occurred in your cluster. The list of
events consists of actions performed by the user, system, or an abnormal behavior detected
in your cluster. Events are different from alerts in that they may represent minor occurrences
of potentially problematic behavior, but have not necessarily been persistent or significant
enough to generate an alert.
In the next section, we will walk you through the Alerts dashboard.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 290
Investigating and Remediating Performance Issues
The Alerts dashboard displays the List tab by default, and an Alert Policies drop-down menu
that allows you to access different views for different types of alert policies.
The List tab displays a list of alerts that are grouped by source entity, impact type, severity,
status, cluster, the time when the alert was first created, and the most recent occurrence of the
alert if more than one instance has occurred.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 291
Investigating and Remediating Performance Issues
What you see on the List tab is the default page view, but you can create custom views if you
are only interested in specific information.
For example, you may want to monitor how quickly alerts that occur on the cluster are being
attended to. To do this, you can click the View By button above the table, and select Add
Custom. Then, in the Alert Columns dialog box, select Acknowledged At, Acknowledged By,
Resolved At, and Resolved By. Enter a name for your custom view, and click Save.
This will display a custom view of the list dashboard as shown in the following figure, with
only the four columns that you selected, and the Name column which will always appear on all
custom views.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 292
Investigating and Remediating Performance Issues
You can switch back and forth between the default view and any custom views you have
created using the View By button.
In addition to creating custom views, you can also change the grouping of the information
displayed in the table on the List tab. If you click the Group By option, you will see that the
default selection is None.
If you click Severity, for example, you will see the table reorganized into three tabs, as shown in
the following figure: Warning, Info, and Critical.
Clicking each of these tabs will display alerts only of that severity, allowing you to focus your
attention on cluster issues that require immediate or urgent resolution.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 293
Investigating and Remediating Performance Issues
In addition to grouping alerts, you can also apply filters to the alerts page to view information
with additional granularity.
The Filters menu allows you to view alert messages for specific clusters, specific severities or
impact types, created times, by title or keyword, and more.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 294
Investigating and Remediating Performance Issues
Note: For more information, see Alerts Summary View section of the Prism Central
Guide on the Nutanix Support Portal.
Clicking the name of an alert in the table will display an alert details page. Here, you will see
information about the alert in the left pane such as source entity, severity, impact, date and
time, and more. The left pane also includes a link to a relevant Nutanix Knowledge base article,
if one is available for this particular alert type.
The remainder of the page displays a summary of the alert, a description, and potential
remediation steps. You can use the buttons at the top of the page to Acknowledge the alert if
you intend to take action, and to Resolve the alert if the issue has been remediated.
Finally, clicking the Help (question mark) icon next to the Resolve button will display
documentation for the Alert details page on the Nutanix Support Portal.
Note: For more information, see Alert Details section of the Prism Central Guide on
the Nutanix Support Portal.
Alert Policies
The Alert Policies tab has three views that you can choose from: user defined, system defined,
and external defined.
• The User Defined view displays all custom alert policies that are user defined.
• The System Defined view displays all default alert policies that are system defined.
• The External Defined view displays all alert policies that are defined by external entities,
typically an application through an API.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 295
Investigating and Remediating Performance Issues
Note: The Alerts dashboard in Prism Element is similar to the Alerts dashboard in
Prism Central, with a few differences. In Prism Element, you can add custom views
and apply filters. However, grouping alerts by cluster, severity, and impact type is
unique to Prism Central.
Another difference is that Prism Element has a single dashboard with multiple tabs
for Alerts, Alert Policies and Events. However, Prism Central has one dedicated
dashboard for Alerts, and another dashboard for Events, with Alert Policies
accessible from the Alerts dashboard.
Prism Central allows you to create custom alert policies, and modify existing, system-defined
policies.
In this section, we will discuss how to modify a system-defined policy, and then create a custom
policy of our own.
We have a total of 1,445 alert policies in Prism Central. And, as the following figure shows, 812 of
those policies are system-defined.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 296
Investigating and Remediating Performance Issues
To modify one of these policies, click the name of the policy in the table. This will display the
Update Policy window.
Different system-defined policies will allow you to modify different options. For example, in this
“/home partition usage on a file server VM higher than threshold” policy, we can:
On the other hand, if we look at the 'Advanced Networking Controller is not Healthy’ policy,
we see slight differences. This policy can only generate critical alerts and, unlike our previous
example, can also be set to auto-resolve.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 297
Investigating and Remediating Performance Issues
Note that auto-resolving an alert will only resolve the alert message itself. Selecting the auto-
resolve option will not enable the cluster to remediate the issue. Resolution may require the
creation and deployment of a playbook, administrator intervention, or may be addressed by the
cluster’s built-in self-healing capabilities.
• Uncheck any or all of the Severity level checkboxes to disable alerts for the selected alert
policy.
• Check or uncheck Auto Resolve these alerts checkbox.
• Add one or more exceptions to exclude a cluster from the global rule.
Note: For step-by-step instructions, see Modifying System Alert Policies section of
the Prism Central Guide on the Nutanix Support Portal.
If you want to create your own alert policies in addition to (or instead of) modifying system-
defined ones, click the Alert Policies pull-down menu and select User Defined.
On the User Defined view, click the Create Alert Policy button. You can create alert policies for
entity types VM, cluster, or host.
You can define a single alert policy for either all VMs, all hosts, or all clusters that share
some common criteria. You can create, update, delete, enable, and disable the alert policies.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 298
Investigating and Remediating Performance Issues
In this example, we will create an alert policy that will monitor the CPU usage of all VMs in
all clusters managed by Prism Central. If CPU utilization is between 0 and 70%, the system
will take no action. If CPU usage exceeds 70%, the system will generate a behavioral anomaly
warning. And if CPU usage exceeds 90% for 10 minutes or more, a critical alert will be
generated.
The policy name (VM CPU Usage) will be generated by default based on our selections. While
this can be edited, we’re going to keep the default name.
Next, we will leave the Description field blank (a description is optional) and set the Impact
Type. The Impact Type field is a descriptor, meant to provide additional, contextual information
to an administrator working to resolve an alert. In this case, since CPU usage exceeding 90%
represents a VM performance issue, we will select Performance as the Impact Type.
Then, in the left pane, we can choose to enable the policy as soon as it is created, and choose
to auto resolve alerts that are generated by this policy. We will select the Enable policy
checkbox, because we want this policy to be active.
However, we will not select the Auto resolve alerts option, because we want any VMs with CPU
anomalies to be investigated first, and then for the alert to be resolved only after the issue has
been addressed.
Next, we will define the parameters that will result in alerts being generated. In the Behavioral
Anomaly section, for the Every time there is an anomaly, alert option, select Warning.
Then, for the Ignore all anomalies between option, set the minimum percentage to 0 and the
maximum percentage to 70.
Finally, in the Static Threshold section, set the Alert Critical If option to > 90%.
Note that additional customization options are possible. We can specify an upper and lower
limit for critical alerts. We can also generate a warning at a different upper and lower limit
However, for our specific alert policy, these are the only settings we need to configure. Review
all settings, and then click Save.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 299
Investigating and Remediating Performance Issues
Note: For step-by-step instructions, see Creating Custom Alert Policies section of
the Prism Central Guide on the Nutanix Support Portal.
Note:
For details on overlapping policies, modifying custom alert policies, and deleting
custom alert policies, see Adding Custom Alert Policies section of the Prism Central
Guide on the Nutanix Support Portal.
Nutanix provides an API that applications can use to define alert policies. External alert polices
are created by applications using this API.
The External Defined tab displays a list of the application-defined alerts. You cannot modify an
externally defined policy, but you can view it by clicking the policy name in the list.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 300
Investigating and Remediating Performance Issues
Stopping Alerts
Prism Central also allows you to stop one or more alerts from being generated. You can stop a
single alert via the List tab, or stop multiple alerts via the command line.
To stop an alert from the List tab of the Alerts dashboard, click a name of an alert to view the
Alert details page. Then, in the left pane, click the link to the alert policy.
In the Update Policy window, uncheck the Global Rule settings and save the policy. This will
disable all alert messages for the associated policy.
In the following figure, unchecking the Warning option in the Global Rule section will prevent
an alert from being generated even if the system detects a high time difference between Prism
Central and Prism Element.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 301
Investigating and Remediating Performance Issues
In Prism Central you can stop multiple alerts by using the nuclei command line interface. This
option is helpful when you are about to engage in planned maintenance, and do not want
related alerts to be generated during this time frame
First, log on to Prism Central and create a specifications file. Specify the values for the following
parameters:
• Under scope_entity_list, specify scope of the entities for which you want to stop the alerts.
• Under schedule_list, define the schedule during which the alerts should be stopped.
After you have specified the values, create a configuration to stop the alerts. Run the following
command:
$ nuclei --username admin --password password blackout.create spec_file=filepath
Replace password with the password of the admin user account and filepath with the path of
the specification file that you created.
Then, list all the configurations that stop alerts. Run the following command:
$ nuclei --username admin --password password blackout.list
The output of this command lists the UUID of each configuration that you have created.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 302
Investigating and Remediating Performance Issues
After the UUID of each command is listed, update the configuration. Run the following
command:
$ nuclei --username admin --password password blackout.put uuid spec_file=filepath
Replace password with password of the admin user account, filepath with the path of the
specification file that you created, and uuid with UUID of the configuration that you want to
update.
Replace password with password of the admin user account and uuid with UUID of the
configuration that you want to get.
Note:
Prism Central allows you to email a notification to one or more recipients whenever an alert is
generated. To configure alert emails, you can either:
• Click the Settings (gear) icon, scroll down to the Alerts and Notifications section, and click
the Alert Email Configuration button, or
• Navigate to the Alerts dashboard and click the Email Configuration button that is displayed
just above the alerts table.
In the Alert Email Configuration window, you will see a message which indicates that any
emails you configure in Prism Central will be sent in addition to emails that may have already
been configured in Prism Element.
That means, if you have already set up email notifications in Prism Element, and also set up
email notifications in Prism Central, your recipients will receive multiple emails every time an
alert is generated.
Note:
• Alert emails sent by Prism Central are in addition to any alert emails you might
have configured on individual clusters through the Prism Element web console.
You will receive email from both entities in this case. Prism Central alert emailing
is not enabled by default; you must explicitly enable it and specify the recipients
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 303
Investigating and Remediating Performance Issues
(Nutanix customer support and/or supplied email addresses). If you enable alerts
through Prism Central and do not want to receive double email notifications
for the same alert, disable customer email notification for those alerts on the
individual clusters through Prism Element (but keep email notification for Nutanix
customer support enabled).
• Prism Central requires an SMTP server to send alert email messages (see
Configuring an SMTP Server (Prism Central)).
On the Settings tab, you can choose how frequently you want to send email notifications. The
options available are to send an email for every alert, to aggregate these alerts and send them
in a daily summary, and to skip the daily summary email if no alert have been generated. All
three of these options are selected by default. You can also specify recipients by adding email
addresses, as shown in the following figure.
The Tunnel Connection section of this tab displays mail transport status. If you have SMTP
and other email capabilities configured on your cluster, you should see a status of Enabled
here. Since this is our test cluster, certain email capabilities are intentionally disabled, which is
why the Tunnel Connection is also disabled.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 304
Investigating and Remediating Performance Issues
On the Rules tab, you can configure which types of alert emails will be sent to whom. For
example, consider the two test clusters in our environment: CDEV7 and CDEV9.
We can configure rules so that critical performance-related alerts on CDEV9 go to one set of
recipients, while alerts for CDEV7 go to another set of recipients.
To create a new rule, click the + New Rule button. Then, select the alert severity, impact type,
cluster, and key phrases to determine the criteria by which alert notifications will be filtered and
emailed to different recipients. Enter recipient email IDs to specify who will receive notifications
based on the criteria that were defined in the When section of the Rules tab.
On the Email Content tab, you can define an email template for alert messages. By default,
the email subject contains the name of the alert, and the body contains a one or two sentence
summary with basic information about the condition or situation that triggered the alert.
You can enter information in the Prepend Subject and Append Body fields. This will add
additional information before the subject line or after the text of the body.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 305
Investigating and Remediating Performance Issues
This is an optional step and can be skipped if you do not want to customize the contents of the
email.
Note:
For step-by-step instructions, see Configuring Alert Emails (Prism Central) section
of the Prism Central Guide on the Nutanix Support Portal.
Events Dashboard
The Events dashboard contains messages that describe cluster-related activities performed
by the user and the system. Event messages are typically generated when logging in or out of
Prism; creating or updating entities such as storage containers, VMs, and snapshots; or when
configurations are updated for various features such as email notifications or Pulse.
To access the Events dashboard, click the Entities menu, select Activity and then click Events.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 306
Investigating and Remediating Performance Issues
The Events dashboard displays event messages that are grouped by source entity, event type
(user action, system action, behavioral anomaly), cluster, and event creation date and time.
As with the Alerts dashboard, you can create custom event views as well, and switch back and
forth between custom and default views as needed.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 307
Investigating and Remediating Performance Issues
You can also filter event messages by type, cluster, time created on, or by the message title.
Note: For more information, see Event Summary View section of the Prism Central
Guide on the Nutanix Support Portal.
Types of Events
In this section, we will walk through three types of events that the system generates.
• User action
• Behavioral anomaly
• System action
Clicking an event name will display the Event Details page. The Event Details page displays
event details on the left pane and an event summary on the right. In this example, a user has
updated an alert email configuration in Prism Central and an event is generated to indicate the
user action.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 308
Investigating and Remediating Performance Issues
These types of events, when combined with the Audits feature in Prism Central, provide you
with a detailed view of the activity that occurs on a cluster. You can very easily identify which
entities have been created, changed, or modified, and which users are responsible for which
actions and activity on a cluster.
A behavioral anomaly, like the name suggests, is a sudden unexpected change in some aspect
of system or entity performance. These anomalies can be related to the network, storage, CPU,
memory, and so on, and can apply to various entities – VMs, hosts, CVMs, storage entities, or
the cluster itself.
However, repeated occurrences of the same or similar behavioral anomalies on the same
entities are often worth investigating. Patterns of behavioral anomalies – for example, increases
in controller IO bandwidth that occur somewhat frequently or regularly – may indicate an
underlying problem that could lead to a cluster issue in the near future.
Monitoring, analyzing, and investigating behavioral anomalies can allow you to take preventive
action, addressing potential issues before they begin to impact your clusters and workloads.
Behavioral anomalies are based on deviations from a normal behavior band, which the system
defines for various metrics based on historical data. The anomaly detection module monitors
these predefined metrics on a daily basis and publishes baseline values for each of them.
The anomaly detection module then measures usage every five minutes and compares that
usage with the expected values. If the observed value is outside the band, it flags that value as
an anomaly.
Note: For more information, see Behavioral Learning Tools section of the Prism
Central Guide on the Nutanix Support Portal.
If you want to investigate a particular anomaly further, you can use the Actions pull-down menu
on the top right of the event details page. which is unique to Behavioral Anomaly event type.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 309
Investigating and Remediating Performance Issues
The Add to Analysis option will add the controller IO bandwidth chart to the Analysis
dashboard. On the Analysis dashboard, you can monitor performance measures.
The Select Analysis Session option will open the Select a Session window, you can select a
session that you would like to add this chart to. A session helps you correlate the metrics with
the alerts and events for troubleshooting.
The Alert Setting option will open the Create Alert Policy window, you can edit the behavioral
anomaly or static threshold settings that trigger an alert and event if an anomaly is detected.
Events created by system action are only informational and there is no user intervention
required.
For example, as the following figure shows, the system has generated an event for VMs that
have restarted due to HA failover. Since failover has occurred successfully and VMs have
restarted, no manual action is required, and the message is simply provided for information and
audit purposes.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 310
Investigating and Remediating Performance Issues
Note: For more information, see Event Details section of the Prism Central Guide on
the Nutanix Support Portal.
As you can see, the VM Efficiency widget lists four types of inefficient VMs. VMs are assigned to
one of those four categories if they meet the following criteria:
• Bully VM: A bully VM consumes too many resources and causes other VMs to starve. A VM is
considered a bully when it exhibits one or more of the following conditions for over an hour:
• Constrained VM: A constrained VM does not have enough resources and can lead to
performance bottlenecks. A VM is considered constrained when it exhibits one or more of
the following baseline values, based on the past 21 days:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 311
Investigating and Remediating Performance Issues
over-provisioned when it exhibits one or more of the following baseline values, based on the
past 21 days:
- CPU usage < 50% (moderate) or < 20% (severe) and CPU ready time < 5%
- Memory usage < 50% (moderate) or < 20% (severe) and memory swap rate = 0 Kbps
- Dead VM: A VM is considered dead when it has been powered off for at least 21 days.
- Zombie VM: A VM is considered a zombie when it is powered on but does fewer than 30
read or write I/Os (total) and receives or transfers fewer than 1000 bytes per day for the
past 21 days.
As we saw earlier, the VM Efficiency widget on the Main dashboard displays number
of VMs that are considered inefficient.
Clicking View All Inefficient VMs link will display a filtered version of the VM dashboard with
only the inefficient VMs displayed. From here, you can click each VM to view details and
determine what sorts of adjustments are needed to resolve the issue at hand.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 312
Investigating and Remediating Performance Issues
Alternatively, you can visit the VMs dashboard in Prism Central, click the Summary tab, and
view the Anomalies widget. The Anomalies widget will display unusual patterns of VM behavior,
and will allow you to identify and take action on VMs whose resource consumption patterns
have suddenly deviated from expected behavior.
Next, you can use the Alerts dashboard to view VM-related alerts and can create alert policies
to monitor VMs for specific criteria.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 313
Investigating and Remediating Performance Issues
Once you have identified problematic VMs, you can add either built-in or custom charts to the
Analysis dashboard, in order to investigate the performance of those VMs, if further exploration
is warranted.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 314
Investigating and Remediating Performance Issues
The Health dashboard in Prism Element helps you check the health of your cluster by running
multiple checks. You can run Nutanix Cluster Check (NCC) to investigate resource utilization
and inefficiencies for entities such as VMs, hosts, storage pools, storage capacity and many
more entities.
While NCC helps with cluster as a whole, you can check specific aspects of your cluster. For
example, CVM memory usage, CVM CPU utilization, node average load, and many more checks
as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 315
Investigating and Remediating Performance Issues
Note: For more information, see Running Checks by Using Web Console section of
the Prism Web Console Guide on the Nutanix Support Portal.
Note:
• You can run similar checks via the command line to check for resource utilization
and inefficiencies in Prism Central. For more information, see KB article 2733.
• For details on running NCC via the command line, see Running NCC (Prism
Central) section of the Nutanix Cluster Check Guide on the Nutanix Support
Portal.
After you have identified inefficient VMs, you will need to add or remove resources to resolve
issues.
There are multiple ways in which you can address inefficient VMs in a cluster.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 316
Investigating and Remediating Performance Issues
• You can use Capacity Runway to view resource runway information and optimize resources
that help you improve your cluster's resource allocation or capacity. Based on the current or
historical CPU, memory, and storage usage demand, the system may recommend you to:
The Storage dashboard in Prism Element displays dynamically updated information about
the storage configuration in a cluster. To investigate storage optimization, view the Capacity
Optimization widget on the Storage dashboard. The widget displays data reduction ratio, data
reductions savings, and overall efficiency.
To increase the effective storage capacity of a cluster, configure storage containers with
storage optimization technologies such as compression, deduplication, and erasure coding.
Note: For more information, see Storage Management section of the Prism Web
Console Guide on the Nutanix Support Portal.
Module Summary
By the end of this module, you learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 317
Performing Cluster Maintenance
Module
13
PERFORMING CLUSTER MAINTENANCE
Module 13 Overview
This module will introduce you to a variety of maintenance activities that you can perform on
your clusters.
By the end of this module, you will have learned:
You can perform most administrative functions of a Nutanix cluster through the Prism web
console or REST API. Nutanix recommends using these interfaces whenever possible and
disabling CVM SSH access with password or key authentication. Some functions, however,
require logging on to a CVM with SSH. Exercise caution whenever connecting directly to a CVM
as it increases the risk of causing cluster issues.
Before you perform operations such as restarting a CVM, restarting an AHV host, or putting an
AHV host into maintenance mode, you will need to perform cluster health check to determine if
the cluster can tolerate a single-node failure.
In Prism Element, navigate to the Alerts dashboard. Before running a health check, review and
resolve any critical alerts. This will ensure that performance issues have been addressed, and
will help prevent unexpected downtime.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 318
Performing Cluster Maintenance
You can review, acknowledge, and resolve alerts as described in a previous module. In addition,
you can also review alerts via the command line.
After you have reviewed and resolved any critical alerts, verify if the cluster can tolerate a
single-node failure. To do this, in Prism Element, on the Home dashboard, check the status of
the Data Resiliency Status widget.
Verify that the status is OK. If the status is anything other than OK, resolve any indicated issues
before performing maintenance activities.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 319
Performing Cluster Maintenance
Alternatively, you can log on to Controller VM (CVM) with SSH and check the fault tolerance
status of the cluster. To check fault tolerance, run the following command:
nutanix@cvm$ ncli cluster get-domain-fault-tolerance-status type=node
The value of the Current Fault Tolerance row in the output must be at least 1 for all the nodes
in the cluster.
Note: For more information, see Verifying the Cluster Health section of the AHV
Administration Guide on the Nutanix Support Portal.
Once you have verified cluster health, you can perform maintenance tasks. In the next section,
we will talk about node maintenance.
Node Maintenance
Performing CVM maintenance or any other maintenance operations such as making changes to
the network configuration of a node, performing manual firmware upgrades, or replacements
requires you to gracefully place a node into maintenance mode or a non-operational state.
You can only place one node at a time in maintenance mode for each cluster. When a host is in
maintenance mode, the CVM is placed in maintenance mode as part of the node maintenance
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 320
Performing Cluster Maintenance
operation and any associated RF1 VMs are powered off. The cluster marks the host as
unschedulable so that no new VM instances are created on it.
Note: When a node is placed in the maintenance mode from the Prism web
console, an attempt is made to evacuate VMs from the host. If the evacuation
attempt fails, the host remains in the "entering maintenance mode" state, where it is
marked unschedulable, waiting for user remediation.
Non-migratable VMs (for example, pinned or RF1 VMs which have affinity towards a specific
node) are powered-off while live migratable or high availability (HA) VMs are moved from the
original host to other hosts in the cluster. After exiting maintenance mode, all non-migratable
guest VMs are powered on again and live migrated VMs are automatically restored on their
original host.
Note: VMs with CPU passthrough or PCI passthrough, pinned VMs (with host
affinity policies), and RF1 VMs are not migrated to other hosts in the cluster when a
node undergoes maintenance. Click View these VMs link to view the list of VMs that
cannot be live-migrated.
In the next section, we will walk through the process of, first, putting a node into maintenance
mode and then exiting maintenance mode.
When a node enters the maintenance mode, the following high-level tasks are performed
internally.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 321
Performing Cluster Maintenance
When Enter Maintenance Mode is initiated, the Host Maintenance dialog box is displayed.
You will need to select the Power off VMs that can not migrate checkbox and then click Enter
Maintenance Mode.
Note: VMs with CPU passthrough, PCI passthrough, pinned VMs (with host affinity
policies), and RF1 are not migrated to other hosts in the cluster when a node
undergoes maintenance. Click View these VMs link to view the list of VMs that
cannot be live-migrated.
Next, a revolving icon appears as a tool tip beside the selected node and also in the Host
Details view. This indicates that the host is entering the maintenance mode. Then, the revolving
icon disappears and the Exit Maintenance Mode option is enabled after the node completely
enters maintenance mode. As with other actions and tasks in the cluster, you can monitor the
progress by clicking Tasks.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 322
Performing Cluster Maintenance
Note: For more information, see Putting a Node into Maintenance Mode (AHV)
section of the Prism Web Console Guide on the Nutanix Support Portal.
After the host exits maintenance mode, RF1 VMs continue to be powered on and VMs migrate
to restore host locality.
To remove a node from maintenance mode, log on to Prism Element, navigate to the Hardware
dashboard, and click Table view. On this page, you will see a list of hosts in your cluster. Select
the host you want to remove from maintenance mode and click Exit Maintenance Mode as
shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 323
Performing Cluster Maintenance
When Exit Maintenance Mode is initiated, the Host Maintenance dialog box is displayed. Click
the Exit Maintenance Mode button to confirm the exit process.
Next, a revolving icon appears as a tool tip beside the selected node and also in the Host
Details view. This indicates that the host is exiting the maintenance mode. Then, the revolving
icon disappears and the Enter Maintenance Mode option is enabled after the node completely
exits maintenance mode.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 324
Performing Cluster Maintenance
Note: For more information, see Exiting a Node from the Maintenance Mode (AHV)
section of the Prism Web Console Guide on the Nutanix Support Portal.
Note: Verify the data resiliency status of your cluster. If the cluster only has
replication factor 2 (RF2), you can only shut down one node for each cluster. If
an RF2 cluster would have more than one node shut down, shut down the entire
cluster.
To shutdown a node, use SSH and log on to the Controller VM (CVM) on the host you want to
shutdown. Run the following command to shut down the CVM:
nutanix@cvm$ cvm_shutdown -P now
Note: Once the CVM is put into maintenance mode, it might take a couple of
minutes before you can shut down the CVM. It is recommended to wait for one to
two minutes before running the command to shut down the CVM.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 325
Performing Cluster Maintenance
After the CVM has shutdown, log on to the AHV host with SSH. Run the following command to
shut down the host:
root@ahv# shutdown -h now
Note: For step-by-step instructions, see Shutting Down a Node in a Cluster (AHV)
section of the AHV Administration Guide on the Nutanix Support Portal.
In the next section, we will talk about how to start a node in a cluster.
Next, log on to another CVM in the Nutanix cluster with SSH and verify that all services on all
the CVMs are running. Run the following command:
nutanix@cvm$ cluster status
Note: For step-by-step instructions, see Starting a Node in a Cluster (AHV) section
of the AHV Administration Guide on the Nutanix Support Portal.
2. Log on to a Controller VM (CVM) with SSH and run a complete NCC health check
nutanix@cvm$ ncc health_checks run_all
If you receive any failure or error messages, resolve those issues by referring to the KB articles
indicated in the output of the NCC check results. If you are unable to resolve these issues,
contact Nutanix Support.
To shutdown an AHV cluster, shut down the services or VMs associated with the AOS features
or Nutanix products. For example, shut down all the Nutanix file server VMs (FSVMs)
Next, shut down all guest VMs in the cluster using Prism Element. If the cluster has a large
number of running VMs, shut down the VMs using Prism Central. You can also use acli to shut
down a large number of running VMs.
Then, stop the Nutanix cluster and shut down all the CVMs in the cluster. After all the CVMs are
shut down, shut down each node in the cluster.
Perform the maintenance activity and start all the nodes in the cluster, and then start the
cluster.
Note: For step-by-step instructions, see Shutting Down an AHV Cluster section of
the AHV Administration Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 326
Performing Cluster Maintenance
Modifying a Cluster
In this section, we will talk about adding nodes to a cluster and removing nodes from a cluster.
Expanding a Cluster
You can add new nodes to a cluster at any time, after physically installing and connecting them
to the network on the same subnet as the cluster. The cluster expansion process compares the
AOS version on the existing and new nodes and performs any upgrades necessary for all nodes
to have the same AOS version.
Before you add one or mode nodes to an existing cluster, you will need to:
1. Review the relevant sections in Prerequisites and Requirements before attempting to add a
node to the cluster.
2. Check the Health Dashboard to see if any health checks are failing.
4. Ensure that all nodes are in the correct metadata state by checking the Hardware
dashboard.
To add a node, you can either navigate to the Settings page or the Hardware dashboard and
click the Expand Cluster button. On the Expand Cluster window, you have the option to either
Expand Cluster or Prepare Now and Expand Later. In this example, we will see how to expand
a cluster.
After you have selected Expand Cluster operation, on the Select Host tab you will see a
graphical list of the discovered blocks and nodes. Some hosts cannot be automatically
discovered and require manual entry of hypervisor IP.
For manual host discovery click +Add Host and enter the host IP. Then, when the list of hosts is
complete, click the Discover and Add Hosts button.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 327
Performing Cluster Maintenance
After the hosts have been discovered, you will need to select the hosts that you want to add to
your cluster and configure their network addresses, and then click Next.
Then, in the Choose Node Type tab select the node type as either HCI Node or Storage-only
node and configure the host network in the Host Networking tab.
Finally, on the Configure Host tab, specify the hypervisor image and allowlist for nodes
that require imaging. Run checks to verify that the nodes are ready. When the checks pass
successfully, click the Expand Cluster button.
Note: For step-by-step instructions, see Expanding a Cluster section of the Prism
Web Console Guide on the Nutanix Support Portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 328
Performing Cluster Maintenance
Note: Ensure that you migrate the VMs to an ESXi cluster before removing a host
or node. Verify that the cluster has enough available compute capacity before
actually migrating the VMs. If you remove a node or host without first migrating the
VMs, the VMs data may be migrated without any notice resulting in loss of service.
To remove a node, in Prism Element, navigate to the Hardware dashboard and select the node
you want to remove either from the Diagram page or Table page. Then, click the Remove Host
link on the right of the Summary line.
Note: The Remove Host link on the right of the Summary line does not appear for
the removal of a host from a three-node cluster.
Note: If the node that you are trying to remove is unreachable or powered off,
a notification is triggered in the Prism UI alerting you that the storage utilization
for this node could not be calculated and also suggesting the possible impact of
removing this node. If you still want to go ahead, you can use the force option to
forcefully mark this node for removal.
Note: For more information, see Removing a Node from a Cluster section from the
Prism Web Console Guide on the Nutanix Support Portal.
Module 13 Summary
By the end of this module, you will have learned:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 329
Upgrading Licenses, Software, and Firmware
Module
14
UPGRADING LICENSES, SOFTWARE, AND
FIRMWARE
Module 14 Overview
This module will introduce you to Life Cycle Manager (LCM) and upgrading software and
firmware through LCM in Prism Central and Prism Element. You will also learn about licenses.
• What Life Cycle Manager (LCM) is and how it simplifies the upgrade process.
• How to explore the LCM dashboard.
• How LCM works.
• What the recommended upgrade order is.
• How to upgrade clusters with and without internet access.
• How to license a cluster.
• How to manage your licenses.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 330
Upgrading Licenses, Software, and Firmware
LCM consists of a framework and a set of modules for inventory and updates. It tracks software
and firmware versions of all entities in the cluster.
All communication between the cluster and LCM modules goes through the LCM framework.
The LCM framework is accessible through the Prism interface. It acts as a download manager
for LCM modules, validating and downloading module content. LCM modules are independent
of AOS. They contain libraries and images, as well as metadata and checksums for security.
LCM supports software updates for all platforms that use Nutanix software. However, it
supports firmware updates only for Nutanix (NX), Dell XC / XC Core, Lenovo HX / HX Ready,
HPE DX, Fujitsu XF, Intel DCB, HPE DL (G10), and Inspur InMerge.
LCM performs two functions: taking inventory of the software and firmware currently installed
on a cluster, and performing upgrades. LCM updates are not reversible.
LCM supports one-click upgrades across multiple qualified hardware manufacturers and
configurations, so IT teams have the flexibility to deploy the best hardware for each use case –
and still benefit from centralized upgrade capabilities.
Nutanix LCM manages the inventory and updating of Nutanix software and hardware firmware,
including:
• Self-Managing: Automatically detects and self-updates the LCM orchestrator logic and new
module metadata when framework components are made available.
• Dark Site Support: Can be configured to fetch LCM updates from a local source for
datacenters without external Internet access.
• Upgrade Pre-Checks: Performs a comprehensive range of cluster pre-checks for health,
capacity and version control.
• Secure By Design: The LCM framework, modules and compatibility matrix metadata use
signed public key cryptography security to ensure binaries are verified as genuine.
LCM methodically and automatically works through each cluster node, updating all patches
selected by the administrator:
• Software Updates: LCM can automatically complete required upgrades for Nutanix software:
- AOS, AHV, Prism, Files, Objects, Calm, Karbon, NCC, and Foundation.
• Batched Processing: To save time and repeat processes, LCM can collate hardware
component updates together and perform any required pre- and post-update actions just
once per host:
- BIOS, BMC, Boot Drive, HDD, SSD, NIC. HBA, and Expander.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 331
Upgrading Licenses, Software, and Firmware
• Multi-Hypervisor Flexibility: LCM can deploy updates for the following hypervisors:
Prism Central
To view this dashboard:
2. Click the LCM option. Note that you can click the star icon to the right of the option name to
bookmark the dashboard for easy access.
By default, the Best Practices tab is displayed, as shown in the following figure.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 332
Upgrading Licenses, Software, and Firmware
Note: The LCM dashboard in Prism Element is similar to (and provides the same
information) the LCM dashboard in Prism Central with a few differences. By default,
the LCM dashboard in Prism Element has five tabs: Four of those tabs (Best
Practices, Inventory, Updates, and Settings) are what you would find in Prism
Central with Direct Uploads being unique to Prism Element. If you are at a dark
site, you can use this tab (Direct Uploads) to fetch an update bundle and upload it
directly to Prism, without having to set up a local web server.
Another difference is that you can run a NCC check from the Updates tab on the
Prism Element dashboard, this option is not available in Prism Central.
All LCM updates follow the procedure shown in the following flowchart.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 333
Upgrading Licenses, Software, and Firmware
1. If updates for the LCM framework are available, LCM first auto-updates its own framework,
then continues with the operation.
4. Next, LCM chooses a node and performs any necessary pre-update actions.
5. Then, LCM performs the update. The update process varies by component.
Note: For details, refer to the Update Actions for Each Component in the Effects
of Updates on the Cluster section in the Life Cycle Manager Guide on the Nutanix
Support Portal.
6. LCM performs any necessary post-update actions and brings the node back up.
7. When cluster data resiliency is back to normal, LCM moves on to the next node.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 334
Upgrading Licenses, Software, and Firmware
You can access the recommended upgrade order from the Recommended Upgrade Order
section of the Acropolis Upgrade Guide on the Nutanix Support Portal. Alternatively, in Prism
on the Best Practices page within LCM, click the View More link under FAQs. The kb article
has a series of questions that have been compiled about LCM including firmware upgrades,
prerequisites, LCM downtime, hardware platforms support, upgrade order, and many more
questions that helps us understand the product workflow.
a. Upgrade LCM framework. Do not upgrade any other software component except LCM.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 335
Upgrading Licenses, Software, and Firmware
a. Upgrade LCM framework. Do not upgrade any other software component except LCM.
c. Upgrade Foundation.
e. Upgrade AOS.
g. If you are running AHV continue with the following steps. Otherwise skip to step h.
i. Upgrade AHV.
ii. Perform LCM inventory again and upgrade any available firmware.
h. Perform these steps if you are running a hypervisor other than AHV. Otherwise skip to
step i.
ii. Perform LCM inventory again and upgrade any available software or firmware.
3. After performing core upgrades on Prism Central and Prism Element, upgrade Services
software such as Calm, Files, Karbon, Objects, and other Nutanix software installed on Prism
Central.
In the next section, we will talk about performing an inventory and performing software
upgrades.
Performing Inventory
Performing an inventory allows you to view installed software and firmware versions, along with
their last updated time, as well as available updates, if any.
Note: Inventory information for a given node is persistent as long as the node
remains in the chassis. When you remove a node from a chassis, LCM does not
retain inventory information for that node. When you return the node to the
chassis, you must perform the inventory operation again to restore the inventory
information.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 336
Upgrading Licenses, Software, and Firmware
To perform inventory, navigate to the LCM dashboard in Prism Central and click the Inventory
tab. If you are accessing the Inventory tab for the first time, auto-update will not be enabled.
If a new version of LCM framework is available you will see a warning message as shown in the
following figure.
Click the Perform Inventory button to automatically update the LCM framework before
performing any updates. The performing inventory dialog box is displayed. The dialog box
describes what the inventory process does, approximately how long inventory will take, and
helps you understand what the auto-inventory is used for. Nutanix recommends that you check
the Enable LCM Auto Inventory box before you proceed to ensure that LCM is automatically
updated in the background.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 337
Upgrading Licenses, Software, and Firmware
Note:
You can enable LCM auto inventory from this dialog box. However, if you choose
not to enable it, you can later enable it at any time by clicking Settings tab on the
LCM page and selecting the checkbox.
Click Proceed to perform inventory. This can take several minutes depending on the size of
your cluster. You can monitor the inventory progress by clicking the Tasks icon or navigating to
the Tasks dashboard in Prism Central.
When the inventory process is complete you will see a list of all installed software and their
versions, as well as their last updated time. You can also see the number of clusters these
products are installed on.
You can click the View By button to select Host (default) or Component. This page displays
the installed versions of components in Prism Central. You can also click Export to export the
inventory as a spreadsheet.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 338
Upgrading Licenses, Software, and Firmware
After you have examined all the installed software in your cluster, and if you want to perform
an update, click the Software tab. On the Software tab, you will see the a list of all software
that can be upgraded and the associated versions. You can also click View Release Notes in the
Release Notes column to read through the changes that have been made to the latest versions
of specific products. In this example, the Software page displays available version and current
version for NCC and Prism Central.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 339
Upgrading Licenses, Software, and Firmware
For both NCC and Prism Central the latest versions are selected by default. You can choose to
select the version that applies to your environment by clicking the current available version. In
this example, Prism Central has two available versions that you can choose from. You can apply
a filter to view all versions or only the enabled versions. Before you select a version for your
environment, you can view the release notes for each version and decide whether to apply the
updates or defer them.
Note: For step-by-step instructions, see Performing Inventory With the Life Cycle
Manager section of the Life Cycle Manager Guide on the Nutanix Support Portal.
Once the inventory is complete, then you can move forward with the software upgrade. In the
next section, we will talk about how to perform a software upgrade after inventory is complete.
Performing an Update
Before you perform an update, Nutanix recommends that you run NCC checks from Prism
Central to ensure that there are no cluster health issues. NCC cannot be run from the Prism
Central web console and can only be run via the command line.
Note: For more information, see Running NCC (Prism Central) section of
the Nutanix Cluster Check Guide on the Nutanix Support Portal.
Performing an Update
To perform an update, navigate to the LCM dashboard in Prism Central and click the Software
tab.
Note: You can upgrade both software and firmware in Prism Element. However, in
Prism Central, you can perform software upgrades only.
After you have reviewed the release notes, you can select the software that you want to update
and click the View Update Plan button.
In this example, both NCC and Prism Central latest available versions have been selected. If
you have other versions of the software available, you may choose to select the version that is
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 340
Upgrading Licenses, Software, and Firmware
suitable for your environment. To view the other versions that are available, click the available
version of a software.
After you have selected the software you want to update, click the View Update Plan button. In
the Review Update Plan page, Prism will tell you the impact of updates.
In this scenario, the NCC update will not affect user workloads and workload migrations will not
be necessary. For the Prism Central update, each CVM/PCVM for AOS/PC will be rebooted, one
CVM/PCVM at a time and the user workloads will not be affected.
After you have reviewed the update plan, click the Apply Updates button to perform the
update.
Note: When performing upgrades in Prism Element, ensure that Prism Central
has been upgraded first to a compatible version. Refer to the Software Product
Interoperability page on the Nutanix portal.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 341
Upgrading Licenses, Software, and Firmware
After the updates are complete, run an NCC check again to ensure that there are no cluster
health issues.
Note: For more information and step-by-step instructions, see Performing Updates
with the Life Cycle Manager section of the Life Cycle Manager Guide on the Nutanix
Support Portal.
The Updates tab on the LCM dashboard in Prism Element will give you an option to either
select Software or Firmware updates. However, in Prism Central you will only see updates to
Software.
Note: By default, the LCM dashboard in Prism Element has five tabs: Four of those
tabs (Best Practices, Inventory, Updates, and Settings) are what you would find in
Prism Central with Direct Uploads being unique to Prism Element. We will talk about
Direct Uploads tab under Upgrading Software on Clusters at a Dark Site section
later in the module.
In this example, we have a list of installed software and firmware, and their latest available
versions.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 342
Upgrading Licenses, Software, and Firmware
Additionally, you can perform NCC check directly in Prism Element by clicking the NCC Check
button in the Updates tab on the LCM dashboard. Alternatively, you can run the NCC checks
from the Health dashboard of the Prism web console.
Note:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 343
Upgrading Licenses, Software, and Firmware
After you have run NCC checks, you can proceed with the perform update process. The
procedure is similar to the process performed in Prism Central. You need to select the software
that you want to update and click the View Update Plan button. After you have reviewed the
update plan, click the Apply Update button to perform the update.
To decide which software or firmware to upgrade and in which order, refer to the
recommended upgrade order section in this module and then finally perform the upgrade.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 344
Upgrading Licenses, Software, and Firmware
The first step is to set up a local web server that is reachable by all your Nutanix clusters. For
Windows, LCM supports Internet Information Services (IIS) servers only. LCM does not currently
support HFS servers. For IIS information and examples, see the Microsoft documentation.
Note: If you use HTTPS, it is your responsibility to make sure that the certificates
are correct. LCM only accepts certificates issued by a trusted certificate authority
(CA). LCM does not accept certificates from a custom CA.
Note: If you have a proxy setup in your environment, make sure that you add the
web server IP address to your proxy allow-list in Prism.
To set up a local web server, create a virtual directory called release. When you download your
dark site bundles, make sure that you extract the dark site bundles into the release directory.
LCM must be able to look up the manifest at http://web_server_IP_address/release/
master_manifest.tgz.sign.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 345
Upgrading Licenses, Software, and Firmware
In the release directory, create or modify the MIME types. For example, create a MIME type
called '. sign' with the type set as text/plain. Make sure that MIME types with the following
extensions (.BD, .bin, .csv, and so on) exist in the release directory. Some are present by default.
For any that are not present by default, create and set the type as application/x-iso9660-image.
IIS has request filtering disabled by default. If you have enabled request filtering, make sure that
you configure exceptions for these MIME types.
Finally, set permissions on the release directory and make sure that the files in your web server
are all in lower case. Windows is case-insensitive, but the LCM framework ignores files in upper
case.
Note: For step-by-step instruction, see Setting Up a Local Web Server (Windows)
and for Linux machines see Setting Up a Local Web Server (Linux) section of the
Life Cycle Manager Dark Site Guide on the Nutanix Support Portal.
After you have set up a local web server, you will need to update the LCM framework. In the
next section, we will talk about updating LCM framework using web server.
Before you update any other component in LCM, you will need to update the LCM framework.
Navigate to the Downloads section of the Nutanix Support Portal and select LCM. Then, click
Download next to the LCM Framework Bundle. This will download the latest LCM framework
tar file. For example, lcm_dark_site_bundle_version.tar.gz.
After you have the downloaded LCM framework, transfer the framework tar file to your local
web server. Then, extract the framework tar file into the release directory.
Note: Do not use the 7zip tool to extract the tar file. Use WinZip version 22 or
later (disable the TAR file smart CR/LF option), or else extract the tar file manually
with tar -xvf. If you have extraction issues, see Troubleshooting File Extraction for a
Web Server. The local web server requires that all filenames must be in lower case.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 346
Upgrading Licenses, Software, and Firmware
After the files are extracted to the release directory, log on to Prism and open the LCM page
and click Settings.
On the Settings page, select Local web server in the Source field. In the URL field, enter the
path to the directory where you extracted the tar file on your local server and in the end of the
URL add /release as shown in the image above.
After you have entered the required details, click Save. Then, on the LCM dashboard, select
Inventory tab and click Perform Inventory. The inventory operation will first update the LCM
framework and then perform an inventory of software and firmware installed in your clusters
and their latest available versions.
Note: For step-by-step instructions, see Updating the LCM Framework Using
a Web Serversection of the Life Cycle Manager Dark Site Guide on the Nutanix
Support Portal.
To fetch and download the software dark site update bundle, navigate to the Downloads
section of the Nutanix Support Portal and select the Software bundle that you want to
download. In this example, we have downloaded the AHV LCM bundle lcm_ahv_version.tar.gz.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 347
Upgrading Licenses, Software, and Firmware
Note: The list contains several AHV LCM bundles. Check the version number to
make sure that you fetch the bundle for the AHV version you want.
Then, transfer the files to your local web server and extract the files into the release directory.
In the release folder, replace the existing compatibility and signature files with the files from the
compatibility bundle that you downloaded earlier.
After you have replaced the existing compatibility and signature files, on each Nutanix cluster,
perform an inventory. The inventory list of software and firmware updates will now include AHV
as an available update for your cluster.
Note: For step-by-step instructions, see Fetching LCM Software Update Bundles
Using a Web Server section of the Life Cycle Manager Dark Site Guide on the
Nutanix Support.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 348
Upgrading Licenses, Software, and Firmware
After you have fetched the dark site software bundle, you can proceed to perform updates. As
a first step, navigate to the LCM dashboard and click the Updates tab and perform the NCC
prechecks.
Next, select the software that you want to update and click Apply Updates. Repeat these steps
on each of your clusters.
Note: For step-by-step instructions, see Updating Software at a Dark Site section
of the Life Cycle Manager Dark Site Guide on the Nutanix Support.
Performing upgrades in Prism Element is similar to Prism Central. You will need to prepare your
cluster, update LCM framework, fetch software or firmware bundle, and perform updates. The
only exception is in the Preparing Your Cluster step, in which you will upload software bundles
directly to Prism Element.
Note: Currently LCM only supports direct upload for Prism Element. For Prism
Central, use a local web server.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 349
Upgrading Licenses, Software, and Firmware
• Components that have bundles that support either direct upload or a web server use the
filename format lcm_component_version.tar.gz.
• Components that have bundles that support only a web server use the filename format
lcm_darksite_component_version.tar.gz.
Before you update any other component in LCM, you will need to update the LCM framework.
Navigate to the Downloads section of the Nutanix Support Portal and select LCM. Then, click
Download next to the LCM Framework Bundle. This will download the latest LCM framework
tar file. For example, lcm_dark_site_bundle_version.tar.gz.
After you have the downloaded the LCM framework, transfer the framework tar file to your
local system. Next, log on to Prism and open the LCM page and click Settings.
On the Settings page, you will see details about Direct Upload. To upload the LCM framework
bundle that you have downloaded earlier, select Dark Site (Direct Upload) in the Source field
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 350
Upgrading Licenses, Software, and Firmware
and click Upload Bundle. You will be taken to the Direct Uploads tab, where you will need to
click + Upload Bundle button to upload your LCM framework bundle.
In some cases, if you wish to manually upload binaries that are not yet approved for automatic
LCM detection, use the Direct Upload method. Only upload LCM bundles that you have
obtained from https//portal.nutanix.com/page/downloads or other binaries approved by
Nutanix Support.
Note: The Upload Bundle is not supported when you access Prism Element from
Prism Central. You will need to log on to Prism Element directly to perform the
uploads.
In the next section, we will talk about fetching software or firmware bundles and performing
updates.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 351
Upgrading Licenses, Software, and Firmware
The process of fetching a software or firmware update bundle is similar to the process we saw
in fetching LCM software bundles using a web server. The only difference is that, after you have
downloaded a software or firmware bundle and the Nutanix compatibility bundle, you need to
select Dark Site (Direct Upload) in the Settings page of the LCM dashboard.
Note: The Dark Site (Direct Upload) option is only available if you have uploaded
the LCM framework bundle. See Updating the LCM Framework with Direct Upload.
Next, click Upload Bundle button and you will be directed to the Direct Uploads tab. On this
tab, click + Upload Bundle and specify the compatibility bundle. Then, click + Upload Bundle
again to specify the software or firmware update bundle. When the upload is complete, LCM
shows the available bundles.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 352
Upgrading Licenses, Software, and Firmware
Finally, select Inventory tab and click Perform Inventory. The software or firmware bundles are
now listed in the available updates.
The process of performing a software or firmware update in Prism Element is similar to the
process in Prism Central. The only exception being, you can perform software and firmware
updates in Prism Element, whereas in Prism Central you can perform only Software updates.
Before you update a software or firmware at a dark site, you will need to fetch the dark site
update bundle. Then, navigate to the LCM dashboard and click the Updates tab and perform
the NCC prechecks.
Next, select the software or firmware that you want to update and click Apply Updates. Repeat
these steps to each of your cluster.
Licensing a Cluster
License Manager provides Licensing as a Service (LaaS) by integrating the Nutanix Support
portal Licensing page with licensing management and agent software residing on Prism
Element and Prism Central clusters. Unlike previous license schemes and work flows that
were dependent on specific Nutanix software releases, License Manager is an independent
software service residing in your cluster software. You can update it independently from
Nutanix software such as AOS, Prism Central, Nutanix Cluster Check, and so on.
To license your cluster, use the Update License procedure. This procedure applies if a cluster is
newly deployed or previously unlicensed. Before you begin, keep two browser windows open:
• One browser window for the Prism Element or Prism Central web console
• One browser on an Internet-connected machine for the Nutanix Support Portal
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 353
Upgrading Licenses, Software, and Firmware
Log on to Prism Central, and click the Settings (gear) icon. On the Settings page, select
Licensing from the left pane. On the Licensing page you will see all licenses that have been
applied to your clusters.
On the Licensing page, click Update License button to download the cluster summary file. To
license your cluster you must follow the following steps as shown in the image below.
After you have downloaded the cluster summary file, go to the Licenses page on the Support
Portal and click Manage Licenses.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 354
Upgrading Licenses, Software, and Firmware
On the Manage Licenses page, you will see cluster information tab, select licenses tab, and
review and finish tab.
On the Cluster Information tab, you will need to select Clusters With Internet and click License
Clusters button. Then, choose a license tag for your license and upload the cluster summary file
that you downloaded earlier.
Then, on the Select Licenses page, select all the product licenses that you want to apply to
your environment. On the Review and Finish page, review all product licenses and click Finish.
This will automatically download the license summary file.
Finally, on the Licensing Settings pane, Upload the license summary file that was downloaded
from the Support Portal and click Apply. The Licensing page will display all the applied licenses
in your environment.
Note: The process of licensing a cluster in Prism Element is similar to the process
in Prism Central. The only exception being, the Licensing page in Prism Element
displays licenses applied to a single cluster. However, in Prism Central you can view
licenses applied to all your clusters.
Managing Licenses
License Manager provides more than one way to manage your licenses, depending on your
preference and cluster deployment. For 3-step licensing and 1-click licensing, your clusters will
need to be connected to the internet.
3-step Licensing
After you license your cluster you can use Update License in the web console to change
your license tier. The Licensing page allows you to manage your license tier by upgrading,
downgrading, or otherwise updating a license. For example, to upgrade from AOS Pro to AOS
Ultimate or downgrade Prism Ultimate to Prism Pro.
The 3-step licensing procedure is a manual licensing process, where you will need to:
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 355
Upgrading Licenses, Software, and Firmware
1. In Prism Element or Prism Central, download a cluster summary file and upload the file to the
Nutanix Support Portal.
3. In Prism Element or Prism Central web console, apply the license summary file to the
clusters.
You can manage your licensing actions with Update License (3-step licensing) process. The
licensing actions include, rebalancing licenses after a cluster capacity change, reclaim licenses
by unlicensing, extending you license term, applying an upgrade license, and upgrading or
downgrading license tiers.
Note: For more information, see Manage Licenses with Update License (3-Step
Licensing) section of the License Manager Guide on the Support Portal.
1-Click Licensing
1-click licensing helps simplify licensing by integrating the workflow into a single interface in
the web console. 1-click licensing is disabled by default; you will need to enable and configure it
before you can use it.
Note: For more information, see Enable 1-Click Licensing section of the License
Manager Guide on the Nutanix Support Portal.
Note: 1-click licensing feature is not available for dark site clusters.
When you enable 1-click licensing, you can perform most tasks from the Prism web console
without needing to explicitly log on to the Nutanix Support Portal. When you open Licensing
from the Prism Element web console for a licensed cluster, each license tile includes a drop-
down menu so you can manage your licenses without leaving the web console. 1-click licensing
communicates with the Nutanix Support Portal to detect any changes or updates to your
cluster license status.
You can manage your licensing actions with 1-Click Licensing process. The licensing actions
include, rebalancing licenses after a cluster capacity change, reclaiming licenses, extending a
license term, applying an upgrade license, and upgrading or downgrading license tiers.
Note: For more information, see Manage Licenses with 1-Click Licensing section of
the License Manager Guide on the Nutanix Support Portal.
Module 14 Summary
In this module, you learned:
• What Life Cycle Manager (LCM) is and how it simplifies the upgrade process.
• How to explore the LCM dashboard.
• How LCM works.
• What the recommended upgrade order is.
• How to upgrade clusters with and without internet access.
• How to license a cluster.
• How to manage your licenses.
Do not replicate or distribute without written consent. Copyright Nutanix Inc. 2022 | 356