NetBackup102 DeployGuide Kubernetes Clusters
NetBackup102 DeployGuide Kubernetes Clusters
Release 10.2
NetBackup™ Deployment Guide for Kubernetes
Clusters
Legal Notice
Copyright © 2023 Veritas Technologies LLC. All rights reserved.
Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies
LLC or its affiliates in the U.S. and other countries. Other names may be trademarks of their
respective owners.
This product may contain third-party software for which Veritas is required to provide attribution
to the third party (“Third-party Programs”). Some of the Third-party Programs are available
under open source or free software licenses. The License Agreement accompanying the
Software does not alter any rights or obligations you may have under those open source or
free software licenses. Refer to the Third-party Legal Notices document accompanying this
Veritas product or available at:
https://www.veritas.com/about/legal/license-agreements
The product described in this document is distributed under licenses restricting its use, copying,
distribution, and decompilation/reverse engineering. No part of this document may be
reproduced in any form by any means without prior written authorization of Veritas Technologies
LLC and its licensors, if any.
The Licensed Software and Documentation are deemed to be commercial computer software
as defined in FAR 12.212 and subject to restricted rights as defined in FAR Section 52.227-19
"Commercial Computer Software - Restricted Rights" and DFARS 227.7202, et seq.
"Commercial Computer Software and Commercial Computer Software Documentation," as
applicable, and any successor regulations, whether delivered by Veritas as on premises or
hosted services. Any use, modification, reproduction release, performance, display or disclosure
of the Licensed Software and Documentation by the U.S. Government shall be solely in
accordance with the terms of this Agreement.
Veritas Technologies LLC
2625 Augustine Drive.
Santa Clara, CA 95054
http://www.veritas.com
Technical Support
Technical Support maintains support centers globally. Technical Support’s primary
role is to respond to specific queries about product features and functionality. The
Technical Support group also creates content for our online Knowledge Base. The
Technical Support group works collaboratively with the other functional areas within
the company to answer your questions in a timely fashion.
Our support offerings include the following:
■ A range of support options that give you the flexibility to select the right amount
of service for any size organization
■ Telephone and/or Web-based support that provides rapid response and
up-to-the-minute information
■ Upgrade assurance that delivers software upgrades
■ Global support purchased on a regional business hours or 24 hours a day, 7
days a week basis
■ Premium service offerings that include Account Management Services
For information about our support offerings, you can visit our website at the following
URL:
www.veritas.com/support
All support services will be delivered in accordance with your support agreement
and the then-current enterprise technical support policy.
Customer service
Customer service information is available at the following URL:
www.veritas.com/support
Customer Service is available to assist with non-technical questions, such as the
following types of issues:
■ Questions regarding product licensing or serialization
■ Product registration updates, such as address or name changes
■ General product information (features, language availability, local dealers)
■ Latest information about product updates and upgrades
■ Information about upgrade assurance and support contracts
■ Advice about technical support options
■ Nontechnical presales questions
■ Issues that are related to CD-ROMs, DVDs, or manuals
Support agreement resources
If you want to contact us regarding an existing support agreement, please contact
the support agreement administration team for your region as follows:
Japan CustomerCare_Japan@veritas.com
Contents
■ Required terminology
Required terminology
The table describes the important terms for NetBackup deployment on Kubernetes
cluster. For more information visit the link to Kubernetes documentation.
Term Description
Pod A Pod is a group of one or more containers, with shared storage and
network resources, and a specification for how to run the containers.
For more information on Pods, see Kubernetes Documentation.
Job Kubernetes jobs ensure that one or more pods execute their
commands and exit successfully. For more information on Jobs, see
Kubernetes Documentation.
Term Description
Persistent Volume A PersistentVolume (PV) is a piece of storage in the cluster that has
been provisioned by an administrator or dynamically provisioned using
storage classes. For more information on Persistent Volumes, see
Kubernetes Documentation.
Custom Resource A Custom Resource (CR) is an extension of the Kubernetes API that
is not necessarily available in a default Kubernetes installation. For
more information on Custom Resources, see Kubernetes
Documentation.
Custom Resource The CustomResourceDefinition (CRD) API resource lets you define
Definition custom resources. For more information on
CustomResourceDefinitions, see Kubernetes Documentation.
ServiceAccount A service account provides an identity for processes that run in a Pod.
For more information on configuring the service accounts for Pods,
see Kubernetes Documentation.
■ Appropriate roles and Kubernetes cluster specific permissions are set to the
cluster at the time of cluster creation.
■ After successful deployment of the primary and media servers, the operator
creates a custom Kubernetes role with name ResourceName-admin whereas
Resource Name is given in primary server or media server CR specification.
The following permissions are provided in the respective namespaces:
Introduction 17
User roles and permissions
This role can be assigned to the NetBackup Administrator to view the pods that
were created, and to execute into them. For more information on the access
control, see Kubernetes Access Control Documentation.
Note: One role would be created, only if primary and media servers are in same
namespace with the same resource name prefix.
■ (AKS-specific only) Your AKS cluster must have the RBAC enabled. To view
the permissions set for the AKS cluster, use one of the following methods and
verify if enbleRBAC is set to true:
■ Run the following command:
az resource show -g <resource group name> -n <cluster name>
--resource-type
Microsoft.ContainerService/ManagedClusters --query
properties.enableRBAC
Table 1-2
Resource Name API Group Allowed Operations
PersistentVolume ■ Delete
■ Get
■ List
■ Patch
■ Update
■ Watch
■ Config-Checker utility
Config-Checker utility
This section describes the working, execution and status details of the
Config-Checker utility.
■ MinimumVolumeSize check:
This check verifies that the PVC storage capacity meets the minimum required
volume size for each volume in the CR. The check fails if any of the volume
capacity sizes does not meet the requirements.
Following are the minimum volume size requirements:
■ Primary server:
■ Data volume size: 30Gi
■ Catalog volume size: 100Gi
■ Log volume size: 30Gi
■ Media server:
■ Data volume size: 50Gi
■ Log volume size: 30Gi
■ Provisioner check:
EKS-specific only
■ Primary server: This will verify that the storage type provided is Amazon
Elastic Block Store (Amazon EBS) for data and log volume. If any other
driver type is used, the Config-Checker fails.
■ Media server: This will verify that the storage type provided is Amazon Elastic
Block Store (Amazon EBS) for data and log volume. Config-Checker fails if
this requirement is not met for media server.
AKS-specific only
■ This check verifies that the provisioner type used in defining the storage
class is Azure disk, for the volumes in Media servers. If not the
Config-Checker will fail. This check verifies that the provisioner type used in
defining the storage class is not Azure files for the volumes in Media servers.
That is data and log volumes in case of Media server.
(EKS-specific only) This check verifies if the AWS Autoscaler add-on is installed
in the cluster. For more information, refer to AWS Autoscaling documentation.
■ Volume expansion check:
This check verifies the storage class name given for Primary server data and
log volume and for Media server data and log volumes has
AllowVolumeExpansion = true. If Config-Checker fails with this check then it
gives a warning message and continues with deployment of NetBackup media
servers.
■ Following are the Config-Checker modes that can be specified in the Primary
and Media CR:
■ Default: This mode executes the Config-Checker. If the execution is
successful, the Primary and Media CRs deployment is started.
■ Dryrun: This mode only executes the Config-Checker to verify the
configuration requirements but does not start the CR deployment.
■ Skip: This mode skips the Config-Checker execution of Config-Checker and
directly start the deployment of the respective CR.
■ Status of the Config-Checker can be retrieved from the primary server and media
server CRs by using the kubectl describe <PrimaryServer/MediaServer>
<CR name> -n <namespace> command.
For example, kubectl describe primaryservers environment-sample -n
test
■ Success: Indicates that all the mandatory config checks have successfully
passed.
■ Failed: Indicates that some of the config checks have failed.
■ Running: Indicates that the Config-Checker execution is in progress.
■ Skip: Indicates that the Config-Checker is not executed because the
configcheckmode specified in the CR is skipped.
■ Apply the CR again. Add the required data which was deleted earlier at
correct location, save it and apply the yaml using kubectl apply -f
<environment.yaml> command.
■ Migration job is used to perform data transfer of Primary server’s file system
data from Azure disks to Azure premium files for existing NetBackup
deployments.
■ If user is deploying NetBackup for the first time, then it is considered as fresh
installation and the user can directly utilize the Azure premium files for Primary
server’s catalog volume. Primary server log and data volume supports azure
disks only.
■ For existing NetBackup deployments, migration job would copy Primary server’s
old Azure disk catalog volume to new azure file volumes, except nbdb data,
nbdb data will be copied to new azure disks based data volume. Logs can be
migrated to new azure disk log volume.
■ To invoke the migration job, the Azure premium files storage class must be
provided in the environment.yaml file for catalog volume. User can also provide
new azure disks storage class for log volume and new azure disk based data
volume must be provided in environment.yaml.
■ The migration status is updated to Success in primary server CRD post
successful data migration.
Note: Migration will take longer time based on catalog data size.
■ Status of the data migration can be retrieved from the primary server CR by
using the following command:
kubectl describe <PrimaryServer> <CR name> -n
<netbackup-environment-namespace>
Prerequisites for Kubernetes cluster configuration 29
Webhooks validation for EKS
■ If the Data migration execution status is failed, you can check the migration job
logs using the following command:
kubectl logs <migration-pod-name> -n
<netbackup-environment-namespace>
Review the error codes and error messages pertaining to the failure and update
the primary server CR with the correct configuration details to resolve the errors.
For more information about the error codes, refer to NetBackup™ Status Codes
Reference Guide.
■ Validate CSI driver: This will verify that the PV created is provisioned using
the efs.csi.aws.com driver, that is, AWS Elastic file system (EFS) for
volumes catalog. If any other driver type is used, the webhook fails.
■ Validate AWS Elastic file system (EFS) controller add-on: Verifies if the AWS
Elastic file system (EFS) controller add-on is installed on the cluster. This
AWS Elastic file system (EFS) controller is required to use EFS as persistence
storage for pods which will be running on cluster. Webhooks will check the
EFS controller add-on is installed and it is running properly. If no, then
validation error is displayed.
■ AWS Load Balancer Controller add-on check: Verifies if the AWS load
balancer controller add-on is installed on the cluster. This load balancer
controller is required to use load balancer in the cluster. Webhooks will check
the load balancer controller add-on is installed and it is running properly. If
no, then a validation error is displayed.
■ Webhook validates each check in sequence. Even if one of the validation fails
then a validation error is displayed and the execution is stopped.
■ The error must be fixed and the environment.yaml file must be applied so that
the next validation check is performed.
■ The environment is created only after webhook validations are passed.
Chapter 3
Deployment with
environment operators
This chapter includes the following topics:
■ Manual deployment
Prerequisites
Ensure that the following prerequisites are met before proceeding with the
deployment.
■ Taints and tolerations allows you to mark (taint) a node so that no pods can
schedule onto it unless a pod explicitly tolerates the taint. Marking nodes instead
Deployment with environment operators 32
About deployment with the environment operator
■ Add a taint with the same key and value which is used for label in above
step with effect as NoSchedule.
For example, key = nbpool, value = nbnodes, effect = NoSchedule
■ Install Cert-Manager. You can use the following command to install the
Cert-Manager:
$ kubectl apply -f
https://github.com/jetstack/cert-manager/releases/download/v1.6.0/cert-manager.yaml
For details, see https://cert-manager.io/docs/installation/
■ A workstation or VM running Linux with the following:
■ Configure kubectl to access the cluster.
■ Install Azure/AWS CLI to access Azure/AWS resources.
■ Configure docker to be able to push images to the container registry.
■ Free space of approximately 8.5GB on the location where you copy and
extract the product installation TAR package file. If using docker locally, there
should be approximately 8GB available on the /var/lib/docker location
so that the images can be loaded to the docker cache, before being pushed
to the container registry.
AKS-specific
■ A Kubernetes cluster in Azure Kubernetes Service in AKS with multiple nodes.
Using separate node pool is recommended for the NetBackup servers, MSDP
Scaleout deployments and for different media server objects. It is required to
have separate node pool for Snapshot Manager data plane.
■ Taints are set on the node pool while creating the node pool in the cluster.
Tolerations are set on the pods.
■ Define storage class of AzureFiles and Azure managed disks for primary and
Azure managed disks for media and MSDPX.
■ Enable AKS Uptime SLA. AKS Uptime SLA is recommended for a better
resiliency. For information about AKS Uptime SLA and to enable it, see Azure
Kubernetes Service (AKS) Uptime SLA.
Deployment with environment operators 33
About deployment with the environment operator
■ Access to a container registry that the Kubernetes cluster can access, like an
Azure Kubernetes Service Container Registry.
EKS-specific
■ A Kubernetes cluster in Amazon Elastic Kubernetes Service in EKS with multiple
nodes. Using separate node group is recommended for the NetBackup servers,
MSDP Scaleout deployments and for different media server objects. It is required
to have separate node pool for Snapshot Manager data plane.
■ Taints are set on the node group while creating the node group in the cluster.
Tolerations are set on the pods.
■ Access to a container registry that the Kubernetes cluster can access, like an
Amazon Elastic Kubernetes Service Container Registry.
■ AWS network load balancer controller add-on must be installed for using network
load balancer capabilities.
■ AWS EFS-CSI driver must be installed for statically provisioning the PV or PVC
in EFS for primary server.
For more information on installing the load balancer add-on controller and EFS-CSI
driver, See “About the Load Balancer service” on page 162.
Item Description
OCI images in the These docker image files that are loaded and then copied to
/images directory the container registry to run in Kubernetes. They include
NetBackup and MSDP Scaleout application images and the
operator images.
MSDP kubectl plug-in at Used to deploy and manage the MSDP Scaleout operator
/bin/kubectl-msdp tasks.
Configuration(.yaml) files at You can edit these to suit your configuration requirements
/operator directory before installation.
Sample product (.yaml) files You can use these as templates to define your NetBackup
at /samples directory environment.
Known limitations
Here are some known limitations.
■ Changes to the CorePattern which specifies the path used for storing core dump
files in case of a crash are not supported. CorePattern can only be set during
initial deployment.
■ Changes to MSDP Scaleout credential autoDelete, which allows automatic
deletion of credential after use, is not supported. The autoDelete value can only
be set during initial deployment.
Manual deployment
Deploying the operators
To perform these steps, log on to the Linux workstation or VM where you have
extracted the TAR file.
To deploy the operators
1 Install the MSDP kubectl plug-in at some location which is set in the path
environment variable of your shell. For example, copy the file kubectl-msdp
to/usr/local/bin/.
2 Run the following commands to load each of the product images to the local
docker instance.
$ docker load -i netbackup-main-10.2.tar.gz
$ docker load -i
netbackup-flexsnap-$(SNAPSHOT_MANAGER_VERSION).tar.gz
Run the command docker image ls to confirm that the product images are
loaded properly to the docker cache.
Deployment with environment operators 35
Manual deployment
3 Run the following commands to re-tag the images to associate them with your
container registry, keep the image name and version same as original:
(AKS-specific) $ REGISTRY=<example.azurecr.io> (Replace with your
own container registry name)
(EKS-specific) $ REGISTRY=<<AccountID>.dkr.ecr.<region>.amazonaws.com
$ docker tag netbackup/main:10.2 ${REGISTRY}/netbackup/main:10.2
$ docker tag
veritas/flexsnap-datamover:${SNAPSHOT_MANAGER_VERSION}
${REGISTRY}/veritas/flexsnap-datamover:${SNAPSHOT_MANAGER_VERSION}
If the repository is not created, then create the repository using the following
command:
aws ecr create-repository --repository-name <image-name> --region
<region-name>
5 Run the following commands to push the images to the container registry.
$ docker push ${REGISTRY}/netbackup/main:10.2
$ docker push
${REGISTRY}/veritas/flexsnap-certauth:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-rabbitmq:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-fluentd:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-datamover:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-nginx:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-mongodb:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-core:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-deploy:${SNAPSHOT_MANAGER_VERSION}
Deployment with environment operators 37
Manual deployment
7 Install the MSDP Scaleout operator in the created namespace, using this
command. To run this command you must define a full image name in step 3,
define a storage class for storing logs from the MSDP operator, and define
node selector labels (optional) for scheduling the MSDP operator pod on specific
nodes. See “Prerequisites” on page 31.
$ kubectl msdp init --image ${REGISTRY}/msdp-operator:18.0
--storageclass x --namespace netbackup-operator-system -l
key1=value1
images:
- name: netbackupoperator
newName: example.com/netbackup/operator
newTag: 'SNAPSHOT_MANAGER_VERSION'
nodeSelector:
nbpool: nbnodes
# Support node taints by adding pod tolerations equal to the
Deployment with environment operators 38
Manual deployment
specified nodeSelectors
# For Toleartion NODE_SELECTOR_KEY used as a key and
NODE_SELECTOR_VALUE as a value.
tolerations:
- key: nbpool
operator: "Equal"
value: nbnodes
Deployment with environment operators 39
Manual deployment
10 Configure the namespace, image name, and node selector to use for NetBackup
Snapshot Manager operator image by editing the provided configuration yaml
files. Edit the operator/kustomization.yaml file and change newName and
newTag. Also change Snapshot Manager operator’s node selector and
toleration (CONTROL_NODE_KEY and CONTROL_NODE_VALUE).
The value of CONTROL_NODE_KEY and CONTROL_NODE_VALUE should
match with the value of the fields listed in operator/patches/operator_patch.yaml
> nodeSelector (labelKey, labelValue) and tolerations (key, value), so that the
Snapshot Manager operator will also run on the same node as NetBackup
operator. For example:
images:
- name: cloudpointoperator
newName: example.com/veritas/flexsnap-deploy
newTag: 'SNAPSHOT_MANAGER_VERSION'
patches:
- target:
kind: Deployment
name: flexsnap-operator
patch: |
- op: replace
path: /spec/template/spec/tolerations/0/key
value: nbu-control-pool
- op: replace
path: /spec/template/spec/tolerations/0/value
value: nbupool
- op: replace
path: /spec/template/spec/affinity/nodeAffinity/
requiredDuringSchedulingIgnoredDuringExecution/nodeSelectorTerms/0/matchExpressions/0/key
value: nbu-control-pool
- op: replace
path: /spec/template/spec/affinity/nodeAffinity/
requiredDuringSchedulingIgnoredDuringExecution/nodeSelectorTerms/0/matchExpressions/0/values/0
value: nbupool
Deployment with environment operators 40
Manual deployment
11 To install the NetBackup and Snapshot Manager operator, run the following
command from the installer's root directory:
$ kubectl apply -k operator
Where, nb-example is the name of the namespace. The Primary, Media, and
MSDP Scaleout application namespace must be different from the one used
by the operators. It is recommended to use two namespaces. One for the
operators, and a second one for the applications.
2 Create a secret to hold the primary server credentials. Those credentials are
configured in the NetBackup primary server, and other resources in the
NetBackup environment use them to communicate with and configure the
primary server. The secret must include fields for `username` and `password`.
If you are creating the secret by YAML, the type should be opaque or basic-auth.
For example:
apiVersion: v1
kind: Secret
metadata:
name: primary-credentials
namespace: nb-example
type: kubernetes.io/basic-auth
stringData:
username: nbuser
password: p@ssw0rd
3 Create a KMS DB secret to hold Host Master Key ID (`HMKID`), Host Master
Key passphrase (`HMKpassphrase`), Key Protection Key ID (`KPKID`), and
Key Protection Key passphrase (`KPKpassphrase`) for NetBackup Key
Management Service. If creating the secret by YAML, the type should be
_opaque_. For example:
apiVersion: v1
kind: Secret
metadata:
name: example-key-secret
namespace: nb-example
type: Opaque
stringData:
HMKID: HMKID
HMKpassphrase: HMKpassphrase
KPKID: KPKID
KPKpassphrase: KPKpassphrase
You can also create a secret using kubectl from the command line:
$ kubectl create secret generic example-key-secret --namespace
nb-namespace --from-literal=HMKID="HMKID"
--from-literal=HMKpassphrase="HMKpassphrase"
--from-literal=KPKID="KPKID"
--from-literal=KPKpassphrase="KPKpassphrase"
4 Create a secret to hold the MSDP Scaleout credentials for the storage server.
The secret must include fields for `username` and `password` and must be
located in the same namespace as the Environment resource. If creating the
secret by YAML, the type should be _opaque_ or _basic-auth_. For example:
apiVersion: v1
kind: Secret
metadata:
name: msdp-secret1
namespace: nb-example
type: kubernetes.io/basic-auth
stringData:
username: nbuser
password: p@ssw0rd
You can also create a secret using kubectl from the command line:
$ kubectl create secret generic msdp-secret1 --namespace
nb-example --from-literal=username='nbuser'
--from-literal=password='p@ssw0rd'
Note: You can use the same secret for the primary server credentials (from
step 2) and the MSDP Scaleout credentials, so the following step is optional.
However, to use the primary server secret in an MSDP Scaleout, you must set
the `credential.autoDelete` property to false. The sample file includes an
example of setting the property. The default value is true, in which case the
secret may be deleted before all parts of the environment have finished using
it.
Deployment with environment operators 44
Manual deployment
5 (Optional) Create a secret to hold the KMS key details. Specify KMS Key only
if the KMS Key Group does not already exist and you need to create.
Note: When reusing storage from previous deployment, the KMS Key Group
and KMS Key may already exist. In this case, provide KMS Key Group only.
If creating the secret by YAML, the type should be _opaque_. For example:
apiVersion: v1
kind: Secret
metadata:
name: example-key-secret
namespace: nb-example
type: Opaque
stringData:
username: nbuser
passphrase: 'test passphrase'
You can also create a secret using kubectl from the command line:
$ kubectl create secret generic example-key-secret --namespace
nb-example --from-literal=username="nbuser"
--from-literal=passphrase="test passphrase"
You may need this key for future data recovery. After you have successfully
deployed and saved the key details. It is recommended that you delete this
secret and the corresponding key info secret.
6 (Optional for AKS-specific) Create a secret to hold the MSDP S3 root credentials
if you need MSDP S3 service. The secret must include accessKey and
secretKey, and must be located in the same namespace as the Environment
resource.
■ accessKey must match the regex pattern ^[\w]+$ and has the length in
the range [16, 128].
■ secretKey must match the regex pattern ^[\w+\/]+$ and has the length
in the range [32, 128].
It is recommended that you generate random S3 root credentials. Run the
following command:
$ kubectl msdp generate-s3-secret --namespace nb-example
--s3secret s3-secret1
Save the generated S3 root credentials at a secure place for later use.
Deployment with environment operators 45
Manual deployment
Use this command to verify the new environment resource in your cluster:
$ kubectl get --namespace nb-example environments
NAME AGE
environment-sample 2m
After a few minutes, NetBackup finishes starting up on the primary server, and
then the media servers and MSDP Scaleout storage servers you configured
in the environment resource start appearing. Run:
$ kubectl get --namespace nb-example
all,environments,primaryservers,mediaservers,msdpscaleouts
NAME STATUS
environment.netbackup.veritas.com/environment-sample Success
9 To start using your newly deployed environment sign-in to NetBackup web UI.
Open a web browser and navigate to https://<primaryserver>/webui/login
URL.
The primary server is the host name or IP address of the NetBackup primary
server.
You can retrieve the primary server's hostname by using the command:
$ kubectl describe primaryserver.netbackup.veritas.com/<primary
server CR name>--namespace <namespace_name>
Where, nb-example is the name of the namespace. The Primary, Media, and
Snapshot Manager application namespace must be different from the one used
by the operators. It is recommended to use two namespaces. One for the
operators, and a second one for the applications.
2 Create a secret to hold the Snapshot Manager credentials. The secret must
include fields for username and password. If you are creating the secret by
YAML, the type should be opaque or basic-auth.
For example:
apiVersion: v1
stringData:
password: p@ssw0rd
username: cpuser
kind: Secret
metadata:
name: cp-creds
namespace: nb-example
type: Opaque
Use this command to verify the new environment resource in your cluster:
kubectl get --namespace nb-example environments
After a few minutes, NetBackup finishes starting up the primary server, media
servers and Snapshot Manager servers in the sequence that you configured
in the environment resource. Snapshot Manager is registered with NetBackup
and cloud provider is configured automatically.
Run the following command:
kubectl get --namespace netbackup-environment
all,environments,primaryservers,cpservers,mediaservers
pod/flexsnap-listener-664674-phjnd
1/1 Running 0 43m
pod/flexsnap-mongodb-f6b744df5-p4hfv
1/1 Running 0 43m
pod/flexsnap-nginx-8647f57db8-rzkt5
1/1 Running 0 43m
pod/flexsnap-notification-7db95868f5-dpx7z
1/1 Running 0 43m
pod/flexsnap-rabbitmq-0
1/1 Running 0 43m
pod/flexsnap-scheduler-68d8b75d75-5q4fk
1/1 Running 0 43m
pod/nbux-marketplace-10-239-207-44.vxindia.veritas.com
1/1 Running 0 72m
pod/nbux-marketplace-10-239-207-45.vxindia.veritas.com
2/2 Running 0 68m
pod/nbux-marketplace-10-239-207-46.vxindia.veritas.com
2/2 Running 0 64m
pod/nbux-marketplace-10-239-207-47.vxindia.veritas.com
2/2 Running 0 60m
pod/nbux-eks-dedupe1-uss-agent-56r7h
1/1 Running 0 73m
pod/nbux-eks-dedupe1-uss-agent-hvfw7
1/1 Running 0 73m
pod/nbux-eks-dedupe1-uss-agent-jx46x
1/1 Running 0 73m
pod/nbux-eks-dedupe1-uss-agent-pz7w8
1/1 Running 0 73m
pod/nbux-eks-dedupe1-uss-agent-r2kk
1/1 Running 0 73m
pod/nbux-eks-dedupe1-uss-agent-vx8gc
1/1 Running 0 73m
pod/nbux-eks-dedupe1-uss-controller-
1/1 Running 0 73m
pod/nbux-eks-dedupe1-uss-controller-1
1/1 Running 0 72m
pod/nbux-eks-dedupe1-uss-mds-1
1/1 Running 0 75m
pod/nbux-eks-dedupe1-uss-mds-2
1/1 Running 0 74m
pod/nbux-eks-dedupe1-uss-mds-3
1/1 Running 0 73m
pod/nbux-eks-media1-media-0
Deployment with environment operators 51
Manual deployment
AGE
service/flexsnap-api-gateway ClusterIP 172.20.92.70
<none> 8472/TCP
43m
service/flexsnap-certauth ClusterIP 172.20.222.22
<none> 9000/TCP
43m
service/flexsnap-fluentd-service ClusterIP 172.20.141.61
<none> 24224/TCP
43m
service/flexsnap-mongodb ClusterIP 172.20.157.102
<none> 27017/TCP
43m
service/flexsnap-nginx LoadBalancer 172.20.187.1
nbux-eks-cpserver-1-2b677a3f6b6ffe48.elb.us-west-2.amazonaws.com
443:31318/TCP,5671:31902/TCP
43m
service/flexsnap-rabbitmq ClusterIP 172.20.33.99
<none> 5671/TCP
Deployment with environment operators 52
Manual deployment
43m
service/ip-10-239-207-44-host-nbux-marketplace-10-239-207-44-vxindia-ve
LoadBalancer 172.20.186.216
k8s-ns155-ip102392-8d0152d6a0-8d77c9a84d1dd6a0.
elb.us-west-2.amazonaws.com
10082:30397/TCP,10102:31873/TCP,10086:32374/TCP,
443:30732/TCP,111:30721/TCP,662:32206/TCP,875:32361/TCP,892:31540/TCP,2049:30676/TCP,
45209:31944/TCP,58329:30149/TCP,139:31587/TCP,445:31252/TCP 73m
service/ip-10-239-207-45-host-nbux-marketplace-10-239-207-45-vxindia-ve
LoadBalancer 172.20.23.36
k8s-ns155-ip102392-410db5113c-791da4601d58039f.
elb.us-west-2.amazonaws.com
10082:31116/TCP,10102:31904/TCP,10086:32468/TCP,
443:32693/TCP,111:32658/TCP,662:31151/TCP,875:30175/TCP,892:
31126/TCP,2049:31632/TCP,45209:32602/TCP,58329:31082/TCP,139:31800/TCP,445:30795/TCP
73m
service/ip-10-239-207-46-host-nbux-marketplace-10-239-207-46-vxindia-ve
LoadBalancer 172.20.6.179
k8s-ns155-ip102392-1c160eed54-c975ee450ede1c20.
elb.us-west-2.amazonaws.com
10082:31927/TCP,10102:31309/TCP,10086:30285/TCP,
443:32648/TCP,111:32348/TCP,662:32170/TCP,875:31854/TCP,892:30842/TCP,2049:31357/TCP,
45209:30002/TCP,58329:32408/TCP,139:30882/TCP,445:32017/TCP 73m
service/ip-10-239-207-47-host-nbux-marketplace-10-239-207-47-vxindia-ve
LoadBalancer 172.20.221.124
k8s-ns155-ip102392-bb1f5b1cbe-22e4275c1af33239.
elb.us-west-2.amazonaws.com
10082:30137/TCP,10102:30727/TCP,10086:32649/TCP,
443:30474/TCP,111:32690/TCP,662:31740/TCP,875:30437/TCP,892:32532/TCP,2049:32641/TCP,
45209:31259/TCP,58329:31070/TCP,139:32393/TCP,445:30296/TCP 73m
service/nbux-eks
-dedupe1-uss-controller ClusterIP 172.20.197.231 <none>
10100/TCP
74m
service/nbux-eks
-dedupe1-uss-mds ClusterIP None <none>
Deployment with environment operators 53
Manual deployment
2379/TCP,2380/TCP
75m
service/nbux-eks
-dedupe1-uss-mds-client ClusterIP 172.20.226.45 <none>
2379/TCP
75m
service/nbux-eks
-media1-media-0 LoadBalancer 172.20.146.140
nbux-eks-media1-media-0-1a525c3f2587c8cf.elb.us-west-2.amazonaws.com
13782:31414/TCP,1556:31562/TCP
56m
service/nbux-eks-primary LoadBalancer 172.20.108.104
nbux-eks-primary-c0f9de7ce7e231cd.elb.us-west-2.amazonaws.com
13781:30252/TCP,13782:31446/TCP,1556:31033/TCP,443:31517/TCP,8443:32237/
TCP,22:30436/TCP 101m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE
NODE SELECTOR AGE
daemonset.apps/
flexsnap-fluentd 5 5 5 5 5
<none> 43m
daemonset.apps/
nbux-eks-
dedupe1-uss-agent 6 6 6 6 6
agentpool
=msdpxpool 73m
NAME READY UP-TO-DATE AVAILABLE
AGE
deployment.apps/
flexsnap-agent 1/1 1 1
43m
deployment.apps/
flexsnap-agent-
Deployment with environment operators 54
Manual deployment
33e649abd383410ea618751f7b2eb8ae 1/1 1 1
28m
deployment.apps/
flexsnap-api-gateway 1/1 1 1
43m
deployment.apps/
flexsnap-certauth 1/1 1 1
44m
deployment.apps/
flexsnap-coordinator 1/1 1 1
43m
deployment.apps/
flexsnap-fluentd-collector 1/1 1 1
43m
deployment.apps/
flexsnap-listener 1/1 1 1
43m
deployment.apps/
flexsnap-mongodb 1/1 1 1
43m
deployment.apps/
flexsnap-nginx 1/1 1 1
43m
deployment.apps/
flexsnap-notification 1/1 1 1
43m
deployment.apps/
flexsnap-scheduler 1/1 1 1
43m
NAME DESIRED CURRENT
READY AGE
replicaset.apps/
flexsnap-agent-
33e649abd383410ea618751f7b2eb8ae-598b8b747 1 1
1 28m
replicaset.apps/
flexsnap-agent-688d478bc8 1 1
1 43m
replicaset.apps/
flexsnap-api-gateway-69cbbfc844 1 1
1 43m
replicaset.apps/
flexsnap-certauth-6f65894b69 1 1
Deployment with environment operators 55
Manual deployment
1 44m
replicaset.apps/
flexsnap-coordinator-749649c7 1 1
1 43m
replicaset.apps/
flexsnap-fluentd-collector-7445f6fb9f 1 1
1 43m
replicaset.apps/
flexsnap-listener-664674 1 1
1 43m
replicaset.apps/
flexsnap-mongodb-f6b744df5 1 1
1 43m
replicaset.apps/
flexsnap-nginx-8647f57db8 1 1
1 43m
replicaset.apps/
flexsnap-notification-7db95868f5 1 1
1 43m
replicaset.apps/
flexsnap-scheduler-68d8b75d75 1 1
1 43m
NAME READY AGE
statefulset.apps/
flexsnap-rabbitmq 1/1 43m
statefulset.apps/
nbux-eks-dedupe1-uss-controller 2/2 73m
statefulset.apps/
nbux-eks-media1-media 1/1 55m
statefulset.apps/
nbux-eks-primary 1/1 101m
NAME READY AGE STATUS
environment.netbackup.
veritas.com/nbux-eks 4/4 102m Success
NAME TAG AGE STATUS
primaryserver.netbackup.
veritas.com/nbux-eks 10.1.1.0085 102m Success
NAME TAG AGE PRIMARY SERVER
STATUS
mediaserver.netbackup.
veritas.com/nbux-eks-media1 10.1.1.0085 56m nbux-marketplace
Deployment with environment operators 56
Configuring the environment.yaml file
-10-239-207-42.vxindia.veritas.com Success
NAME TAG AGE STATUS
cpserver.netbackup.
veritas.com/nbux-eks-cpserver-1 10.1.1.0.1073 44m Success
NAME AGE TAG SIZE READY
msdpscaleout.msdp.
veritas.com/nbux-eks-dedupe1 75m 17.1.0085 4 4
Parameter Description
namespace: example-ns Specify the namespace where all the NetBackup resources are managed. If not
specified here, then it will be the current namespace when you run the command
kubectl apply -f on this file.
(AKS-specific) containerRegistry: Specify a container registry that the cluster has access. NetBackup images are
example.azurecr.io pushed to this registry.
(EKS-specific) containerRegistry:
example.dkr.ecr.us-east-2.amazonaws.com/exampleReg
tag: 10.2 This tag is used for all images in the environment. Specifying a `tag` value on a
sub-resource affects the images for that sub-resource only. For example, if you
apply an EEB that affects only primary servers, you might set the `primary.tag`
to the custom tag of that EEB. The primary server runs with that image, but the
media servers and MSDP scaleouts continue to run images tagged `10.2`. Beware
that the values that look like numbers are treated as numbers in YAML even
though this field needs to be a string; quote this to avoid misinterpretation.
licenseKeys: List the license keys that are shared among all the sub-resources. Licenses
specified in a sub-resource are appended to this list and applied only to the
sub-resource.
Deployment with environment operators 57
Configuring the environment.yaml file
Parameter Description
paused: false Specify whether the NetBackup operator attempts to reconcile the differences
between this YAML specification and the current Kubernetes cluster state. Only
set it to true during maintenance.
configCheckMode: default This controls whether certain configuration restrictions are checked or enforced
during setup. Other allowed values are skip and dryrun.
corePattern: Specify the path to use for storing core files in case of a crash.
/corefiles/core.%e.%p.%t
(AKS-specific) Specify the annotations to be added for the network load balancer
loadBalancerAnnotations: service.
beta.kubernetes.io/ azure-load-
balancer- internal-subnet:
example-subnet
(EKS-specific)
loadBalancerAnnotations:
service.beta.kubernetes.io/aws-load-balancer-subnets:
example-subnet1 name
The following section describes Snapshot Manager related parameters. You may
also deploy without any Snapshot Manager. In that case, remove the cpServer
section entirely from the configuration file.
Parameter Description
Parameter Description
containerRegistry (Optional) Specify a container registry that the cluster has access.
Snapshot Manager images are pushed to this registry which
overrides the one defined in Common environment parameters
table above.
log.storageClassName Storage class for log volume. It must be EFS based storage class.
Parameter Description
proxySettings.vx_http_proxy: Address to be used as the proxy for all HTTP connections. For
example, "http://proxy.example.com:8080/"
proxySettings.vx_https_proxy: Address to be used as the proxy for all HTTPS connections. For
example, "http://proxy.example.com:8080/"
proxySettings.vx_no_proxy: Address that are allowed to bypass the proxy server. You can
specify host name, IP addresses and domain names in this
parameter. For example,
"localhost,mycompany.com,169.254.169.254"
The following configurations apply to the primary server. The values specified in
the following table can override the values specified in the table above.
Paragraph Description
tag: 10.2-special To use a different image tag specifically for the primary
server, uncomment this value and provide the desired tag.
This overrides the tag specified in the common section.
Deployment with environment operators 60
Configuring the environment.yaml file
Paragraph Description
nodeSelector: Specify a key and value that identifies nodes where the
primary server pod runs.
labelKey: kubernetes.io/os
Note: This labelKey and labelValue must be the same
labelValue: linux
label key:value pair used during cloud node creation which
would be used as a toleration for primary server.
credSecretName: primary-credential-secret This determines the credentials for the primary server.
Media servers use these credentials to register themselves
with the primary server.
kmsDBSecret: kms-secret Secret name which contains the Host Master Key ID
(HMKID), Host Master Key passphrase (HMKpassphrase),
Key Protection Key ID (KPKID) and Key Protection Key
passphrase (KPKpassphrase) for NetBackup Key
Management Service. The secret should be 'Opaque', and
can be created either using a YAML or the following
example command: kubectl create secret
generic kms-secret --namespace nb-namespace
--from-literal=HMKID="HMK@ID"
--from-literal=HMKpassphrase="HMK@passphrase"
--from-literal=KPKID="KPK@ID"
--from-literal=KPKpassphrase="KPK@passphrase"
Deployment with environment operators 61
Configuring the environment.yaml file
Paragraph Description
capacity: 30Gi
The following section describes the media server configurations. If you do not have
a media server either remove this section from the configuration file entirely, or
define it as an empty list.
Parameters Description
tag: 10.2-special To use a different image tag specifically for the media
servers, uncomment this value and provide the desired
tag. This overrides the tag specified above in the common
table.
capacity: 50Gi The minimum data size for a media server is 50 Gi.
(AKS-specific) storageClassName:
managed-premium-nbux
(AKS-specific) storageClassName:
managed-premium-nbux
Parameters Description
ipAddr: 4.3.2.2
fqdn: media1-1.example.com
ipAddr: 4.3.2.3
fqdn: media1-2.example.com
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp3
annotations:
storageclass.kubernetes.io/is-default-class: "true"
allowVolumeExpansion: true
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
type: gp3
The following section describes MSDP-related parameters. You may also deploy
without any MSDP scaleouts. In that case, remove the msdpScaleouts section
entirely from the configuration file.
Deployment with environment operators 64
Configuring the environment.yaml file
Parameter Description
tag: '18.0' This tag overrides the one defined in the table 1-3. It is
necessary because the MSDP Scaleout images are
shipped with tags different from the NetBackup primary
and media images.
serviceIPFQDNs: These are the IP addresses and host names of the MSDP
Scaleout servers. The number of the entries should match
ipAddr: 1.2.3.4
the number of the replicas specified above.
fqdn: dedupe1-1.example.com
ipAddr: 1.2.3.5
fqdn: dedupe1-2.example.com
ipAddr: 1.2.3.6
fqdn: dedupe1-3.example.com
ipAddr: 1.2.3.7
fqdn: dedupe1-4.example.com
kms: Specifies the initial key group and key secret to be used
for KMS encryption. When reusing storage from a previous
keyGroup: example-key-group
deployment, the key group and key secret may already
exist. In this case, provide the keyGroup only.
Deployment with environment operators 65
Configuring the environment.yaml file
Parameter Description
keySecret: Specify keySecret only if the key group does not already
exist and needs to be created. The secret type should be
example-key-secret
Opaque, and you can create the secret either using a
YAML or the following command:
(AKS-specific) loadBalancerAnnotations: For MSDP scaleouts, the default value for the following
service.beta.kubernetes .io/azure-load- balancer-internal: annotation is `false`, which may cause the MSDP Scaleout
true services in this Environment to be accessible publicly:
credential: This defines the credentials for the MSDP Scaleout server.
It refers to a secret in the same namespace as this
secretName: msdp-secret1
environment resource. Secret can be either of type
'Basic-auth' or 'Opaque'. You can create secrets using a
YAML or by using the following command:kubectl
create secret generic <msdp-secret1>
--namespace <nb-namespace>
--from-literal=username=<"devuser">
--from-literal=password=<"Y@123abCdEf">
autoDelete: false Optional parameter. Default value is true. When set to true,
the MSDP Scaleout operator deletes the MSDP secret
after using it. In such case, the MSDP and primary secrets
must be distinct. To use the same secret for both MSDP
scaleouts and the primary server, set autoDelete to false.
Parameter Description
dataVolumes: This specifies the data storage for this MSDP Scaleout
resource. You may increase the size of a volume or add
capacity: 5Gi
more volumes to the end of the list, but do not remove or
(AKS-specific) storageClassName: standard re-order volumes. Maximum 16 volumes are allowed.
Appending new data volumes or expanding existing ones
(EKS-specific) storageClassName: gp2
will cause short downtime of the Engines. Recommended
volume size is 5Gi-32Ti.
nodeSelector: Specify a key and value that identifies nodes where MSDP
Scaleout pods will run.
labelKey: kubernetes.io/os
labelValue: linux
secretName: s3-secret1 Defines the MSDP S3 root credentials for the MSDP
Scaleout server. It refers to a secret in the same
namespace as this environment resource. If the parameter
is not specified, MSDP S3 feature is unavailable.
Parameter Description
name Specifies the prefix name for the primary, media, and MSDP Scaleout server resources.
(AKS-specific) The values against ipAddr, fqdn and loadBalancerAnnotations against following
fields should not be changed post initial deployment. This is applicable for primary,
ipAddr, fqdn and
media, and MSDP Scaleout servers. For example:
loadBalancerAnnotations
- The loadBalancerAnnotations for loadBalancerAnnotations:
service.beta.kubernetes.io/azure-load-balancer
-internal-subnet: example-subnet
service.beta.kubernetes.io/azure-load-balancer -internal: "true"
The IP and FQDNs values defined for Primary, Media and
MSDPScaleout ipList:
- ipAddr: 4.3.2.1 fqdn: primary.example.com
ipList:
- ipAddr: 4.3.2.2 fqdn: media1-1.example.com
- ipAddr: 4.3.2.3 fqdn: media1-2.example.com
serviceIPFQDNs:
- ipAddr: 1.2.3.4 fqdn: dedupe1-1.example.com
- ipAddr: 1.2.3.5 fqdn: dedupe1-2.example.com
- ipAddr: 1.2.3.6 fqdn: dedupe1-3.example.com
- ipAddr: 1.2.3.7 fqdn: dedupe1-4.example.com
Deployment with environment operators 68
Configuring the environment.yaml file
Parameter Description
(EKS-specific) The values against ipAddr, fqdn and loadBalancerAnnotations against following
fields should not be changed post initial deployment. This is applicable for primary,
ipAddr, fqdn and
media, and MSDP Scaleout servers. For example:
loadBalancerAnnotations
- The loadBalancerAnnotations for loadBalancerAnnotations:
service.beta.kubernetes.io/aws-load-balancer -internal-subnet:
example-subnet service.beta.kubernetes.io/aws-load-balancer
-internal: "true"
- The IP and FQDNs values defined for Primary, Media and
MSDPScaleout ipList:
- ipAddr: 4.3.2.1 fqdn: primary.example.com
ipList:
- ipAddr: 4.3.2.2 fqdn: media1-1.example.com
- ipAddr: 4.3.2.3 fqdn: media1-2.example.com
serviceIPFQDNs:
- ipAddr: 1.2.3.4 fqdn: dedupe1-1.example.com
- ipAddr: 1.2.3.5 fqdn: dedupe1-2.example.com
- ipAddr: 1.2.3.6 fqdn: dedupe1-3.example.com
- ipAddr: 1.2.3.7 fqdn: dedupe1-4.example.com
parameters Description
parameters Description
ipList:
ipAddr: 4.3.2.2
fqdn: media1-1.example.com
ipAddr: 4.3.2.3
cpServer: media1-2.example.com
Deployment with environment operators 70
Uninstalling NetBackup environment and the operators
parameters Description
Storage class:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
parameters:
provisioningMode: efs-ap
fileSystemId: <EFS ID>
directoryPerms: "700"
reclaimPolicy: Retain
volumeBindingMode: Immediate
Note: Replace the environment custom resource names as per your configuration
in the steps below.
Deployment with environment operators 71
Uninstalling NetBackup environment and the operators
2 Wait for all the pods, services and resources to be terminated. To confirm, run
$ kubectl get --namespace <namespce_name>
all,environments,primaryservers,mediaservers,msdpscaleouts
You should get a message that no resources were found in the nb-example
namespace.
3 To identify and delete any outstanding persistent volume claims, run the
following:
$ kubectl get pvc --namespace <namespce_name>
4 To locate and delete any persistent volumes created by the deployment, run:
$ kubectl get pv
Note: Certain storage drivers may cause physical volumes to get stuck in the
terminating state. To resolve this issue, remove the finalizer, using the
command: $ kubectl patch pv <pv-name> -p
'{"metadata":{"finalizers":null}}
Note: (EKS-specific) Navigate to mounted EFS directory and delete the content
from primary_catalog folder by running the rm -rf /efs/ command.
Note: Do not remove the MSDP Scaleout operator first as it may corrupt the
NetBackup operator.
For more information on uninstalling the Snapshot Manager, refer to the following
section:
See “Uninstalling Snapshot Manager from Kubernetes cluster” on page 213.
6 Return to the deploy script and when prompted, enter yes to tag and push the
images. Wait for the images to be pushed, and then the script will pause to
ask another question. The remaining questions are not required, so press
Ctrl+c to exit the deploy script.
Deployment with environment operators 74
Applying security patches
The command prints the name of the image and includes the SHA-256 hash
identifying the image. For example:
(AKS-specific) exampleacr.azurecr.io/netbackup/operator
@sha256:59d4d46d82024a1ab6353
33774c8e19eb5691f3fe988d86ae16a0c5fb636e30c
(EKS-specific) example.dkr.ecr.us-east-2.amazonaws.com/
2 To restart the NetBackup operator, run:
pod=$(kubectl get pod -n netbackup-operator-system -l
nb-control-plane=nb-controller-manager -o jsonpath --template
'{.items[*].metadata.name}')
3 Re-run the kubectl command from earlier to get the image ID of the NetBackup
operator. Confirm that it's different from what it was before the update.
3 Re-run the kubectl command from earlier to get the image ID of the MSDP
Scaleout operator. Confirm that it's different from what it was before the update.
Deployment with environment operators 75
Applying security patches
2 Get the image ID of the existing NetBackup container and record it for later.
Run:
kubectl get pods -n nb-example $pod -o jsonpath --template
"{.status.containerStatuses[*].imageID}{'\n'}"
3 Look at the list of StatefulSets in the application namespace and identify the
one that corresponds to the pod or pods to be updated. The name is typically
the same as the pod, but without the number at the end. For example, a pod
named nb-primary-0 is associated with statefulset nb-primary. Hereafter the
statefulset will be referred to as $set. Run:
kubectl get statefulsets -n nb-example
The pod or pods associated with the statefulset are terminated and be
re-created. It may take several minutes to reach the "Running" state.
5 Once the pods are running, re-run the kubectl command from step 2 to get the
image ID of the new NetBackup container. Confirm that it's different from what
it was before the update.
2 Get the image IDs of the existing MSDP Scaleout containers and record them
for later. All the MDS pods use the same image, and all the engine pods use
the same image, so it's only necessary to get three image IDs, one for each
type of pod.
kubectl get pods -n nb-example $engine $controller $mds -o
jsonpath --template "{range
.items[*]}{.status.containerStatuses[*].imageID}{'\n'}{end}"
Deployment with environment operators 76
Applying security patches
...
spec:
...
msdpScaleouts:
- ...
tag: "17.0-update1"
4 Save the file and close the editor. The MSDP Scaleout pods are terminated
and re-created. It may take several minutes for all the pods to reach the
"Running" state.
5 Run kubectl get pods, to check the list of pods and note the new name of
the uss-controller pod. Then, once the pods are all ready, re-run the kubectl
command above to get the image IDs of the new MSDP Scaleout containers.
Confirm that they're different from what they were before the update.
Chapter 4
Deploying NetBackup
This chapter includes the following topics:
AKS-specific requirements
Use the following checklist to prepare the AKS for installation.
■ Your Azure Kubernetes cluster must be created with appropriate network and
configuration settings.
Supported Kubernetes cluster version is 1.21.x and later.
■ While creating the cluster, assign appropriate roles and permissions.
Concepts - Access and identity in Azure Kubernetes Services (AKS) - Azure
Kubernetes Service | Microsoft Docs
Deploying NetBackup 78
Preparing the environment for NetBackup installation on Kubernetes cluster
■ Use an existing Azure container registry or create a new one. Your Kubernetes
cluster must be able to access this registry to pull the images from the container
registry. For more information on the Azure container registry, see Azure
Container Registry documentation.
■ It is recommended to create a separate node pool for Media server installation
and select the Scale method as Autoscale. The autoscaling feature allows
your node pool to scale dynamically by provisioning and de-provisioning the
nodes as required automatically.
■ A dedicated node pool for Primary server must be created in Azure Kubernetes
cluster.
The following table lists the node configuration for the primary and media servers.
vCPU 16
RAM 64 GiB
Number of disks/node 1
Medium (8 nodes) 8 TB
■ Another dedicated node pool must be created for Snapshot Manager (if it has
to be deployed) with auto scaling enabled.
Following is the minimum configuration required for Snapshot Manager data
plane node pool:
RAM 8 GB
Following are the different scenario's on how the NetBackup Snapshot Manager
calculates the number of job which can run at a given point in time, based on
the above mentioned formula:
■ For 2 CPU's and 8 GB RAM node configuration:
RAM 8 GB
RAM 16 GB
■ All the nodes in the node pool must be running the Linux operating system.
■ Taints and tolerations allows you to mark (taint) a node so that no pods can
schedule onto it unless a pod explicitly tolerates the taint. Marking nodes instead
of pods (as in node affinity/anti-affinity) is particularly useful for situations where
most pods in the cluster must avoid scheduling onto the node.
Taints are set on the node pool while creating the node pool in the cluster.
Tolerations are set on the pods.
To use this functionality, user must create the node pool with the following detail:
■ Add a label with certain key value. For example key = nbpool, value =
nbnodes
Deploying NetBackup 80
Preparing the environment for NetBackup installation on Kubernetes cluster
■ Add a taint with the same key and value which is used for label in above
step with effect as NoSchedule.
For example, key = nbpool, value = nbnodes, effect = NoSchedule
Provide these details in the operator yaml as follows. To update the toleration
and node selector for operator pod,
Edit the operator/patch/operator_patch.yaml file. Provide the same
label key:value in node selector section and in toleration sections. For
example,
nodeSelector:
nbpool: nbnodes
# Support node taints by adding pod tolerations equal to the specified nodeSelectors
# For Toleartion NODE_SELECTOR_KEY used as a key and NODE_SELECTOR_VALUE as a value.
tolerations:
- key: nbpool
operator: "Equal"
value: nbnodes
■ If you want to use static public IPs, private IPs and fully qualified domain names
for the load balancer service, the public IP addresses, private IP addresses and
FQDNs must be created in AKS before deployment.
■ If you want to bind the load balancer service IPs to a specific subnet, the subnet
must be created in AKS and its name must be updated in the annotations key
in the networkLoadBalancer section of the custom resource (CR).
For more information on the network configuration for a load balancer service,
refer to the How-to-Guide section of the Azure documentation.
For more information on managing the load balancer service, See “About the
Load Balancer service” on page 162.
■ Create a storage class with Azure file storage type with file.csi.azure.com
and allows volume expansion. It must be in LRS category with Premium SSD.
It is recommended that the storage class has , Retain reclaim. Such storage
class can be used for primary server as it supports Azure premium files
storage only for catalog volume.
For more information on Azure premium files, see Azure Files CSI driver.
For example,
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: {{ custome-storage-class-name }}
Deploying NetBackup 81
Preparing the environment for NetBackup installation on Kubernetes cluster
provisioner: file.csi.azure.com
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
parameters:
storageaccounttype: Premium_LRS
protocol: nfs
Note: All the nodes in node group must be running on the Linux operating
system.
■ AmazonEKSServicePolicy
2 Use an existing AWS Elastic Container Registry or create a new one and
ensure that the EKS has full access to pull images from the elastic container
registry.
3 It is recommended to create separate node pool for Media server installation
with autoscaler add-on installed in the cluster. The autoscaling feature allows
your node pool to scale dynamically by provisioning and de-provisioning the
nodes as required automatically.
4 A dedicated node pool for Primary server must be created in Amazon Elastic
Kubernetes Services cluster.
The following table lists the node configuration for the primary and media
servers.
vCPU 16
RAM 64 GiB
Number of disks/node 1
Medium (8 nodes) 8 TB
5 Another dedicated node pool must be created for Snapshot Manager (if it has
to be deployed) with auto scaling enabled.
Following is the minimum configuration required for Snapshot Manager data
plane node pool:
RAM 8 GB
Following are the different scenario's on how the NetBackup Snapshot Manager
calculates the number of job which can run at a given point in time, based on
the above mentioned formula:
■ For 2 CPU's and 8 GB RAM node configuration:
RAM 8 GB
Deploying NetBackup 84
Preparing the environment for NetBackup installation on Kubernetes cluster
RAM 16 GB
6 Taints and tolerations allows you to mark (taint) a node so that no pods can
schedule onto it unless a pod explicitly tolerates the taint. Marking nodes instead
of pods (as in node affinity/anti-affinity) is particularly useful for situations where
most pods in the cluster must avoid scheduling onto the node.
Taints are set on the node group while creating the node group in the cluster.
Tolerations are set on the pods.
To use this functionality, user must create the node group with the following
detail:
■ Add a label with certain key value. For example key = nbpool, value =
nbnodes
■ Add a taint with the same key and value which is used for label in above
step with effect as NoSchedule.
For example, key = nbpool, value = nbnodes, effect = NoSchedule
Provide these details in the operator yaml as follows. To update the
toleration and node selector for operator pod,
Edit the operator/patch/operator_patch.yaml file. Provide the same
label key:value in node selector section and in toleration sections. For
example,
nodeSelector:
nbpool: nbnodes
# Support node taints by adding pod tolerations equal to the specified nodeSelectors
# For Toleartion NODE_SELECTOR_KEY used as a key and NODE_SELECTOR_VALUE as a value.
tolerations:
- key: nbpool
operator: "Equal"
value: nbnodes
9 The FQDN that will be provided in primary server CR and media server CR
specifications in networkLoadBalancer section must be DNS resolvable to the
provided IP address.
10 Amazon Elastic File System (Amazon EFS) for shared persistence storage.
To create EFS for primary server, see Create your Amazon EFS file system.
EFS configuration can be as follow and user can update Throughput mode as
required:
Performance mode: General Purpose
Throughput mode: Provisioned (256 MiB/s)
Availability zone: Regional
Note: To install the add-on in the cluster, ensure that you install the Amazon
EFS CSI driver. For more information on installing the Amazon EFS CSI driver,
see Amazon EFS CSI driver.
11 If NetBackup client is outside VPC or if you want to access the WEB UI from
outside VPC then NetBackup client CIDR must be added with all NetBackup
ports in security group inbound rule of cluster. See “About the Load Balancer
service” on page 162. for more information on NetBackup ports.
■ To obtain the cluster security group, run the following command:
aws eks describe-cluster --name <my-cluster> --query
cluster.resourcesVpcConfig.clusterSecurityGroupId
■ The following link helps to add inbound rule to the security group:
Add rules to a security group
Deploying NetBackup 87
Preparing the environment for NetBackup installation on Kubernetes cluster
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: ebs-csi-storage-class
parameters:
fsType: ext4
type: gp2
provisioner: kubernetes.io/ebs.csi.aws.com
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Note: Ensure that you install the Amazon EBS CSI driver to install the add-on
in the cluster. For more information on installing the Amazon EBS CSI driver,
see Managing the Amazon EBS CSI driver as an Amazon EKS add-on and
Amazon EBS CSI driver.
13 The EFS based PV must be specified for Primary server catalog volume with
ReclaimPolicy=Retain.
Host-specific requirements
Use the following checklist to address the prerequisites on the system that you want
to use as a NetBackup host that connects to the AKS/EKS cluster.
AKS-specific
■ Linux operating system: For a complete list of compatible Linux operating
systems, refer to the Software Compatibility List (SCL) at:
NetBackup Compatibility List for all Versions
https://sort.veritas.com/netbackup
■ Install Docker on the host to install NetBackup container images through tar,
and start the container service.
https://docs.docker.com/engine/install/
■ Prepare the host to manage the AKS cluster.
Deploying NetBackup 88
Preparing the environment for NetBackup installation on Kubernetes cluster
EKS-specific
■ Install AWS CLI.
For more information on installing the AWS CLI, see Installing or updating the
latest version of the AWS CLI.
■ Install Kubectl CLI.
For more information on installing the Kubectl CLI, see Installing kubectl.
■ Configure docker to enable the push of the container images to the container
registry.
■ Create the OIDC provider for the AWS EKS cluster.
For more information on creating the OIDC provider, see Create an IAM OIDC
provider for your cluster.
■ Create an IAM service account for the AWS EKS cluster.
For more information on creating an IAM service account, see Amazon EFS
CSI driver.
■ If an IAM role needs an access to the EKS cluster, run the following command
from the system that already has access to the EKS cluster:
kubectl edit -n kube-system configmap/aws-auth
For more information on creating an IAM role, see Enabling IAM user and role
access to your cluster.
■ Login to the AWS environment to access the Kubernetes cluster by running the
following command on AWS CLI:
aws eks --region <region_name> update-kubeconfig --name
<cluster_name>
Deploying NetBackup 89
Recommendations of NetBackup deployment on Kubernetes cluster
■ Free space of approximately 8.5GB on the location where you copy and extract
the product installation TAR package file. If using docker locally, there should
be approximately 8GB available on the /var/lib/docker location so that the
images can be loaded to the docker cache, before being pushed to the container
registry.
■ AWS EFS-CSI driver should be installed for static PV/PVC creation of primary
catalog volume.
■ Deploy primary server custom resource and media server custom resource in
same namespace.
■ It is recommended to have separate node pool for Media server installation with
Autoscaler enabled/installed.
■ Ensure that you follow the symbolic link and edit the actual persisted version of
the file, if you want to edit a file having a symbolic link in the primary server or
media server.
■ Specify different block storage based volume to obtain good performance when
the nbdeployutil utility does not perform well on the following respective storage
types based volumes:
(AKS-specific): Azure premium files
(EKS-specific): Amazon elastic files
■ Duplication job configuration recommendation:
While configuring destination storage unit, manually select media servers that
are always up, running and would never scale in (by the media server autoscaler).
Deploying NetBackup 90
Limitations of NetBackup deployment on Kubernetes cluster
Number of media servers that are always up and running would be same as
that of the value mentioned in minimumReplicas field in CR.
When upgrading from older version of NetBackup 10.2, post upgrade ensure
that you manually select media servers mentioned in minimumReplicas field
in CR. If the value of minimumReplicas is not specified, the value will be set
to the value specified for replicas field.
■ Adjust the value of minimumReplicas field based on the backup environment
and requirements.
■ (AKS-specific)
■ Use Azure Premium storage for data volume in media server CR.
■ Use Azure Standard storage for log volume in media server CR.
■ For primary server catalog volume, use Azure premium files as storage
type and for media server volumes, use managed-disk as storage type.
■ In case of upgrade and during migration, do not delete the Azure premium
files/Azure disk volume linked to the old PV which is used in primary
server CR deployment until the migration is completed successfully. Else
this leads to data loss.
■ Do not skip the Config-Checker utility execution during NetBackup upgrade
or data migration.
(EKS-specific)
■ Use AWS Premium storage for data volume in media server CR.
■ Use AWS Standard storage for log volume in media server CR.
■ For primary server volume (catalog), use Amazon EFS as storage type. For
media server, primary server volumes, log and data volumes use Amazon
EBS as storage type.
■ In case of upgrade and during migration, do not delete the Amazon elastic
files linked to the old PV which is used in primary server CR deployment
until the migration is completed successfully. Else this leads to data loss.
EKS-specific
■ (Applicable only for media servers) A storage class that has the storage type
as EFS is not supported. When the Config-Checker runs the validation for
checking the storage type, the Config-Checker job fails if it detects the storage
type as EFS. But if the Config-Checker is skipped then this validation is not run,
and there can be issues in the deployment. There is no workaround available
for this limitation. You must clean up the PVCs and CRs and reapply the CRs.
Note: After deployment, you cannot change the Name in primary server and
media server CR.
■ Before the CRs can be deployed, the utility called Config-Checker is executed
that performs checks on the environment to ensure that it meets the basic
deployment requirements. The config-check is done according to the
configCheckMode and paused values provided in the custom resource YAML.
See “How does the Config-Checker utility work” on page 24.
■ You can deploy the primary server and media server CRs in same namespace.
■ (AKS-specific) Use the storage class that has the storage type as Azure premium
files for the catalog volumes in the primary server CR, and the storage type
as Azure disk for the data and log volumes in the media server CR and primary
server CR.
(EKS-specific) Use the storage class that has the storage type as Amazon
elastic files for the catalog volume in the primary server CR. For data and
log volumes in the media server use the storage type as EBS.
■ During fresh installation of the NetBackup servers, the value for keep logs up
to under log retention configuration is set based on the log storage capacity
provided in the primary server CR inputs. You may change this value if required.
To update logs retention configuration, refer the steps mentioned in NetBackup™
Logging Reference Guide.
■ The NetBackup deployment sets the value as per the formula.
Size of logs PVC/PV * 0.8 = Keep logs up value By default, the default value
is set to 24GB.
Deploying NetBackup 93
Primary and media server CR
For example: If the user configures the storage size in the CR as 40GB
(instead of the default 30GB) then the default value for that option become
32GB automatically based on the formula.
Note: This value will get automatically updated to the value of bp.conf file
on volume expansion.
■ Deployment details of primary server and media server can be observed from
the operator pod logs using the following command:
kubectl logs <operator-pod-name> -c netbackup-operator -n
<operator-namespace>
■ Initially pod will be in not ready state (0/1) when installation is going in the
background. Check the pod logs for installation progress using the following
command:
kubectl logs <media-pod-name> -n <namespace>
Media server can be considered as successfully installed and running when the
media server pod’s state is ready (1/1), and the Statefulset is ready (1/1), for
each replica count.
■ Details of media server name for each replica can be obtained from media server
CR status by running the following command:
kubectl describe <MediaServer_cr_name> -n <namespace>
Fields Description
ActiveReplicas Indicates the actual number of replicas that must be running to complete
the ongoing operations on the media servers. Default value is 0. It will
be 0 if media server autoscaler is disabled.
Note: If autoscaler is enabled (after that autscaler is tuned off) the
value would be set to the value of minimumReplicas. It will be
minimumReplicas even if media server autoscaler is disabled.
NextIterationTime Indicates the next iteration time of the media server autoscaler that is,
the media server autoscaler will run after NextIterationTime only. Default
value is empty.
NextCertificateRenewalTime Next time to scale up all registered media servers for certificate renewal.
Configuration parameters
■ ConfigMap
A new ConfigMap with name nbu-media-autoscaler-configmap is created
during deployment and the key-value pairs would be consumed for tuning the
media server autoscaler. This ConfigMap is common to all the media server CR
objects and supports the following keys:
Parameters Description
Parameters Description
Parameters Description
Note: Media server autoscaler can scale out or scale in multiple pods at the same time.
The time taken for media server scale depends on the value of
scaling-interval-in-seconds configuration parameter. During this interval, the
jobs would be served by existing media server replicas based on NetBackup
throttling parameters. For example, Maximum concurrent jobs in storage unit,
Number of jobs per client, and so on.
Cluster's native autoscaler takes some time as per scale-down-unneeded-time
attribute, which decides on the time a node should be unneeded before it is eligible
to be scaled down. By default this is 10 minutes. To change this parameter, edit
the cluster-autoscaler’s current deployment settings using the following commands
and then edit the existing value:
■ AKS: az aks update -g $RESOURCE_GROUP_NAME -n
■ EKS: kubectl -n kube-system edit deployment cluster-autoscaler
Note the following:
■ For scaled in media servers, certain resources and configurations are retained
to avoid reconfiguration during subsequent scale out.
Deploying NetBackup 97
Primary and media server CR
Following table describes the primary server CR and media server CR status fields:
Table 4-1
Section Field / Value Description
Primary Server Host Name Name of the primary server that should
Details be used to access the web UI.
Following tables describe the specs that can be edited for each CR.
Spec Description
(AKS-specific) capacity Catalog, log and data volume storage capacity can be
updated.
Spec Description
Spec Description
(AKS-specific) capacity Catalog, log and data volume storage capacity can be
updated.
If you edit any other fields, the deployment can go into an inconsistent state.
Additional steps
■ Delete the Load Balancer service created for the media server by running the
following commands:
$ kubectl get service --namespace <namespce_name>
$ kubectl delete service <service-name> --namespace <namespce_name>
■ Identify and delete any outstanding persistent volume claims for the media server
by running the following commands:
$ kubectl get pvc --namespace <namespce_name>
$ kubectl delete pvc <pvc-name>
■ Locate and delete any persistent volumes created for the media server by running
the following commands:
$ kubectl get pv
$ kubectl delete pv <pv-name> --grace-period=0 --force
Notes:
■ Deleting a CR will delete all its child resources like pod, statefulset, services,
configmaps, config checker job, config checker pod.
■ Deleting operator with kubectl delete -k <operator_folder_path> will
delete the CRs and its resources except the PVC.
Deploying NetBackup 102
Configuring NetBackup IT Analytics for NetBackup deployment
■ Persistent volume claim (PVC) will not be deleted upon deleting a CR so that
the data is retained in the volumes. Then if you create a new CR with the same
name as the deleted one, the existing PVC with that same name will be
automatically linked to the newly created pods.
■ Do not delete /mnt/nbdata, /mnt/nblogs and /mnt/nbdb folders manually from
primary server and media pods. The NetBackup deployment will go into an
inconsistent state and will also result in data loss.
Note: From NetBackup version 10.2, cloudscale release data collector on primary
server pod is supported.
itanalyticsportal.<yourdomain>
itanalyticsagent.<yourdomain>
itanalyticsportal.<yourdomain>
itanalyticsagent.<yourdomain>
aptareportal.<yourdomain>
aptareagent.<yourdomain>
cd "/mnt/nbdata/"
mkdir analyticscollector
PROXY_USERNAME=
PROXY_PASSWORD=
PROXY_EXCLUDE=
■ Run ./dc_installer.sh -c
/usr/openv/analyticscollector/installer/responsefile.sample
command to connect data collector with IT Analytics portal
10 Check the data collector services status by running the following command
and ensure that the following data collector services are up and running:
/usr/openv/analyticscollector/mbs/bin/aptare_agent status
For more information about IT Analytics data collector policy, see NetBackup IT
Analytics User Guide.
3 Create and copy NetBackup API key from NetBackup web UI.
Configuring the primary server with NetBackup IT Analytics tools is supported only
once from primary server CR.
For more information about IT Analytics data collector policy, see Add a Veritas
NetBackup Data Collector policy and for more information about adding NetBackup
Primary Servers within the Data Collector policy, see Add/Edit NetBackup Master
Servers within the Data Collector policy.
Deploying NetBackup 106
Managing NetBackup deployment using VxUpdate
3 Restart the sshd service using the systemctl restart sshd command.
After adding the VxUpdate package to nbrepo, this package is persisted even
after pod restarts.
2 Change the node selector labelKey and lableValue to new values for
primary/media server.
3 Save the environment CR.
This will change the statefulset for respective NetBackup server replica to 0 for
respective server. This will terminate the pods. After successful migration, statefulset
replicas will be set to original value.
Chapter 5
Deploying NetBackup
using Helm charts
This chapter includes the following topics:
■ Overview
Overview
Helm is an open source packaging tool for deploying and managing the life cycle
of complex Kubernetes applications using helm command line tool. A Helm chart
helps define, install and upgrade the constituents of a Kubernetes application. The
steps involved in the deployment of the Kubernetes application would be automated
through a Helm chart for the application.
The Helm chart package is also an OCI compliant format which allows it to be
pushed and pulled from OCI compliant registry such as Azure or AWS container
registries. Therefore we can use the same deployment methodology (pull images
and install) used for container images for Helm charts. This allows the container
software and deployment tool to be packaged, shipped and installed together.
Deploying NetBackup using Helm charts 109
Installing NetBackup using Helm charts
■ Edit the values.yaml file to fill-in the required input values similar to the
environment.yaml file used for manual deployment.
■ Install the Helm chart specifying the values.yaml file using the following
command:
# helm install <release-name> netbackup-cloudscale-10.2.tar.gz
-f values.yaml
4 Using the following command, verify that all the pods in the
netbackup-operator-system and NetBackup Namespaces go into ready
state:
# kubectl get pods -n netbackup-operator-system, netbackup
Deploying NetBackup using Helm charts 110
Uninstalling NetBackup using Helm charts
3 The pending global objects cluster roles and role bindings must be cleared up
using the following kubectl commands:
Step 1 Install the docker images and See “Installing the docker images and
binaries. binaries” on page 116.
For information about AKS Uptime SLA and to enable it, see Azure
Kubernetes Service (AKS) Uptime SLA.
If the NetBackup servers are on Azure cloud, besides the NetBackup configuration
requirements, the following settings are recommended. They are not MSDP-specific
requirements, they just help your NetBackup environment run smoothly on Azure
cloud.
■ Add the following in /usr/openv/netbackup/bp.conf
HOST_HAS_NAT_ENDPOINTS = YES
net.ipv4.tcp_keepalive_time=120
net.core.somaxconn = 1024
Tune the max open files to 1048576 if you run concurrent jobs.
You must have a dedicated node group for MSDP Scaleout created. The node
group should not cross availability zone.
The AWS Auto Scaling allows your node group to scale dynamically as required.
If AWS Auto Scaling is not enabled, ensure the node number is not less than
MSDP Scaleout size.
It is recommended that you set the minimum node number to 1 or more to bypass
some limitations in EKS.
■ Client machine to access EKS cluster
■ A separate computer that can access and manage your EKS cluster and
ECR.
■ It must have Linux operating system.
■ It must have Docker daemon, the Kubernetes command-line tool (kubectl),
and AWS CLI installed.
The Docker storage size must be more than 6 GB. The version of kubectl
must be v1.19.x or later. The version of AWS CLI must meet the EKS cluster
requirements.
■ If EKS is a private cluster, see Creating an private Amazon EKS cluster.
■ If the internal IPs are used, reserve N internal IPs and make sure they are not
used. N matches the MSDP-X cluster size which is to be configured.
These IPs are used for network load balancer services. For the private IPs,
please do not use the same subnet with the node group to avoid IP conflict with
the secondary private IPs used in the node group.
For the DNS name, you can use the Private IP DNS name amazon provided,
or you can create DNS and Reverse DNS entries under Route53.
HOST_HAS_NAT_ENDPOINTS = YES
net.ipv4.tcp_keepalive_time=120
net.core.somaxconn = 1024
Tune the max open files to 1048576 if you run concurrent jobs.
3 Copy MSDP kubectl plugin to a directory from where you access AKS or EKS
host. This directory can be configured in the PATH environment variable so
that kubectl can load kubectl-msdp as a plugin automatically.
For example,
cp ./VRTSpddek-*/bin/kubectl-msdp /usr/local/bin/
4 Push the docker images to the ACR. Keep the image name and version same
as original.
3 Copy MSDP kubectl plugin to a directory from where you access AKS or EKS
host. This directory can be configured in the PATH environment variable so
that kubectl can load kubectl-msdp as a plugin automatically.
For example,
cp ./VRTSpddek-*/bin/kubectl-msdp /usr/local/bin/
■ Create a repository.
See AWS documentation Creating a private repository
■ Push the docker images to ECR. Keep the image name and version same
as original.
■ For EKS
kubectl msdp init -i <ecr-url>/msdp-operator:<version> -s
<storage-class-name> [-l agentpool=<nodegroup-name>]
Option Description
■ AKS: agentpool=<nodepool-name>
■ EKS: agentpool=<nodegroup-name>
Range: 1-365
Default value: 28
Range: 1-20
Default value: 20
environment. MSDP operator runs with Deployment Kubernetes workload type with
single replica size in the default namespace msdp-operator-system.
MSDP operator also exposes the following services:
■ Webhook service
The webhook service is consumed by Kubernetes api-server to mutate and
validate the user inputs and changes of the MSDP CR for the MSDP Scaleout
configuration.
■ Metrics service
AKS: The metric service is consumed by Kubernetes/AKS for Azure Container
Insight integration.
EKS: The metric service is consumed by Kubernetes/EKS for Amazon
CloudWatch integration.
You can deploy only one MSDP operator instance in a Kubernetes cluster.
Run the following command to check the MSDP operator status.
kubectl -n msdp-operator-system get pods -o wide
In the STATUS column, if the readiness state for the controller, MDS and
engine pods are all Running, it means that the configuration has completed
successfully.
In the READINESS GATES column for engines, 1/1 indicates that the engine
configuration has completed successfully.
9 If you specified spec.autoRegisterOST.enabled: true in the CR, when the
MSDP engines are configured, the MSDP operator automatically registers the
storage server, a default disk pool, and a default storage unit in the NetBackup
primary server.
A field ostAutoRegisterStatus in the Status section indicates the registration
status. If ostAutoRegisterStatus.registered is True, it means that the
registration has completed successfully.
You can run the following command to check the status:
kubectl get msdpscaleouts.msdp.veritas.com -n <sample-namespace>
You can find the storage server, the default disk pool, and storage unit on the
Web UI of the NetBackup primary server.
If the command output is true, S3 service is configured and ready for use.
Otherwise wait for the flag to be true. The flag changes to true automatically
after all MSDP Scaleout resources are ready.
2 Use the following URL to access S3 service in MSDP Scaleout:
https://<first-MSDP-engine-FQDN>:8443
Limitations:
■ S3 service in MSDP Scaleout only supports NBCA certificates.
You can use the CA certificate in NetBackup’s primary server (the CA certificate
file path is /usr/openv/var/webtruststore/cacert.pem) to bypass SSL warnings
when accessing S3 service.
For example, when using AWS CLI you can use –ca-bundle parameter to
specify CA certificate file path to bypass SSL warnings.
■ The access point of S3 service is at the first MSDP Scaleout engine. If the first
engine is down, S3 service is unavailable until MSDP Scaleout operator restarts
it.
■ The S3 service process s3srv resides in the engines when S3 service is being
configured. If more engines are added after S3 service is configured, the process
s3srv does not run on new added engines.
■ If you have configured Disaster Recovery for cloud LSU and enabled S3 service,
the IAM users in the cloud LSU need to be recreated using the S3 root user.
Ensure that you save the S3 credential at a secure place after it is generated
for later use.
2 Input S3 credential field in existing CR resources.
■ If the MSDP Scaleout is deployed with environment YAML, run the following
command to update the spec.msdpScaleouts[<index>].s3Credential
field in the existing CR resources:
$ kubectl edit environment <environmentCR_name> -n
<sample-namespace>
■ If the MSDP Scaleout is deployed with MSDP Scaleout YAML, run the
following command to update the spec.s3Credential field in the existing
CR resources:
$ kubectl edit msdpscaleout <MSDP Scaleout CR name> -n
<sample-namespace>
If the command output is true, S3 service is configured and ready for use.
Chapter 7
Deploying Snapshot
Manager
This chapter includes the following topics:
■ Prerequisites
Prerequisites
A working Azure Kubernetes cluster (AKS cluster)
■ Azure Kubernetes cluster
■ Your Azure Kubernetes cluster must be created with appropriate network
and configuration settings.
Supported Azure Kubernetes cluster version is 1.21.x and later.
■ Availability zone for AKS cluster must be disabled.
■ Two storage classes with the following configurations is required:
$ docker tag
veritas/flexsnap-datamover:${SNAPSHOT_MANAGER_VERSION}
${REGISTRY}/veritas/flexsnap-datamover:${SNAPSHOT_MANAGER_VERSION}
Note: Ensure that you use the same tag as that of Snapshot Manager image
version. Custom tag cannot be used.
Deploying Snapshot Manager 129
Installing the docker images
$ docker push
${REGISTRY}/veritas/flexsnap-rabbitmq:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-fluentd:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-datamover:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-nginx:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-mongodb:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-core:${SNAPSHOT_MANAGER_VERSION}
$ docker push
${REGISTRY}/veritas/flexsnap-deploy:${SNAPSHOT_MANAGER_VERSION}
3 Edit the cpServer CR section of the environment.yaml file in the text editor.
See “Configuring the environment.yaml file” on page 56.
4 Apply the CR file to the Kubernetes cluster:
kubectl apply -f <sample-cr-yaml>
Note: If Snapshot Manager is uninstalled and installed again with same NetBackup
primary server, then generate reissue token and edit the Snapshot Manager after
enabling it.
■ Telemetry reporting
Table 8-1
Action Description Probe name Primary Media server
server (seconds)
(seconds)
Heath probes are run using the nbu-health command. If you want to manually run
the nbu-health command, the following options are available:
■ Disable
This option disables the health check that will mark pod as not ready (0/1).
■ Enable
This option enables the already disabled health check in the pod. This marks
the pod in ready state(1/1) again if all the NetBackup health checks are passed.
■ Deactivate
This option deactivates the health probe functionality in pod. Pod remains in
ready state(1/1). This will avoid pod restarts due to health probes like liveness,
readiness probe failure. This is the temporary step and not recommended to
use in usual case.
■
Monitoring NetBackup 134
Telemetry reporting
■ Activate
This option activates the health probe functionality that has been deactivated
earlier using the deactivate option.
You can manually disable or enable the probes if required. For example, if for any
reason you need to exec into the pod and restart the NetBackup services, the health
probes should be disabled before restarting the services, and then they should be
enabled again after successfully restarting the NetBackup services. If you do not
disable the health probes during this process, the pod may restart due to the failed
health probes.
You can check pod events in case of probe failures to get more details using
the kubectl describe <primary/media-pod-name> -n <namesapce>
command.
Telemetry reporting
Telemetry reporting entries for the NetBackup deployment on AKS/EKS are indicated
with the AKS/EKS based deployments text.
■ By default, the telemetry data is saved at the /var/veritas/nbtelemetry/
location. The default location will not persisted during the pod restarts.
■ If you want to save telemetry data to persisted location, then execute the kubectl
exec -it -n <namespace> <primary/media_server_pod_name> - /bin/bash
command in the pod using the and execute telemetry command using
Monitoring NetBackup 135
About NetBackup operator logs
■ NetBackup operator provides different log levels that can be changed before
deployment of NetBackup operator.
The following log levels are provided:
■ -1 - Debug
■ 0 - Info
■ 1 - Warn
■ 2 - Error
By default, the log level is 0.
It is recommended to use 0, 1, or 2 log level depending on your requirement.
Before you deploy NetBackup operator, you can change the log levels using
operator_patch.yaml.
After deployment if user changes operator log level, to reflect it, user has to
perform the following steps:
■ Apply the operator changes using the kubectl apply -k
<operator-folder> command.
■ Restart the operator pod. Delete the pod using the kubectl delete
pod/<netbackup-opertaor-pod-name> -n <namespace> command.
Kubernetes will recreate the NetBackup operator pod again after deletion.
Monitoring NetBackup 136
Expanding storage volumes
■ Config-Checker jobs that run before deployment of primary server and media
server creates the pod. The logs for config checker executions can be checked
using the kubectl logs <configchecker-pod-name> -n
<netbackup-operator-namespace> command.
■ Installation logs of NetBackup primary server and media server can be retrieved
using any of the following methods:
■ Run the kubectl logs <PrimaryServer/MediaServer-Pod-Name> -n
<PrimaryServer/MediaServer namespace> command.
■ Execute the following command in the primary server/media server pod and
check the /mnt/nblogs/setup-server.log file:
kubectl exec -it <PrimaryServer/MediaServer-Pod-Name> -n
<PrimaryServer/MediaServer-namespace> -- bash
■ (AKS-specific) Data migration jobs create the pods that run before deployment
of primary server. The logs for data migration execution can be checked using
the following command:
kubectl logs <migration-pod-name> -n
<netbackup-environment-namespace>
■ Execute the following respective commands to check the event logs that shows
deployment logs for PrimaryServer and MediaServer:
■ For primary server: kubectl describe PrimaryServer <PrimaryServer
name> -n <PrimaryServer-namespace>
PVC will expand as per the new size and it will be available to volume mounts in
primaryServer pod.
To expand volume of data and log volumes for primary and media server
Note: (EKS-specific) Amazon EFS is an elastic file system, it does not enforce any
file system capacity limits. The actual storage capacity value in persistent volumes
and persistent volume claims is not used when creating the file system. However,
because storage capacity is a required field in Kubernetes, you must specify a valid
value. This value does not limit the size of your Amazon EFS file system.
1 Edit the environment custom resource using the kubectl edit Environment
<environmentCR_name> -n <namespace> command.
2 To pause the reconciler of the particular custom resource, change the paused:
false value to paused: true in the primaryServer or mediaServer section and
save the changes. In case of multiple media server objects change Paused
value to true for respective media server object only.
3 Edit StatefulSet of primary server or particular media server object using
thekubectl edit <statfulset name> -n <namespace> command, change
replica count to 0 and wait for all pods to terminate for the particular CR object.
4 Update all the persistent volume claim which expects capacity resize with the
kubectl edit pvc <pvcName> -n <namespace> command. In case of
particular media server object, resize respective PVC with expected storage
capacity for all its replicas.
5 Update the respective custom resource section using the kubectl edit
Environment <environmentCR_name> -n <namespace> command with updated
storage capacity for respective volume and change paused: false. Save updated
custom resource.
To update the storage details for respective volume, add storage section with
specific volume and its capacity in respective primaryServer or mediaServer
section in environment CR.
Earlier terminated pod and StatefulSet must get recreated and running
successfully. Pod should get linked to respective persistent volume claim and
data must have been persisted.
Monitoring NetBackup 138
Allocating static PV for Primary and Media pods
6 Run the kubectl get pvc -n <namespace> command and check for capacity
column in result to check the persistent volume claim storage capacity is
expanded.
7 (Optional) Update the log retention configuration for NetBackup depending on
the updated storage capacity.
For more information, refer to the NetBackup™ Administrator's Guide,
Volume I
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: managed-premium-retain
provisioner: disk.csi.azure.com
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate
parameters:
storageaccounttype: Premium_LRS
kind: Managed
Example If user wants to deploy a media For this scenario, you must create total 8
1 server with replica count 3. disks, 8 PV and 8 PVCs.
For data:
■ data-testmedia-media-0
■ data-testmedia-media-1
■ data-testmedia-media-10
■ data-testmedia-media-2
For log:
■ logs-testmedia-media-0
■ logs-testmedia-media-1
■ logs-testmedia-media-2
Example If user wants to deploy a media For this scenario, you must create 12
2 server with replica count 5 disks, 12 PV and 12 PVCs
For data:
■ data-testmedia-media-0
■ data-testmedia-media-1
■ data-testmedia-media-2
■ data-testmedia-media-3
■ data-testmedia-media-4
For log:
■ logs-testmedia-media-0
■ logs-testmedia-media-1
■ logs-testmedia-media-2
■ logs-testmedia-media-3
■ logs-testmedia-media-4
3 Create required number of Azure disks and save the ID of newly created disk.
For more information, see Azure Disk - Static
Monitoring NetBackup 141
Allocating static PV for Primary and Media pods
4 Create PVs for each disk and link the PVCs to respective PVs.
To create the PVs, specify the created storage class and diskURI (ID of the
disk received in step 3). The PV must be created using the claimRef field and
provide PVC name for its corresponding namespace.
For example, if you are creating PV for catalog volume, storage required is
128GB, diskName is primary_catalog_pv and namespace is test. PVC named
catalog-testprimary-primary-0 is linked to this PV when PVC is created in
the namespace test.
apiVersion: v1
kind: PersistentVolume
metadata:
name: catalog
spec:
capacity:
storage: 128Gi
accessModes:
- ReadWriteOnce
azureDisk:
kind: Managed
diskName: primary_catalog_pv
diskURI:
/subscriptions/3247febe-4e28-467d-a65c-10ca69bcd74b/
resourcegroups/MC_NBU-k8s-network_xxxxxx_eastus/providers/Microsoft.Compute/disks/deepak_s_pv
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: catalog-testprimary-primary-0
namespace: test
Monitoring NetBackup 142
Allocating static PV for Primary and Media pods
5 Create PVC with correct PVC name (step 2), storage class and storage.
For example,
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: catalog-testprimary-primary-0
namespace: test
spec:
storageClassName: "managed-premium-retain"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 128Gi
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp2-reclaim
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate
parameters:
fsType: ext4
type: gp2
Example If user wants to deploy a media For this scenario, you must create total 8
1 server with replica count 3. disks, 8 PV and 8 PVCs.
Name of the Media PVC assuming 6 disks, 6 PV and 6 PVCs for media
resourceNamePrefix_of_media is server.
testmedia.
For data of PrimaryServer:
For logs:
■ logs-testprimary-primary-0
For data:
■ data-testmedia-media-0
■ data-testmedia-media-1
■ data-testmedia-media-2
For log:
■ logs-testmedia-media-0
■ logs-testmedia-media-1
■ logs-testmedia-media-2
Example If user wants to deploy a media For this scenario, you must create 12
2 server with replica count 5 disks, 12 PV and 12 PVCs
For data:
■ data-testmedia-media-0
■ data-testmedia-media-1
■ data-testmedia-media-2
■ data-testmedia-media-3
■ data-testmedia-media-4
For log:
■ logs-testmedia-media-0
■ logs-testmedia-media-1
■ logs-testmedia-media-2
■ logs-testmedia-media-3
■ logs-testmedia-media-4
Monitoring NetBackup 145
Allocating static PV for Primary and Media pods
3 Create the required number of AWS EBS volumes and save the VolumeId of
newly created volumes.
For more information on creating EBS volumes, see EBS volumes.
(For Primary Server volumes) Create the required number of EFS. User can
use single EFS to mount catalog of primary. For example, VolumeHandle in
PersistentVolume spec will be as follows:
<file_system_id>:/catalog
apiVersion: v1
kind: PersistentVolume
metadata:
name: catalog
spec:
accessModes:
- ReadWriteMany
awsElasticBlockStore:
fsType: xfs
volumeID: aws://us-east-2c/vol-xxxxxxxxxxxxxxxxx
capacity:
storage: 128Gi
persistentVolumeReclaimPolicy: Retain
storageClassName: gp2-retain
volumeMode: Filesystem
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: catalog-testprimary-primary-0
namespace: test
Monitoring NetBackup 146
Allocating static PV for Primary and Media pods
5 Create PVC with correct PVC name (step 2), storage class and storage.
For example,
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: catalog-testprimary-primary-0
namespace: test
spec:
storageClassName: gp2-retain
accessModes:
- ReadWriteMany
resources:
requests:
storage: 128Gi
AKS:
{
"controllers": [
{
"apiVersions": [
"1.0"
],
"name": "msdp-aks-demo-uss-controller",
"nodeName": "aks-nodepool1-25250377-vmss000002",
"productVersion": "15.1-0159",
"pvc": [
{
"pvcName": "msdp-aks-demo-uss-controller-log",
"stats": {
"availableBytes": "10125.98Mi",
"capacityBytes": "10230.00Mi",
"percentageUsed": "1.02%",
"usedBytes": "104.02Mi"
}
}
],
"ready": "True"
}
],
"engines": [
{
Monitoring MSDP Scaleout 149
About MSDP Scaleout status and events
"ip": "x.x.x.x",
"name": "msdppods1.westus2.cloudapp.azure.com",
"nodeName": "aks-nodepool1-25250377-vmss000003",
"pvc": [
{
"pvcName": "msdppods1.westus2.cloudapp.azure.com-catalog",
"stats": {
"availableBytes": "20293.80Mi",
"capacityBytes": "20470.00Mi",
"percentageUsed": "0.86%",
"usedBytes": "176.20Mi"
}
},
{
"pvcName": "msdppods1.westus2.cloudapp.azure.com-data-0",
"stats": {
"availableBytes": "30457.65Mi",
"capacityBytes": "30705.00Mi",
"percentageUsed": "0.81%",
"usedBytes": "247.35Mi"
}
}
],
"ready": "True"
},
......
EKS:
"capacityBytes": "9951.27Mi",
"percentageUsed": "0.58%",
"usedBytes": "57.27Mi"
}
}
],
"ready": "True"
}
],
"engines": [
{
"ip": "x.x.x.x",
"name": "ip-x-x-x-x.ec2.internal",
"nodeName": "ip-x-x-x-x.ec2.internal",
"pvc": [
{
"pvcName": "ip-x-x-x-x.ec2.internal-catalog",
"stats": {
"availableBytes": "604539.68Mi",
"capacityBytes": "604629.16Mi",
"percentageUsed": "0.01%",
"usedBytes": "73.48Mi"
}
},
{
"pvcName": "ip-x-x-x-x.ec2.internal-data-0",
"stats": {
"availableBytes": "4160957.62Mi",
"capacityBytes": "4161107.91Mi",
"percentageUsed": "0.00%",
"usedBytes": "134.29Mi"
}
}
],
"ready": "True"
},
name: prometheus-cwagentconfig
namespace: amazon-cloudwatch
---
# create configmap for prometheus scrape config
apiVersion: v1
data:
# prometheus config
prometheus.yaml: |
global:
scrape_interval: 1m
scrape_timeout: 10s
scrape_configs:
- job_name: 'msdpoperator-metrics'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount
/token
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io
_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io
_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_
prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- source_labels: [__meta_kubernetes_namespace]
action: replace
Monitoring MSDP Scaleout 154
Monitoring with Amazon CloudWatch
target_label: NameSpace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: PodName
kind: ConfigMap
metadata:
name: prometheus-config
namespace: amazon-cloudwatch
Table 9-1 lists the Prometheus metrics that MSDP Scaleout supports.
4 Apply the YAML file.
Kubectl apply -f Prometheus-eks.yaml
If multiple MSDP scaleout clusters are deployed in the same EKS cluster, use
the filter to search the results. For example, search the MSDP engines with
the free space size lower than 1GB in the namespace sample-cr-namespace.
Log query:
prometheus-data-collection-settings: |-
[prometheus_data_collection_settings.cluster]
interval = "1m"
fieldpass = ["msdpoperator_reconcile_total",
"msdpoperator_reconcile_failed",
"msdpoperator_operator_run",
"msdpoperator_diskFreeLess5GBEngines_total",
"msdpoperator_diskFreeMiBytesInEngine",
"msdpoperator_diskFreeLess10GBClusters_total",
"msdpoperator_totalDiskFreePercentInCluster",
"msdpoperator_diskFreePercentInEngine",
"msdpoperator_pvcFreePercentInCluster",
"msdpoperator_unhealthyEngines_total",
"msdpoperator_createdPods_total"]
monitor_kubernetes_pods = true
monitor_kubernetes_pods_namespaces =
["msdp-operator-system"]
Table 9-2 lists the Prometheus metrics that MSDP Scaleout supports.
Monitoring MSDP Scaleout 157
Monitoring with Azure Container insights
The configuration change takes a few minutes and all omsagent pods in the
cluster restart.
The default namespace of prometheus metrics is prometheus.
5 Add alert rules for the integrated metrics.
Add related log query, add new alert rule for the selected query, and alert
group/action for it.
For example,
If the free space size of the MSDP Scaleout engines is lower than 1 GB in past
5 minutes, alert the users.
Log query:
InsightsMetrics
If multiple MSDP Scaleouts are deployed in the same AKS cluster, use the
filter to search the results. For example, search the MSDP engines with the
free space size lower than 1GB in the namespace sample-cr-namespace
Log query:
InsightsMetrics
| where Name == "msdpoperator_diskFreeMiBytesInEngine"
| where Namespace == "prometheus"
| where TimeGenerated > ago(5m)
| where Val <= 1000000
| where Val > 0
| extend Tags = parse_json(Tags)
| where Tags.msdpscalout_ns == "sample-cr-namespace"
■ Run the following command to find the Kubernetes cluster level resources that
belong to the CR:
kubectl api-resources --verbs=list --namespaced=false -o name |
xargs -n 1 -i bash -c 'kubectl get --show-kind --show-labels
--ignore-not-found {} |grep [msdp-operator|<cr-name>]'
Chapter 10
Monitoring Snapshot
Manager
This chapter includes the following topics:
■ Overview
■ Configuration parameters
Overview
The status of Snapshot Manager deployment can be verified by using the following
command:
kubectl describe cpserver -n $ENVIRONMENT_NAMESPACE
Status Description
You can find the Snapshot Manager log files under /cloudpoint/logs/ folder.
Configuration parameters
■ Any configuration related parameter that must be added in
/cloudpoint/flexsnap.conf file can be added in flexsnap-conf configmap
by editing it as follows:
kubectl edit configmap flexsnap-conf -n $ENVIRONMENT_NAMESPACE
For example, for changing the log level from info to debug, add the following:
[logging]
level = debug
service.beta.kubernetes.io/azure-load-balancer-internal: "true".
In this case, by default internal load balancer is selected for deployment.
■ If networkLoadBalancer section is not defined, by default internal load
balancer with dynamic IP address allocation are done. In this case, DNS
names for the services can be obtained from HostName in CR status using
the kubectl describe <CR name> -n <namespace> command.
■ Whenever, HostName in CR status is not in FQDN format, you must add
entry of hostname and its corresponding IP address in /etc/host to
access the primary server and its corresponding IP address in hosts file
of computer accessing the primary server. Hosts file is present at the
following location:
■ For Linux: /etc/hosts
■ For Windows: c:\Windows\System32\Drivers\etc\hosts
networkLoadBalancer:
type: Public
annotations:
- service.beta.kubernetes.io/azure-load-balancer-
resource-group:<name of network resource-group>
ipList:
Managing the Load Balancer service 164
About the Load Balancer service
- fqdn: primary.eastus.cloudapp.azure.com
ipAddr: 40.123.45.123
networkLoadBalancer:
annotations:
- service.beta.kubernetes.io/azure-load-balancer-
resource-group: ""<name of network resource-group>""
ipList:
- fqdn: media-1.eastus.cloudapp.azure.com
ipAddr: 40.123.45.123
- fqdn: media-2.eastus.cloudapp.azure.com
ipAddr: 40.123.45.124
■ (EKS-specific)
■ NetBackup supports the network load balancer with AWS Load Balancer
scheme as internet-facing.
■ FQDN must be created before being used. Refer below sections for different
allowed annotations to be used in CR spec.
■ User must add the following annotations:
service.beta.kubernetes.io/aws-load-balancer-subnets: <subnet1
name>
In addition to the above annotations, if required user can add more
annotations supported by AWS. For more information, see AWS Load
Balancer Controller Annotations.
Example: CR spec in primary server,
networkLoadBalancer:
type: Private
annotations:
service.beta.kubernetes.io/aws-load-balancer-subnets: <subnet1 name>
ipList:
"10.244.33.27: abc.vxindia.veritas.com"
networkLoadBalancer:
type: Private
annotations:
service.beta.kubernetes.io/aws-load-balancer-subnets: <subnet1 name>
ipList:
Managing the Load Balancer service 165
About the Load Balancer service
"10.244.33.28: pqr.vxindia.veritas.com"
"10.244.33.29: xyz.vxindia.veritas.com"
Note: The subnet provided here should be same as the one given in node
pool used for primary server and media server.
If NetBackup client is outside VPC or to access Web UI from outside VPC, then
client CIDR must be added with all NetBackup ports in security group rule of cluster.
Run the following command, to obtain the cluster security group:
aws eks describe-cluster --name <my-cluster> --query
cluster.resourcesVpcConfig.clusterSecurityGroupId
For more information on cluster security group, see Amazon EKS security group
requirements and considerations.
Add inbound rule to security group. For more information, see Add rules to a security
group.
■ 1556
Used as bidirectional port. Primary server to/from media servers and primary
server to/from client require this TCP port for communication.
■ 8443
Used to inbound to java nbwmc on the primary server.
■ 443
Used to inbound to vnet proxy tunnel on the primary server. Also, this is used
Nutanix workload, communication from primary server to the deduplication
media server.
■ 13781
The MQBroker is listening on TCP port 13781. NetBackup client hosts -
typically located behind a NAT gateway - be able to connect to the message
queue broker (MQBroker) on the primary server.
■ 13782
Used by primary server for bpcd process.
■ Port 22
Used by NetBackup IT Analytics data collector for data collection.
■ Media server:
■ 1556
Used as bidirectional port. Primary server to/from media servers and primary
server to/from client require this TCP port for communication.
■ 13782
Used by media server for bpcd process.
■ 443
The Snapshot Manager user interface uses this port as the default HTTPS
port.
■ 5671
The Snapshot Manager RabbitMQ server uses this port for internal service
communications. This port must be open to support multiple agents,
extensions, backup from snapshot, and restore from backup jobs.
■ (EKS-specific) 2049
It is used for Amazon EFS access.
For more information, see Source ports for working with EFS.
Note: Add the NFS rule that allows traffic on port 2049 directly to the cluster
security group. The security group attached to EFS must also allow traffic
from port 2049.
Managing the Load Balancer service 167
Notes for Load Balancer service
Note: Be cautious while performing this step, this may lead to data loss.
■ Before using the DNS and its respective IP address in CR yaml, you can verify
the IP address and its DNS resolution using nslookup.
■ In case of media server scaleout, ensure that the number of IP addresses
mentioned in IPList in networkLoadBalancer section matches the replica count.
■ If nslookup is done for loadbalancer IP inside the container, it returns the DNS
in the form of <svc name>.<namespace_name>.svc.cluster.local. This is
Kubernetes behavior. Outside the pod, the loadbalancer service IP address is
resolved to the configured DNS. The nbbptestconnection command inside
the pods can provide a mismatch in DNS names, which can be ignored.
For example:
■ For primary server load balancer service:
■ Service name starts with Name of primary server like <Name>-primary.
Edit the service with the kubectl edit service <Name>-primary -n
<namespace> command.
Note: The load balancer service with name Name used in primary sever and
media server specification must be unique.
3 Add entry for new port in ports array in specification field of the service. For
example, if user want to add 111 port, then add the following entry in ports
array in specification field.
name: custom-111
port: 111
protocol: TCP
targetPort: 111
The MSDP Scaleout services are not interrupted when MSDP engines are added.
Note: Due to some Kubernetes restrictions, MSDP operator restarts the engine
pods for attaching the existing and new volumes, which can cause the short
downtime of the services.
Managing MSDP Scaleout 171
Expanding existing data or catalog volumes
To expand the data or catalog volumes using the kubectl command directly
◆ Run the following command to increase the requested storage size in the
spec.dataVolumes field or in the spec.catalogVolume field..
kubectl -n <sample-namespace> edit msdpscaleout <your-cr-name>
[-o json | yaml]
Sometimes Azure disk or Amazon EBS CSI driver may not respond the volume
expansion request promptly. In this case, the operator retries the request by adding
1 byte to the requested volume size to trigger the volume expansion again. If it is
successful, the actual volume capacity could be slightly larger than the requested
size.
Due to the limitation of Azure disk or Amazon CSI storage driver, the engine pods
need to be restarted for resizing the existing volumes. This can cause the short
downtime of the services.
MSDP Scaleout does not support the following:
■ Cannot shrink the volume size.
■ Cannot change the existing data volumes other than for storage expansion.
■ Cannot expand the log volume size. You can do it manually. See “Manual storage
expansion” on page 171.
■ Cannot expand the data volume size for MDS pods. You can do it manually.
See “Manual storage expansion” on page 171.
Note: If you add new MSDP Engines later, the new Engines will respect the CR
specification only. Your manual changes would not be respected by the new Engines.
■ After scaling up, the memory and CPU of the existing node pool may not meet
the performance requirements anymore. In this case, you can add more memory
and CPU by upgrading to the higher instance type to improve the existing node
pool performance or create another node pool with higher instance type and
update the node-selector for the CR accordingly. If you create another node
pool, the new node-selector does not take effect until you manually delete the
pods and deployments from the old node pool, or delete the old node pool
directly to have the pods re-scheduled to the new node pool.
■ Ensure that each AKS or EKS node supports mounting the number of data
volumes plus 5 of the data disks.
For example, if you have 16 data volumes for each engine, then each your AKS
or EKS node should support mounting at least 21 data disks. The additional 5
data disks are for the potential MDS pod, Controller pod or MSDP operator pod
to run on the same node with MSDP engine.
Credentials, bucket name, and sub bucket name must be the same as the
recovered Cloud LSU configuration in the previous MSDP Scaleout deployment.
Configuration file template:
If the LSU cloud alias does not exist, you can use the following command to
add it.
/usr/openv/netbackup/bin/admincmd/csconfig cldinstance -as -in
<instance-name> -sts <storage-server-name> -lsu_name <lsu-name>
3 On the first MSDP Engine of MSDP Scaleout, run the following command for
each cloud LSU:
sudo -E -u msdpsvc /usr/openv/pdde/pdcr/bin/cacontrol --catalog
clouddr <LSUNAME>
Managing MSDP Scaleout 175
MSDP Cloud backup and disaster recovery
Option 2: Stop MSDP services in each MSDP engine pod. MSDP service starts
automatically.
kubectl exec <sample-engine-pod> -n <sample-cr-namespace> -c
uss-engine -- /usr/openv/pdde/pdconfigure/pdde stop
Scenario 2: MSDP Scaleout and its data is lost and the NetBackup primary
server was destroyed and is re-installed
1 Redeploy MSDP Scaleout on a cluster by using the same CR parameters and
new NetBackup token.
2 When MSDP Scaleout is up and running, reuse the cloud LSU on NetBackup
primary server.
/usr/openv/netbackup/bin/admincmd/nbdevconfig -setconfig
-storage_server <STORAGESERVERNAME> -stype PureDisk -configlist
<configuration file>
Credentials, bucket name, and sub bucket name must be the same as the
recovered Cloud LSU configuration in previous MSDP Scaleout deployment.
Configuration file template:
If KMS is enabled, setup KMS server and import the KMS keys.
If the LSU cloud alias does not exist, you can use the following command to
add it.
/usr/openv/netbackup/bin/admincmd/csconfig cldinstance -as -in
<instance-name> -sts <storage-server-name> -lsu_name <lsu-name>
3 On the first MSDP Engine of MSDP Scaleout, run the following command for
each cloud LSU:
sudo -E -u msdpsvc /usr/openv/pdde/pdcr/bin/cacontrol --catalog
clouddr <LSUNAME>
You can configure Auto Image Replication in NetBackup, which is using MSDP
Scaleout storage servers.
To configure Auto Image Replication
1 Logon to the NetBackup Web UI of both replication source and target domain.
2 Add each other NetBackup's primary server as trusted primary server.
For more information, see the NetBackup Web UI Administrator’s Guide.
3 In the replication source domain, get the MSDP_SERVER name from the
NetBackup Web UI.
Navigate to Storage > Storage configuration > Storage servers.
4 Add MSDP_SERVER in the primary server of replication target domain. Login
to the target primary server and run the following command:
echo "MSDP_SERVER = <Source MSDP server name>" >>
/usr/openv/netbackup/bp.conf
5 Get the token from the target domain NetBackup Web UI.
Navigate to Security > Token. In the Create token window, enter the token
name and other required details. Click Create.
For more information, see the NetBackup Web UI Administrator’s Guide.
6 Add replication targets for the disk pool in replication source domain.
In the Disk pools tab, click on the disk pool link.
Click Add to add the replication target.
7 In the Add replication targets window:
■ Select the replication target primary server.
■ Provide the target domain token.
■ Select the target volume.
■ Provide the target storage credentials.
Click Add.
Option Description
Available options:
■ Backing up a catalog
■ Restoring a catalog
Backing up a catalog
You can backup a catalog.
To backup a catalog
1 Exec into the primary server pod using the following command:
kubectl exec -it -n <namespace> <primary-pod-name> -- /bin/bash
6 Exec into the primary server pod using the following command:
kuebctl exec -it -n <namespace> <primaryserver pod name> -- bash
Restoring a catalog
You can restore a catalog.
To restore a catalog
1 Copy DRPackages files (packages) located at /mnt/nblogs/DRPackages/
from the pod to the host machine from where Kubernetes Service cluster is
accessed.
Run the kubectl cp
<primary-pod-namespace>/<primary-pod-name>:/mnt/nblogs/DRPackages
<Path_where_to_copy_on_host_machine> command.
7 Delete the PV linked to primary server PVC using the kubectl delete pv
<pv-name> command.
8 (EKS-specific) Navigate to mounted EFS directory and delete the content from
primary_catalog folder by running the rm -rf /efs/* command.
Performing catalog backup and recovery 183
Restoring a catalog
10 After the primary server pod is in ready state, change CR spec from paused:
false to paused: true in primary server section in environment object using the
following command:
kubectl edit <environment_CR_name> -n <namespace> command.
■ Change ownership of the DRPackages folder to service user using the chown
nbsvcusr:nbsvcusr /mnt/nblogs/DRPackages command.
16 Restart the NetBackup services in primary server pod and external media
server.
■ Execute the following command in the primary server pod:
kubectl exec -it -n <namespace> <primary-pod-name> -- /bin/bash
17 Configure a storage unit on external media server that is used during catalog
backup.
18 Perform catalog recovery from NetBackup Administration Console.
For more information, refer to the NetBackup Troubleshooting Guide.
19 Execute the kubectl exec -it -n <namespace> <primary-pod-name> --
/bin/bash command in the primary server pod.
Run MSDP commands with non-root user msdpsvc after logging in to an engine
pod.
For example, sudo -E -u msdpsvc <command>
The MSDP Scaleout services in an engine pods are running with non-root user
msdpsvc. If you run the MSDP Scaleout services or commands with the root user,
MSDP Scaleout may stop working due to file permissions issues.
3 If the reclaim policy of the storage class is Retain, run the following command
to restart the existing MSDP Scaleout. MSDP Scaleout starts with the existing
data/metadata.
kubectl apply -f <your-cr-yaml>
Note: All affected pods or other Kubernetes workload objects must be restarted
for the change to take effect.
4 After the CR YAML file update, existing pods are terminated and restarted one
at a time, and the pods are re-scheduled for the new node pool automatically.
Note: Controller pods are temporarily unavailable when the MDS pod restarts.
Do not delete pods manually.
5 Run the following command to change MSDP Scaleout operator to the new
node pool:
AKS: kubectl msdp init -i <your-acr-url>/msdp-operator:<version>
-s <storage-class-name> -l agentpool=<new-nodepool-name>
6 If node selector does not match any existing nodes at the time of change, you
see the message on the console.
If auto scaling for node is enabled, it may resolve automatically as the new
nodes are made available to the cluster. If invalid node selector is provided,
pods may go in the pending state after the update. In that case, run the
command above again.
Do not delete the pods manually.
Chapter 15
Upgrading
This chapter includes the following topics:
■ Upgrading NetBackup
Upgrading NetBackup
Preparing for NetBackup upgrade
Note: Ensure that you go through this section carefully before starting with the
upgrade procedure.
5 Preserve the environment CR object using the following command and operator
directory that is used to deploy the NetBackup operator:
kubectl -n <namespace> get environment.netbackup.veritas.com
<environment name> -o yaml > environment.yaml
4 To upgrade the operator, apply the new image changes using the following
command:
kubectl apply -k <operator folder name>
After applying the changes, new NetBackup operator pod will start in operator
namespace and run successfully.
Kind: Environment
...
Spec:
primary:
tag: "newtag"
mediaServers:
tag: "newtag"
Primary server and media server pods would start with new container images
respectively.
Note: Upgrade the PrimaryServer first and then change the tag for MediaServer
to upgrade. If this sequence is not followed then deployment may go into
inconsistent state
2 At the time of upgrade, primary server and media server status would be
changed to Running. Once upgrade is completed, the status would be changed
to Success again.
Perform the following if upgrade fails in between for primary server or media
server
1 Check the installation logs using the following command:
kubectl logs <PrimaryServer-pod-name/MediaServer-pod-name> -n
<PrimaryServer/MediaServer-CR-namespace>
2 If required, check the NetBackup logs by performing exec into the pod using
the following command:
kubectl exec -it -n <PrimaryServer/MediaServer-CR-namespace>
<PrimaryServer/MediaServer-pod-name> -- bash
3 Fix the issue and restart the pod by deleting the respective pod with the
following command:
kubectl delete < PrimaryServer/MediaServer-pod-name > -n
<PrimaryServer/MediaServer-CR-namespace>
4 New pod would be created and upgrade process will be restarted for the
respective NetBackup server.
5 Data migration jobs create the pods that run before deployment of primary
server. Data migration pod exist after migration for one hour only if data
migration job failed. The logs for data migration execution can be checked
using the following command:
kubectl logs <migration-pod-name> -n
<netbackup-environment-namespace>
User can copy the logs to retain them even after job pod deletion using the
following command:
kubectl logs <migration-pod-name> -n
<netbackup-environment-namespace> > jobpod.log
Note: Downgrade of NetBackup servers is not supported. If this is done, there are
chances of inconsistent state of NetBackup deployment.
Upgrading 194
Upgrading NetBackup
EKS-specific
Ensure that all the steps mentioned for data migration in the following section are
performed before upgrading to the latest NetBackup or installing the latest :
See “Preparing the environment for NetBackup installation on Kubernetes cluster”
on page 77.
■ User must have deployed NetBackup on AWS with EBS as its storage class.
While upgrading to latest NetBackup, the existing catalog data of primary server
will be migrated (copied) from EBS to Amazon elastic files.
■ Fresh NetBackup deployment: If user is deploying NetBackup for the first time,
then Amazon elastic files will be used for primary server's catalog volume
for any backup and restore operations.
Upgrading 195
Upgrading NetBackup
Perform the following steps to create EFS when upgrading NetBackup from
version 10.0.0.1
1 To create EFS for primary server, see Create your Amazon EFS file system.
EFS configuration can be as follow and user can update Throughput mode as
required:
Performance mode: General Purpose
Throughput mode: Provisioned (256 MiB/s)
Availability zone: Regional
After changing the existing storage class from EBS to EFS for data migration,
manually create PVC and PV with EFS volume handle and update the yaml file as
described in the following procedure:
1. Create new PVC and PV with EFS volume handle.
■ PVC
CatlogPVC.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
Upgrading 196
Upgrading NetBackup
name: catalog
namespace: ns-155
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 100Gi
volumeName: environment-pv-primary -catalog
■ PV
catalogPV
apiVersion: v1
kind: PersistentVolume
metadata:
name: environment-pv-primary-catalog
labels:
topology.kubernetes.io/region: us-east-2
# Give the region as your configuration in your cluster
topology.kubernetes.io/zone: us-east-2c
# Give the zone of your node instance,
can also check with subnet zone in which your node instance is there.
spec:
capacity:
storage: 100Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
storageClassName: ""
persistentVolumeReclaimPolicy: Retain
mountOptions:
- iam
csi:
driver: efs.csi.aws.com
volumeHandle: fs-07a82a46b4a7d87f8:/nbdata
#EFS id need to be changed as per your created EFS id
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: catalog # catalog pvc name to which data to be copied namespace: ns-155
Upgrading 197
Upgrading NetBackup
PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-< Primary name >-primary-0
namespace: ns-155
spec:
accessModes:
- ReadWriteOnce
storageClassName: <Storageclass name>
resources:
requests:
storage: 30Gi
2. Edit the environment.yaml file and change the value of paused to true in
primary section and apply the yaml.
3. Scale down the primary server using the following commands:
■ To get statefulset name: kubectl get sts -n < namespace in
environment cr (ns-155)>
■ To scale down the STS: kubectl scale sts --replicas=0 < STS name
> -n < Namespace >
catalogMigration.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: rsync-data
namespace: ns-155
spec:
template:
spec:
volumes:
- name: source-pvc
persistentVolumeClaim:
# SOURCE PVC
claimName: <EBS PVC name of catalog> #
catalog-environment-migrate1-primary- 0
volumeMounts:
- name: source-pvc
mountPath: /srcPvc
- name: destination-pvc
mountPath: /destPvc
restartPolicy: Never
Upgrading 199
Upgrading NetBackup
dataMigration.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: rsync-data2
namespace: ns-155
spec:
template:
spec:
volumes:
- name: source-pvc
persistentVolumeClaim:
# SOURCE PVC
claimName: <EBS PVC name of catalog> #
catalog-environment-migrate1-primary- 0
volumeMounts:
- name: source-pvc
mountPath: /srcPvc
- name: destination-pvc
mountPath: /destPvc
restartPolicy: Never
5. Delete the migration job once the pods are in complete state.
6. For primary server, delete old PVC (EBS) of catalog volume.
For example, catalog-<Name_of_primary>-primary-0 and create new PVC
with same name (as deleted PVC) which were attached to primary server.
Upgrading 200
Upgrading NetBackup
■ Follow the naming conventions of static PV and PVC to consume for Primary
Server Deployment.
catalog-<Name_of_primary>-primary-0
data-<Name_of_primary>-primary-0
Example:
catalog-test-env-primary-0
data-test-env-primary-0
environment.yaml
apiVersion: netbackup.veritas.com/v2
kind: Environment
metadata:
name: test-env
namespace: ns-155
spec:
...
primary:
# Set name to control the name of the primary server.
The default value is the same as the Environment's metadata.name.
name: test-env
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: catalog-test-env-primary-0
namespace: ns-155
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 100Gi
volumeName: environment-pv-primary-catalog
7. Edit the PV (mounted on EFS) and replace the name, resource version, uid
with new created PVC to meet the naming convention.
Get the PV's and PVC's using the following commands:
■ To get PVC details: kubectl get pvc -n < Namespace>
Upgrading 201
Upgrading NetBackup
■ Use edit command to get PVC details: kubectl edit pvc < New PVC(old
name) name > -n < Namespace >
8. Upgrade the MSDP with new build and image tag. Apply the following command
to MSDP:
./kubectl-msdp init --image <<Image name:Tag>> --storageclass <<
Storage Class Name>> --namespace << Namespace >>
9. Apply the following command operator from new build with new image tag
and node selector:
kubectl apply -k operator/
10. Edit the environment.yaml file from new build and perform the following
changes:
■ Add the tag: <new_tag_of_upgrade_image> tag separately under primary
sections.
■ Provide EFS ID for storageClassName of the catalog volume under primary
section. Set the paused=false under primary section.
EFS ID must be same as used in the Create new PVC and PV with EFS
volume handle step in the above section.
■ Change the strorageClass for data and logs as with storageClassName
and then apply environment.yaml file using the following command and
ensure that the primary server is upgraded successfully:
kubectl apply -f environment.yaml
■ Edit the environment.yaml file and update the image tag for Media Server
in mediaServer section.
■ Apply environment.yaml file using the following command and ensure
that the Media Server is deployed successfully:
kubectl apply -f environment.yaml
Upgrading 202
Upgrading NetBackup
2 Upgrade the MSDP with new build and image tag. Apply the following command
to MSDP:
./kubectl-msdp init --image <Image name:Tag> --storageclass
<Storage Class Name> --namespace <Namespace>
3 Edit the sample/environment.yaml file from new build and perform the
following changes:
■ Add the tag: <new_tag_of_upgrade_image> tag separately under primary
sections.
■ Provide the EFS ID for storageClassName of catalog volume in primary
section.
■ Use the following command to retrieve the previously used EFS ID from
PV and PVC:
kubectl get pvc -n <namespace>
From the output, copy the name of catalog PVC which is of the following
format:
catalog-<resource name prefix>-primary-0
■ Edit the environment.yaml file and update the image tag for Media
Server in mediaServer section.
■ Apply environment.yaml file using the following command and ensure
that the Media Server is deployed successfully:
kubectl apply -f environment.yaml
Note: The rollback procedure in this section can be performed only after assuming
that the customer has taken catalog backup before performing the upgrade.
Perform the following steps to rollback from upgrade failure and install the
NetBackup version prior to upgrade
1 Delete the environment CR object using the following command and wait until
all the underlying resources are cleaned up:
kubectl delete environment.netbackup.veritas.com <environment
name> -n <namespace>
For example, primary server CR, media server CR, MSDP CR and their
underlined resources.
2 Delete the new operator which is deployed during upgrade using the following
command:
kubectl delete -k <new-operator-directory>
3 Apply the NetBackup operator directory which was preserved (the directory
which was used to install operator before upgrade) using the following
command:
kubectl apply -k <operator_directory>
4 Get names of PV attached to primary server PVC (data, catalog and log) using
the following command:
kubectl get pvc -n <namespace> -o wide
5 Delete the primary server PVC (data, catalog and log) using the following
command:
kubectl delete pvc <pvc-name> -n <namespace>
6 Delete the PV linked to primary server PVC using the following command:
kubectl delete pv <pv-name> command
9 After the primary server pod is in ready state (1/1), change the CR spec from
paused: false to paused: true in environment object using the following
command:
kubectl edit <environment_CR_name> -n <namespace>
10 Exec into the primary server pod using the following command:
kubectl exec -it -n <PrimaryServer/MediaServer-CR-namespace>
<primary-pod-name> -- /bin/bash
After stopping all the services, restart the NetBackup services using the
following command:
/usr/openv/netbackup/bin/bp.start_all
16 Configure a storage unit on external media server that is used during catalog
backup.
17 Perform catalog recovery from NetBackup Administration Console.
For more information, refer to the VeritasTM NetBackup Troubleshooting Guide
18 Exec into the primary server pod using the following command:
kubectl exec -it -n <PrimaryServer/MediaServer-CR-namespace>
<primary-pod-name> -- /bin/bash
19 Restart the NetBackup operator pod, where user must delete the pod using
the following command:
kuebctl delete <operator-pod-name> -n <namespace>
All the options except -i option must be same as earlier when the operator
was deployed initially.
3 Run the following command to change the spec.version in the existing CR
resources.
kubectl edit msdpscaleout <cr-name>
Wait for a few minutes. MSDP operator upgrades all the pods and other MSDP
Scaleout resources automatically.
Note: If you use the environment operator for the MSDP Scaleout deployment,
change the version string for MSDP Scaleout in the environment operator CR
only. Do not change the version string in the MSDP Scaleout CR.
4 Upgrade process restarts the pods. The NetBackup jobs are interrupted during
the process.
4 To upgrade the operator, apply the new image changes using the following
command:
kubectl apply -k <operator folder name>
After applying the changes, new Snapshot manager operator pod will start in
operator namespace and run successfully.
NB_VERSION=10.1.0
OPERATOR_NAMESPACE="netbackup-operator-system"
ENVIRONMENT_NAMESPACE="ns-155"
NB_DIR=/home/azureuser/VRTSk8s-netbackup-${NB_VERSION}/
Post-migration tasks
After migration, if the name is changed to Snapshot Manager, then perform the
following steps for Linux and Windows on-host agent renews and then perform the
plugin level discovery:
For Linux:
■ Edit the /etc/flexsnap.conf file for migrated Snapshot Manager.
For example,
[agent]
id = agent.c2ec74c967e043aaae5818e50a939556
■ Perform the Linux on-host agent renew using the following command:
/opt/VRTScloudpoint/bin/flexsnap-agent--renew--token <auth_token>
Upgrading 210
Upgrading Snapshot Manager
For Windows:
■ Edit the \etc\flexsnap.conf file for migrated Snapshot Manager.
For example,
[global]
target = nbuxqa-alphaqa-10-250-172-172.vxindia.veritas.com
hostid = azure-vm-427a67a0-6f91-4a35-abb0-635e099fe9ad
[agent]
id = agent.3e2de0bf17d54ed0b54d4b33530594d8
■ Perform the Windows on-host agent renew using the following command:
"c:\ProgramFiles\Veritas\CloudPoint\flexsnap-agent.exe"--renew--token
<auth_token>
Chapter 16
Uninstalling
This chapter includes the following topics:
When an MSDP Scaleout CR is deleted, the critical MSDP data and metadata
is not deleted. You must delete it manually. If you delete the CR without cleaning
up the data and metadata, you can re-apply the same CR YAML file to restart
MSDP Scaleout again by reusing the existing data.
2 If your storage class is with the Retain policy, you must write down the PVs
that are associated with the CR PVCs for deletion in the Kubernetes cluster
level.
kubectl get
pod,svc,deploy,rs,ds,pvc,secrets,certificates,issuers,cm,sa,role,rolebinding
-n <sample-namespace> -o wide
4 If your storage class is with the Retain policy, you must delete the Azure disks
using Azure portal or delete the EBS volumes using Amazon console. You can
also use the Azure or AWS CLI.
AKS: az disk delete -g $RESOURCE_GROUP --name $AZURE_DISK --yes
EKS: aws ec2 delete-volume --volume-id <value>
See “Deploying MSDP Scaleout” on page 111.
See “Reinstalling MSDP Scaleout operator” on page 187.
■ -k: Delete all resources of MSDP Scaleout operator except the namespace.
3 If your storage class is with the Retain policy, you must delete the Azure disks
using Azure portal or delete the EBS volumes using Amazon console. You can
also use the Azure or AWS CLI.
AKS: az disk delete -g $RESOURCE_GROUP --name $AZURE_DISK --yes
EKS: aws ec2 delete-volume --volume-id <value>
See “Deploying MSDP Scaleout” on page 111.
See “Reinstalling MSDP Scaleout operator” on page 187.
ENVIRONMENT_NAMESPACE="netbackup-environment"
# Make sure the flexsnap-operator pod is running and ready.
# Comment out / remove cpServer part from environment.yaml then apply it.
Following commands can be used to remove and disable the Snapshot Manager
from NetBackup:
kubectl apply -f environment.yaml -n $ENVIRONMENT_NAMESPACE sleep
10s
Uninstalling 214
Uninstalling Snapshot Manager from Kubernetes cluster
NAME READY
STATUS RESTARTS AGE
pod/flexsnap-operator-7d45568767-n9g27 1/1
Running 0 18h
pod/msdp-operator-controller-manager-0 2/2
Running 0 43m
pod/msdp-operator-controller-manager-1 2/2
Running 0 44m
Troubleshooting 216
Troubleshooting AKS and EKS issues
pod/netbackup-operator-controller-manager-6cbf85694f-p97sw 2/2
Running 0 42m
NAME TYPE
CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/msdp-operator-controller-manager-metrics-service
ClusterIP 10.96.144.99 <none> 8443/TCP 3h6m
service/msdp-operator-webhook-service
ClusterIP 10.96.74.75 <none> 443/TCP 3h6m
service/netbackup-operator-controller-manager-metrics-service
ClusterIP 10.96.104.94 <none> 8443/TCP 93m
service/netbackup-operator-webhook-service ClusterIP
10.96.210.26 <none> 443/TCP 93m
NAME
READY UP-TO-DATE AVAILABLE AGE
deployment.apps/msdp-operator-controller-manager
1/1 1 1 3h6m
deployment.apps/netbackup-operator-controller-manager-operator-controller-manager
1/1 1 1 93m
NAME
DESIRED CURRENT READY AGE
replicaset.apps/msdp-operator-controller-manager-65d8fd7c4d
1 1 1 3h6m
replicaset.apps/netbackup-operator-controller-manager-55d6bf59c8
1 1 1 93m
Verify that both pods display Running in the Status column and both deployments
display 2/2 in the Ready column.
pod/dedupe1-uss-controller
-79d554f8cc-598pr 1/1 Running 0 68m
pod/dedupe1-uss-mds-1 1/1 Running 0 75m
pod/dedupe1-uss-mds-2 1/1 Running 0 74m
pod/dedupe1-uss-mds-3 1/1 Running 0 71m
pod/media1-media-0 1/1 Running 0 53m
pod/environment-sample
-primary-0 1/1 Running 0 86m
pod/x10-240-0-12.veritas
.internal 1/1 Running 0 68m
pod/x10-240-0-13.veritas
.internal 2/2 Running 0 64m
pod/x10-240-0-14.veritas
.internal 2/2 Running 0 61m
pod/x10-240-0-15.veritas
.internal 2/2 Running 0 59m
1556:30248/TCP 54m
service/
environment-
sample-primary LoadBalancer 10.1.206.39 13781:30246/TCP,
13782:30498/TCP,
1556:31872/TCP,
443:30049/TCP,
8443:32032/TCP,
22:31511/TCP 87m
service/
x10-240-0-12
-veritas-internal LoadBalancer 10.1.44.188 10082:31199/TCP 68m
service/
Troubleshooting 218
Troubleshooting AKS and EKS issues
x10-240-0-13
-veritas-internal LoadBalancer 10.1.21.176 10082:32439/TCP, 68m
service/
x10-240-0-14 10102:30284/TCP
-veritas-internal LoadBalancer 10.1.25.99 10082:31810/TCP, 68m
service/
x10-240-0-15 10102:31755/TCP
-veritas-internal LoadBalancer 10.1.185.135 10082:31664/TCP, 68m
10102:31811/TCP
Once in the primary server shell prompt, to see the list of logs, run:
ls /usr/openv/logs/
To resolve this issue, update the sysctl.conf values for NetBackup servers
deployed on the Kubernetes cluster.
NetBackup image sets following values in sysctl.conf during Kubernetes
deployment:
■ net.ipv4.tcp_keepalive_time = 180
■ net.ipv4.tcp_keepalive_intvl = 10
■ net.ipv4.tcp_keepalive_probes = 20
■ net.ipv4.ip_local_port_range = 14000 65535
These settings are persisted at the location /mnt/nbdata/etc/sysctl.conf.
Modify the values in /mnt/nbdata/etc/sysctl.conf and restart the pod. The new
values are reflected after the pod restart.
If external media servers are used, perform the steps in the following order:
1. Add the following in /usr/openv/netbackup/bp.conf:
HOST_HAS_NAT_ENDPOINTS = YES
2. Add the following sysctl configuration values in etc/sysctl.conf on external
media servers to avoid any socket connection issues:
■ net.ipv4.tcp_keepalive_time = 180
■ net.ipv4.tcp_keepalive_intvl = 10
■ net.ipv4.tcp_keepalive_probes = 20
■ net.ipv4.ip_local_port_range = 14000 65535
■ net.core.somaxconn = 4096
When you deploy NetBackup for the first time, perform the steps for primary CR
and media CR.
To resolve an invalid license key issue for Primary CR
1 Get the configmap name created for primary CR or media CR using the
following command:
kubectl get configmap -n <namespace>
2 Edit the license key stored in configmap using the following command:
kubectl edit configmap <primary-configmap-name> -n <namespace>
3 Update value for ENV_NB_LICKEY key in the configmap with correct license
key and save.
4 Delete respective primary/media pod using the following command:
kubectl delete pod <primary-pod-name> -n <namespace>
Note: Ensure that you copy spec information of the media server CR. The spec
information is used to reapply the media server CR.
3 Depending on the output of the command and the reason for the issue, perform
the required steps and update the environment CR to resolve the issue.
Resolving the issue where the NetBackup server pod is not scheduled
for long time
The NetBackup server (primary server and media server) pods are stuck in Pending
state. The issue can be because of one of the following reasons:
■ Insufficient resource allocation.
■ Persistent volume claims are not bound to persistent volume.
■ NetBackup server pods have the anti-affinity rule added.
As a result, primary server and media server pods are scheduled on different nodes.
If nodes are not available, pod remains in pending state with event logs indicating
nodes are scaling up, if auto scaling is configured in cluster.
Troubleshooting 223
Troubleshooting AKS and EKS issues
To resolve the issue where the NetBackup server pod is not scheduled for
long time
1 Check the pod event details for more information about the error using kubectl
describe <PrimaryServer/MediaServer_Pod_Name> -n <namespace>
command.
2 Depending on the output of the command and the reason for the issue, perform
the required steps and update the environment CR to resolve the issue.
Error: ERROR Storage class with the <storageClassName> name does not exist.
After fixing this error, primary server or media server CR does not require any
changes. In this case, NetBackup operator reconciler loop is invoked after every
10 hours. If you want to reflect the changes and invoke the NetBackup operator
reconciler loop immediately, delete and reapply the primary server or media server
CR.
Troubleshooting 224
Troubleshooting AKS and EKS issues
Note: To reuse the mediaServer section information, you must save it and
apply the yaml again with the new changes using the kubectl apply -f
<environment.yaml> command.
User can copy the logs to retain them even after job pod deletion using the
following command:
kubectl logs <migration-pod-name> -n
<netbackup-environment-namespace> > jobpod.log
2 Check pod events for obtaining more details for probe failure using the following
command:
kubectl describe pod/<podname> -n <namespace>
Kubernetes will automatically try to resolve the issue by restarting the pod after
liveness probe times out.
3 Depending on the error in the pod logs, perform the required steps or contact
technical support.
NetBackup media server and NetBackup primary server were in running state.
Media server persistent volume claim or media server pod is deleted. In this case,
reinstallation of respective media server can cause the issue.
To resolve token issues
1 Open the NetBackup web UI using primary server hostname given in the primary
server CR status.
2 Navigate to Security > Host Mappings.
3 Click Actions > Allow auto reissue certificate for the respective media server
name.
Troubleshooting 227
Troubleshooting AKS and EKS issues
4 Delete data and logs PVC for respective media server only using the kubectl
delete pvc <pvc-name> -n <namespace> command.
New media server pod and new PVCs for the same media server are created.
2 For media server CR: Delete the media server CR by removing the
mediaServer section in the environment.yaml and save the changes.
Note: Ensure that you copy spec information of the media server CR. The spec
information is used to reapply the media server CR.
To resolve this issue, execute the following command in the primary server pod:
kubectl exec -it -n <namespace> <primary-server-pod-name> -- /bin/bash
Refer the NetBackup Security and Encryption Guide for configure KMS manually:
For other troubleshooting issue related to KMS, refer the NetBackup Troubleshooting
Guide.
pod/netbackup-operator
-controller-manager-
5df6f58b9b-6ftt9 1/2 ImagePullBackOff 0 13s
■ Check if the user is authorized and has permissions to access the Azure/AWS
container registry.
4 Run the kubectl get PV command and verify bound state of PVs is Available.
5 For the PV to be claimed by specific PVC, add the claimref spec field with
PVC name and namespace using the kubectl patch pv <pv-name> -p
'{"spec":{"claimRef": {"apiVersion": "v1", "kind":
"PersistentVolumeClaim", "name": "<Name of claim i.e. PVC name>",
"namespace": "<namespace of pvc>"}}}' command.
For example,
kubectl patch pv <pv-name> -p '{"spec":{"claimRef": {"apiVersion":
"v1", "kind": "PersistentVolumeClaim", "name":
"data-testmedia-media-0", "namespace": "test"}}}'
While adding claimRef add correct PVC names and namespace to respective
PV. Mapping should be as it was before deletion of the namespace or deletion
of PVC.
6 Deploy environment CR that deploys the primary server and media server CR
internally.
Troubleshooting 231
Troubleshooting AKS and EKS issues
If the output shows STATUS as Failed as in the example above, check the primary
pod log for errors with the command:
$ kubectl logs pod/environment-sample-primary-0 -n <namespace>
pod/netbackup-
operator-controller-
manager-6c9dc8d87f
-pq8mr 0/2 Pending 0 15s
io/master: }, that
the pod didn't
tolerate, 3 node(s)
didn't match
Pod's node
affinity/selector.
Sample output:
Sample output:
"sha256:353d2bd50105cbc3c61540e10cf32a152432d5173bb6318b8e"
2 Run:
$ docker image ls | grep msdp-operator
Sample output:
(AKS-specific):
(EKS-specific):
17.0: digest:
sha256:d294f260813599562eb5ace9e0acd91d61b7dbc53c3 size:
2622
Troubleshooting 234
Troubleshooting AKS and EKS issues
Sample output:
(AKS-specific):
[
"testregistry.azurecr.io/msdp-operator@sha256:
d294f260813599562eb5ace9e0acd91d61b7dbc53c3"
]
(EKS-specific):
[
"testregistry.<account
id>.dkr.ecr.<region>.amazonaws.com/<registry>:<tag>.io/
msdp-operator@sha256: d294f260813599562eb5ace9e0acd91d61b7dbc53c3"
]
Troubleshooting 235
Troubleshooting AKS and EKS issues
Sample output:
(AKS-specific):
[
"msdp-operator",
]
(EKS-specific):
"repositories": [
{
"repositoryArn": "arn:aws:ecr:us-east-2:046777922665:
repository/veritas/main_test1",
"registryId": "046777922665",
"repositoryName": "veritas/main_test1",
"repositoryUri": "046777922665.dkr.ecr.us-east-2.
amazonaws.com/veritas/main_test1",
"createdAt": "2022-04-13T07:27:52+00:00",
"imageTagMutability": "MUTABLE",
"imageScanningConfiguration": {
"scanOnPush": false
},
"encryptionConfiguration": {
"encryptionType": "AES256"
}
}
]
Troubleshooting 236
Troubleshooting AKS and EKS issues
Sample output:
(AKS-specific):
{
"changeableAttributes": {
"deleteEnabled": true,
"listEnabled": true,
"readEnabled": true,
"writeEnabled": true
},
"createdTime": "2022-02-01T13:43:26.6809388Z",
"digest": "sha256:d294f260813599562eb5ace9e0acd91d61b7dbc53c3",
"lastUpdateTime": "2022-02-01T13:43:26.6809388Z",
"name": "17.0",
"signed": false
}
(EKS-specific):
"imageDetails": [
{
"registryId": "046777922665",
"repositoryName": "veritas/main_test1",
"imageDigest":
"sha256:d0095074286a50c6bca3daeddbaf264cf4006a92fa3a074daa4739cc995b36f8",
"imageTags": [
"latestTest5"
],
"imageSizeInBytes": 38995046,
"imagePushedAt": "2022-04-13T15:56:07+00:00",
"imageManifestMediaType": "application/vnd.docker.
distribution.manifest.v2+json",
"artifactMediaType": "application/vnd.docker.container.image.v1+json"
}
]
Troubleshooting 237
Troubleshooting AKS and EKS issues
The third copy is located on a Kubernetes node running the container after it is
pulled from the registry. To check this copy, perform the following:
1 Run;
$ kubectl get nodes -o wide
(AKS-specific):
(EKS-specific):
3 You can interact with the node session from the privileged container:
chroot /host
Sample output:
(AKS-specific):
(EKS-specific):
Sample output
"sha256:353d2bd50105cbc3c61540e10cf32a152432d5173bb6318b8e"
null
Sample output
(AKS-specific):
[
"testregistry.azurecr.io/msdp-operator@sha256:
d294f260813599562eb5ace9e0acd91d61b7dbc53c3"
]
null
(EKS-specific):
[
"<account
id>.dkr.ecr.<region>.amazonaws.com/msdp-operator@sha256:
d294f260813599562eb5ace9e0acd91d61b7dbc53c3"
]
null
How to make sure that you are running the correct image
Use the steps given above to identify image ID and Digest and compare with values
obtained from the registry and the Kubernetes node running the container.
Sample output:
(AKS-specific):
(EKS-specific):
Alternatively, if the nbbuilder script is not available, you can view the installed
EEBs by executing the following command:
$ docker run --rm <image_name>:<image_tag> cat
/usr/openv/pack/pack.summary
Sample output:
Troubleshooting 241
Troubleshooting AKS and EKS issues
EEB_NetBackup_10.1Beta6_PET3980928_SET3992004_EEB1
EEB_NetBackup_10.1Beta6_PET3980928_SET3992021_EEB1
EEB_NetBackup_10.1Beta6_PET3980928_SET3992022_EEB1
EEB_NetBackup_10.1Beta6_PET3980928_SET3992023_EEB1
EEB_NetBackup_10.1Beta6_PET3992020_SET3992019_EEB2
EEB_NetBackup_10.1Beta6_PET3980928_SET3992009_EEB2
EEB_NetBackup_10.1Beta6_PET3980928_SET3992016_EEB1
EEB_NetBackup_10.1Beta6_PET3980928_SET3992017_EEB1
Note: The pack directory may be located in different locations in the uss-*
containers. For example: /uss-controller/pack , /uss-mds/pack,
/uss-proxy/pack.
ERROR controller-runtime.manager.controller.environment
Error defining desired resource {"reconciler group": "netbackup.veritas.com",
"reconciler kind": "Environment", "name": "test-delete", "namespace": "netbackup-environment",
"Type": "MSDPScaleout", "Resource": "dedupe1", "error": "Unable to get primary host UUID:
Get \"https://nbux-10-244-33-24.vxindia.veritas.com:1556/netbackup/config/hosts\":
x509: certificate signed by unknown authority (possibly because of \"crypto/rsa:
verification error\" while trying to verify candidate authority certificate \"nbatd\")"}
To resolve this issue, restart the NetBackup operator by deleting the NetBackup
operator pod using the following command:
kubectl delete <Netbackup-operator-pod-name> -n <namespace>
When there is an issue with a container related to a full disk, CPU, or memory
pressure, the liveness probe gets timed out because of no response from the health
script. As a result, the Pod does not restart.
To resolve this issue, restart the Pod manually. Delete the Pod using the kubectl
delete pod/<podname> -n <namespace> command.
If the Primary Server pod gets restarted then the user must perform the same
above steps to increase the values of total_time and sleep_duration, as these
values will not get persisted after pod restart.
with some different pod and which causes conflict. The host mapping entries is in
the form of "::ffff:<ip address>"
To resolve the issue of host mapping conflict in NetBackup
1 To resolve the conflict issue, refer to Mappings for Approval tab section of the
NetBackup Security and Encryption Guide.
2 To remove the entries that are not valid, refer to Removing host ID to host
name mappings section of the NetBackup Security and Encryption Guide.
[NBDEPLOYUTIL_INCREMENTAL]
PARENTDIR=/mnt/nbdb/<FOLDER_NAME>
Primary and media servers are referred with multiple IP's inside the pod (pod
IP/LoadBalancer IP). With reverse name lookup of IP enabled, NetBackup treats
the local connection as remote insecure connection.
To resolve the audit events issue, disable the reverse name lookup of primary and
media Load Balancer IP.
Troubleshooting 244
Troubleshooting AKS and EKS issues
2 If it is allocated to same node then create new node with same node selector
given in CR for primary server.
3 Delete the Primary pod which is in pending state.
The newly created Primary pod must not be in pending state now.
Then perform the following steps manually to refresh the Instant Access capability
in NetBackup:
1. Login to NetBackup primary server.
2. Execute the following commands to refresh the MSDP capabilities on
NetBackup primary server:
nbdevconfig -getconfig
nbdevconfig -setconfig
For example,
/usr/openv/netbackup/bin/admincmd/nbdevconfig -getconfig -stype
PureDisk -storage_server [storage server] >
/tmp/tmp_pd_config_file
/usr/openv/netbackup/bin/admincmd/nbdevconfig -setconfig
-storage_server [storage server] -stype PureDisk -configlist
/tmp/tmp_pd_config_file
/usr/openv/netbackup/bin/nbwmc start
1. Obtain the pending pod's toleration and affinity status using the following
command:
kubectl get pods <pod name>
If all the above fields are correct and matching and still the control pool pod is in
pending state, then the issue may be due to all the nodes in nodepool running at
maximum capacity and cannot accommodate new pods. In such case the noodpool
must be scaled properly.
1. Obtain the pending pod's toleration and affinity status using the following
command:
kubectl get pods <pod name>
If all the above fields are correct and matching and still the control pool pod is in
pending state, then the issue may be due to all the nodes in nodepool running at
maximum capacity and cannot accommodate new pods. In such case the noodpool
must be scaled properly.
If all the above fields are correct and matching and still the control pool pod is in
pending state, then the issue may be due to all the nodes in nodepool running at
maximum capacity and cannot accommodate new pods. In such case the noodpool
must be scaled properly.
Troubleshooting 247
Troubleshooting AKS and EKS issues
■ The flexsnap operator is running and is already processing the event (Update,
Upgrade, Create, Delete).
■ To check logs of running operator, use the following command:
kubectl logs -f $(kubectl get pods -n $OPERATOR_NAMESPACE |
grep flexsnap-operator | awk '{printf $1" " }')
■ If you still want to go ahead with new action, you can stop the processing of
the current event so that the new events are processed. To do so delete the
flexsnap operator pod using the following command:
kubectl delete pod $(kubectl get pods -n $OPERATOR_NAMESPACE |
grep flexsnap-operator | awk '{printf $1" " }')
This will re-create the flexsnap-operator pod which will be ready to serve
new events.
Troubleshooting 248
Troubleshooting AKS and EKS issues
Note: The newly created pod might have missed the event which was
performed before re-creation of pod. In this case you may have to reapply
environment.yaml.
2023-03-01T08:14:56.470Z INFO
controller-runtime.manager.controller.mediaserver Running
jobs 0: on Media Server nbux-10-244-33-77.vxindia.veritas.com.
{"reconciler group": "netbackup.veritas.com",
"reconciler kind": "MediaServer", "name": "media1", "namespace":
"netbackup-environment", "Media Server":
"nbux-10-244-33-77.vxindia.veritas.com"}
2023-03-01T08:14:56.646Z INFO
controller-runtime.manager.controller.mediaserver bpps
processes running status. false: on Media Server
nbux-10-244-33-77.vxindia.veritas.com. {"reconciler group":
"netbackup.veritas.com", "reconciler kind": "MediaServer",
"name": "media1", "namespace": "netbackup-environment", "Media
Server": "nbux-10-244-33-77.vxindia.veritas.com"}
Perform the following to know which bpps processes are running and are
not allowing to scale-in the media server pod:
■ Exec into the media server pod for which there are running bpps process.
■ Refer to the /mnt/nblogs/nbprocesscheck log to get the list of bpps
process running. user must wait until the process listed in the Extra bpps
process exits.
Troubleshooting 249
Troubleshooting AKS and EKS issues
■ Identify and delete any outstanding persistent volume claims for the media
server by running the following commands:
$ kubectl get pvc --namespace <namespce_name>
$ kubectl delete pvc <pvc-name>
■ Locate and delete any persistent volumes created for the media server by
running the following commands:
$ kubectl get pv
$ kubectl delete pv <pv-name> --grace-period=0 --force
media servers are not removed from NetBackup primary server and hence if
those media servers are used for any operation, connectivity issue is observed.
Workaround:
It is recommenced to use media servers that are always up, running and would
never scale in (by the media server autoscaler). Number of media servers that
are always up and running would be same as that of the value mentioned in
minimumReplicas field in CR.
Note: If reconciler is called while migration PVC exists the invocation will be
failed, customers must wait for the completion of a migration job if an existing
migration job is running and they can also monitor the migration job pods to
check if there are any issues with the migration job. In order to resolve any
problems encountered during existing migration job pod they may choose to
delete the migration job pod manually. If the migration job pod does not exist,
then customer may delete the migration PVC.
Troubleshooting 251
Troubleshooting AKS-specific issues
To resolve this issue, delete the corrupted database and correct symlink as follows:
1. Exec into primary pod by running the following command:
kubectl exec -it <primary_pod_name> -n <namespace> – bash
# /opt/veritas/vxapp-manage/nbu-health disable
# bp.kill_all
# mv -f /mnt/nbdata/usr/openv/netbackup/db/rb.db /mnt/nbdb/usr/openv/netbackup/db/rb.db
Troubleshooting 252
Troubleshooting EKS-specific issues
"Get \"https://abc.xyz.com:*/netbackup/security/cacert\":
■ From the output, copy the name of catalog PVC which is of the following
format:
catalog-<resource name prefix>-primary-0
2 Depending on the following appropriate scenario, fix the error from the output
under the Event section:
■ If the event log has an error related to incorrect EFS ID or incorrect format,
then update the environment.yaml file with the correct EFS ID and perform
the below steps.
Or
■ If the event log has an error other than the error related to incorrect EFS
ID, then analyze and fix the error and perform the below steps.
Troubleshooting 255
Troubleshooting EKS-specific issues
3 After fixing the error, clean the environment using the following command:
kubectl delete -k operator/
4 Delete PV and PVC created for primary server only by using the following
command:
Kubectl get pvc -n <namespace>
Describe the PVC for primary server which has the following format and obtain
the corresponding PV name:
Delete PVC and PV names using the following commands: For PVC: kubectl
delete pvc <pvc name> -n <namespace> For PV: kubectl delete pv <pv name>
■ PVC: kubectl delete pvc <pvc name> -n <namespace>
■ PV: kubectl delete pv <pv name>
5 Deploy NetBackup operator again and then apply the environment.yaml file.
This issue can be resolved by creating PV and apply environment.yaml file again.
Appendix A
CR template
This appendix includes the following topics:
■ Secret
■ MSDP Scaleout CR
Secret
The Secret is the Kubernetes security component that stores the MSDP credentials
that are required by the CR YAML.
stringData:
# Please follow MSDP guide for the credential characters and length.
# https://www.veritas.com/content/support/en_US/article.100048511
# The pattern is "^[\\w!$+\\-,.:;=?@[\\]`{}\\|~]{1,62}$"
username: xxxx
password: xxxxxx
MSDP Scaleout CR
■ The CR name must be fewer than 40 characters.
■ The MSDP credentials stored in the Secret must match MSDP credential rules.
See Deduplication Engine credentials for NetBackup
■ MSDP CR cannot be deployed in the namespace of MSDP operator. It must be
in a separate namespace.
■ You cannot reorder the IP/FQDN list. You can update the list by appending the
information.
■ You cannot change the storage class name.
The storage class must be backed with:
■ AKS: Azure disk CSI storage driver "disk.csi.azure.com"
■ EKS: Amazon EBS CSI driver "ebs.csi.aws.com"
■ You cannot change the data volume list other than for storage expansion. It is
append-only and storage expansion only. Up to 16 data volumes are supported.
■ Like the data volumes, the catalog volume can be changed for storage expansion
only.
■ You cannot change or expand the size of the log volume by changing the MSDP
CR.
■ You cannot enable NBCA after the configuration.
■ Once KMS and the OST registration parameters set, you cannot change them.
■ You cannot change the core pattern.
fqdn: "sample-fqdn1"
- ipAddr: "sample-ip2"
fqdn: "sample-fqdn2"
- ipAddr: "sample-ip3"
fqdn: "sample-fqdn3"
- ipAddr: "sample-ip4"
fqdn: "sample-fqdn4"
#
# Optional annotations to be added in the LoadBalancer services for the
Engine IPs.
# In case we run the Engines on private IPs, we need to add some
customized annotations to the LoadBalancer services.
# See https://docs.microsoft.com/en-us/azure/aks/internal-lb
# It's optional. It's not needed in most cases if we're
with public IPs.
# loadBalancerAnnotations:
# service.beta.kubernetes.io/azure-load-balancer-internal: "true"
#
# SecretName is the name of the secret which stores the MSDP credential.
# AutoDelete, when true, will automatically delete the secret specified
by SecretName after the
# initial configuration. If unspecified, AutoDelete defaults to true.
# When true, SkipPrecheck will skip webhook validation of the MSDP
credential. It is only used in data re-use
# scenario (delete CR and re-apply with pre-existing data) as the
secret will not take effect in this scenario. It
# can't be used in other scenarios. If unspecified, SkipPrecheck
defaults to false.
credential:
# The secret should be pre-created in the same namespace which has
the MSDP credential stored.
# The secret should have a "username" and a "password" key-pairs
with the corresponding username and password values.
# Please follow MSDP guide for the rules of the credential.
# https://www.veritas.com/content/support/en_US/article.100048511
# A secret can be created directly via kubectl command or with the
equivalent YAML file:
# kubectl create secret generic sample-secret --namespace
sample-namespace \
# --from-literal=username=<username> --from-literal=password=
<password>
secretName: sample-secret
# Optional
CR template 260
MSDP Scaleout CR
# Default is true
autoDelete: true
# Optional
# Default is false.
# Should be specified only in data re-use scenario (aka delete and
re-apply CR with pre-existing data)
skipPrecheck: false
#
# s3Credential:
# secretName: s3-secret
# # Optional
# # Default is true
# autoDelete: true
# # Optional
# # Default is false.
# skipPrecheck: false
# Paused is used for maintenance only. In most cases you don't need
CR template 261
MSDP Scaleout CR
to specify it.
# When it's specified, MSDP operator stops reconciling the corresponding
MSDP-X (aka the CR).
# Optional.
# Default is false
# paused: false
#
# The storage classes for logVolume, catalogVolume and dataVolumes should
be:
# - Backed with Azure disk CSI driver "disk.csi.azure.com" with the
managed disks, and allow volume
# expansion.
# - The Azure in-tree storage driver "kubernetes.io/azure-disk" is not
supported. You need to explicitly
# enable the Azure disk CSI driver when configuring your AKS cluster,
or use k8s version v1.21.x which
# has the Azure disk CSI driver built-in.
# - In LRS category.
# - At least Standard SSD for dev/test, and Premium SSD or Ultra Disk
for production.
# - The same storage class can be used for all the volumes.
# -
#
# LogVolume is the volume specification which is used to provision a
volume of an MDS or Controller
# Pod to store the log files and core dump files.
# It's not allowed to be changed.
# In most cases, 5-10 GiB capacity should be big enough for one MDS or
Controller Pod to use.
logVolume:
storageClassName: sample-azure-disk-sc1
resources:
requests:
storage: 5Gi
#
# CatalogVolume is the volume specification which is used to provision a
volume of an MDS or Engine
# Pod to store the catalog and metadata. It's not allowed to be changed
unless for capacity expansion.
# Expanding the existing catalog volumes expects short downtime of the
Engines.
# Please note the MDS Pods don't respect the storage request in
CatalogVolume, instead they provision the
CR template 262
MSDP Scaleout CR
# The MSDP credential stored in the Secret should match MSDP credential
rules defined in https://www.veritas.com/content/support/en_US/article.
100048511
apiVersion: msdp.veritas.com/v1
kind: MSDPScaleout
metadata:
# The CR name should not be longer than 40 characters.
name: sample-app
# The namespace needs to be present for the CR to be created in.
# It is not allowed to deploy the CR in the same namespace with MSDP
operator.
namespace: sample-namespace
spec:
# Your Container Registry(ECR for AWS EKS) URL where
the docker images can be pulled from the k8s cluster on demand
# The allowed length is in range 1-255
# It is optional for BYO. The code does not check the presence or
validation.
# User needs to specify it correctly if it is needed.
containerRegistry: sample.url
#
# The MSDP version string. It is the tag of the MSDP docker images.
# The allowed length is in range 1-64
version: "sample-version-string"
#
# Size defines the number of Engine instances in the MSDP-X cluster.
# The allowed size is between 1-16
size: 4
#
# The IP and FQDN pairs are used by the Engine Pods to expose the
MSDP services.
# The IP and FQDN in one pair should match each other correctly.
# They must be pre-allocated.
# The item number should match the number of Engine instances.
# They are not allowed to be changed or re-ordered. New items can be
appended for scaling out.
# The first FQDN is used to configure the storage server in NetBackup,
automatically if autoRegisterOST is enabled,
# or manually by the user if not.
serviceIPFQDNs:
# The pattern is IPv4 or IPv6 format
- ipAddr: "sample-ip1"
# The pattern is FQDN format.
CR template 266
MSDP Scaleout CR
fqdn: "sample-fqdn1"
- ipAddr: "sample-ip2"
fqdn: "sample-fqdn2"
- ipAddr: "sample-ip3"
fqdn: "sample-fqdn3"
- ipAddr: "sample-ip4"
fqdn: "sample-fqdn4"
#
# Optional annotations to be added in the LoadBalancer services for the
Engine IPs.
# In case we run the Engines on private IPs, we need to add some
customized annotations to the LoadBalancer services.
# loadBalancerAnnotations:
# # If it's an EKS environment, specify the following annotation
to use the internal IPs.
# # see https://docs.microsoft.com/en-us/amazon/aws/internal-lb
# service.beta.kubernetes.io/aws-load-balancer: "true"
# # If the internal IPs are in a different subnet as the EKS cluster,
the following annotation should be
# # specified as well. The subnet specified must be in the same virtual
network as the EKS cluster.
# service.beta.kubernetes.io/aws-load-balancer-internal-subnet:
"apps-subnet"
#
# # If your cluster is EKS, the following annotation item is required.
# # The subnet specified must be in the same VPC as your EKS.
# service.beta.kubernetes.io/aws-load-balancer-subnets: "subnet-04c47
28ec4d0ecb90"
#
# SecretName is the name of the secret which stores the MSDP credential.
# AutoDelete, when true, will automatically delete the secret specified
by SecretName after the
# initial configuration. If unspecified, AutoDelete defaults to true.
# When true, SkipPrecheck will skip webhook validation of the MSDP
credential. It is only used in data re-use
# scenario (delete CR and re-apply with pre-existing data) as the secret
will not take effect in this scenario. It
# cannot be used in other scenarios. If unspecified, SkipPrecheck defaults
to false.
credential:
# The secret should be pre-created in the same namespace which has the
MSDP credential stored.
# The secret should have a "username" and a "password" key-pairs with
CR template 267
MSDP Scaleout CR
dataVolumes:
- storageClassName: sample-aws-disk-sc3
resources:
requests:
storage: xxTi
- storageClassName: sample-aws-disk-sc3
resources:
requests:
storage: xxTi
#
# NodeSelector is used to schedule the MSDPScaleout Pods on the
specified nodes.
# Optional.
# Default is empty (aka all available nodes)
nodeSelector:
# e.g.
# agentpool: nodegroup2
sample-node-label1: sampel-label-value1
sample-node-label2: sampel-label-value2
#
# NBCA is the specification for the MSDP-X cluster to enable NBCA
SecComm for the Engines.
# Optional.
nbca:
# The master server name
# The allowed length is in range 1-255
masterServer: sample-master-server-name
# The CA SHA256 fingerprint
# The allowed length is 95
cafp: sample-ca-fp
# The NBCA authentication/reissue token
# The allowed length is 16
# For security consideration, a token with maximum 1 user allowed
and valid for 1 day should be sufficient.
token: sample-auth-token
#
# KMS includes the parameters to enable KMS for the Engines.
# We support to enable KMS in init or post configuration.
# We do not support to change the parameters once they have been set.
# Optional.
kms:
# As either the NetBackup KMS or external KMS (EKMS) is configured
or registered on NetBackup master server, then used by
CR template 270
MSDP Scaleout CR
# tcpKeepAliveTime: 120
#
# TCPIdleTimeout is used to change the default value for AWS Load
Balancer rules and Inbound NAT rules.
# It is in minutes.
# The minimal allowed value is 4 and the maximum allowed value is 30.
# A default value 30 minutes is used if not specified. Set it to 0 to
disable the option.
# It is not allowed to change unless in maintenance mode (paused=true),
and the change will not apply until the Engine Pods and the LoadBalancer
services get recreated.
# For EKS deployment in 10.1 release, please leave it unspecified or
specify it with a value larger than 4.
# tcpIdleTimeout: 30