KEMBAR78
Azure Batch | PDF | Command Line Interface | Microsoft Azure
0% found this document useful (0 votes)
20 views828 pages

Azure Batch

Azure Batch is a cloud service that efficiently runs large-scale parallel and high-performance computing jobs without the need for infrastructure management. It allows users to create and manage pools of virtual machines to execute compute-intensive tasks, supporting both intrinsically parallel and tightly coupled workloads. Users only pay for the underlying resources consumed, and the service integrates with various APIs and tools for job management and monitoring.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views828 pages

Azure Batch

Azure Batch is a cloud service that efficiently runs large-scale parallel and high-performance computing jobs without the need for infrastructure management. It allows users to create and manage pools of virtual machines to execute compute-intensive tasks, supporting both intrinsically parallel and tightly coupled workloads. Users only pay for the underlying resources consumed, and the service integrates with various APIs and tools for job management and monitoring.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 828

Tell us about your PDF experience.

Azure Batch documentation


Azure Batch runs large-scale applications efficiently in the cloud. Schedule compute-intensive
tasks and dynamically adjust resources for your solution without managing infrastructure.

About Azure Batch

e OVERVIEW

What is Azure Batch?

d TRAINING

Free Pluralsight video training

Get started

f QUICKSTART

Azure CLI

Azure Portal

.NET API

Python API

p CONCEPT

Workflow and feature overview

Jobs and tasks

Nodes and pools

Step-by-step guides

g TUTORIAL

Parallel file processing - .NET

Parallel file processing - Python


Python scripts with Data Factory

OCR with Batch and Functions

Set up and manage resources

c HOW-TO GUIDE

Create a Batch account

Create and manage pools

Run tasks concurrently

Use application packages

Run container applications

p CONCEPT

Supported VM sizes

Quotas and limits

Best practices

API reference

i REFERENCE

Azure CLI

Azure PowerShell

.NET

Java

Node.js

Python

REST

Batch API lifecycle


What is Azure Batch?
Article • 03/14/2025

Use Azure Batch to run large-scale parallel and high-performance computing (HPC)
batch jobs efficiently in Azure. Azure Batch creates and manages a pool of compute
nodes (virtual machines), installs the applications you want to run, and schedules jobs to
run on the nodes. There's no cluster or job scheduler software to install, manage, or
scale. Instead, you use Batch APIs and tools, command-line scripts, or the Azure portal
to configure, manage, and monitor your jobs.

Developers can use Batch as a platform service to build SaaS applications or client apps
where large-scale execution is required. For example, you can build a service with Batch
to run a Monte Carlo risk simulation for a financial services company, or a service to
process many images.

There is no additional charge for using Batch. You only pay for the underlying resources
consumed, such as the virtual machines, storage, and networking.

For a comparison between Batch and other HPC solution options in Azure, see High
Performance Computing (HPC) on Azure.

Run parallel workloads


Batch works well with intrinsically parallel (also known as "embarrassingly parallel")
workloads. These workloads have applications which can run independently, with each
instance completing part of the work. When the applications are executing, they might
access some common data, but they don't communicate with other instances of the
application. Intrinsically parallel workloads can therefore run at a large scale, determined
by the amount of compute resources available to run applications simultaneously.

Some examples of intrinsically parallel workloads you can bring to Batch:

Financial risk modeling using Monte Carlo simulations


VFX and 3D image rendering
Image analysis and processing
Media transcoding
Genetic sequence analysis
Optical character recognition (OCR)
Data ingestion, processing, and ETL operations
Software test execution
You can also use Batch to run tightly coupled workloads, where the applications you run
need to communicate with each other, rather than running independently. Tightly
coupled applications normally use the Message Passing Interface (MPI) API. You can run
your tightly coupled workloads with Batch using Microsoft MPI or Intel MPI. Improve
application performance with specialized HPC and GPU-optimized VM sizes.

Some examples of tightly coupled workloads:

Finite element analysis


Fluid dynamics
Multi-node AI training

Many tightly coupled jobs can be run in parallel using Batch. For example, you can
perform multiple simulations of a liquid flowing through a pipe with varying pipe
widths.

Additional Batch capabilities


Batch supports large-scale rendering workloads with rendering tools including Autodesk
Maya, 3ds Max, Arnold, and V-Ray.

You can also run Batch jobs as part of a larger Azure workflow to transform data,
managed by tools such as Azure Data Factory.

How it works
A common scenario for Batch involves scaling out intrinsically parallel work, such as the
rendering of images for 3D scenes, on a pool of compute nodes. This pool can be your
"render farm" that provides tens, hundreds, or even thousands of cores to your
rendering job.

The following diagram shows steps in a common Batch workflow, with a client
application or hosted service using Batch to run a parallel workload.
ノ Expand table

Step Description

1. Upload input files and the The input files can be any data that your application processes,
applications to process such as financial modeling data, or video files to be transcoded.
those files to your Azure The application files can include scripts or applications that
Storage account. process the data, such as a media transcoder.

2. Create a Batch pool of Compute nodes are the VMs that execute your tasks. Specify
compute nodes in your properties for your pool, such as the number and size of the
Batch account, a job to run nodes, a Windows or Linux VM image, and an application to
the workload on the pool, install when the nodes join the pool. Manage the cost and size of
and tasks in the job. the pool by using Azure Spot VMs or by automatically scaling the
number of nodes as the workload changes.

When you add tasks to a job, the Batch service automatically


schedules the tasks for execution on the compute nodes in the
pool. Each task uses the application that you uploaded to process
the input files.
Step Description

3. Download input files and Before each task executes, it can download the input data that it
the applications to Batch will process to the assigned node. If the application isn't already
installed on the pool nodes, it can be downloaded here instead.
When the downloads from Azure Storage complete, the task
executes on the assigned node.

4. Monitor task execution As the tasks run, query Batch to monitor the progress of the job
and its tasks. Your client application or service communicates with
the Batch service over HTTPS. Because you may be monitoring
thousands of tasks running on thousands of compute nodes, be
sure to query the Batch service efficiently.

5. Upload task output As the tasks complete, they can upload their result data to Azure
Storage. You can also retrieve files directly from the file system on
a compute node.

6. Download output files When your monitoring detects that the tasks in your job have
completed, your client application or service can download the
output data for further processing.

Keep in mind that the workflow described above is just one way to use Batch, and there
are many other features and options. For example, you can execute multiple tasks in
parallel on each compute node. Or you can use job preparation and completion tasks to
prepare the nodes for your jobs, then clean up afterward.

See Batch service workflow and resources for an overview of features such as pools,
nodes, jobs, and tasks. Also see the latest Batch service updates .

In-region data residency


Azure Batch does not move or store customer data out of the region in which it is
deployed.

Next steps
Get started with Azure Batch with one of these quickstarts:

Run your first Batch job with the Azure CLI


Run your first Batch job with the Azure portal
Run your first Batch job using the .NET API
Run your first Batch job using the Python API
Create a Batch account using ARM templates
Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Quickstart: Use the Azure CLI to create a
Batch account and run a job
Article • 04/06/2025

This quickstart shows you how to get started with Azure Batch by using Azure CLI commands
and scripts to create and manage Batch resources. You create a Batch account that has a pool
of virtual machines, or compute nodes. You then create and run a job with tasks that run on the
pool nodes.

After you complete this quickstart, you understand the key concepts of the Batch service and
are ready to use Batch with more realistic, larger scale workloads.

Prerequisites
If you don't have an Azure subscription, create an Azure free account before you begin.

Azure Cloud Shell or Azure CLI.

You can run the Azure CLI commands in this quickstart interactively in Azure Cloud Shell.
To run the commands in the Cloud Shell, select Open Cloudshell at the upper-right
corner of a code block. Select Copy to copy the code, and paste it into Cloud Shell to run
it. You can also run Cloud Shell from within the Azure portal . Cloud Shell always uses
the latest version of the Azure CLI.

Alternatively, you can install Azure CLI locally to run the commands. The steps in this
article require Azure CLI version 2.0.20 or later. Run az version to see your installed
version and dependent libraries, and run az upgrade to upgrade. If you use a local
installation, sign in to Azure by using the appropriate command.

7 Note

For some regions and subscription types, quota restrictions might cause Batch account or
node creation to fail or not complete. In this situation, you can request a quota increase at
no charge. For more information, see Batch service quotas and limits.

Create a resource group


Run the following az group create command to create an Azure resource group. The resource
group is a logical container that holds the Azure resources for this quickstart.
Azure CLI

export RANDOM_SUFFIX=$(openssl rand -hex 3)


export REGION="canadacentral"
export RESOURCE_GROUP="qsBatch$RANDOM_SUFFIX"

az group create \
--name $RESOURCE_GROUP \
--location $REGION

Results:

JSON

{
"id": "/subscriptions/xxxxx/resourceGroups/qsBatchxxx",
"location": "eastus2",
"managedBy": null,
"name": "qsBatchxxx",
"properties": {
"provisioningState": "Succeeded"
},
"tags": null,
"type": "Microsoft.Resources/resourceGroups"
}

Create a storage account


Use the az storage account create command to create an Azure Storage account to link to your
Batch account. Although this quickstart doesn't use the storage account, most real-world Batch
workloads use a linked storage account to deploy applications and store input and output
data.

Run the following command to create a Standard_LRS SKU storage account in your resource
group:

Azure CLI

export STORAGE_ACCOUNT="mybatchstorage$RANDOM_SUFFIX"

az storage account create \


--resource-group $RESOURCE_GROUP \
--name $STORAGE_ACCOUNT \
--location $REGION \
--sku Standard_LRS
Create a Batch account
Run the following az batch account create command to create a Batch account in your
resource group and link it with the storage account.

Azure CLI

export BATCH_ACCOUNT="mybatchaccount$RANDOM_SUFFIX"

az batch account create \


--name $BATCH_ACCOUNT \
--storage-account $STORAGE_ACCOUNT \
--resource-group $RESOURCE_GROUP \
--location $REGION

Sign in to the new Batch account by running the az batch account login command. Once you
authenticate your account with Batch, subsequent az batch commands in this session use this
account context.

Azure CLI

az batch account login \


--name $BATCH_ACCOUNT \
--resource-group $RESOURCE_GROUP \
--shared-key-auth

Create a pool of compute nodes


Run the az batch pool create command to create a pool of Linux compute nodes in your Batch
account. The following example creates a pool that consists of two Standard_A1_v2 size VMs
running Ubuntu 20.04 LTS OS. This node size offers a good balance of performance versus cost
for this quickstart example.

Azure CLI

export POOL_ID="myPool$RANDOM_SUFFIX"

az batch pool create \


--id $POOL_ID \
--image canonical:0001-com-ubuntu-server-focal:20_04-lts \
--node-agent-sku-id "batch.node.ubuntu 20.04" \
--target-dedicated-nodes 2 \
--vm-size Standard_A1_v2
Batch creates the pool immediately, but takes a few minutes to allocate and start the compute
nodes. To see the pool status, use the az batch pool show command. This command shows all
the properties of the pool, and you can query for specific properties. The following command
queries for the pool allocation state:

Azure CLI

az batch pool show --pool-id $POOL_ID \


--query "{allocationState: allocationState}"

Results:

JSON

{
"allocationState": "resizing"
}

While Batch allocates and starts the nodes, the pool is in the resizing state. You can create a
job and tasks while the pool state is still resizing . The pool is ready to run tasks when the
allocation state is steady and all the nodes are running.

Create a job
Use the az batch job create command to create a Batch job to run on your pool. A Batch job is
a logical group of one or more tasks. The job includes settings common to the tasks, such as
the pool to run on. The following example creates a job that initially has no tasks.

Azure CLI

export JOB_ID="myJob$RANDOM_SUFFIX"

az batch job create \


--id $JOB_ID \
--pool-id $POOL_ID

Create job tasks


Batch provides several ways to deploy apps and scripts to compute nodes. Use the az batch
task create command to create tasks to run in the job. Each task has a command line that
specifies an app or script.
The following Bash script creates four identical, parallel tasks called myTask1 through myTask4 .
The task command line displays the Batch environment variables on the compute node, and
then waits 90 seconds.

Azure CLI

for i in {1..4}
do
az batch task create \
--task-id myTask$i \
--job-id $JOB_ID \
--command-line "/bin/bash -c 'printenv | grep AZ_BATCH; sleep 90s'"
done

Batch distributes the tasks to the compute nodes.

View task status


After you create the tasks, Batch queues them to run on the pool. Once a node is available, a
task runs on the node.

Use the az batch task show command to view the status of Batch tasks. The following example
shows details about the status of myTask1 :

Azure CLI

az batch task show \


--job-id $JOB_ID \
--task-id myTask1

The command output includes many details. For example, an exitCode of 0 indicates that the
task command completed successfully. The nodeId shows the name of the pool node that ran
the task.

View task output


Use the az batch task file list command to list the files a task created on a node. The following
command lists the files that myTask1 created:

Azure CLI

# Wait for task to complete before downloading output


echo "Waiting for task to complete..."
while true; do
STATUS=$(az batch task show --job-id $JOB_ID --task-id myTask1 --query "state"
-o tsv)
if [ "$STATUS" == "running" ]; then
break
fi
sleep 10
done

az batch task file list --job-id $JOB_ID --task-id myTask1 --output table

Results are similar to the following output:

Results:

Output

Name URL
Is Directory Content Length
---------- ----------------------------------------------------------------------
------------------ -------------- ----------------
stdout.txt
https://mybatchaccount.eastus2.batch.azure.com/jobs/myJob/tasks/myTask1/files/stdo
ut.txt False 695
certs
https://mybatchaccount.eastus2.batch.azure.com/jobs/myJob/tasks/myTask1/files/cert
s True
wd
https://mybatchaccount.eastus2.batch.azure.com/jobs/myJob/tasks/myTask1/files/wd
True
stderr.txt
https://mybatchaccount.eastus2.batch.azure.com/jobs/myJob/tasks/myTask1/files/stde
rr.txt False 0

The az batch task file download command downloads output files to a local directory. Run the
following example to download the stdout.txt file:

Azure CLI

az batch task file download \


--job-id $JOB_ID \
--task-id myTask1 \
--file-path stdout.txt \
--destination ./stdout.txt

You can view the contents of the standard output file in a text editor. The following example
shows a typical stdout.txt file. The standard output from this task shows the Azure Batch
environment variables that are set on the node. You can refer to these environment variables in
your Batch job task command lines, and in the apps and scripts the command lines run.
text

AZ_BATCH_TASK_DIR=/mnt/batch/tasks/workitems/myJob/job-1/myTask1
AZ_BATCH_NODE_STARTUP_DIR=/mnt/batch/tasks/startup
AZ_BATCH_CERTIFICATES_DIR=/mnt/batch/tasks/workitems/myJob/job-1/myTask1/certs
AZ_BATCH_ACCOUNT_URL=https://mybatchaccount.eastus2.batch.azure.com/
AZ_BATCH_TASK_WORKING_DIR=/mnt/batch/tasks/workitems/myJob/job-1/myTask1/wd
AZ_BATCH_NODE_SHARED_DIR=/mnt/batch/tasks/shared
AZ_BATCH_TASK_USER=_azbatch
AZ_BATCH_NODE_ROOT_DIR=/mnt/batch/tasks
AZ_BATCH_JOB_ID=myJob
AZ_BATCH_NODE_IS_DEDICATED=true
AZ_BATCH_NODE_ID=tvm-257509324_2-20180703t215033z
AZ_BATCH_POOL_ID=myPool
AZ_BATCH_TASK_ID=myTask1
AZ_BATCH_ACCOUNT_NAME=mybatchaccount
AZ_BATCH_TASK_USER_IDENTITY=PoolNonAdmin

Next steps
In this quickstart, you created a Batch account and pool, created and ran a Batch job and tasks,
and viewed task output from the nodes. Now that you understand the key concepts of the
Batch service, you're ready to use Batch with more realistic, larger scale workloads. To learn
more about Azure Batch, continue to the Azure Batch tutorials.

Tutorial: Run a parallel workload with Azure Batch


Quickstart: Use the Azure portal to
create a Batch account and run a job
Article • 03/14/2025

This quickstart shows you how to get started with Azure Batch by using the Azure portal.
You create a Batch account that has a pool of virtual machines (VMs), or compute nodes.
You then create and run a job with tasks that run on the pool nodes.

After you complete this quickstart, you understand the key concepts of the Batch service
and are ready to use Batch with more realistic, larger scale workloads.

Prerequisites
If you don't have an Azure subscription, create an Azure free account before you
begin.

7 Note

For some regions and subscription types, quota restrictions might cause Batch
account or node creation to fail or not complete. In this situation, you can request a
quota increase at no charge. For more information, see Batch service quotas and
limits.

Create a Batch account and Azure Storage


account
You need a Batch account to create pools and jobs. The following steps create an
example Batch account. You also create an Azure Storage account to link to your Batch
account. Although this quickstart doesn't use the storage account, most real-world
Batch workloads use a linked storage account to deploy applications and store input
and output data.

1. Sign in to the Azure portal , and search for and select batch accounts.
2. On the Batch accounts page, select Create.

3. On the New Batch account page, enter or select the following values:

Under Resource group, select Create new, enter the name qsBatch, and then
select OK. The resource group is a logical container that holds the Azure
resources for this quickstart.
For Account name, enter the name mybatchaccount. The Batch account name
must be unique within the Azure region you select, can contain only
lowercase letters and numbers, and must be between 3-24 characters.
For Location, select East US.
Under Storage account, select the link to Select a storage account.
4. On the Create storage account page, under Name, enter mybatchstorage. Leave
the other settings at their defaults, and select OK.

5. Select Review + create at the bottom of the New Batch account page, and when
validation passes, select Create.

6. When the Deployment succeeded message appears, select Go to resource to go


to the Batch account that you created.

Create a pool of compute nodes


Next, create a pool of Windows compute nodes in your Batch account. The following
steps create a pool that consists of two Standard_A1_v2 size VMs running Windows
Server 2019. This node size offers a good balance of performance versus cost for this
quickstart.

1. On your Batch account page, select Pools from the left navigation.

2. On the Pools page, select Add.


3. On the Add pool page, for Name, enter myPool.

4. Under Operating System, select the following settings:

Publisher: Select microsoftwindowsserver.


Sku: Select 2019-datacenter-core-smalldisk.

5. Under OS disk storage account type, select Standard LRS.

6. Scroll down to Node size, and for VM size, select Standard_A1_v2.

7. Under Scale, for Target dedicated nodes, enter 2.

8. Accept the defaults for the remaining settings, and select OK at the bottom of the
page.

Batch creates the pool immediately, but takes a few minutes to allocate and start the
compute nodes. On the Pools page, you can select myPool to go to the myPool page
and see the pool status of Resizing under Essentials > Allocation state. You can
proceed to create a job and tasks while the pool state is still Resizing or Starting.

After a few minutes, the Allocation state changes to Steady, and the nodes start. To
check the state of the nodes, select Nodes in the myPool page left navigation. When a
node's state is Idle, it's ready to run tasks.

Create a job
Now create a job to run on the pool. A Batch job is a logical group of one or more tasks.
The job includes settings common to the tasks, such as priority and the pool to run tasks
on. The job doesn't have tasks until you create them.

1. On the mybatchaccount page, select Jobs from the left navigation.

2. On the Jobs page, select Add.

3. On the Add job page, for Job ID, enter myJob.

4. Select Select pool, and on the Select pool page, select myPool, and then select
Select.

5. On the Add job page, select OK. Batch creates the job and lists it on the Jobs
page.

Create tasks
Jobs can contain multiple tasks that Batch queues and distributes to run on the compute
nodes. Batch provides several ways to deploy apps and scripts to compute nodes. When
you create a task, you specify your app or script in a command line.

The following procedure creates and runs two identical tasks in your job. Each task runs
a command line that displays the Batch environment variables on the compute node,
and then waits 90 seconds.

1. On the Jobs page, select myJob.

2. On the Tasks page, select Add.

3. On the Add task page, for Task ID, enter myTask1.

4. In Command line, enter cmd /c "set AZ_BATCH & timeout /t 90 > NUL" .

5. Accept the defaults for the remaining settings, and select Submit.

6. Repeat the preceding steps to create a second task, but enter myTask2 for Task ID.

After you create each task, Batch queues it to run on the pool. Once a node is available,
the task runs on the node. In the quickstart example, if the first task is still running on
one node, Batch starts the second task on the other node in the pool.

View task output


The tasks should complete in a couple of minutes. To update task status, select Refresh
at the top of the Tasks page.

To view the output of a completed task, you can select the task from the Tasks page. On
the myTask1 page, select the stdout.txt file to view the standard output of the task.
The contents of the stdout.txt file are similar to the following example:

The standard output for this task shows the Azure Batch environment variables that are
set on the node. As long as this node exists, you can refer to these environment
variables in Batch job task command lines, and in the apps and scripts the command
lines run.

Clean up resources
If you want to continue with Batch tutorials and samples, you can use the Batch account
and linked storage account that you created in this quickstart. There's no charge for the
Batch account itself.
Pools and nodes incur charges while the nodes are running, even if they aren't running
jobs. When you no longer need a pool, delete it.

To delete a pool:

1. On your Batch account page, select Pools from the left navigation.
2. On the Pools page, select the pool to delete, and then select Delete.
3. On the Delete pool screen, enter the name of the pool, and then select Delete.

Deleting a pool deletes all task output on the nodes, and the nodes themselves.

When you no longer need any of the resources you created for this quickstart, you can
delete the resource group and all its resources, including the storage account, Batch
account, and node pools. To delete the resource group, select Delete resource group at
the top of the qsBatch resource group page. On the Delete a resource group screen,
enter the resource group name qsBatch, and then select Delete.

Next steps
In this quickstart, you created a Batch account and pool, and created and ran a Batch job
and tasks. You monitored node and task status, and viewed task output from the nodes.

Now that you understand the key concepts of the Batch service, you're ready to use
Batch with more realistic, larger scale workloads. To learn more about Azure Batch,
continue to the Azure Batch tutorials.

Azure Batch tutorials

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Quickstart: Create a Batch account by
using a Bicep file
Article • 04/02/2025

Get started with Azure Batch by using a Bicep file to create a Batch account, including
storage. You need a Batch account to create compute resources (pools of compute
nodes) and Batch jobs. You can link an Azure Storage account with your Batch account,
which is useful to deploy applications and store input and output data for most real-
world workloads.

After completing this quickstart, you'll understand the key concepts of the Batch service
and be ready to try Batch with more realistic workloads at larger scale.

Bicep is a domain-specific language (DSL) that uses declarative syntax to deploy Azure
resources. It provides concise syntax, reliable type safety, and support for code reuse.
Bicep offers the best authoring experience for your infrastructure-as-code solutions in
Azure.

Prerequisites
You must have an active Azure subscription.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Review the Bicep file


The Bicep file used in this quickstart is from Azure Quickstart Templates .

Bicep

@description('Batch Account Name')


param batchAccountName string =
'${toLower(uniqueString(resourceGroup().id))}batch'

@description('Storage Account type')


@allowed([
'Standard_LRS'
'Standard_GRS'
'Standard_ZRS'
'Premium_LRS'
])
param storageAccountsku string = 'Standard_LRS'
@description('Location for all resources.')
param location string = resourceGroup().location

var storageAccountName = '${uniqueString(resourceGroup().id)}storage'

resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {


name: storageAccountName
location: location
sku: {
name: storageAccountsku
}
kind: 'StorageV2'
tags: {
ObjectName: storageAccountName
}
properties: {
minimumTlsVersion: 'TLS1_2'
allowBlobPublicAccess: false
networkAcls: {
defaultAction: 'Deny'
}
supportsHttpsTrafficOnly: true
}
}

resource batchAccount 'Microsoft.Batch/batchAccounts@2024-02-01' = {


name: batchAccountName
location: location
tags: {
ObjectName: batchAccountName
}
properties: {
autoStorage: {
storageAccountId: storageAccount.id
}
}
}

output storageAccountName string = storageAccount.name


output batchAccountName string = batchAccount.name
output location string = location
output resourceGroupName string = resourceGroup().name
output resourceId string = batchAccount.id

Two Azure resources are defined in the Bicep file:

Microsoft.Storage/storageAccounts: Creates a storage account.


Microsoft.Batch/batchAccounts: Creates a Batch account.

Deploy the Bicep file


1. Save the Bicep file as main.bicep to your local computer.

2. Deploy the Bicep file using either Azure CLI or Azure PowerShell.

CLI

Azure CLI

az group create --name exampleRG --location eastus


az deployment group create --resource-group exampleRG --template-
file main.bicep

When the deployment finishes, you should see a message indicating the
deployment succeeded.

Validate the deployment


Use the Azure portal, Azure CLI, or Azure PowerShell to list the deployed resources in
the resource group.

CLI

Azure CLI

az resource list --resource-group exampleRG

Clean up resources
If you plan to continue on with more of our tutorials, you may want to leave these
resources in place. When no longer needed, use the Azure portal, Azure CLI, or Azure
PowerShell to delete the resource group and all of its resources.

CLI

Azure CLI

az group delete --name exampleRG


Next steps
In this quickstart, you created a Batch account and a storage account using Bicep. To
learn more about Azure Batch, continue to the Azure Batch tutorials.

Azure Batch tutorials

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Quickstart: Create a Batch account by
using ARM template
Article • 06/13/2024

Get started with Azure Batch by using an Azure Resource Manager template (ARM
template) to create a Batch account, including storage. You need a Batch account to
create compute resources (pools of compute nodes) and Batch jobs. You can link an
Azure Storage account with your Batch account, which is useful to deploy applications
and store input and output data for most real-world workloads.

After completing this quickstart, you'll understand the key concepts of the Batch service
and be ready to try Batch with more realistic workloads at larger scale.

An Azure Resource Manager template is a JavaScript Object Notation (JSON) file that
defines the infrastructure and configuration for your project. The template uses
declarative syntax. You describe your intended deployment without writing the
sequence of programming commands to create the deployment.

If your environment meets the prerequisites and you're familiar with using ARM
templates, select the Deploy to Azure button. The template will open in the Azure
portal.

Prerequisites
You must have an active Azure subscription.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Review the template


The template used in this quickstart is from Azure Quickstart Templates .

JSON

{
"$schema": "https://schema.management.azure.com/schemas/2019-04-
01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"metadata": {
"_generator": {
"name": "bicep",
"version": "0.26.54.24096",
"templateHash": "5620168434409602803"
}
},
"parameters": {
"batchAccountName": {
"type": "string",
"defaultValue": "[format('{0}batch',
toLower(uniqueString(resourceGroup().id)))]",
"metadata": {
"description": "Batch Account Name"
}
},
"storageAccountsku": {
"type": "string",
"defaultValue": "Standard_LRS",
"allowedValues": [
"Standard_LRS",
"Standard_GRS",
"Standard_ZRS",
"Premium_LRS"
],
"metadata": {
"description": "Storage Account type"
}
},
"location": {
"type": "string",
"defaultValue": "[resourceGroup().location]",
"metadata": {
"description": "Location for all resources."
}
}
},
"variables": {
"storageAccountName": "[format('{0}storage',
uniqueString(resourceGroup().id))]"
},
"resources": [
{
"type": "Microsoft.Storage/storageAccounts",
"apiVersion": "2023-01-01",
"name": "[variables('storageAccountName')]",
"location": "[parameters('location')]",
"sku": {
"name": "[parameters('storageAccountsku')]"
},
"kind": "StorageV2",
"tags": {
"ObjectName": "[variables('storageAccountName')]"
},
"properties": {
"minimumTlsVersion": "TLS1_2",
"allowBlobPublicAccess": false,
"networkAcls": {
"defaultAction": "Deny"
},
"supportsHttpsTrafficOnly": true
}
},
{
"type": "Microsoft.Batch/batchAccounts",
"apiVersion": "2024-02-01",
"name": "[parameters('batchAccountName')]",
"location": "[parameters('location')]",
"tags": {
"ObjectName": "[parameters('batchAccountName')]"
},
"properties": {
"autoStorage": {
"storageAccountId": "
[resourceId('Microsoft.Storage/storageAccounts',
variables('storageAccountName'))]"
}
},
"dependsOn": [
"[resourceId('Microsoft.Storage/storageAccounts',
variables('storageAccountName'))]"
]
}
],
"outputs": {
"storageAccountName": {
"type": "string",
"value": "[variables('storageAccountName')]"
},
"batchAccountName": {
"type": "string",
"value": "[parameters('batchAccountName')]"
},
"location": {
"type": "string",
"value": "[parameters('location')]"
},
"resourceGroupName": {
"type": "string",
"value": "[resourceGroup().name]"
},
"resourceId": {
"type": "string",
"value": "[resourceId('Microsoft.Batch/batchAccounts',
parameters('batchAccountName'))]"
}
}
}
Two Azure resources are defined in the template:

Microsoft.Storage/storageAccounts: Creates a storage account.


Microsoft.Batch/batchAccounts: Creates a Batch account.

Deploy the template


1. Select the following image to sign in to Azure and open a template. The template
creates an Azure Batch account and a storage account.

2. Select or enter the following values.

Subscription: select an Azure subscription.


Resource group: select Create new, enter a unique name for the resource
group, and then click OK.
Location: select a location. For example, Central US.
Batch Account Name: Leave the default value.
Storage Accountsku: select a storage account type. For example,
Standard_LRS.
Location: Leave the default so that the resources will be in the same location
as your resource group.

3. Select Review + create, then select Create.

After a few minutes, you should see a notification that the Batch account was
successfully created.

In this example, the Azure portal is used to deploy the template. In addition to the Azure
portal, you can also use the Azure PowerShell, Azure CLI, and REST API. To learn other
deployment methods, see Deploy templates.

Validate the deployment


You can validate the deployment in the Azure portal by navigating to the resource
group you created. In the Overview screen, confirm that the Batch account and the
storage account are present.

Clean up resources
If you plan to continue on with more of our tutorials, you may wish to leave these
resources in place. Or, if you no longer need them, you can delete the resource group,
which will also delete the Batch account and the storage account that you created.

Next steps
In this quickstart, you created a Batch account and a storage account. To learn more
about Azure Batch, continue to the Azure Batch tutorials.

Azure Batch tutorials

Feedback
Was this page helpful?  Yes  No

Provide product feedback


Quickstart: Create an Azure Batch
account using Terraform
Article • 04/02/2025

Get started with Azure Batch by using Terraform to create a Batch account, including
storage. You need a Batch account to create compute resources (pools of compute
nodes) and Batch jobs. You can link an Azure Storage account with your Batch account.
This pairing is useful to deploy applications and store input and output data for most
real-world workloads.

After completing this quickstart, you'll understand the key concepts of the Batch service
and be ready to try Batch with more realistic workloads at larger scale.

Terraform enables the definition, preview, and deployment of cloud infrastructure.


Using Terraform, you create configuration files using HCL syntax . The HCL syntax
allows you to specify the cloud provider - such as Azure - and the elements that make
up your cloud infrastructure. After you create your configuration files, you create an
execution plan that allows you to preview your infrastructure changes before they're
deployed. Once you verify the changes, you apply the execution plan to deploy the
infrastructure.

In this article, you learn how to:

" Create a random value for the Azure resource group name using random_pet
" Create an Azure resource group using azurerm_resource_group
" Create a random value using random_string
" Create an Azure Storage account using azurerm_storage_account
" Create an Azure Batch account using azurerm_batch_account

Prerequisites
Install and configure Terraform

Implement the Terraform code

7 Note

The sample code for this article is located in the Azure Terraform GitHub repo .
You can view the log file containing the test results from current and previous
versions of Terraform .

See more articles and sample code showing how to use Terraform to manage
Azure resources

1. Create a directory in which to test and run the sample Terraform code and make it
the current directory.

2. Create a file named providers.tf and insert the following code:

Terraform

terraform {
required_version = ">=1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~>3.0"
}
random = {
source = "hashicorp/random"
version = "~>3.0"
}
}
}
provider "azurerm" {
features {}
}

3. Create a file named main.tf and insert the following code:

Terraform

resource "random_pet" "rg_name" {


prefix = var.resource_group_name_prefix
}

resource "azurerm_resource_group" "rg" {


name = random_pet.rg_name.id
location = var.resource_group_location
}

resource "random_string" "azurerm_storage_account_name" {


length = 13
lower = true
numeric = false
special = false
upper = false
}
resource "random_string" "azurerm_batch_account_name" {
length = 13
lower = true
numeric = false
special = false
upper = false
}

resource "azurerm_storage_account" "storage" {


name =
"storage${random_string.azurerm_storage_account_name.result}"
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
account_tier = element(split("_",
var.storage_account_type), 0)
account_replication_type = element(split("_",
var.storage_account_type), 1)
}

resource "azurerm_batch_account" "batch" {


name =
"batch${random_string.azurerm_batch_account_name.result}"
resource_group_name = azurerm_resource_group.rg.name
location =
azurerm_resource_group.rg.location
storage_account_id =
azurerm_storage_account.storage.id
storage_account_authentication_mode = "StorageKeys"
}

4. Create a file named variables.tf and insert the following code:

Terraform

variable "resource_group_location" {
type = string
default = "eastus"
description = "Location for all resources."
}

variable "resource_group_name_prefix" {
type = string
default = "rg"
description = "Prefix of the resource group name that's combined with
a random ID so name is unique in your Azure subscription."
}

variable "storage_account_type" {
type = string
default = "Standard_LRS"
description = "Azure Storage account type."
validation {
condition = contains(["Premium_LRS", "Premium_ZRS",
"Standard_GRS", "Standard_GZRS", "Standard_LRS", "Standard_RAGRS",
"Standard_RAGZRS", "Standard_ZRS"], var.storage_account_type)
error_message = "Invalid storage account type. The value should be
one of the following:
'Premium_LRS','Premium_ZRS','Standard_GRS','Standard_GZRS','Standard_LR
S','Standard_RAGRS','Standard_RAGZRS','Standard_ZRS'."
}
}

5. Create a file named outputs.tf and insert the following code:

Terraform

output "resource_group_name" {
value = azurerm_resource_group.rg.name
}

output "batch_name" {
value = azurerm_batch_account.batch.name
}

output "storage_name" {
value = azurerm_storage_account.storage.name
}

Initialize Terraform
Run terraform init to initialize the Terraform deployment. This command downloads
the Azure provider required to manage your Azure resources.

Console

terraform init -upgrade

Key points:

The -upgrade parameter upgrades the necessary provider plugins to the newest
version that complies with the configuration's version constraints.

Create a Terraform execution plan


Run terraform plan to create an execution plan.

Console
terraform plan -out main.tfplan

Key points:

The terraform plan command creates an execution plan, but doesn't execute it.
Instead, it determines what actions are necessary to create the configuration
specified in your configuration files. This pattern allows you to verify whether the
execution plan matches your expectations before making any changes to actual
resources.
The optional -out parameter allows you to specify an output file for the plan.
Using the -out parameter ensures that the plan you reviewed is exactly what is
applied.

Apply a Terraform execution plan


Run terraform apply to apply the execution plan to your cloud infrastructure.

Console

terraform apply main.tfplan

Key points:

The example terraform apply command assumes you previously ran terraform
plan -out main.tfplan .

If you specified a different filename for the -out parameter, use that same
filename in the call to terraform apply .
If you didn't use the -out parameter, call terraform apply without any parameters.

Verify the results


Azure CLI

1. Get the Azure resource group name.

Console

resource_group_name=$(terraform output -raw resource_group_name)


2. Get the Batch account name.

Console

batch_name=$(terraform output -raw batch_name)

3. Run az batch account show to display information about the new Batch
account.

Azure CLI

az batch account show \


--resource-group $resource_group_name \
--name $batch_name

Clean up resources
When you no longer need the resources created via Terraform, do the following steps:

1. Run terraform plan and specify the destroy flag.

Console

terraform plan -destroy -out main.destroy.tfplan

Key points:

The terraform plan command creates an execution plan, but doesn't execute
it. Instead, it determines what actions are necessary to create the
configuration specified in your configuration files. This pattern allows you to
verify whether the execution plan matches your expectations before making
any changes to actual resources.
The optional -out parameter allows you to specify an output file for the plan.
Using the -out parameter ensures that the plan you reviewed is exactly what
is applied.

2. Run terraform apply to apply the execution plan.

Console

terraform apply main.destroy.tfplan


Troubleshoot Terraform on Azure
Troubleshoot common problems when using Terraform on Azure

Next steps
Run your first Batch job with the Azure CLI

) Note: The author created this article with assistance from AI. Learn more

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Quickstart: Use .NET to create a Batch
pool and run a job
Article • 04/02/2025

This quickstart shows you how to get started with Azure Batch by running a C# app that
uses the Azure Batch .NET API. The .NET app:

" Uploads several input data files to an Azure Storage blob container to use for Batch
task processing.
" Creates a pool of two virtual machines (VMs), or compute nodes, running Windows
Server.
" Creates a job that runs tasks on the nodes to process each input file by using a
Windows command line.
" Displays the output files that the tasks return.

After you complete this quickstart, you understand the key concepts of the Batch service
and are ready to use Batch with more realistic, larger scale workloads.

Prerequisites
An Azure account with an active subscription. If you don't have one, create an
account for free .

A Batch account with a linked Azure Storage account. You can create the accounts
by using any of the following methods: Azure CLI | Azure portal | Bicep | ARM
template | Terraform.

Visual Studio 2019 or later, or .NET 6.0 or later, for Linux or Windows.

Run the app


To complete this quickstart, you download or clone the app, provide your account
values, build and run the app, and verify the output.

Download or clone the app


Download or clone the Azure Batch .NET Quickstart app from GitHub. Use the
following command to clone the app repo with a Git client:

Windows Command Prompt


git clone https://github.com/Azure-Samples/batch-dotnet-quickstart.git

Provide your account information


The app needs to use your Batch and Storage account names, account key values, and
Batch account endpoint. You can get this information from the Azure portal, Azure APIs,
or command-line tools.

To get your account information from the Azure portal :

1. From the Azure Search bar, search for and select your Batch account name.
2. On your Batch account page, select Keys from the left navigation.
3. On the Keys page, copy the following values:

Batch account
Account endpoint
Primary access key
Storage account name
Key1

Navigate to your downloaded batch-dotnet-quickstart folder and edit the credential


strings in Program.cs to provide the values you copied:

C#

// Batch account credentials


private const string BatchAccountName = "<batch account>";
private const string BatchAccountKey = "<primary access key>";
private const string BatchAccountUrl = "<account endpoint>";

// Storage account credentials


private const string StorageAccountName = "<storage account name>";
private const string StorageAccountKey = "<key1>

) Important

Exposing account keys in the app source isn't recommended for Production usage.
You should restrict access to credentials and refer to them in your code by using
variables or a configuration file. It's best to store Batch and Storage account keys in
Azure Key Vault.

Build and run the app and view output


To see the Batch workflow in action, build and run the application in Visual Studio. You
can also use the command line dotnet build and dotnet run commands.

In Visual Studio:

1. Open the BatchDotNetQuickstart.sln file, right-click the solution in Solution


Explorer, and select Build. If prompted, use NuGet Package Manager to update
or restore NuGet packages.

2. Once the build completes, select BatchDotNetQuickstart in the top menu bar to
run the app.

Typical run time with the default configuration is approximately five minutes. Initial pool
node setup takes the most time. To rerun the job, delete the job from the previous run,
but don't delete the pool. On a preconfigured pool, the job completes in a few seconds.

The app returns output similar to the following example:

Output

Sample start: 11/16/2022 4:02:54 PM

Container [input] created.


Uploading file taskdata0.txt to container [input]...
Uploading file taskdata1.txt to container [input]...
Uploading file taskdata2.txt to container [input]...
Creating pool [DotNetQuickstartPool]...
Creating job [DotNetQuickstartJob]...
Adding 3 tasks to job [DotNetQuickstartJob]...
Monitoring all tasks for 'Completed' state, timeout in 00:30:00...

There's a pause at Monitoring all tasks for 'Completed' state, timeout in


00:30:00... while the pool's compute nodes start. As tasks are created, Batch queues

them to run on the pool. As soon as the first compute node is available, the first task
runs on the node. You can monitor node, task, and job status from your Batch account
page in the Azure portal.

After each task completes, you see output similar to the following example:

Output

Printing task output.


Task: Task0
Node: tvm-2850684224_3-20171205t000401z
Standard out:
Batch processing began with mainframe computers and punch cards. Today it
still plays a central role...
stderr:
...

Review the code


Review the code to understand the steps in the Azure Batch .NET Quickstart .

Create service clients and upload resource files


1. To interact with the storage account, the app uses the Azure Storage Blobs client
library for .NET to create a BlobServiceClient.

C#

var sharedKeyCredential = new


StorageSharedKeyCredential(storageAccountName, storageAccountKey);
string blobUri = "https://" + storageAccountName +
".blob.core.windows.net";

var blobServiceClient = new BlobServiceClient(new Uri(blobUri),


sharedKeyCredential);
return blobServiceClient;

2. The app uses the blobServiceClient reference to create a container in the storage
account and upload data files to the container. The files in storage are defined as
Batch ResourceFile objects that Batch can later download to the compute nodes.

C#

List<string> inputFilePaths = new()


{
"taskdata0.txt",
"taskdata1.txt",
"taskdata2.txt"
};

var inputFiles = new List<ResourceFile>();

foreach (var filePath in inputFilePaths)


{
inputFiles.Add(UploadFileToContainer(containerClient,
inputContainerName, filePath));
}

3. The app creates a BatchClient object to create and manage Batch pools, jobs, and
tasks. The Batch client uses shared key authentication. Batch also supports
Microsoft Entra authentication.

C#

var cred = new BatchSharedKeyCredentials(BatchAccountUrl,


BatchAccountName, BatchAccountKey);

using BatchClient batchClient = BatchClient.Open(cred);


...

Create a pool of compute nodes


To create a Batch pool, the app uses the BatchClient.PoolOperations.CreatePool method
to set the number of nodes, VM size, and pool configuration. The following
VirtualMachineConfiguration object specifies an ImageReference to a Windows Server
Marketplace image. Batch supports a wide range of Windows Server and Linux
Marketplace OS images, and also supports custom VM images.

The PoolNodeCount and VM size PoolVMSize are defined constants. The app creates a
pool of two Standard_A1_v2 nodes. This size offers a good balance of performance
versus cost for this quickstart.

The Commit method submits the pool to the Batch service.

C#

private static VirtualMachineConfiguration


CreateVirtualMachineConfiguration(ImageReference imageReference)
{
return new VirtualMachineConfiguration(
imageReference: imageReference,
nodeAgentSkuId: "batch.node.windows amd64");
}

private static ImageReference CreateImageReference()


{
return new ImageReference(
publisher: "MicrosoftWindowsServer",
offer: "WindowsServer",
sku: "2016-datacenter-smalldisk",
version: "latest");
}

private static void CreateBatchPool(BatchClient batchClient,


VirtualMachineConfiguration vmConfiguration)
{
try
{
CloudPool pool = batchClient.PoolOperations.CreatePool(
poolId: PoolId,
targetDedicatedComputeNodes: PoolNodeCount,
virtualMachineSize: PoolVMSize,
virtualMachineConfiguration: vmConfiguration);

pool.Commit();
}
...

Create a Batch job


A Batch job is a logical grouping of one or more tasks. The job includes settings
common to the tasks, such as priority and the pool to run tasks on.

The app uses the BatchClient.JobOperations.CreateJob method to create a job on your


pool. The Commit method submits the job to the Batch service. Initially the job has no
tasks.

C#

try
{
CloudJob job = batchClient.JobOperations.CreateJob();
job.Id = JobId;
job.PoolInformation = new PoolInformation { PoolId = PoolId };

job.Commit();
}
...

Create tasks
Batch provides several ways to deploy apps and scripts to compute nodes. This app
creates a list of CloudTask input ResourceFile objects. Each task processes an input file
by using a CommandLine property. The Batch command line is where you specify your
app or script.

The command line in the following code runs the Windows type command to display
the input files. Then, the app adds each task to the job with the AddTask method, which
queues the task to run on the compute nodes.

C#
for (int i = 0; i < inputFiles.Count; i++)
{
string taskId = String.Format("Task{0}", i);
string inputFilename = inputFiles[i].FilePath;
string taskCommandLine = String.Format("cmd /c type {0}",
inputFilename);

var task = new CloudTask(taskId, taskCommandLine)


{
ResourceFiles = new List<ResourceFile> { inputFiles[i] }
};
tasks.Add(task);
}

batchClient.JobOperations.AddTask(JobId, tasks);

View task output


The app creates a TaskStateMonitor to monitor the tasks and make sure they complete.
When each task runs successfully, its output writes to stdout.txt. The app then uses the
CloudTask.ComputeNodeInformation property to display the stdout.txt file for each
completed task.

C#

foreach (CloudTask task in completedtasks)


{
string nodeId =
String.Format(task.ComputeNodeInformation.ComputeNodeId);
Console.WriteLine("Task: {0}", task.Id);
Console.WriteLine("Node: {0}", nodeId);
Console.WriteLine("Standard out:");

Console.WriteLine(task.GetNodeFile(Constants.StandardOutFileName).ReadAsStri
ng());
}

Clean up resources
The app automatically deletes the storage container it creates, and gives you the option
to delete the Batch pool and job. Pools and nodes incur charges while the nodes are
running, even if they aren't running jobs. If you no longer need the pool, delete it.

When you no longer need your Batch account and storage account, you can delete the
resource group that contains them. In the Azure portal, select Delete resource group at
the top of the resource group page. On the Delete a resource group screen, enter the
resource group name, and then select Delete.

Next steps
In this quickstart, you ran an app that uses the Batch .NET API to create a Batch pool,
nodes, job, and tasks. The job uploaded resource files to a storage container, ran tasks
on the nodes, and displayed output from the nodes.

Now that you understand the key concepts of the Batch service, you're ready to use
Batch with more realistic, larger scale workloads. To learn more about Azure Batch and
walk through a parallel workload with a real-world application, continue to the Batch
.NET tutorial.

Process a parallel workload with .NET

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Quickstart: Use Python to create a Batch
pool and run a job
Article • 03/21/2025

This quickstart shows you how to get started with Azure Batch by running an app that
uses the Azure Batch libraries for Python. The Python app:

" Uploads several input data files to an Azure Storage blob container to use for Batch
task processing.
" Creates a pool of two virtual machines (VMs), or compute nodes, running Ubuntu
22.04 LTS OS.
" Creates a job and three tasks to run on the nodes. Each task processes one of the
input files by using a Bash shell command line.
" Displays the output files that the tasks return.

After you complete this quickstart, you understand the key concepts of the Batch service
and are ready to use Batch with more realistic, larger scale workloads.

Prerequisites
An Azure account with an active subscription. If you don't have one, create an
account for free .

A Batch account with a linked Azure Storage account. You can create the accounts
by using any of the following methods: Azure CLI | Azure portal | Bicep | ARM
template | Terraform.

Python version 3.8 or later, which includes the pip package manager.

Run the app


To complete this quickstart, you download or clone the Python app, provide your
account values, run the app, and verify the output.

Download or clone the app


1. Download or clone the Azure Batch Python Quickstart app from GitHub. Use the
following command to clone the app repo with a Git client:

Bash
git clone https://github.com/Azure-Samples/batch-python-quickstart.git

2. Switch to the batch-python-quickstart/src folder, and install the required packages


by using pip .

Bash

pip install -r requirements.txt

Provide your account information


The Python app needs to use your Batch and Storage account names, account key
values, and Batch account endpoint. You can get this information from the Azure portal,
Azure APIs, or command-line tools.

To get your account information from the Azure portal :

1. From the Azure Search bar, search for and select your Batch account name.
2. On your Batch account page, select Keys from the left navigation.
3. On the Keys page, copy the following values:

Batch account
Account endpoint
Primary access key
Storage account name
Key1

In your downloaded Python app, edit the following strings in the config.py file to supply
the values you copied.

Python

BATCH_ACCOUNT_NAME = '<batch account>'


BATCH_ACCOUNT_KEY = '<primary access key>'
BATCH_ACCOUNT_URL = '<account endpoint>'
STORAGE_ACCOUNT_NAME = '<storage account name>'
STORAGE_ACCOUNT_KEY = '<key1>'

) Important

Exposing account keys in the app source isn't recommended for Production usage.
You should restrict access to credentials and refer to them in your code by using
variables or a configuration file. It's best to store Batch and Storage account keys in
Azure Key Vault.

Run the app and view output


Run the app to see the Batch workflow in action.

Bash

python python_quickstart_client.py

Typical run time is approximately three minutes. Initial pool node setup takes the most
time.

The app returns output similar to the following example:

Output

Sample start: 11/26/2012 4:02:54 PM

Uploading file taskdata0.txt to container [input]...


Uploading file taskdata1.txt to container [input]...
Uploading file taskdata2.txt to container [input]...
Creating pool [PythonQuickstartPool]...
Creating job [PythonQuickstartJob]...
Adding 3 tasks to job [PythonQuickstartJob]...
Monitoring all tasks for 'Completed' state, timeout in 00:30:00...

There's a pause at Monitoring all tasks for 'Completed' state, timeout in


00:30:00... while the pool's compute nodes start. As tasks are created, Batch queues
them to run on the pool. As soon as the first compute node is available, the first task
runs on the node. You can monitor node, task, and job status from your Batch account
page in the Azure portal.

After each task completes, you see output similar to the following example:

Output

Printing task output...


Task: Task0
Node: tvm-2850684224_3-20171205t000401z
Standard output:
Batch processing began with mainframe computers and punch cards. Today it
still plays a central role...
Review the code
Review the code to understand the steps in the Azure Batch Python Quickstart .

Create service clients and upload resource files


1. The app creates a BlobServiceClient object to interact with the Storage account.

Python

blob_service_client = BlobServiceClient(
account_url=f"https://{config.STORAGE_ACCOUNT_NAME}.
{config.STORAGE_ACCOUNT_DOMAIN}/",
credential=config.STORAGE_ACCOUNT_KEY
)

2. The app uses the blob_service_client reference to create a container in the


Storage account and upload data files to the container. The files in storage are
defined as Batch ResourceFile objects that Batch can later download to compute
nodes.

Python

input_file_paths = [os.path.join(sys.path[0], 'taskdata0.txt'),


os.path.join(sys.path[0], 'taskdata1.txt'),
os.path.join(sys.path[0], 'taskdata2.txt')]

input_files = [
upload_file_to_container(blob_service_client, input_container_name,
file_path)
for file_path in input_file_paths]

3. The app creates a BatchServiceClient object to create and manage pools, jobs, and
tasks in the Batch account. The Batch client uses shared key authentication. Batch
also supports Microsoft Entra authentication.

Python

credentials = SharedKeyCredentials(config.BATCH_ACCOUNT_NAME,
config.BATCH_ACCOUNT_KEY)

batch_client = BatchServiceClient(
credentials,
batch_url=config.BATCH_ACCOUNT_URL)
Create a pool of compute nodes
To create a Batch pool, the app uses the PoolAddParameter class to set the number of
nodes, VM size, and pool configuration. The following VirtualMachineConfiguration
object specifies an ImageReference to an Ubuntu Server 22.04 LTS Azure Marketplace
image. Batch supports a wide range of Linux and Windows Server Marketplace images,
and also supports custom VM images.

The POOL_NODE_COUNT and POOL_VM_SIZE are defined constants. The app creates a pool of
two size Standard_DS1_v2 nodes. This size offers a good balance of performance versus
cost for this quickstart.

The pool.add method submits the pool to the Batch service.

Python

new_pool = batchmodels.PoolAddParameter(
id=pool_id,

virtual_machine_configuration=batchmodels.VirtualMachineConfiguration(
image_reference=batchmodels.ImageReference(
publisher="canonical",
offer="0001-com-ubuntu-server-focal",
sku="22_04-lts",
version="latest"
),
node_agent_sku_id="batch.node.ubuntu 22.04"),
vm_size=config.POOL_VM_SIZE,
target_dedicated_nodes=config.POOL_NODE_COUNT
)
batch_service_client.pool.add(new_pool)

Create a Batch job


A Batch job is a logical grouping of one or more tasks. The job includes settings
common to the tasks, such as priority and the pool to run tasks on.

The app uses the JobAddParameter class to create a job on the pool. The job.add
method adds the job to the specified Batch account. Initially the job has no tasks.

Python

job = batchmodels.JobAddParameter(
id=job_id,
pool_info=batchmodels.PoolInformation(pool_id=pool_id))
batch_service_client.job.add(job)

Create tasks
Batch provides several ways to deploy apps and scripts to compute nodes. This app
creates a list of task objects by using the TaskAddParameter class. Each task processes
an input file by using a command_line parameter to specify an app or script.

The following script processes the input resource_files objects by running the Bash
shell cat command to display the text files. The app then uses the task.add_collection
method to add each task to the job, which queues the tasks to run on the compute
nodes.

Python

tasks = []

for idx, input_file in enumerate(resource_input_files):


command = f"/bin/bash -c \"cat {input_file.file_path}\""
tasks.append(batchmodels.TaskAddParameter(
id=f'Task{idx}',
command_line=command,
resource_files=[input_file]
)
)

batch_service_client.task.add_collection(job_id, tasks)

View task output


The app monitors task state to make sure the tasks complete. When each task runs
successfully, the task command output writes to the stdout.txt file. The app then displays
the stdout.txt file for each completed task.

Python

tasks = batch_service_client.task.list(job_id)

for task in tasks:

node_id = batch_service_client.task.get(job_id,
task.id).node_info.node_id
print(f"Task: {task.id}")
print(f"Node: {node_id}")
stream = batch_service_client.file.get_from_task(
job_id, task.id, config.STANDARD_OUT_FILE_NAME)

file_text = _read_stream_as_string(
stream,
text_encoding)

if text_encoding is None:
text_encoding = DEFAULT_ENCODING

sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding =


text_encoding)
sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding =
text_encoding)

print("Standard output:")
print(file_text)

Clean up resources
The app automatically deletes the storage container it creates, and gives you the option
to delete the Batch pool and job. Pools and nodes incur charges while the nodes are
running, even if they aren't running jobs. If you no longer need the pool, delete it.

When you no longer need your Batch resources, you can delete the resource group that
contains them. In the Azure portal, select Delete resource group at the top of the
resource group page. On the Delete a resource group screen, enter the resource group
name, and then select Delete.

Next steps
In this quickstart, you ran an app that uses the Batch Python API to create a Batch pool,
nodes, job, and tasks. The job uploaded resource files to a storage container, ran tasks
on the nodes, and displayed output from the nodes.

Now that you understand the key concepts of the Batch service, you're ready to use
Batch with more realistic, larger scale workloads. To learn more about Azure Batch and
walk through a parallel workload with a real-world application, continue to the Batch
Python tutorial.

Process a parallel workload with Python

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Deploy an Azure Batch account and two
pools - Terraform
Article • 10/24/2024

In this quickstart, you create an Azure Batch account, an Azure Storage account, and two
Batch pools using Terraform. Batch is a cloud-based job scheduling service that
parallelizes and distributes the processing of large volumes of data across many
computers. It's typically used for parametric sweeps, Monte Carlo simulations, financial
risk modeling, and other high-performance computing applications. A Batch account is
the top-level resource in the Batch service that provides access to pools, jobs, and tasks.
The Storage account is used to store and manage all the files that are used and
generated by the Batch service, while the two Batch pools are collections of compute
nodes that execute the tasks.

Terraform enables the definition, preview, and deployment of cloud infrastructure.


Using Terraform, you create configuration files using HCL syntax . The HCL syntax
allows you to specify the cloud provider - such as Azure - and the elements that make
up your cloud infrastructure. After you create your configuration files, you create an
execution plan that allows you to preview your infrastructure changes before they're
deployed. Once you verify the changes, you apply the execution plan to deploy the
infrastructure.

" Specify the required version of Terraform and the required providers.


" Define the Azure provider with no additional features.
" Define variables for the location of the resource group and the prefix of the
resource group name.
" Generate a random name for the resource group using the provided prefix.
" Create an Azure resource group with the generated name at the specified location.
" Generate a random string to be used as the name for the Storage account.
" Create a Storage account with the generated name in the created resource group,
at the same location, and with a standard account tier and locally redundant
Storage replication type.
" Generate another random string to be used as the name for the Batch account.
" Create a Batch account with the generated name in the created resource group, at
the same location, and link it to the created Storage account with Storage keys
authentication mode.
" Generate a random name for the Batch pool with a "pool" prefix.
" Create a Batch pool with a fixed scale using the generated name in the created
resource group, linked to the created Batch account, with a standard A1 virtual
machine (VM) size, Ubuntu 22.04 node agent SKU, and a start task that echoes
'Hello World from $env' with a maximum of one retry and waits for success.
" Create another Batch pool with auto scale, using the same generated name, in the
created resource group, linked to the created Batch account, with a standard A1 VM
size, Ubuntu 22.04 node agent SKU, and an autoscale formula.
" Output the names of the created resource group, Storage account, Batch account,
and both Batch pools.

Prerequisites
Create an Azure account with an active subscription. You can create an account for
free .
Install and configure Terraform.

Implement the Terraform code

7 Note

The sample code for this article is located in the Azure Terraform GitHub repo .
You can view the log file containing the test results from current and previous
versions of Terraform .

See more articles and sample code showing how to use Terraform to manage
Azure resources.

1. Create a directory in which to test and run the sample Terraform code, and make it
the current directory.

2. Create a file named main.tf , and insert the following code:

Terraform

resource "random_pet" "rg_name" {


prefix = var.resource_group_name_prefix
}

resource "azurerm_resource_group" "rg" {


location = var.resource_group_location
name = random_pet.rg_name.id
}

resource "random_string" "storage_account_name" {


length = 8
lower = true
numeric = false
special = false
upper = false
}

resource "azurerm_storage_account" "example" {


name = random_string.storage_account_name.result
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
account_tier = "Standard"
account_replication_type = "LRS"
}

resource "random_string" "batch_account_name" {


length = 8
lower = true
numeric = false
special = false
upper = false
}

resource "azurerm_batch_account" "example" {


name =
random_string.batch_account_name.result
resource_group_name = azurerm_resource_group.rg.name
location =
azurerm_resource_group.rg.location
storage_account_id =
azurerm_storage_account.example.id
storage_account_authentication_mode = "StorageKeys"
}

resource "random_pet" "azurerm_batch_pool_name" {


prefix = "pool"
}

resource "azurerm_batch_pool" "fixed" {


name = "${random_pet.azurerm_batch_pool_name.id}-
fixed-pool"
resource_group_name = azurerm_resource_group.rg.name
account_name = azurerm_batch_account.example.name
display_name = "Fixed Scale Pool"
vm_size = "Standard_A1"
node_agent_sku_id = "batch.node.ubuntu 22.04"

fixed_scale {
target_dedicated_nodes = 2
resize_timeout = "PT15M"
}

storage_image_reference {
publisher = "Canonical"
offer = "0001-com-ubuntu-server-jammy"
sku = "22_04-lts"
version = "latest"
}

start_task {
command_line = "echo 'Hello World from $env'"
task_retry_maximum = 1
wait_for_success = true

common_environment_properties = {
env = "TEST"
}

user_identity {
auto_user {
elevation_level = "NonAdmin"
scope = "Task"
}
}
}

metadata = {
"tagName" = "Example tag"
}
}

resource "azurerm_batch_pool" "autopool" {


name = "${random_pet.azurerm_batch_pool_name.id}-
autoscale-pool"
resource_group_name = azurerm_resource_group.rg.name
account_name = azurerm_batch_account.example.name
display_name = "Auto Scale Pool"
vm_size = "Standard_A1"
node_agent_sku_id = "batch.node.ubuntu 22.04"

auto_scale {
evaluation_interval = "PT15M"

formula = <<EOF
startingNumberOfVMs = 1;
maxNumberofVMs = 25;
pendingTaskSamplePercent = $PendingTasks.GetSamplePercent(180 *
TimeInterval_Second);
pendingTaskSamples = pendingTaskSamplePercent < 70 ?
startingNumberOfVMs : avg($PendingTasks.GetSample(180 *
TimeInterval_Second));
$TargetDedicatedNodes=min(maxNumberofVMs, pendingTaskSamples);
EOF
}

storage_image_reference {
publisher = "Canonical"
offer = "0001-com-ubuntu-server-jammy"
sku = "22_04-lts"
version = "latest"
}
}

3. Create a file named outputs.tf , and insert the following code:

Terraform

output "resource_group_name" {
value = azurerm_resource_group.rg.name
}

output "storage_account_name" {
value = azurerm_storage_account.example.name
}

output "batch_account_name" {
value = azurerm_batch_account.example.name
}

output "batch_pool_fixed_name" {
value = azurerm_batch_pool.fixed.name
}

output "batch_pool_autopool_name" {
value = azurerm_batch_pool.autopool.name
}

4. Create a file named providers.tf , and insert the following code:

Terraform

terraform {
required_version = ">=1.0"

required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~>3.0"
}
random = {
source = "hashicorp/random"
version = "~>3.0"
}
}
}

provider "azurerm" {
features {}
}
5. Create a file named variables.tf , and insert the following code:

Terraform

variable "resource_group_location" {
type = string
default = "eastus"
description = "Location of the resource group."
}

variable "resource_group_name_prefix" {
type = string
default = "rg"
description = "Prefix of the resource group name that's combined with
a random ID so name is unique in your Azure subscription."
}

Initialize Terraform
Run terraform init to initialize the Terraform deployment. This command downloads
the Azure provider required to manage your Azure resources.

Console

terraform init -upgrade

Key points:

The -upgrade parameter upgrades the necessary provider plugins to the newest
version that complies with the configuration's version constraints.

Create a Terraform execution plan


Run terraform plan to create an execution plan.

Console

terraform plan -out main.tfplan

Key points:

The terraform plan command creates an execution plan, but doesn't execute it.
Instead, it determines what actions are necessary to create the configuration
specified in your configuration files. This pattern allows you to verify whether the
execution plan matches your expectations before making any changes to actual
resources.
The optional -out parameter allows you to specify an output file for the plan.
Using the -out parameter ensures that the plan you reviewed is exactly what is
applied.

Apply a Terraform execution plan


Run terraform apply to apply the execution plan to your cloud infrastructure.

Console

terraform apply main.tfplan

Key points:

The example terraform apply command assumes you previously ran terraform
plan -out main.tfplan .

If you specified a different filename for the -out parameter, use that same
filename in the call to terraform apply .
If you didn't use the -out parameter, call terraform apply without any parameters.

Verify the results


Azure CLI

Run az batch account show to view the Batch account.

Azure CLI

az batch account show --name <batch_account_name> --resource-group


<resource_group_name>

Replace <batch_account_name> with the name of your Batch account and


<resource_group_name> with the name of your resource group.

Clean up resources
When you no longer need the resources created via Terraform, do the following steps:
1. Run terraform plan and specify the destroy flag.

Console

terraform plan -destroy -out main.destroy.tfplan

Key points:

The terraform plan command creates an execution plan, but doesn't execute
it. Instead, it determines what actions are necessary to create the
configuration specified in your configuration files. This pattern allows you to
verify whether the execution plan matches your expectations before making
any changes to actual resources.
The optional -out parameter allows you to specify an output file for the plan.
Using the -out parameter ensures that the plan you reviewed is exactly what
is applied.

2. Run terraform apply to apply the execution plan.

Console

terraform apply main.destroy.tfplan

Troubleshoot Terraform on Azure


Troubleshoot common problems when using Terraform on Azure.

Next steps
See more articles about Batch accounts .

) Note: The author created this article with assistance from AI. Learn more

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Deploy an Azure Batch account and two
pools with a start task - Terraform
Article • 10/24/2024

In this quickstart, you create an Azure Batch account, an Azure Storage account, and two
Batch pools using Terraform. Batch is a cloud-based job scheduling service that
parallelizes and distributes the processing of large volumes of data across many
computers. It's typically used for tasks like rendering 3D graphics, analyzing large
datasets, or processing video. In this case, the resources created include a Batch account
(which is the central organizing entity for distributed processing tasks), a Storage
account for holding the data to be processed, and two Batch pools, which are groups of
virtual machines that execute the tasks.

Terraform enables the definition, preview, and deployment of cloud infrastructure.


Using Terraform, you create configuration files using HCL syntax . The HCL syntax
allows you to specify the cloud provider - such as Azure - and the elements that make
up your cloud infrastructure. After you create your configuration files, you create an
execution plan that allows you to preview your infrastructure changes before they're
deployed. Once you verify the changes, you apply the execution plan to deploy the
infrastructure.

" Specify the required version of Terraform and the required providers.


" Define the Azure provider with no additional features.
" Define variables for the resource group location and name prefix.
" Generate a random name for the Azure resource group.
" Create a resource group with the generated name at a specified location.
" Generate a random string for the Storage account name.
" Create a Storage account with the generated name in the created resource group.
" Generate a random string for the Batch account name.
" Create a Batch account with the generated name in the created resource group and
linked to the created Storage account.
" Generate a random name for the Batch pool.
" Create a Batch pool with a fixed scale in the created resource group and linked to
the created Batch account.
" Create a Batch pool with autoscale in the created resource group and linked to the
created Batch account.
" Output the names of the created resource group, Storage account, Batch account,
and both Batch pools.
Prerequisites
Create an Azure account with an active subscription. You can create an account for
free .
Install and configure Terraform.

Implement the Terraform code

7 Note

The sample code for this article is located in the Azure Terraform GitHub repo .
You can view the log file containing the test results from current and previous
versions of Terraform .

See more articles and sample code showing how to use Terraform to manage
Azure resources.

1. Create a directory in which to test and run the sample Terraform code, and make it
the current directory.

2. Create a file named main.tf , and insert the following code:

Terraform

resource "random_pet" "rg_name" {


prefix = var.resource_group_name_prefix
}

resource "azurerm_resource_group" "rg" {


location = var.resource_group_location
name = random_pet.rg_name.id
}

resource "random_string" "storage_account_name" {


length = 8
lower = true
numeric = false
special = false
upper = false
}

resource "azurerm_storage_account" "example" {


name = random_string.storage_account_name.result
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
account_tier = "Standard"
account_replication_type = "LRS"
}

resource "random_string" "batch_account_name" {


length = 8
lower = true
numeric = false
special = false
upper = false
}

resource "azurerm_batch_account" "example" {


name =
random_string.batch_account_name.result
resource_group_name = azurerm_resource_group.rg.name
location =
azurerm_resource_group.rg.location
storage_account_id =
azurerm_storage_account.example.id
storage_account_authentication_mode = "StorageKeys"
}

resource "random_pet" "azurerm_batch_pool_name" {


prefix = "pool"
}

resource "azurerm_batch_pool" "fixed" {


name = "${random_pet.azurerm_batch_pool_name.id}-
fixed-pool"
resource_group_name = azurerm_resource_group.rg.name
account_name = azurerm_batch_account.example.name
display_name = "Fixed Scale Pool"
vm_size = "Standard_D4_v3"
node_agent_sku_id = "batch.node.ubuntu 22.04"

fixed_scale {
target_dedicated_nodes = 2
resize_timeout = "PT15M"
}

storage_image_reference {
publisher = "Canonical"
offer = "0001-com-ubuntu-server-jammy"
sku = "22_04-lts"
version = "latest"
}

start_task {
command_line = "echo 'Hello World from $env'"
task_retry_maximum = 1
wait_for_success = true

common_environment_properties = {
env = "TEST"
}
user_identity {
auto_user {
elevation_level = "NonAdmin"
scope = "Task"
}
}
}

metadata = {
"tagName" = "Example tag"
}
}

resource "azurerm_batch_pool" "autopool" {


name = "${random_pet.azurerm_batch_pool_name.id}-
autoscale-pool"
resource_group_name = azurerm_resource_group.rg.name
account_name = azurerm_batch_account.example.name
display_name = "Auto Scale Pool"
vm_size = "Standard_D4_v3"
node_agent_sku_id = "batch.node.ubuntu 22.04"

auto_scale {
evaluation_interval = "PT15M"

formula = <<EOF
startingNumberOfVMs = 1;
maxNumberofVMs = 25;
pendingTaskSamplePercent = $PendingTasks.GetSamplePercent(180 *
TimeInterval_Second);
pendingTaskSamples = pendingTaskSamplePercent < 70 ?
startingNumberOfVMs : avg($PendingTasks.GetSample(180 *
TimeInterval_Second));
$TargetDedicatedNodes=min(maxNumberofVMs, pendingTaskSamples);
EOF
}

storage_image_reference {
publisher = "Canonical"
offer = "0001-com-ubuntu-server-jammy"
sku = "22_04-lts"
version = "latest"
}
}

3. Create a file named outputs.tf , and insert the following code:

Terraform

output "resource_group_name" {
value = azurerm_resource_group.rg.name
}
output "storage_account_name" {
value = azurerm_storage_account.example.name
}

output "batch_account_name" {
value = azurerm_batch_account.example.name
}

output "batch_pool_fixed_name" {
value = azurerm_batch_pool.fixed.name
}

output "batch_pool_autopool_name" {
value = azurerm_batch_pool.autopool.name
}

4. Create a file named providers.tf , and insert the following code:

Terraform

terraform {
required_version = ">=1.0"

required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~>3.0"
}
random = {
source = "hashicorp/random"
version = "~>3.0"
}
}
}

provider "azurerm" {
features {}
}

5. Create a file named variables.tf , and insert the following code:

Terraform

variable "resource_group_location" {
type = string
default = "eastus"
description = "Location of the resource group."
}

variable "resource_group_name_prefix" {
type = string
default = "rg"
description = "Prefix of the resource group name that's combined with
a random ID so name is unique in your Azure subscription."
}

Initialize Terraform
Run terraform init to initialize the Terraform deployment. This command downloads
the Azure provider required to manage your Azure resources.

Console

terraform init -upgrade

Key points:

The -upgrade parameter upgrades the necessary provider plugins to the newest
version that complies with the configuration's version constraints.

Create a Terraform execution plan


Run terraform plan to create an execution plan.

Console

terraform plan -out main.tfplan

Key points:

The terraform plan command creates an execution plan, but doesn't execute it.
Instead, it determines what actions are necessary to create the configuration
specified in your configuration files. This pattern allows you to verify whether the
execution plan matches your expectations before making any changes to actual
resources.
The optional -out parameter allows you to specify an output file for the plan.
Using the -out parameter ensures that the plan you reviewed is exactly what is
applied.

Apply a Terraform execution plan


Run terraform apply to apply the execution plan to your cloud infrastructure.
Console

terraform apply main.tfplan

Key points:

The example terraform apply command assumes you previously ran terraform
plan -out main.tfplan .

If you specified a different filename for the -out parameter, use that same
filename in the call to terraform apply .
If you didn't use the -out parameter, call terraform apply without any parameters.

Verify the results


Azure CLI

Run az batch account show to view the Batch account.

Azure CLI

az batch account show --name <batch_account_name> --resource-group


<resource_group_name>

In the above command, replace <batch_account_name> with the name of your Batch
account and <resource_group_name> with the name of your resource group.

Clean up resources
When you no longer need the resources created via Terraform, do the following steps:

1. Run terraform plan and specify the destroy flag.

Console

terraform plan -destroy -out main.destroy.tfplan

Key points:

The terraform plan command creates an execution plan, but doesn't execute
it. Instead, it determines what actions are necessary to create the
configuration specified in your configuration files. This pattern allows you to
verify whether the execution plan matches your expectations before making
any changes to actual resources.
The optional -out parameter allows you to specify an output file for the plan.
Using the -out parameter ensures that the plan you reviewed is exactly what
is applied.

2. Run terraform apply to apply the execution plan.

Console

terraform apply main.destroy.tfplan

Troubleshoot Terraform on Azure


Troubleshoot common problems when using Terraform on Azure.

Next steps
See more articles about Batch accounts .

) Note: The author created this article with assistance from AI. Learn more

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Tutorial: Run a parallel workload with
Azure Batch using the .NET API
Article • 04/02/2025

Use Azure Batch to run large-scale parallel and high-performance computing (HPC)
batch jobs efficiently in Azure. This tutorial walks through a C# example of running a
parallel workload using Batch. You learn a common Batch application workflow and how
to interact programmatically with Batch and Storage resources.

" Add an application package to your Batch account.


" Authenticate with Batch and Storage accounts.
" Upload input files to Storage.
" Create a pool of compute nodes to run an application.
" Create a job and tasks to process input files.
" Monitor task execution.
" Retrieve output files.

In this tutorial, you convert MP4 media files to MP3 format, in parallel, by using the
ffmpeg open-source tool.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Prerequisites
Visual Studio 2017 or later , or .NET Core SDK for Linux, macOS, or Windows.

A Batch account and a linked Azure Storage account. To create these accounts, see
the Batch quickstart guides for the Azure portal or Azure CLI.

Download the appropriate version of ffmpeg for your use case to your local
computer. This tutorial and the related sample app use the Windows 64-bit full-
build version of ffmpeg 4.3.1 . For this tutorial, you only need the zip file. You do
not need to unzip the file or install it locally.

Sign in to Azure
Sign in to the Azure portal .
Add an application package
Use the Azure portal to add ffmpeg to your Batch account as an application package.
Application packages help you manage task applications and their deployment to the
compute nodes in your pool.

1. In the Azure portal, click More services > Batch accounts, and select the name of
your Batch account.

2. Click Applications > Add.

3. Enter ffmpeg in the Application Id field, and a package version of 4.3.1 in the
Version field. Select the ffmpeg zip file that you downloaded, and then select
Submit. The ffmpeg application package is added to your Batch account.

Get account credentials


For this example, you need to provide credentials for your Batch and Storage accounts.
A straightforward way to get the necessary credentials is in the Azure portal. (You can
also get these credentials using the Azure APIs or command-line tools.)
1. Select All services > Batch accounts, and then select the name of your Batch
account.

2. To see the Batch credentials, select Keys. Copy the values of Batch account, URL,
and Primary access key to a text editor.

3. To see the Storage account name and keys, select Storage account. Copy the
values of Storage account name and Key1 to a text editor.

Download and run the sample app

Download the sample app


Download or clone the sample app from GitHub. To clone the sample app repo with a
Git client, use the following command:

git clone https://github.com/Azure-Samples/batch-dotnet-ffmpeg-tutorial.git

Navigate to the directory that contains the Visual Studio solution file
BatchDotNetFfmpegTutorial.sln.

Also, make sure that the ffmpeg application package reference in the solution matches
the identifier and version of the ffmpeg package that you uploaded to your Batch
account. For example, ffmpeg and 4.3.1 .

C#

const string appPackageId = "ffmpeg";


const string appPackageVersion = "4.3.1";

Build and run the sample project


Build and run the application in Visual Studio, or at the command line with the dotnet
build and dotnet run commands. After running the application, review the code to

learn what each part of the application does. For example, in Visual Studio:

1. Right-click the solution in Solution Explorer and select Build Solution.

2. Confirm the restoration of any NuGet packages, if you're prompted. If you need to
download missing packages, ensure the NuGet Package Manager is installed.
3. Run the solution. When you run the sample application, the console output is
similar to the following. During execution, you experience a pause at Monitoring
all tasks for 'Completed' state, timeout in 00:30:00... while the pool's

compute nodes are started.

Sample start: 11/19/2018 3:20:21 PM

Container [input] created.


Container [output] created.
Uploading file LowPriVMs-1.mp4 to container [input]...
Uploading file LowPriVMs-2.mp4 to container [input]...
Uploading file LowPriVMs-3.mp4 to container [input]...
Uploading file LowPriVMs-4.mp4 to container [input]...
Uploading file LowPriVMs-5.mp4 to container [input]...
Creating pool [WinFFmpegPool]...
Creating job [WinFFmpegJob]...
Adding 5 tasks to job [WinFFmpegJob]...
Monitoring all tasks for 'Completed' state, timeout in 00:30:00...
Success! All tasks completed successfully within the specified timeout
period.
Deleting container [input]...

Sample end: 11/19/2018 3:29:36 PM


Elapsed time: 00:09:14.3418742

Go to your Batch account in the Azure portal to monitor the pool, compute nodes, job,
and tasks. For example, to see a heat map of the compute nodes in your pool, click
Pools > WinFFmpegPool.

When tasks are running, the heat map is similar to the following:

Typical execution time is approximately 10 minutes when you run the application in its
default configuration. Pool creation takes the most time.
Retrieve output files
You can use the Azure portal to download the output MP3 files generated by the
ffmpeg tasks.

1. Click All services > Storage accounts, and then click the name of your storage
account.
2. Click Blobs > output.
3. Right-click one of the output MP3 files and then click Download. Follow the
prompts in your browser to open or save the file.

Although not shown in this sample, you can also download the files programmatically
from the compute nodes or from the storage container.

Review the code


The following sections break down the sample application into the steps that it performs
to process a workload in the Batch service. Refer to the file Program.cs in the solution
while you read the rest of this article, since not every line of code in the sample is
discussed.

Authenticate Blob and Batch clients


To interact with the linked storage account, the app uses the Azure.Storage.Blobs Library
for .NET. Using the BlobServiceClient class which takes a reference to the account Uri
and authenticating Token such as DefaultAzureCredential.

C#
// TODO: Replace <storage-account-name> with your actual storage account
name
Uri accountUri = new Uri("https://<storage-account-
name>.blob.core.windows.net/");
BlobServiceClient blobClient = new BlobServiceClient(accountUri, new
DefaultAzureCredential());

The app creates a reference to the BatchAccountResource via the Resource manager's
ArmClient to create the pool in the Batch service. The Arm client in the sample uses
DefaultAzureCredential authentication.

C#

ArmClient _armClient = new ArmClient(new DefaultAzureCredential());


var batchAccountIdentifier =
ResourceIdentifier.Parse(BatchAccountResourceID);
BatchAccountResource batchAccount = await
_armClient.GetBatchAccountResource(batchAccountIdentifier).GetAsync();

The app creates a BatchClient object to create and jobs and tasks in the Batch service.
The Batch client in the sample uses DefaultAzureCredential authentication.

C#

// TODO: Replace <batch-account-name> with your actual storage account name


Uri batchUri = new Uri("https://<batch-account-
name>t.eastus.batch.azure.com");
BatchClient _batchClient = new BatchClient(batchUri, new
DefaultAzureCredential());

Upload input files


The app passes the blobServerClient object to the CreateContainerIfNotExist method
to create a storage container for the input files (MP4 format) and a container for the
task output.

C#

CreateContainerIfNotExist(blobClient, inputContainerName);
CreateContainerIfNotExist(blobClient, outputContainerName);

Then, files are uploaded to the input container from the local InputFiles folder. The files
in storage are defined as Batch ResourceFile objects that Batch can later download to
compute nodes.
Two methods in Program.cs are involved in uploading the files:

UploadFilesToContainerAsync : Returns a collection of ResourceFile objects and

internally calls UploadResourceFileToContainerAsync to upload each file that is


passed in the inputFilePaths parameter.
UploadResourceFileToContainerAsync : Uploads each file as a blob to the input

container. After uploading the file, it obtains a shared access signature (SAS) for
the blob and returns a ResourceFile object to represent it.

C#

string inputPath = Path.Combine(Environment.CurrentDirectory, "InputFiles");

List<string> inputFilePaths = new List<string>


(Directory.GetFileSystemEntries(inputPath, "*.mp4",
SearchOption.TopDirectoryOnly));

List<ResourceFile> inputFiles = await UploadFilesToContainerAsync(


blobClient,
inputContainerName,
inputFilePaths);

For details about uploading files as blobs to a storage account with .NET, see Upload,
download, and list blobs using .NET.

Create a pool of compute nodes


Next, the sample creates a pool of compute nodes in the Batch account with a call to
CreatePoolIfNotExistAsync . This defined method uses the

BatchAccountResource.GetBatchAccountPools().CreateOrUpdateAsync method to set


the number of nodes, VM size, and a pool configuration. Here, a BatchVmConfiguration
object specifies an BatchImageReference to a Windows Server image published in the
Azure Marketplace. Batch supports a wide range of VM images in the Azure
Marketplace, as well as custom VM images.

The number of nodes and VM size are set using defined constants. Batch supports
dedicated nodes and Spot nodes, and you can use either or both in your pools.
Dedicated nodes are reserved for your pool. Spot nodes are offered at a reduced price
from surplus VM capacity in Azure. Spot nodes become unavailable if Azure does not
have enough capacity. The sample by default creates a pool containing only 5 Spot
nodes in size Standard_A1_v2.

7 Note
Be sure you check your node quotas. See Batch service quotas and limits for
instructions on how to create a quota request.

The ffmpeg application is deployed to the compute nodes by adding an


ApplicationPackageReference to the pool configuration.

C#

var credential = new DefaultAzureCredential();


ArmClient _armClient = new ArmClient(credential);

var batchAccountIdentifier =
ResourceIdentifier.Parse(BatchAccountResourceID);
BatchAccountResource batchAccount = await
_armClient.GetBatchAccountResource(batchAccountIdentifier).GetAsync();

BatchAccountPoolCollection collection = batchAccount.GetBatchAccountPools();


if (collection.Exists(poolId) == false)
{
var poolName = poolId;
var imageReference = new BatchImageReference()
{
Publisher = "MicrosoftWindowsServer",
Offer = "WindowsServer",
Sku = "2019-datacenter-smalldisk",
Version = "latest"
};
string nodeAgentSku = "batch.node.windows amd64";

ArmOperation<BatchAccountPoolResource> armOperation = await


batchAccount.GetBatchAccountPools().CreateOrUpdateAsync(
WaitUntil.Completed, poolName, new BatchAccountPoolData()
{
VmSize = "Standard_DS1_v2",
DeploymentConfiguration = new BatchDeploymentConfiguration()
{
VmConfiguration = new BatchVmConfiguration(imageReference,
nodeAgentSku)
},
ScaleSettings = new BatchAccountPoolScaleSettings()
{
FixedScale = new BatchAccountFixedScaleSettings()
{
TargetDedicatedNodes = DedicatedNodeCount,
TargetLowPriorityNodes = LowPriorityNodeCount
}
},
Identity = new
ManagedServiceIdentity(ManagedServiceIdentityType.UserAssigned)
{
UserAssignedIdentities =
{
[new ResourceIdentifier(ManagedIdentityId)] = new
Azure.ResourceManager.Models.UserAssignedIdentity(),
},
},
ApplicationPackages =
{
new
Azure.ResourceManager.Batch.Models.BatchApplicationPackageReference(new
ResourceIdentifier(appPackageResourceID))
{
Version = appPackageVersion,
}
},

});
BatchAccountPoolResource pool = armOperation.Value;

Create a job
A Batch job specifies a pool to run tasks on and optional settings such as a priority and
schedule for the work. The sample creates a job with a call to CreateJobAsync . This
defined method uses the BatchClient.CreateJobAsync method to create a job on your
pool.

C#

BatchJobCreateContent batchJobCreateContent = new


BatchJobCreateContent(jobId, new BatchPoolInfo { PoolId = poolId });
await batchClient.CreateJobAsync(batchJobCreateContent);

Create tasks
The sample creates tasks in the job with a call to the AddTasksAsync method, which
creates a list of BatchTask objects. Each BatchTask runs ffmpeg to process an input
ResourceFile object using a CommandLine property. ffmpeg was previously installed on

each node when the pool was created. Here, the command line runs ffmpeg to convert
each input MP4 (video) file to an MP3 (audio) file.

The sample creates an OutputFile object for the MP3 file after running the command
line. Each task's output files (one, in this case) are uploaded to a container in the linked
storage account, using the task's OutputFiles property. Note the conditions set on the
outputFile object. An output file from a task is only uploaded to the container after the
task has successfully completed ( OutputFileUploadCondition.TaskSuccess ). See the full
code sample on GitHub for further implementation details.

Then, the sample adds tasks to the job with the CreateTaskAsync method, which queues
them to run on the compute nodes.

Replace the executable's file path with the name of the version that you downloaded.
This sample code uses the example ffmpeg-4.3.1-2020-11-08-full_build .

C#

// Create a collection to hold the tasks added to the job:


List<BatchTaskCreateContent> tasks = new List<BatchTaskCreateContent>();

for (int i = 0; i < inputFiles.Count; i++)


{
// Assign a task ID for each iteration
string taskId = String.Format("Task{0}", i);

// Define task command line to convert the video format from MP4 to MP3
using ffmpeg.
// Note that ffmpeg syntax specifies the format as the file extension of
the input file
// and the output file respectively. In this case inputs are MP4.
string appPath = String.Format("%AZ_BATCH_APP_PACKAGE_{0}#{1}%",
appPackageId, appPackageVersion);
string inputMediaFile = inputFiles[i].StorageContainerUrl;
string outputMediaFile = String.Format("{0}{1}",
System.IO.Path.GetFileNameWithoutExtension(inputMediaFile),
".mp3");
string taskCommandLine = String.Format("cmd /c {0}\\ffmpeg-4.3.1-2020-
11-08-full_build\\bin\\ffmpeg.exe -i {1} {2}", appPath, inputMediaFile,
outputMediaFile);

// Create a batch task (with the task ID and command line) and add it to
the task list

BatchTaskCreateContent batchTaskCreateContent = new


BatchTaskCreateContent(taskId, taskCommandLine);
batchTaskCreateContent.ResourceFiles.Add(inputFiles[i]);

// Task output file will be uploaded to the output container in Storage.


// TODO: Replace <storage-account-name> with your actual storage account
name
OutputFileBlobContainerDestination outputContainer = new
OutputFileBlobContainerDestination("https://<storage-account-
name>.blob.core.windows.net/output/" + outputMediaFile)
{
IdentityReference = inputFiles[i].IdentityReference,
};

OutputFile outputFile = new OutputFile(outputMediaFile,


new OutputFileDestination() {
Container = outputContainer },
new
OutputFileUploadConfig(OutputFileUploadCondition.TaskSuccess));
batchTaskCreateContent.OutputFiles.Add(outputFile);

tasks.Add(batchTaskCreateContent);
}

// Call BatchClient.CreateTaskCollectionAsync() to add the tasks as a


collection rather than making a
// separate call for each. Bulk task submission helps to ensure efficient
underlying API
// calls to the Batch service.

await batchClient.CreateTaskCollectionAsync(jobId, new


BatchTaskGroup(tasks));

Clean up resources
After it runs the tasks, the app automatically deletes the input storage container it
created, and gives you the option to delete the Batch pool and job. The BatchClient has
a method to delete a job DeleteJobAsync and delete a pool DeletePoolAsync, which are
called if you confirm deletion. Although you're not charged for jobs and tasks
themselves, you are charged for compute nodes. Thus, we recommend that you allocate
pools only as needed. When you delete the pool, all task output on the nodes is deleted.
However, the output files remain in the storage account.

When no longer needed, delete the resource group, Batch account, and storage
account. To do so in the Azure portal, select the resource group for the Batch account
and click Delete resource group.

Next steps
In this tutorial, you learned how to:

" Add an application package to your Batch account.


" Authenticate with Batch and Storage accounts.
" Upload input files to Storage.
" Create a pool of compute nodes to run an application.
" Create a job and tasks to process input files.
" Monitor task execution.
" Retrieve output files.
For more examples of using the .NET API to schedule and process Batch workloads, see
the Batch C# samples on GitHub .

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Tutorial: Run a parallel workload with
Azure Batch using the Python API
Article • 03/01/2024

Use Azure Batch to run large-scale parallel and high-performance computing (HPC)
batch jobs efficiently in Azure. This tutorial walks through a Python example of running a
parallel workload using Batch. You learn a common Batch application workflow and how
to interact programmatically with Batch and Storage resources.

" Authenticate with Batch and Storage accounts.


" Upload input files to Storage.
" Create a pool of compute nodes to run an application.
" Create a job and tasks to process input files.
" Monitor task execution.
" Retrieve output files.

In this tutorial, you convert MP4 media files to MP3 format, in parallel, by using the
ffmpeg open-source tool.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Prerequisites
Python version 3.8 or later

pip package manager

An Azure Batch account and a linked Azure Storage account. To create these
accounts, see the Batch quickstart guides for Azure portal or Azure CLI.

Sign in to Azure
Sign in to the Azure portal .

Get account credentials


For this example, you need to provide credentials for your Batch and Storage accounts.
A straightforward way to get the necessary credentials is in the Azure portal. (You can
also get these credentials using the Azure APIs or command-line tools.)
1. Select All services > Batch accounts, and then select the name of your Batch
account.

2. To see the Batch credentials, select Keys. Copy the values of Batch account, URL,
and Primary access key to a text editor.

3. To see the Storage account name and keys, select Storage account. Copy the
values of Storage account name and Key1 to a text editor.

Download and run the sample app

Download the sample app


Download or clone the sample app from GitHub. To clone the sample app repo with a
Git client, use the following command:

Bash

git clone https://github.com/Azure-Samples/batch-python-ffmpeg-tutorial.git

Navigate to the directory that contains the file batch_python_tutorial_ffmpeg.py.

In your Python environment, install the required packages using pip .

Bash

pip install -r requirements.txt

Use a code editor to open the file config.py. Update the Batch and storage account
credential strings with the values unique to your accounts. For example:

Python

_BATCH_ACCOUNT_NAME = 'yourbatchaccount'
_BATCH_ACCOUNT_KEY =
'xxxxxxxxxxxxxxxxE+yXrRvJAqT9BlXwwo1CwF+SwAYOxxxxxxxxxxxxxxxx43pXi/gdiATkvbp
LRl3x14pcEQ=='
_BATCH_ACCOUNT_URL =
'https://yourbatchaccount.yourbatchregion.batch.azure.com'
_STORAGE_ACCOUNT_NAME = 'mystorageaccount'
_STORAGE_ACCOUNT_KEY =
'xxxxxxxxxxxxxxxxy4/xxxxxxxxxxxxxxxxfwpbIC5aAWA8wDu+AFXZB827Mt9lybZB1nUcQbQi
UrkPtilK5BQ=='
Run the app
To run the script:

Bash

python batch_python_tutorial_ffmpeg.py

When you run the sample application, the console output is similar to the following.
During execution, you experience a pause at Monitoring all tasks for 'Completed'
state, timeout in 00:30:00... while the pool's compute nodes are started.

Sample start: 11/28/2018 3:20:21 PM

Container [input] created.


Container [output] created.
Uploading file LowPriVMs-1.mp4 to container [input]...
Uploading file LowPriVMs-2.mp4 to container [input]...
Uploading file LowPriVMs-3.mp4 to container [input]...
Uploading file LowPriVMs-4.mp4 to container [input]...
Uploading file LowPriVMs-5.mp4 to container [input]...
Creating pool [LinuxFFmpegPool]...
Creating job [LinuxFFmpegJob]...
Adding 5 tasks to job [LinuxFFmpegJob]...
Monitoring all tasks for 'Completed' state, timeout in 00:30:00...
Success! All tasks completed successfully within the specified timeout
period.
Deleting container [input]....

Sample end: 11/28/2018 3:29:36 PM


Elapsed time: 00:09:14.3418742

Go to your Batch account in the Azure portal to monitor the pool, compute nodes, job,
and tasks. For example, to see a heat map of the compute nodes in your pool, select
Pools > LinuxFFmpegPool.

When tasks are running, the heat map is similar to the following:
Typical execution time is approximately 5 minutes when you run the application in its
default configuration. Pool creation takes the most time.

Retrieve output files


You can use the Azure portal to download the output MP3 files generated by the
ffmpeg tasks.

1. Click All services > Storage accounts, and then click the name of your storage
account.
2. Click Blobs > output.
3. Right-click one of the output MP3 files and then click Download. Follow the
prompts in your browser to open or save the file.

Although not shown in this sample, you can also download the files programmatically
from the compute nodes or from the storage container.
Review the code
The following sections break down the sample application into the steps that it performs
to process a workload in the Batch service. Refer to the Python code while you read the
rest of this article, since not every line of code in the sample is discussed.

Authenticate Blob and Batch clients


To interact with a storage account, the app uses the azure-storage-blob package to
create a BlockBlobService object.

Python

blob_client = azureblob.BlockBlobService(
account_name=_STORAGE_ACCOUNT_NAME,
account_key=_STORAGE_ACCOUNT_KEY)

The app creates a BatchServiceClient object to create and manage pools, jobs, and tasks
in the Batch service. The Batch client in the sample uses shared key authentication. Batch
also supports authentication through Microsoft Entra ID, to authenticate individual users
or an unattended application.

Python

credentials = batchauth.SharedKeyCredentials(_BATCH_ACCOUNT_NAME,
_BATCH_ACCOUNT_KEY)

batch_client = batch.BatchServiceClient(
credentials,
base_url=_BATCH_ACCOUNT_URL)

Upload input files


The app uses the blob_client reference create a storage container for the input MP4
files and a container for the task output. Then, it calls the upload_file_to_container
function to upload MP4 files in the local InputFiles directory to the container. The files in
storage are defined as Batch ResourceFile objects that Batch can later download to
compute nodes.

Python

blob_client.create_container(input_container_name, fail_on_exist=False)
blob_client.create_container(output_container_name, fail_on_exist=False)
input_file_paths = []
for folder, subs, files in os.walk(os.path.join(sys.path[0],
'./InputFiles/')):
for filename in files:
if filename.endswith(".mp4"):
input_file_paths.append(os.path.abspath(
os.path.join(folder, filename)))

# Upload the input files. This is the collection of files that are to be
processed by the tasks.
input_files = [
upload_file_to_container(blob_client, input_container_name, file_path)
for file_path in input_file_paths]

Create a pool of compute nodes


Next, the sample creates a pool of compute nodes in the Batch account with a call to
create_pool . This defined function uses the Batch PoolAddParameter class to set the

number of nodes, VM size, and a pool configuration. Here, a


VirtualMachineConfiguration object specifies an ImageReference to an Ubuntu Server
20.04 LTS image published in the Azure Marketplace. Batch supports a wide range of VM
images in the Azure Marketplace, as well as custom VM images.

The number of nodes and VM size are set using defined constants. Batch supports
dedicated nodes and Spot nodes, and you can use either or both in your pools.
Dedicated nodes are reserved for your pool. Spot nodes are offered at a reduced price
from surplus VM capacity in Azure. Spot nodes become unavailable if Azure doesn't
have enough capacity. The sample by default creates a pool containing only five Spot
nodes in size Standard_A1_v2.

In addition to physical node properties, this pool configuration includes a StartTask


object. The StartTask executes on each node as that node joins the pool, and each time a
node is restarted. In this example, the StartTask runs Bash shell commands to install the
ffmpeg package and dependencies on the nodes.

The pool.add method submits the pool to the Batch service.

Python

new_pool = batch.models.PoolAddParameter(
id=pool_id,
virtual_machine_configuration=batchmodels.VirtualMachineConfiguration(
image_reference=batchmodels.ImageReference(
publisher="Canonical",
offer="UbuntuServer",
sku="20.04-LTS",
version="latest"
),
node_agent_sku_id="batch.node.ubuntu 20.04"),
vm_size=_POOL_VM_SIZE,
target_dedicated_nodes=_DEDICATED_POOL_NODE_COUNT,
target_low_priority_nodes=_LOW_PRIORITY_POOL_NODE_COUNT,
start_task=batchmodels.StartTask(
command_line="/bin/bash -c \"apt-get update && apt-get install -y
ffmpeg\"",
wait_for_success=True,
user_identity=batchmodels.UserIdentity(
auto_user=batchmodels.AutoUserSpecification(
scope=batchmodels.AutoUserScope.pool,
elevation_level=batchmodels.ElevationLevel.admin)),
)
)
batch_service_client.pool.add(new_pool)

Create a job
A Batch job specifies a pool to run tasks on and optional settings such as a priority and
schedule for the work. The sample creates a job with a call to create_job . This defined
function uses the JobAddParameter class to create a job on your pool. The job.add
method submits the pool to the Batch service. Initially the job has no tasks.

Python

job = batch.models.JobAddParameter(
id=job_id,
pool_info=batch.models.PoolInformation(pool_id=pool_id))

batch_service_client.job.add(job)

Create tasks
The app creates tasks in the job with a call to add_tasks . This defined function creates a
list of task objects using the TaskAddParameter class. Each task runs ffmpeg to process
an input resource_files object using a command_line parameter. ffmpeg was previously
installed on each node when the pool was created. Here, the command line runs ffmpeg
to convert each input MP4 (video) file to an MP3 (audio) file.

The sample creates an OutputFile object for the MP3 file after running the command
line. Each task's output files (one, in this case) are uploaded to a container in the linked
storage account, using the task's output_files property.
Then, the app adds tasks to the job with the task.add_collection method, which queues
them to run on the compute nodes.

Python

tasks = list()

for idx, input_file in enumerate(input_files):


input_file_path = input_file.file_path
output_file_path = "".join((input_file_path).split('.')[:-1]) + '.mp3'
command = "/bin/bash -c \"ffmpeg -i {} {} \"".format(
input_file_path, output_file_path)
tasks.append(batch.models.TaskAddParameter(
id='Task{}'.format(idx),
command_line=command,
resource_files=[input_file],
output_files=[batchmodels.OutputFile(
file_pattern=output_file_path,
destination=batchmodels.OutputFileDestination(
container=batchmodels.OutputFileBlobContainerDestination(
container_url=output_container_sas_url)),
upload_options=batchmodels.OutputFileUploadOptions(

upload_condition=batchmodels.OutputFileUploadCondition.task_success))]
)
)
batch_service_client.task.add_collection(job_id, tasks)

Monitor tasks
When tasks are added to a job, Batch automatically queues and schedules them for
execution on compute nodes in the associated pool. Based on the settings you specify,
Batch handles all task queuing, scheduling, retrying, and other task administration
duties.

There are many approaches to monitoring task execution. The


wait_for_tasks_to_complete function in this example uses the TaskState object to

monitor tasks for a certain state, in this case the completed state, within a time limit.

Python

while datetime.datetime.now() < timeout_expiration:


print('.', end='')
sys.stdout.flush()
tasks = batch_service_client.task.list(job_id)

incomplete_tasks = [task for task in tasks if


task.state != batchmodels.TaskState.completed]
if not incomplete_tasks:
print()
return True
else:
time.sleep(1)
...

Clean up resources
After it runs the tasks, the app automatically deletes the input storage container it
created, and gives you the option to delete the Batch pool and job. The BatchClient's
JobOperations and PoolOperations classes both have delete methods, which are called if
you confirm deletion. Although you're not charged for jobs and tasks themselves, you
are charged for compute nodes. Thus, we recommend that you allocate pools only as
needed. When you delete the pool, all task output on the nodes is deleted. However,
the input and output files remain in the storage account.

When no longer needed, delete the resource group, Batch account, and storage
account. To do so in the Azure portal, select the resource group for the Batch account
and choose Delete resource group.

Next steps
In this tutorial, you learned how to:

" Authenticate with Batch and Storage accounts.


" Upload input files to Storage.
" Create a pool of compute nodes to run an application.
" Create a job and tasks to process input files.
" Monitor task execution.
" Retrieve output files.

For more examples of using the Python API to schedule and process Batch workloads,
see the Batch Python samples on GitHub.
Tutorial: Trigger a Batch job using Azure
Functions
Article • 05/04/2023

In this tutorial, you learn how to trigger a Batch job using Azure Functions. This article
walks through an example that takes documents added to an Azure Storage blob
container applies optical character recognition (OCR) by using Azure Batch. To
streamline the OCR processing, this example configures an Azure function that runs a
Batch OCR job each time a file is added to the blob container. You learn how to:

Use the Azure portal to create pools and jobs.


Create blob containers and a shared access signature (SAS).
Create a blob-triggered Azure Function.
Upload input files to Storage.
Monitor task execution.
Retrieve output files.

Prerequisites
An Azure account with an active subscription. Create an account for free .
An Azure Batch account and a linked Azure Storage account. For more information
on how to create and link accounts, see Create a Batch account.

Sign in to Azure
Sign in to the Azure portal .

Create a Batch pool and Batch job using the


Azure portal
In this section, you use the Azure portal to create the Batch pool and Batch job that runs
OCR tasks.

Create a pool
1. Sign in to the Azure portal using your Azure credentials.
2. Create a pool by selecting Pools on the left side navigation, and then the select the
Add button above the search form.

a. Enter a Pool ID. This example names the pool ocr-pool .


b. Select canonical as the Publisher.
c. Select 0001-com-ubuntu-server-jammy as the Offer.
d. Select 22_04-lts as the Sku.
e. Choose Standard_F2s_v2 - 2 vCPUs, 2 GB Memory as the VM size in the Node
Size section.
f. Set the Mode in the Scale section to Fixed, and enter 3 for the Target dedicated
nodes.
g. Set Start task to Enabled the start task, and enter the command /bin/bash -c
"sudo update-locale LC_ALL=C.UTF-8 LANG=C.UTF-8; sudo apt-get update; sudo
apt-get -y install ocrmypdf" in Command line. Be sure to set the Elevation

level as Pool autouser, Admin, which allows start tasks to include commands
with sudo .
h. Select OK.

Create a job
1. Create a job on the pool by selecting Jobs in the left side navigation, and then
choose the Add button above the search form.
a. Enter a Job ID. This example uses ocr-job .
b. Select ocr-pool for Current pool, or whatever name you chose for your pool.
c. Select OK.

Create blob containers


Here you create blob containers that store your input and output files for the OCR Batch
job. In this example, the input container is named input and is where all documents
without OCR are initially uploaded for processing. The output container is named
output and is where the Batch job writes processed documents with OCR.

1. Search for and select Storage accounts in the Azure portal.

2. Choose your storage account linked to your Batch account.

3. Select Containers from the left side navigation, and create two blob containers
(one for input files, one for output files) by following the steps at Create a blob
container.

4. Create a shared access signature for your output container by selecting the output
container, and on the Shared access tokens page, select Write in the Permissions
drop down. No other permissions are necessary.

5. Select Generate SAS token and URL, and copy the Blob SAS URL to use later for
your function.
Create an Azure Function
In this section, you create the Azure Function that triggers the OCR Batch job whenever
a file is uploaded to your input container.

1. Follow the steps in Create a function triggered by Azure Blob storage to create a
function.
a. For runtime stack, choose .NET. This example function uses C# to take
advantage of the Batch .NET SDK.
b. On the Storage page, use the same storage account that you linked to your
Batch account.
c. Select Review + Create > Create.

The following screenshot the Create Function App page on the Basics tab using
example information.

2. In your function, select Functions from the left side navigation and select Create.

3. In the Create function pane, select Azure Blob Storage trigger.

4. Enter a name for your function in New Function. In this example, the name is
OcrTrigger. Enter the path as input/{name} , where input in the name of your Blob
container.
5. Select Create.

6. Once the blob-triggered function is created, select Code + Test. Use the run.csx
and function.proj from GitHub in the Function. function.proj doesn't exist by
default, so select the Upload button to upload it into your development
workspace.

run.csx is run when a new blob is added to your input blob container.
function.proj lists the external libraries in your Function code, for example,

the Batch .NET SDK.


7. Change the placeholder values of the variables in the Run() function of the
run.csx file to reflect your Batch and storage credentials. You can find your Batch
and storage account credentials in the Azure portal in the Keys section of your
Batch and storage account.

Trigger the function and retrieve results


Upload any or all of the scanned files from the input_files directory on GitHub to your
input container.

You can test your function from Azure portal on the Code + Test page of your function.

1. Select Test/run on the Code + Test page.


2. Enter the path for your input container in Body on the Input tab.
3. Select Run.

After a few seconds, the file with OCR applied is added to the output container. Log
information outputs to the bottom window. The file is then visible and retrievable on
Storage Explorer.

Alternatively, you can find the log information on the Monitor page:

Console

2019-05-29T19:45:25.846 [Information] Creating job...


2019-05-29T19:45:25.847 [Information] Accessing input container
<inputContainer>...
2019-05-29T19:45:25.847 [Information] Adding <fileName> as a resource
file...
2019-05-29T19:45:25.848 [Information] Name of output text file:
<outputTxtFile>
2019-05-29T19:45:25.848 [Information] Name of output PDF file:
<outputPdfFile>
2019-05-29T19:45:26.200 [Information] Adding OCR task <taskID> for
<fileName> <size of fileName>...

To download the output files to your local machine, go to the output container in your
storage account. Select more options on the file you want, and then select Download.

 Tip

The downloaded files are searchable if opened in a PDF reader.


Clean up resources
You're charged for the pool while the nodes are running, even if no jobs are scheduled.
When you no longer need the pool, delete it with the following steps:

1. From the Pools page of your Batch account, select more options on your pool.
2. Select Delete.

When you delete the pool, all task output on the nodes is deleted. However, the output
files remain in the storage account. When no longer needed, you can also delete the
Batch account and the storage account.

Next steps
For more examples of using the .NET API to schedule and process Batch workloads, see
the samples on GitHub.

Batch C# samples
Tutorial: Run a Batch job through Data
Factory with Batch Explorer, Storage
Explorer, and Python
Article • 04/02/2025

This tutorial walks you through creating and running an Azure Data Factory pipeline that
runs an Azure Batch workload. A Python script runs on the Batch nodes to get comma-
separated value (CSV) input from an Azure Blob Storage container, manipulate the data,
and write the output to a different storage container. You use Batch Explorer to create a
Batch pool and nodes, and Azure Storage Explorer to work with storage containers and
files.

In this tutorial, you learn how to:

" Use Batch Explorer to create a Batch pool and nodes.


" Use Storage Explorer to create storage containers and upload input files.
" Develop a Python script to manipulate input data and produce output.
" Create a Data Factory pipeline that runs the Batch workload.
" Use Batch Explorer to look at the output log files.

Prerequisites
An Azure account with an active subscription. If you don't have one, create a free
account .
A Batch account with a linked Azure Storage account. You can create the accounts
by using any of the following methods: Azure portal | Azure CLI | Bicep | ARM
template | Terraform.
A Data Factory instance. To create the data factory, follow the instructions in Create
a data factory.
Batch Explorer downloaded and installed.
Storage Explorer downloaded and installed.
Python 3.8 or above , with the azure-storage-blob package installed by using
pip .

The iris.csv input dataset downloaded from GitHub.

Use Batch Explorer to create a Batch pool and


nodes
Use Batch Explorer to create a pool of compute nodes to run your workload.

1. Sign in to Batch Explorer with your Azure credentials.

2. Select your Batch account.

3. Select Pools on the left sidebar, and then select the + icon to add a pool.

4. Complete the Add a pool to the account form as follows:

Under ID, enter custom-activity-pool.


Under Dedicated nodes, enter 2.
For Select an operating system configuration, select the Data science tab,
and then select Dsvm Win 2019.
For Choose a virtual machine size, select Standard_F2s_v2.
For Start Task, select Add a start task. On the start task screen, under
Command line, enter cmd /c "pip install azure-storage-blob pandas" , and
then select Select. This command installs the azure-storage-blob package on
each node as it starts up.

5. Select Save and close.

Use Storage Explorer to create blob containers


Use Storage Explorer to create blob containers to store input and output files, and then
upload your input files.

1. Sign in to Storage Explorer with your Azure credentials.


2. In the left sidebar, locate and expand the storage account that's linked to your
Batch account.
3. Right-click Blob Containers, and select Create Blob Container, or select Create
Blob Container from Actions at the bottom of the sidebar.
4. Enter input in the entry field.
5. Create another blob container named output.
6. Select the input container, and then select Upload > Upload files in the right pane.
7. On the Upload files screen, under Selected files, select the ellipsis ... next to the
entry field.
8. Browse to the location of your downloaded iris.csv file, select Open, and then
select Upload.

Develop a Python script


The following Python script loads the iris.csv dataset file from your Storage Explorer
input container, manipulates the data, and saves the results to the output container.

The script needs to use the connection string for the Azure Storage account that's linked
to your Batch account. To get the connection string:

1. In the Azure portal , search for and select the name of the storage account that's
linked to your Batch account.
2. On the page for the storage account, select Access keys from the left navigation
under Security + networking.
3. Under key1, select Show next to Connection string, and then select the Copy icon
to copy the connection string.

Paste the connection string into the following script, replacing the <storage-account-
connection-string> placeholder. Save the script as a file named main.py.

) Important

Exposing account keys in the app source isn't recommended for Production usage.
You should restrict access to credentials and refer to them in your code by using
variables or a configuration file. It's best to store Batch and Storage account keys in
Azure Key Vault.

Python

# Load libraries
# from azure.storage.blob import BlobClient
from azure.storage.blob import BlobServiceClient
import pandas as pd
import io

# Define parameters
connectionString = "<storage-account-connection-string>"
containerName = "output"
outputBlobName = "iris_setosa.csv"

# Establish connection with the blob storage account


blob = BlobClient.from_connection_string(conn_str=connectionString,
container_name=containerName, blob_name=outputBlobName)

# Initialize the BlobServiceClient (This initializes a connection to the


Azure Blob Storage, downloads the content of the 'iris.csv' file, and then
loads it into a Pandas DataFrame for further processing.)
blob_service_client =
BlobServiceClient.from_connection_string(conn_str=connectionString)
blob_client =
blob_service_client.get_blob_client(container_name=containerName,
blob_name=outputBlobName)

# Download the blob content


blob_data = blob_client.download_blob().readall()

# Load iris dataset from the task node


# df = pd.read_csv("iris.csv")
df = pd.read_csv(io.BytesIO(blob_data))

# Take a subset of the records


df = df[df['Species'] == "setosa"]

# Save the subset of the iris dataframe locally in the task node
df.to_csv(outputBlobName, index = False)

with open(outputBlobName, "rb") as data:


blob.upload_blob(data, overwrite=True)

For more information on working with Azure Blob Storage, refer to the Azure Blob
Storage documentation.

Run the script locally to test and validate functionality.

Bash

python main.py

The script should produce an output file named iris_setosa.csv that contains only the
data records that have Species = setosa. After you verify that it works correctly, upload
the main.py script file to your Storage Explorer input container.

Set up a Data Factory pipeline


Create and validate a Data Factory pipeline that uses your Python script.

Get account information


The Data Factory pipeline uses your Batch and Storage account names, account key
values, and Batch account endpoint. To get this information from the Azure portal :

1. From the Azure Search bar, search for and select your Batch account name.

2. On your Batch account page, select Keys from the left navigation.

3. On the Keys page, copy the following values:

Batch account
Account endpoint
Primary access key
Storage account name
Key1

Create and run the pipeline


1. If Azure Data Factory Studio isn't already running, select Launch studio on your
Data Factory page in the Azure portal.

2. In Data Factory Studio, select the Author pencil icon in the left navigation.

3. Under Factory Resources, select the + icon, and then select Pipeline.

4. In the Properties pane on the right, change the name of the pipeline to Run
Python.

5. In the Activities pane, expand Batch Service, and drag the Custom activity to the
pipeline designer surface.

6. Below the designer canvas, on the General tab, enter testPipeline under Name.
7. Select the Azure Batch tab, and then select New.

8. Complete the New linked service form as follows:

Name: Enter a name for the linked service, such as AzureBatch1.


Access key: Enter the primary access key you copied from your Batch
account.
Account name: Enter your Batch account name.
Batch URL: Enter the account endpoint you copied from your Batch account,
such as https://batchdotnet.eastus.batch.azure.com .
Pool name: Enter custom-activity-pool, the pool you created in Batch
Explorer.
Storage account linked service name: Select New. On the next screen, enter
a Name for the linked storage service, such as AzureBlobStorage1, select your
Azure subscription and linked storage account, and then select Create.

9. At the bottom of the Batch New linked service screen, select Test connection.
When the connection is successful, select Create.

10. Select the Settings tab, and enter or select the following settings:

Command: Enter cmd /C python main.py .


Resource linked service: Select the linked storage service you created, such
as AzureBlobStorage1, and test the connection to make sure it's successful.
Folder path: Select the folder icon, and then select the input container and
select OK. The files from this folder download from the container to the pool
nodes before the Python script runs.

11. Select Validate on the pipeline toolbar to validate the pipeline.

12. Select Debug to test the pipeline and ensure it works correctly.

13. Select Publish all to publish the pipeline.

14. Select Add trigger, and then select Trigger now to run the pipeline, or New/Edit to
schedule it.

Use Batch Explorer to view log files


If running your pipeline produces warnings or errors, you can use Batch Explorer to look
at the stdout.txt and stderr.txt output files for more information.

1. In Batch Explorer, select Jobs from the left sidebar.


2. Select the adfv2-custom-activity-pool job.
3. Select a task that had a failure exit code.
4. View the stdout.txt and stderr.txt files to investigate and diagnose your problem.

Clean up resources
Batch accounts, jobs, and tasks are free, but compute nodes incur charges even when
they're not running jobs. It's best to allocate node pools only as needed, and delete the
pools when you're done with them. Deleting pools deletes all task output on the nodes,
and the nodes themselves.

Input and output files remain in the storage account and can incur charges. When you
no longer need the files, you can delete the files or containers. When you no longer
need your Batch account or linked storage account, you can delete them.

Next steps
In this tutorial, you learned how to use a Python script with Batch Explorer, Storage
Explorer, and Data Factory to run a Batch workload. For more information about Data
Factory, see What is Azure Data Factory?

) Note: The author created this article with assistance from AI. Learn more

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


CLI example: Create a Batch account in
Batch service mode
Article • 04/02/2025

This script creates an Azure Batch account in Batch service mode and shows how to
query or update various properties of the account. When you create a Batch account in
the default Batch service mode, its compute nodes are assigned internally by the Batch
service. Allocated compute nodes are subject to a separate vCPU (core) quota and the
account can be authenticated either via shared key credentials or a Microsoft Entra
token.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Get
started with Azure Cloud Shell.

If you prefer to run CLI reference commands locally, install the Azure CLI. If you're
running on Windows or macOS, consider running Azure CLI in a Docker container.
For more information, see How to run the Azure CLI in a Docker container.

If you're using a local installation, sign in to the Azure CLI by using the az login
command. To finish the authentication process, follow the steps displayed in
your terminal. For other sign-in options, see Authenticate to Azure using Azure
CLI.

When you're prompted, install the Azure CLI extension on first use. For more
information about extensions, see Use and manage extensions with the Azure
CLI.

Run az version to find the version and dependent libraries that are installed. To
upgrade to the latest version, run az upgrade.

Sample script
Launch Azure Cloud Shell
The Azure Cloud Shell is a free interactive shell that you can use to run the steps in this
article. It has common Azure tools preinstalled and configured to use with your account.

To open the Cloud Shell, just select Try it from the upper right corner of a code block.
You can also launch Cloud Shell in a separate browser tab by going to
https://shell.azure.com .

When Cloud Shell opens, verify that Bash is selected for your environment. Subsequent
sessions will use Azure CLI in a Bash environment, Select Copy to copy the blocks of
code, paste it into the Cloud Shell, and press Enter to run it.

Sign in to Azure
Cloud Shell is automatically authenticated under the initial account signed-in with. Use
the following script to sign in using a different subscription, replacing subscriptionId with
your Azure subscription ID.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Azure CLI

subscription="subscriptionId" # Set Azure subscription ID here

az account set -s $subscription # ...or use 'az login'

For more information, see set active subscription or log in interactively.

Run the script


Azure CLI

# Create a Batch account in Batch service mode

# Variable block
let "randomIdentifier=$RANDOM*$RANDOM"
location="East US"
[[ "$RESOURCE_GROUP" == '' ]] && resourceGroup="msdocs-batch-
rg-$randomIdentifier" || resourceGroup="${RESOURCE_GROUP}"
tag="create-account"
batchAccount="msdocsbatch$randomIdentifier"
storageAccount="msdocsstorage$randomIdentifier"

# Create a resource group


echo "Creating $resourceGroup in "$location"..."
az group create --name $resourceGroup --location "$location" --tag $tag

# Create a Batch account


echo "Creating $batchAccount"
az batch account create --resource-group $resourceGroup --name $batchAccount
--location "$location"

# Display the details of the created account.


az batch account show --resource-group $resourceGroup --name $batchAccount

# Add a storage account reference to the Batch account for use as 'auto-
storage'
# for applications. Start by creating the storage account.
echo "Creating $storageAccount"
az storage account create --resource-group $resourceGroup --name
$storageAccount --location "$location" --sku Standard_LRS

# Update the Batch account with the either the name (if they exist in
# the same resource group) or the full resource ID of the storage account.
echo "Adding $storageAccount to $batchAccount"
az batch account set --resource-group $resourceGroup --name $batchAccount --
storage-account $storageAccount

# View the access keys to the Batch Account for future client
authentication.
az batch account keys list --resource-group $resourceGroup --name
$batchAccount

# Authenticate against the account directly for further CLI interaction.


az batch account login --resource-group $resourceGroup --name $batchAccount
--shared-key-auth

Clean up resources
Use the following command to remove the resource group and all resources associated
with it using the az group delete command - unless you have an ongoing need for these
resources. Some of these resources may take a while to create, as well as to delete.

Azure CLI

az group delete --name $resourceGroup

Sample reference
This script uses the following commands. Each command in the table links to command-
specific documentation.
ノ Expand table

Command Notes

az group create Creates a resource group in which all resources are stored.

az batch account create Creates the Batch account.

az storage account Creates a storage account.


create

az batch account set Updates properties of the Batch account.

az batch account show Retrieves details of the specified Batch account.

az batch account keys Retrieves the access keys of the specified Batch account.
list

az batch account login Authenticates against the specified Batch account for further CLI
interaction.

az group delete Deletes a resource group including all nested resources.

Next steps
For more information on the Azure CLI, see Azure CLI documentation.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


CLI example: Create a Batch account in
user subscription mode
Article • 06/24/2024

This script creates an Azure Batch account in user subscription mode. An account that
allocates compute nodes into your subscription must be authenticated via a Microsoft
Entra token. The compute nodes allocated count toward your subscription's vCPU (core)
quota.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see
Quickstart for Bash in Azure Cloud Shell.

If you prefer to run CLI reference commands locally, install the Azure CLI. If you're
running on Windows or macOS, consider running Azure CLI in a Docker container.
For more information, see How to run the Azure CLI in a Docker container.

If you're using a local installation, sign in to the Azure CLI by using the az login
command. To finish the authentication process, follow the steps displayed in
your terminal. For other sign-in options, see Sign in with the Azure CLI.

When you're prompted, install the Azure CLI extension on first use. For more
information about extensions, see Use extensions with the Azure CLI.

Run az version to find the version and dependent libraries that are installed. To
upgrade to the latest version, run az upgrade.

Sample script

Launch Azure Cloud Shell


The Azure Cloud Shell is a free interactive shell that you can use to run the steps in this
article. It has common Azure tools preinstalled and configured to use with your account.
To open the Cloud Shell, just select Try it from the upper right corner of a code block.
You can also launch Cloud Shell in a separate browser tab by going to
https://shell.azure.com .

When Cloud Shell opens, verify that Bash is selected for your environment. Subsequent
sessions will use Azure CLI in a Bash environment, Select Copy to copy the blocks of
code, paste it into the Cloud Shell, and press Enter to run it.

Sign in to Azure
Cloud Shell is automatically authenticated under the initial account signed-in with. Use
the following script to sign in using a different subscription, replacing <Subscription
ID> with your Azure Subscription ID. If you don't have an Azure subscription, create an

Azure free account before you begin.

Azure CLI

subscription="<subscriptionId>" # add subscription here

az account set -s $subscription # ...or use 'az login'

For more information, see set active subscription or log in interactively

Run the script


Azure CLI

# Create a Batch account in user subscription mode

# Variable block
let "randomIdentifier=$RANDOM*$RANDOM"
location="East US"
[[ "$RESOURCE_GROUP" == '' ]] && resourceGroup="msdocs-batch-
rg-$randomIdentifier" || resourceGroup="${RESOURCE_GROUP}"
tag="create-account-user-subscription"
keyVault="msdocskeyvault$randomIdentifier"
batchAccount="msdocsbatch$randomIdentifier"

# Allow Azure Batch to access the subscription (one-time operation).


az role assignment create --assignee ddbf3205-c6bd-46ae-8127-60eb93363864 --
role contributor

# Create a resource group


echo "Creating $resourceGroup in "$location"..."
az group create --name $resourceGroup --location "$location" --tag $tag

# Create an Azure Key Vault. A Batch account that allocates pools in the
user's subscription
# must be configured with a Key Vault located in the same region.
echo "Creating $keyVault"
az keyvault create --resource-group $resourceGroup --name $keyVault --
location "$location" --enabled-for-deployment true --enabled-for-disk-
encryption true --enabled-for-template-deployment true

# Add an access policy to the Key Vault to allow access by the Batch
Service.
az keyvault set-policy --resource-group $resourceGroup --name $keyVault --
spn ddbf3205-c6bd-46ae-8127-60eb93363864 --key-permissions all --secret-
permissions all

# Create the Batch account, referencing the Key Vault either by name (if
they
# exist in the same resource group) or by its full resource ID.
echo "Creating $batchAccount"
az batch account create --resource-group $resourceGroup --name $batchAccount
--location "$location" --keyvault $keyVault

# Authenticate directly against the account for further CLI interaction.


# Batch accounts that allocate pools in the user's subscription must be
# authenticated via an Azure Active Directory token.
az batch account login -g $resourceGroup -n $batchAccount

Clean up resources
Use the following command to remove the resource group and all resources associated
with it using the az group delete command - unless you have an ongoing need for these
resources. Some of these resources may take a while to create, as well as to delete.

Azure CLI

az group delete --name $resourceGroup

Sample reference
This script uses the following commands. Each command in the table links to command-
specific documentation.

ノ Expand table

Command Notes

az role assignment Create a new role assignment for a user, group, or service principal.
create
Command Notes

az group create Creates a resource group in which all resources are stored.

az keyvault create Creates a key vault.

az keyvault set-policy Update the security policy of the specified key vault.

az batch account create Creates the Batch account.

az batch account login Authenticates against the specified Batch account for further CLI
interaction.

az group delete Deletes a resource group including all nested resources.

Next steps
For more information on the Azure CLI, see Azure CLI documentation.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


CLI example: Add an application to an
Azure Batch account
Article • 04/02/2025

This script demonstrates how to add an application for use with an Azure Batch pool or
task. To set up an application to add to your Batch account, package your executable,
together with any dependencies, into a zip file.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Get
started with Azure Cloud Shell.

If you prefer to run CLI reference commands locally, install the Azure CLI. If you're
running on Windows or macOS, consider running Azure CLI in a Docker container.
For more information, see How to run the Azure CLI in a Docker container.

If you're using a local installation, sign in to the Azure CLI by using the az login
command. To finish the authentication process, follow the steps displayed in
your terminal. For other sign-in options, see Authenticate to Azure using Azure
CLI.

When you're prompted, install the Azure CLI extension on first use. For more
information about extensions, see Use and manage extensions with the Azure
CLI.

Run az version to find the version and dependent libraries that are installed. To
upgrade to the latest version, run az upgrade.

Sample script

Launch Azure Cloud Shell


The Azure Cloud Shell is a free interactive shell that you can use to run the steps in this
article. It has common Azure tools preinstalled and configured to use with your account.
To open the Cloud Shell, just select Try it from the upper right corner of a code block.
You can also launch Cloud Shell in a separate browser tab by going to
https://shell.azure.com .

When Cloud Shell opens, verify that Bash is selected for your environment. Subsequent
sessions will use Azure CLI in a Bash environment, Select Copy to copy the blocks of
code, paste it into the Cloud Shell, and press Enter to run it.

Sign in to Azure
Cloud Shell is automatically authenticated under the initial account signed-in with. Use
the following script to sign in using a different subscription, replacing subscriptionId with
your Azure subscription ID.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Azure CLI

subscription="subscriptionId" # Set Azure subscription ID here

az account set -s $subscription # ...or use 'az login'

For more information, see set active subscription or log in interactively.

Create batch account and new application


Azure CLI

# Add an application to an Azure Batch account

# Variable block
let "randomIdentifier=$RANDOM*$RANDOM"
location="East US"
[[ "$RESOURCE_GROUP" == '' ]] && resourceGroup="msdocs-batch-
rg-$randomIdentifier" || resourceGroup="${RESOURCE_GROUP}"
tag="add-application"
storageAccount="msdocsstorage$randomIdentifier"
batchAccount="msdocsbatch$randomIdentifier"

# Create a resource group.


echo "Creating $resourceGroup in "$location"..."
az group create --name $resourceGroup --location "$location" --tag $tag

# Create a general-purpose storage account in your resource group.


echo "Creating $storageAccount"
az storage account create --resource-group $resourceGroup --name
$storageAccount --location "$location" --sku Standard_LRS

# Create a Batch account.


echo "Creating $batchAccount"
az batch account create --name $batchAccount --storage-account
$storageAccount --resource-group $resourceGroup --location "$location"

# Authenticate against the account directly for further CLI interaction.


az batch account login --name $batchAccount --resource-group $resourceGroup
--shared-key-auth

# Create a new application.


az batch application create --resource-group $resourceGroup --name
$batchAccount --application-name "MyApplication"

Create batch application package


An application can reference multiple application executable packages of different
versions. The executables and any dependencies need to be zipped up for the package.
Once uploaded, the CLI attempts to activate the package so that it's ready for use.

Azure CLI

az batch application package create \


--resource-group $resourceGroup \
--name $batchAccount \
--application-name "MyApplication" \
--package-file my-application-exe.zip \
--version-name 1.0

Update the application


Update the application to assign the newly added application package as the default
version.

Azure CLI

az batch application set \


--resource-group $resourceGroup \
--name $batchAccount \
--application-name "MyApplication" \
--default-version 1.0

Clean up resources
Use the following command to remove the resource group and all resources associated
with it using the az group delete command - unless you have an ongoing need for these
resources. Some of these resources may take a while to create, as well as to delete.

Azure CLI

az group delete --name $resourceGroup

Sample reference
This script uses the following commands. Each command in the table links to command-
specific documentation.

ノ Expand table

Command Notes

az group create Creates a resource group in which all resources are stored.

az storage account create Creates a storage account.

az batch account create Creates the Batch account.

az batch account login Authenticates against the specified Batch account for further CLI
interaction.

az batch application create Creates an application.

az batch application package Adds an application package to the specified application.


create

az batch application set Updates properties of an application.

az group delete Deletes a resource group including all nested resources.

Next steps
For more information on the Azure CLI, see Azure CLI documentation.

Feedback
Was this page helpful?  Yes  No
Provide product feedback | Get help at Microsoft Q&A
CLI example: Create and manage a Linux
pool in Azure Batch
Article • 04/02/2025

This script demonstrates some of the commands available in the Azure CLI to create and
manage a pool of Linux compute nodes in Azure Batch.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Get
started with Azure Cloud Shell.

If you prefer to run CLI reference commands locally, install the Azure CLI. If you're
running on Windows or macOS, consider running Azure CLI in a Docker container.
For more information, see How to run the Azure CLI in a Docker container.

If you're using a local installation, sign in to the Azure CLI by using the az login
command. To finish the authentication process, follow the steps displayed in
your terminal. For other sign-in options, see Authenticate to Azure using Azure
CLI.

When you're prompted, install the Azure CLI extension on first use. For more
information about extensions, see Use and manage extensions with the Azure
CLI.

Run az version to find the version and dependent libraries that are installed. To
upgrade to the latest version, run az upgrade.

Sample script

Launch Azure Cloud Shell


The Azure Cloud Shell is a free interactive shell that you can use to run the steps in this
article. It has common Azure tools preinstalled and configured to use with your account.
To open the Cloud Shell, just select Try it from the upper right corner of a code block.
You can also launch Cloud Shell in a separate browser tab by going to
https://shell.azure.com .

When Cloud Shell opens, verify that Bash is selected for your environment. Subsequent
sessions will use Azure CLI in a Bash environment, Select Copy to copy the blocks of
code, paste it into the Cloud Shell, and press Enter to run it.

Sign in to Azure
Cloud Shell is automatically authenticated under the initial account signed-in with. Use
the following script to sign in using a different subscription, replacing subscriptionId with
your Azure subscription ID.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Azure CLI

subscription="subscriptionId" # Set Azure subscription ID here

az account set -s $subscription # ...or use 'az login'

For more information, see set active subscription or log in interactively.

To create a Linux pool in Azure Batch


Azure CLI

# Create and manage a Linux pool in Azure Batch

# Variable block
let "randomIdentifier=$RANDOM*$RANDOM"
location="East US"
[[ "$RESOURCE_GROUP" == '' ]] && resourceGroup="msdocs-batch-
rg-$randomIdentifier" || resourceGroup="${RESOURCE_GROUP}"
tag="manage-pool-linux"
batchAccount="msdocsbatch$randomIdentifier"

# Create a resource group.


echo "Creating $resourceGroup in "$location"..."
az group create --name $resourceGroup --location "$location" --tag $tag

# Create a Batch account.


echo "Creating $batchAccount"
az batch account create --resource-group $resourceGroup --name $batchAccount
--location "$location"
# Authenticate Batch account CLI session.
az batch account login --resource-group $resourceGroup --name $batchAccount
--shared-key-auth

# Retrieve a list of available images and node agent SKUs.


az batch pool supported-images list --query "[?
contains(imageReference.offer,'ubuntuserver') && imageReference.publisher ==
'canonical'].{Offer:imageReference.offer,
Publisher:imageReference.publisher, Sku:imageReference.sku,
nodeAgentSkuId:nodeAgentSkuId}[-1]" --output tsv

# Create a new Linux pool with a virtual machine configuration. The image
reference
# and node agent SKUs ID can be selected from the ouptputs of the above list
command.
# The image reference is in the format: {publisher}:{offer}:{sku}:{version}
where {version} is
# optional and defaults to 'latest'."

az batch pool create --id mypool-linux --vm-size Standard_A1 --image


canonical:ubuntuserver:18_04-lts-gen2 --node-agent-sku-id "batch.node.ubuntu
18.04"

# Resize the pool to start some VMs.


az batch pool resize --pool-id mypool-linux --target-dedicated 5

# Check the status of the pool to see when it has finished resizing.
az batch pool show --pool-id mypool-linux

# List the compute nodes running in a pool.


az batch node list --pool-id mypool-linux

# returns [] if no compute nodes are running

To reboot a batch node


If a particular node in the pool is having issues, it can be rebooted or reimaged. The ID
of the node can be retrieved with the list command above. A typical node ID is in the
format tvm-xxxxxxxxxx_1-<timestamp> .

Azure CLI

az batch node reboot \


--pool-id mypool-linux \
--node-id tvm-123_1-20170316t000000z

To delete a batch node


One or more compute nodes can be deleted from the pool, and any work already
assigned to it can be re-allocated to another node.

Azure CLI

az batch node delete \


--pool-id mypool-linux \
--node-list tvm-123_1-20170316t000000z tvm-123_2-20170316t000000z \
--node-deallocation-option requeue

Clean up resources
Use the following command to remove the resource group and all resources associated
with it using the az group delete command - unless you have an ongoing need for these
resources. Some of these resources may take a while to create, as well as to delete.

Azure CLI

az group delete --name $resourceGroup

Sample reference
This script uses the following commands. Each command in the table links to command-
specific documentation.

ノ Expand table

Command Notes

az group create Creates a resource group in which all resources are stored.

az batch account create Creates the Batch account.

az batch account login Authenticates against the specified Batch account for further CLI
interaction.

az batch pool node-agent- Lists available node agent SKUs and image information.
skus list

az batch pool create Creates a pool of compute nodes.

az batch pool resize Resizes the number of running VMs in the specified pool.

az batch pool show Displays the properties of a pool.


Command Notes

az batch node list Lists all the compute node in the specified pool.

az batch node reboot Reboots the specified compute node.

az batch node delete Deletes the listed nodes from the specified pool.

az group delete Deletes a resource group including all nested resources.

Next steps
For more information on the Azure CLI, see Azure CLI documentation.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


CLI example: Create and manage a
Windows pool in Azure Batch
Article • 06/24/2024

This script demonstrates some of the commands available in the Azure CLI to create and
manage a pool of Windows compute nodes in Azure Batch. A Windows pool can be
configured in two ways, with either a Cloud Services configuration or a Virtual Machine
configuration. This example shows how to create a Windows pool with the Cloud
Services configuration.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see
Quickstart for Bash in Azure Cloud Shell.

If you prefer to run CLI reference commands locally, install the Azure CLI. If you're
running on Windows or macOS, consider running Azure CLI in a Docker container.
For more information, see How to run the Azure CLI in a Docker container.

If you're using a local installation, sign in to the Azure CLI by using the az login
command. To finish the authentication process, follow the steps displayed in
your terminal. For other sign-in options, see Sign in with the Azure CLI.

When you're prompted, install the Azure CLI extension on first use. For more
information about extensions, see Use extensions with the Azure CLI.

Run az version to find the version and dependent libraries that are installed. To
upgrade to the latest version, run az upgrade.

Sample script

Launch Azure Cloud Shell


The Azure Cloud Shell is a free interactive shell that you can use to run the steps in this
article. It has common Azure tools preinstalled and configured to use with your account.
To open the Cloud Shell, just select Try it from the upper right corner of a code block.
You can also launch Cloud Shell in a separate browser tab by going to
https://shell.azure.com .

When Cloud Shell opens, verify that Bash is selected for your environment. Subsequent
sessions will use Azure CLI in a Bash environment, Select Copy to copy the blocks of
code, paste it into the Cloud Shell, and press Enter to run it.

Sign in to Azure
Cloud Shell is automatically authenticated under the initial account signed-in with. Use
the following script to sign in using a different subscription, replacing <Subscription
ID> with your Azure Subscription ID. If you don't have an Azure subscription, create an

Azure free account before you begin.

Azure CLI

subscription="<subscriptionId>" # add subscription here

az account set -s $subscription # ...or use 'az login'

For more information, see set active subscription or log in interactively

Run the script


Azure CLI

# Create and manage a Windows pool in Azure Batch

# Variable block
let "randomIdentifier=$RANDOM*$RANDOM"
location="East US"
[[ "$RESOURCE_GROUP" == '' ]] && resourceGroup="msdocs-batch-
rg-$randomIdentifier" || resourceGroup="${RESOURCE_GROUP}"
tag="manage-pool-windows"
storageAccount="msdocsstorage$randomIdentifier"
batchAccount="msdocsbatch$randomIdentifier"

# Create a resource group.


echo "Creating $resourceGroup in "$location"..."
az group create --name $resourceGroup --location "$location" --tag $tag

# Create a general-purpose storage account in your resource group.


echo "Creating $storageAccount"
az storage account create --resource-group $resourceGroup --name
$storageAccount --location "$location" --sku Standard_LRS
# Create a Batch account.
echo "Creating $batchAccount"
az batch account create --name $batchAccount --storage-account
$storageAccount --resource-group $resourceGroup --location "$location"

# Authenticate Batch account CLI session.


az batch account login --resource-group $resourceGroup --name $batchAccount
--shared-key-auth

# Create a new Windows cloud service platform pool with 3 Standard A1 VMs.
# The pool has a start task that runs a basic shell command. Typically a
# start task copies application files to the pool nodes.
az batch pool create --id mypool-windows --os-family 4 --target-dedicated 3
--vm-size small --start-task-command-line "cmd /c dir /s" --start-task-wait-
for-success

# --application-package-references myapp
# You can specify an application package reference when the pool is created
or you can add it later.
# https://docs.microsoft.com/azure/batch/batch-application-packages.

# Add some metadata to the pool.


az batch pool set --pool-id mypool-windows --metadata IsWindows=true
VMSize=StandardA1

# Change the pool to enable automatic scaling of compute nodes.


# This autoscale formula specifies that the number of nodes should be
adjusted according
# to the number of active tasks, up to a maximum of 10 compute nodes.
az batch pool autoscale enable --pool-id mypool-windows --auto-scale-formula
'$averageActiveTaskCount = avg($ActiveTasks.GetSample(TimeInterval_Minute *
15));$TargetDedicated = min(10, $averageActiveTaskCount);'

# Monitor the resizing of the pool.


az batch pool show --pool-id mypool-windows

# Disable autoscaling when we no longer require the pool to automatically


scale.
az batch pool autoscale disable --pool-id mypool-windows

Clean up resources
Use the following command to remove the resource group and all resources associated
with it using the az group delete command - unless you have an ongoing need for these
resources. Some of these resources may take a while to create, as well as to delete.

Azure CLI

az group delete --name $resourceGroup


Sample reference
This script uses the following commands. Each command in the table links to command-
specific documentation.

ノ Expand table

Command Notes

az group create Creates a resource group in which all resources are stored.

az batch account create Creates the Batch account.

az batch account login Authenticates against the specified Batch account for further CLI
interaction.

az batch pool create Creates a pool of compute nodes.

az batch pool set Updates the properties of a pool.

az batch pool autoscale Enables auto-scaling on a pool and applies a formula.


enable

az batch pool show Displays the properties of a pool.

az batch pool autoscale Disables auto-scaling on a pool.


disable

az group delete Deletes a resource group including all nested resources.

Next steps
For more information on the Azure CLI, see Azure CLI documentation.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


CLI example: Run a job and tasks with
Azure Batch
Article • 04/02/2025

This script creates a Batch job and adds a series of tasks to the job. It also demonstrates
how to monitor a job and its tasks.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Get
started with Azure Cloud Shell.

If you prefer to run CLI reference commands locally, install the Azure CLI. If you're
running on Windows or macOS, consider running Azure CLI in a Docker container.
For more information, see How to run the Azure CLI in a Docker container.

If you're using a local installation, sign in to the Azure CLI by using the az login
command. To finish the authentication process, follow the steps displayed in
your terminal. For other sign-in options, see Authenticate to Azure using Azure
CLI.

When you're prompted, install the Azure CLI extension on first use. For more
information about extensions, see Use and manage extensions with the Azure
CLI.

Run az version to find the version and dependent libraries that are installed. To
upgrade to the latest version, run az upgrade.

Sample script

Launch Azure Cloud Shell


The Azure Cloud Shell is a free interactive shell that you can use to run the steps in this
article. It has common Azure tools preinstalled and configured to use with your account.
To open the Cloud Shell, just select Try it from the upper right corner of a code block.
You can also launch Cloud Shell in a separate browser tab by going to
https://shell.azure.com .

When Cloud Shell opens, verify that Bash is selected for your environment. Subsequent
sessions will use Azure CLI in a Bash environment, Select Copy to copy the blocks of
code, paste it into the Cloud Shell, and press Enter to run it.

Sign in to Azure
Cloud Shell is automatically authenticated under the initial account signed-in with. Use
the following script to sign in using a different subscription, replacing subscriptionId with
your Azure subscription ID.

If you don't have an Azure subscription, create an Azure free account before you
begin.

Azure CLI

subscription="subscriptionId" # Set Azure subscription ID here

az account set -s $subscription # ...or use 'az login'

For more information, see set active subscription or log in interactively.

Create a Batch account in Batch service mode


Azure CLI

# Run a job and tasks with Azure Batch

# Variable block
let "randomIdentifier=$RANDOM*$RANDOM"
location="East US"
[[ "$RESOURCE_GROUP" == '' ]] && resourceGroup="msdocs-batch-
rg-$randomIdentifier" || resourceGroup="${RESOURCE_GROUP}"
tag="run-job"
storageAccount="msdocsstorage$randomIdentifier"
batchAccount="msdocsbatch$randomIdentifier"

# Create a resource group.


echo "Creating $resourceGroup in "$location"..."
az group create --name $resourceGroup --location "$location" --tag $tag

# Create a general-purpose storage account in your resource group.


echo "Creating $storageAccount"
az storage account create --resource-group $resourceGroup --name
$storageAccount --location "$location" --sku Standard_LRS

# Create a Batch account.


echo "Creating $batchAccount"
az batch account create --name $batchAccount --storage-account
$storageAccount --resource-group $resourceGroup --location "$location"

# Authenticate against the account directly for further CLI interaction.


az batch account login --name $batchAccount --resource-group $resourceGroup
--shared-key-auth

# Create a new Linux pool with a virtual machine configuration.


az batch pool create --id mypool --vm-size Standard_A1 --target-dedicated 2
--image canonical:ubuntuserver:18_04-lts-gen2 --node-agent-sku-id
"batch.node.ubuntu 18.04"

# Create a new job to encapsulate the tasks that are added.


az batch job create --id myjob --pool-id mypool

# Add tasks to the job. Here the task is a basic shell command.
az batch task create --job-id myjob --task-id task1 --command-line
"/bin/bash -c 'printenv AZ_BATCH_TASK_WORKING_DIR'"

To add many tasks at once


To add many tasks at once, specify the tasks in a JSON file, and pass it to the command.
For format, see https://github.com/Azure/azure-docs-cli-python-
samples/blob/master/batch/run-job/tasks.json . Provide the absolute path to the JSON
file. For an example JSON file, see https://github.com/Azure-Samples/azure-cli-
samples/blob/master/batch/run-job/tasks.json .

Azure CLI

az batch task create \


--job-id myjob \
--json-file tasks.json

To update the job


Update the job so that it is automatically marked as completed once all the tasks are
finished.

Azure CLI

az batch job set \


--job-id myjob \
--on-all-tasks-complete terminatejob

To monitor the status of the job


Azure CLI

az batch job show --job-id myjob

To monitor the status of a task


Azure CLI

az batch task show \


--job-id myjob \
--task-id task1

Clean up resources
Use the following command to remove the resource group and all resources associated
with it using the az group delete command - unless you have an ongoing need for these
resources. Some of these resources may take a while to create, as well as to delete.

Azure CLI

az group delete --name $resourceGroup

Sample reference
This script uses the following commands. Each command in the table links to command-
specific documentation.

ノ Expand table

Command Notes

az group create Creates a resource group in which all resources are stored.

az batch account Creates the Batch account.


create
Command Notes

az batch account login Authenticates against the specified Batch account for further CLI
interaction.

az batch pool create Creates a pool of compute nodes.

az batch job create Creates a Batch job.

az batch task create Adds a task to the specified Batch job.

az batch job set Updates properties of a Batch job.

az batch job show Retrieves details of a specified Batch job.

az batch task show Retrieves the details of a task from the specified Batch job.

az group delete Deletes a resource group including all nested resources.

Next steps
For more information on the Azure CLI, see Azure CLI documentation.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Azure Policy Regulatory Compliance
controls for Azure Batch
Article • 04/29/2024

Regulatory Compliance in Azure Policy provides Microsoft created and managed


initiative definitions, known as built-ins, for the compliance domains and security
controls related to different compliance standards. This page lists the compliance
domains and security controls for Azure Batch. You can assign the built-ins for a
security control individually to help make your Azure resources compliant with the
specific standard.

The title of each built-in policy definition links to the policy definition in the Azure
portal. Use the link in the Policy Version column to view the source on the Azure Policy
GitHub repo .

) Important

Each control is associated with one or more Azure Policy definitions. These policies
might help you assess compliance with the control. However, there often isn't a
one-to-one or complete match between a control and one or more policies. As
such, Compliant in Azure Policy refers only to the policies themselves. This doesn't
ensure that you're fully compliant with all requirements of a control. In addition, the
compliance standard includes controls that aren't addressed by any Azure Policy
definitions at this time. Therefore, compliance in Azure Policy is only a partial view
of your overall compliance status. The associations between controls and Azure
Policy Regulatory Compliance definitions for these compliance standards can
change over time.

CIS Microsoft Azure Foundations Benchmark


1.3.0
To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance - CIS Microsoft Azure
Foundations Benchmark 1.3.0. For more information about this compliance standard,
see CIS Microsoft Azure Foundations Benchmark .

ノ Expand table
Domain Control Control title Policy Policy
ID (Azure portal) version
(GitHub)

5 Logging and 5.3 Ensure that Diagnostic Logs Resource logs in Batch 5.0.0
Monitoring are enabled for all services accounts should be
which support it. enabled

CIS Microsoft Azure Foundations Benchmark


1.4.0
To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance details for CIS v1.4.0. For
more information about this compliance standard, see CIS Microsoft Azure Foundations
Benchmark .

ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

5 Logging and 5.3 Ensure that Diagnostic Logs Resource logs in Batch 5.0.0
Monitoring Are Enabled for All Services accounts should be
that Support it. enabled

CIS Microsoft Azure Foundations Benchmark


2.0.0
To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance details for CIS v2.0.0. For
more information about this compliance standard, see CIS Microsoft Azure Foundations
Benchmark .

ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

5 5.4 Ensure that Azure Monitor Resource logs in Batch 5.0.0


Resource Logging is Enabled for accounts should be
Domain Control Control title Policy Policy
ID (Azure portal) version
(GitHub)

All Services that Support it enabled

FedRAMP High
To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance - FedRAMP High. For
more information about this compliance standard, see FedRAMP High .

ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

Audit And AU-6 (4) Central Review And Resource logs in Batch 5.0.0
Accountability Analysis accounts should be
enabled

Audit And AU-6 (5) Integration / Scanning Resource logs in Batch 5.0.0
Accountability And Monitoring accounts should be
Capabilities enabled

Audit And AU-12 Audit Generation Resource logs in Batch 5.0.0


Accountability accounts should be
enabled

Audit And AU-12 System-Wide / Time- Resource logs in Batch 5.0.0


Accountability (1) Correlated Audit Trail accounts should be
enabled

System And SC-12 Cryptographic Key Azure Batch account 1.0.1


Communications Establishment And should use customer-
Protection Management managed keys to
encrypt data

FedRAMP Moderate
To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance - FedRAMP Moderate.
For more information about this compliance standard, see FedRAMP Moderate .
ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

Audit And AU-12 Audit Generation Resource logs in Batch 5.0.0


Accountability accounts should be
enabled

System And SC-12 Cryptographic Key Azure Batch account 1.0.1


Communications Establishment And should use customer-
Protection Management managed keys to
encrypt data

HIPAA HITRUST 9.2


To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance - HIPAA HITRUST 9.2. For
more information about this compliance standard, see HIPAA HITRUST 9.2 .

ノ Expand table

Domain Control ID Control title Policy Policy


(Azure portal) version
(GitHub)

12 Audit 1205.09aa2System.1- 1205.09aa2System.1- Resource logs in 5.0.0


Logging & 09.aa 09.aa 09.10 Monitoring Batch accounts
Monitoring should be
enabled

Microsoft cloud security benchmark


The Microsoft cloud security benchmark provides recommendations on how you can
secure your cloud solutions on Azure. To see how this service completely maps to the
Microsoft cloud security benchmark, see the Azure Security Benchmark mapping files .

To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance - Microsoft cloud security
benchmark.

ノ Expand table
Domain Control Control title Policy Policy
ID (Azure portal) version
(GitHub)

Logging and LT-3 Enable logging for Resource logs in Batch 5.0.0
Threat Detection security investigation accounts should be
enabled

NIST SP 800-171 R2
To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance - NIST SP 800-171 R2. For
more information about this compliance standard, see NIST SP 800-171 R2 .

ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

System and 3.13.10 Establish and manage Azure Batch 1.0.1


Communications cryptographic keys for account should
Protection cryptography employed in use customer-
organizational systems. managed keys to
encrypt data

Audit and 3.3.1 Create and retain system audit Resource logs in 5.0.0
Accountability logs and records to the extent Batch accounts
needed to enable the should be
monitoring, analysis, enabled
investigation, and reporting of
unlawful or unauthorized system
activity

Audit and 3.3.2 Ensure that the actions of Resource logs in 5.0.0
Accountability individual system users can be Batch accounts
uniquely traced to those users, should be
so they can be held accountable enabled
for their actions.

NIST SP 800-53 Rev. 4


To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance - NIST SP 800-53 Rev. 4.
For more information about this compliance standard, see NIST SP 800-53 Rev. 4 .
ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

Audit And AU-6 (4) Central Review And Resource logs in Batch 5.0.0
Accountability Analysis accounts should be
enabled

Audit And AU-6 (5) Integration / Scanning Resource logs in Batch 5.0.0
Accountability And Monitoring accounts should be
Capabilities enabled

Audit And AU-12 Audit Generation Resource logs in Batch 5.0.0


Accountability accounts should be
enabled

Audit And AU-12 System-Wide / Time- Resource logs in Batch 5.0.0


Accountability (1) Correlated Audit Trail accounts should be
enabled

System And SC-12 Cryptographic Key Azure Batch account 1.0.1


Communications Establishment And should use customer-
Protection Management managed keys to
encrypt data

NIST SP 800-53 Rev. 5


To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance - NIST SP 800-53 Rev. 5.
For more information about this compliance standard, see NIST SP 800-53 Rev. 5 .

ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

Audit and AU-6 (4) Central Review and Resource logs in Batch 5.0.0
Accountability Analysis accounts should be
enabled

Audit and AU-6 (5) Integrated Analysis of Resource logs in Batch 5.0.0
Accountability Audit Records accounts should be
enabled
Domain Control Control title Policy Policy
ID (Azure portal) version
(GitHub)

Audit and AU-12 Audit Record Resource logs in Batch 5.0.0


Accountability Generation accounts should be
enabled

Audit and AU-12 System-wide and Resource logs in Batch 5.0.0


Accountability (1) Time-correlated Audit accounts should be
Trail enabled

System and SC-12 Cryptographic Key Azure Batch account 1.0.1


Communications Establishment and should use customer-
Protection Management managed keys to
encrypt data

NL BIO Cloud Theme


To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance details for NL BIO Cloud
Theme. For more information about this compliance standard, see Baseline Information
Security Government Cybersecurity - Digital Government (digitaleoverheid.nl) .

ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

U.05.1 Data U.05.1 Data transport is secured Azure Batch pools 1.0.0
protection - with cryptography where key should have disk
Cryptographic management is carried out encryption enabled
measures by the CSC itself if possible.

U.05.2 Data U.05.2 Data stored in the cloud Azure Batch account 1.0.1
protection - service shall be protected to should use
Cryptographic the latest state of the art. customer-managed
measures keys to encrypt data

U.05.2 Data U.05.2 Data stored in the cloud Azure Batch pools 1.0.0
protection - service shall be protected to should have disk
Cryptographic the latest state of the art. encryption enabled
measures
Domain Control Control title Policy Policy
ID (Azure portal) version
(GitHub)

U.11.3 U.11.3 Sensitive data is always Azure Batch account 1.0.1


Cryptoservices - encrypted, with private keys should use
Encrypted managed by the CSC. customer-managed
keys to encrypt data

U.11.3 U.11.3 Sensitive data is always Azure Batch pools 1.0.0


Cryptoservices - encrypted, with private keys should have disk
Encrypted managed by the CSC. encryption enabled

U.15.1 Logging and U.15.1 The violation of the policy Resource logs in 5.0.0
monitoring - Events rules is recorded by the CSP Batch accounts
logged and the CSC. should be enabled

RMIT Malaysia
To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance - RMIT Malaysia. For
more information about this compliance standard, see RMIT Malaysia .

ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

Security of 10.66 Security of Deploy Diagnostic Settings for 2.0.0


Digital Digital Services - Batch Account to Event Hub
Services 10.66

Security of 10.66 Security of Deploy Diagnostic Settings for 1.0.0


Digital Digital Services - Batch Account to Log Analytics
Services 10.66 workspace

SWIFT CSP-CSCF v2021


To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance details for SWIFT CSP-
CSCF v2021. For more information about this compliance standard, see SWIFT CSP CSCF
v2021 .
ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

Detect Anomalous Activity 6.4 Logging and Resource logs in Batch 5.0.0
to Systems or Transaction Monitoring accounts should be
Records enabled

SWIFT CSP-CSCF v2022


To review how the available Azure Policy built-ins for all Azure services map to this
compliance standard, see Azure Policy Regulatory Compliance details for SWIFT CSP-
CSCF v2022. For more information about this compliance standard, see SWIFT CSP CSCF
v2022 .

ノ Expand table

Domain Control Control title Policy Policy


ID (Azure portal) version
(GitHub)

6. Detect Anomalous 6.4 Record security events and Resource logs in 5.0.0
Activity to Systems detect anomalous actions and Batch accounts
or Transaction operations within the local should be
Records SWIFT environment. enabled

Next steps
Learn more about Azure Policy Regulatory Compliance.
See the built-ins on the Azure Policy GitHub repo .
Azure security baseline for Batch
Article • 02/25/2025

This security baseline applies guidance from the Microsoft cloud security benchmark
version 1.0 to Batch. The Microsoft cloud security benchmark provides
recommendations on how you can secure your cloud solutions on Azure. The content is
grouped by the security controls defined by the Microsoft cloud security benchmark and
the related guidance applicable to Batch.

You can monitor this security baseline and its recommendations using Microsoft
Defender for Cloud. Azure Policy definitions will be listed in the Regulatory Compliance
section of the Microsoft Defender for Cloud portal page.

When a feature has relevant Azure Policy Definitions, they are listed in this baseline to
help you measure compliance with the Microsoft cloud security benchmark controls and
recommendations. Some recommendations may require a paid Microsoft Defender plan
to enable certain security scenarios.

7 Note

Features not applicable to Batch have been excluded. To see how Batch completely
maps to the Microsoft cloud security benchmark, see the full Batch security
baseline mapping file .

Security profile
The security profile summarizes high-impact behaviors of Batch, which may result in
increased security considerations.

ノ Expand table

Service Behavior Attribute Value

Product Category Compute

Customer can access HOST / OS Read Only

Service can be deployed into customer's virtual network True

Stores customer content at rest False


Network security
For more information, see the Microsoft cloud security benchmark: Network security.

NS-1: Establish network segmentation boundaries

Features

Virtual Network Integration

Description: Service supports deployment into customer's private Virtual Network


(VNet). Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Customer

Configuration Guidance: Deploy Azure Batch pools within a virtual network. Consider
provisioning the pool without public IP addresses to restrict access to nodes in the
private network and to reduce the discoverability of the nodes from the internet.

Reference: Create an Azure Batch pool in a virtual network

Network Security Group Support

Description: Service network traffic respects Network Security Groups rule assignment
on its subnets. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True True Microsoft

Feature notes: By default, Batch adds network security groups (NSGs) at the network
interfaces (NIC) level attached to compute nodes.

Configuration Guidance: No additional configurations are required as this is enabled on


a default deployment.

Reference: Create an Azure Batch pool in a virtual network


NS-2: Secure cloud services with network controls

Features

Azure Private Link

Description: Service native IP filtering capability for filtering network traffic (not to be
confused with NSG or Azure Firewall). Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Customer

Configuration Guidance: Deploy private endpoints for Azure Batch accounts. This
restricts access to the Batch accounts to the virtual network where they reside or to any
peered virtual network.

Reference: Use private endpoints with Azure Batch accounts

Disable Public Network Access

Description: Service supports disabling public network access either through using
service-level IP ACL filtering rule (not NSG or Azure Firewall) or using a 'Disable Public
Network Access' toggle switch. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Customer

Configuration Guidance: Disable public network access to Batch accounts by setting the
'Public network access' setting to disabled.

Reference: Disable public network access

Identity management
For more information, see the Microsoft cloud security benchmark: Identity management.
IM-1: Use centralized identity and authentication system

Features

Azure AD Authentication Required for Data Plane Access

Description: Service supports using Azure AD authentication for data plane access.
Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Customer

Configuration Guidance: Use Azure Active Directory (Azure AD) as the default
authentication method to control your data plane access instead of using Shared Keys.

Reference: Authenticate with Azure AD

Local Authentication Methods for Data Plane Access

Description: Local authentications methods supported for data plane access, such as a
local username and password. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Customer

Feature notes: Avoid the usage of local authentication methods or accounts, these
should be disabled wherever possible. Instead use Azure AD to authenticate where
possible.

Configuration Guidance: Restrict the use of local authentication methods for data plane
access. Instead, use Azure Active Directory (Azure AD) as the default authentication
method to control your data plane access.

Reference: Authentication via Shared Key


IM-3: Manage application identities securely and
automatically

Features

Managed Identities

Description: Data plane actions support authentication using managed identities. Learn
more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Shared

Configuration Guidance: Use Azure managed identities instead of service principals


when possible, which can authenticate to Azure services and resources that support
Azure Active Directory (Azure AD) authentication. Managed identity credentials are fully
managed, rotated, and protected by the platform, avoiding hard-coded credentials in
source code or configuration files.

Reference: Configure managed identities in Batch pools

Service Principals

Description: Data plane supports authentication using service principals. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Customer

Additional Guidance: To authenticate an application that runs unattended, you may use
a service principal. After you've registered your application, make the appropriate
configurations in the Azure Portal for the service principal, such as requesting a secret
for the application and assigning Azure RBAC roles.

Reference: Authenticate Batch service solutions with Azure Active Directory

IM-7: Restrict resource access based on conditions


Features

Conditional Access for Data Plane

Description: Data plane access can be controlled using Azure AD Conditional Access
Policies. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

IM-8: Restrict the exposure of credential and secrets

Features

Service Credential and Secrets Support Integration and Storage in


Azure Key Vault

Description: Data plane supports native use of Azure Key Vault for credential and secrets
store. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Privileged access
For more information, see the Microsoft cloud security benchmark: Privileged access.

PA-7: Follow just enough administration (least privilege)


principle
Features

Azure RBAC for Data Plane

Description: Azure Role-Based Access Control (Azure RBAC) can be used to managed
access to service's data plane actions. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Customer

Configuration Guidance: Use Azure role-based access control (Azure RBAC) to manage
Azure resource access through built-in role assignments. Azure Batch supports Azure
RBAC for managing access to these resource types: Accounts, Jobs, Tasks, and Pools.

Reference: Assign Azure RBAC to your application

Data protection
For more information, see the Microsoft cloud security benchmark: Data protection.

DP-2: Monitor anomalies and threats targeting sensitive


data

Features

Data Leakage/Loss Prevention

Description: Service supports DLP solution to monitor sensitive data movement (in
customer's content). Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.


DP-3: Encrypt sensitive data in transit

Features

Data in Transit Encryption

Description: Service supports data in-transit encryption for data plane. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True True Microsoft

Configuration Guidance: No additional configurations are required as this is enabled on


a default deployment.

DP-4: Enable data at rest encryption by default

Features

Data at Rest Encryption Using Platform Keys

Description: Data at-rest encryption using platform keys is supported, any customer
content at rest is encrypted with these Microsoft managed keys. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True True Microsoft

Feature notes: Some of the information specified in Batch APIs, such as account
certificates, job and task metadata, and task command lines, are automatically encrypted
when stored by the Batch service. By default, this data is encrypted using Azure Batch
platform-managed keys unique to each Batch account.

You can also encrypt this data using customer-managed keys. Azure Key Vault is used to
generate and store the key, with the key identifier registered with your Batch account.

Configuration Guidance: No additional configurations are required as this is enabled on


a default deployment.
DP-5: Use customer-managed key option in data at rest
encryption when required

Features

Data at Rest Encryption Using CMK

Description: Data at-rest encryption using customer-managed keys is supported for


customer content stored by the service. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Customer

Configuration Guidance: If required for regulatory compliance, define the use case and
service scope where encryption using customer-managed keys are needed. Enable and
implement data at rest encryption using customer-managed key for those services.

Reference: Configure customer-managed keys

DP-6: Use a secure key management process

Features

Key Management in Azure Key Vault

Description: The service supports Azure Key Vault integration for any customer keys,
secrets, or certificates. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Shared

Configuration Guidance: Use Azure Key Vault to create and control the life cycle of your
encryption keys, including key generation, distribution, and storage. Rotate and revoke
your keys in Azure Key Vault and your service based on a defined schedule or when
there is a key retirement or compromise. When there is a need to use customer-
managed key (CMK) in the workload, service, or application level, ensure you follow the
best practices for key management: Use a key hierarchy to generate a separate data
encryption key (DEK) with your key encryption key (KEK) in your key vault. Ensure keys
are registered with Azure Key Vault and referenced via key IDs from the service or
application. If you need to bring your own key (BYOK) to the service (such as importing
HSM-protected keys from your on-premises HSMs into Azure Key Vault), follow
recommended guidelines to perform initial key generation and key transfer.

Note: Customer must opt-in to use customer-managed keys otherwise by default the
service will use platform keys managed by Microsoft.

Reference: Configure customer-managed keys for your Azure Batch account with Azure
Key Vault and Managed Identity

DP-7: Use a secure certificate management process

Features

Certificate Management in Azure Key Vault

Description: The service supports Azure Key Vault integration for any customer
certificates. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Shared

Configuration Guidance: Use Azure Key Vault to create and control the certificate
lifecycle, including creation, importing, rotation, revocation, storage, and purging of the
certificate. Ensure the certificate generation follows defined standards without using any
insecure properties, such as: insufficient key size, overly long validity period, insecure
cryptography. Setup automatic rotation of the certificate in Azure Key Vault and the
Azure service (if supported) based on a defined schedule or when there is a certificate
expiration. If automatic rotation is not supported in the application, ensure they are still
rotated using manual methods in Azure Key Vault and the application.

Reference: Use certificates and securely access Azure Key Vault with Batch

Asset management
For more information, see the Microsoft cloud security benchmark: Asset management.

AM-2: Use only approved services

Features

Azure Policy Support

Description: Service configurations can be monitored and enforced via Azure Policy.
Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Customer

Configuration Guidance: Use Microsoft Defender for Cloud to configure Azure Policy to
audit and enforce configurations of your Azure resources. Use Azure Monitor to create
alerts when there is a configuration deviation detected on the resources. Use Azure
Policy [deny] and [deploy if not exists] effects to enforce a secure configuration across
Azure resources.

For any scenarios where built-in policy definitions don't exist, you can use Azure Policy
aliases in the "Microsoft.Batch" namespace to create custom policies.

Reference: Azure Policy built-in definitions for Azure Batch

AM-5: Use only approved applications in virtual machine

Features

Microsoft Defender for Cloud - Adaptive Application Controls

Description: Service can limit what customer applications run on the virtual machine
using Adaptive Application Controls in Microsoft Defender for Cloud. Learn more.

ノ Expand table
Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Logging and threat detection


For more information, see the Microsoft cloud security benchmark: Logging and threat
detection.

LT-1: Enable threat detection capabilities

Features

Microsoft Defender for Service / Product Offering

Description: Service has an offering-specific Microsoft Defender solution to monitor and


alert on security issues. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

LT-4: Enable logging for security investigation

Features

Azure Resource Logs

Description: Service produces resource logs that can provide enhanced service-specific
metrics and logging. The customer can configure these resource logs and send them to
their own data sink like a storage account or log analytics workspace. Learn more.

ノ Expand table
Supported Enabled By Default Configuration Responsibility

True False Customer

Configuration Guidance: Enable Azure resource logs for Azure Batch for the following
log types: ServiceLog and AllMetrics.

Reference: Batch metrics, alerts, and logs for diagnostic evaluation and monitoring

Posture and vulnerability management


For more information, see the Microsoft cloud security benchmark: Posture and
vulnerability management.

PV-3: Define and establish secure configurations for


compute resources

Features

Azure Automation State Configuration

Description: Azure Automation State Configuration can be used to maintain the security
configuration of the operating system. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Azure Policy Guest Configuration Agent

Description: Azure Policy guest configuration agent can be installed or deployed as an


extension to compute resources. Learn more.

ノ Expand table
Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Custom VM Images

Description: Service supports using user-supplied VM images or pre-built images from


the marketplace with certain baseline configurations pre-applied. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Shared

Configuration Guidance: When possible, use a pre-configured hardened image from a


trusted supplier such as Microsoft or build a desired secure configuration baseline into
the VM image template.

Customers may also use custom operating system images for Azure Batch. When using
the virtual machine configuration for your Azure Batch, ensure custom images are
hardened to your organization's needs. For lifecycle management, the pools store the
images in a shared image gallery. You can set up a secure image build process using
Azure automation tools, such as Azure Image Builder.

Reference: Use a managed image to create a custom image pool

Custom Containers Images

Description: Service supports using user-supplied container images or pre-built images


from the marketplace with certain baseline configurations pre-applied. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

True False Shared

Configuration Guidance: If using Batch pool to run tasks in Docker-compatible


containers on the nodes, use pre-configured hardened container images from a trusted
supplier such as Microsoft or build the desired secure configuration baseline into the
container image template.

Reference: Run container applications on Azure Batch

PV-5: Perform vulnerability assessments

Features

Vulnerability Assessment using Microsoft Defender

Description: Service can be scanned for vulnerability scan using Microsoft Defender for
Cloud or other Microsoft Defender services embedded vulnerability assessment
capability (including Microsoft Defender for server, container registry, App Service, SQL,
and DNS). Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

PV-6: Rapidly and automatically remediate vulnerabilities

Features

Azure Automation Update Management

Description: Service can use Azure Automation Update Management to deploy patches
and updates automatically. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.


Endpoint security
For more information, see the Microsoft cloud security benchmark: Endpoint security.

ES-1: Use Endpoint Detection and Response (EDR)

Features

EDR Solution

Description: Endpoint Detection and Response (EDR) feature such as Azure Defender for
servers can be deployed into the endpoint. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

ES-2: Use modern anti-malware software

Features

Anti-Malware Solution

Description: Anti-malware feature such as Microsoft Defender Antivirus, Microsoft


Defender for Endpoint can be deployed on the endpoint. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

ES-3: Ensure anti-malware software and signatures are


updated
Features

Anti-Malware Solution Health Monitoring

Description: Anti-malware solution provides health status monitoring for platform,


engine, and automatic signature updates. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Backup and recovery


For more information, see the Microsoft cloud security benchmark: Backup and recovery.

BR-1: Ensure regular automated backups

Features

Azure Backup

Description: The service can be backed up by the Azure Backup service. Learn more.

ノ Expand table

Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Service Native Backup Capability

Description: Service supports its own native backup capability (if not using Azure
Backup). Learn more.

ノ Expand table
Supported Enabled By Default Configuration Responsibility

False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Next steps
See the Microsoft cloud security benchmark overview
Learn more about Azure security baselines

Feedback
Was this page helpful?  Yes  No

Provide product feedback


Batch security and compliance best
practices
Article • 11/21/2024

This article provides guidance and best practices for enhancing security when using
Azure Batch.

By default, Azure Batch accounts have a public endpoint and are publicly accessible.
When an Azure Batch pool is created, the pool is provisioned in a specified subnet of an
Azure virtual network. Virtual machines in the Batch pool are accessed, by default,
through public IP addresses that Batch creates. Compute nodes in a pool can
communicate with each other when needed, such as to run multi-instance tasks, but
nodes in a pool can't communicate with virtual machines outside of the pool.

Many features are available to help you create a more secure Azure Batch deployment.
You can restrict access to nodes and reduce the discoverability of the nodes from the
internet by provisioning the pool without public IP addresses. The compute nodes can
securely communicate with other virtual machines or with an on-premises network by
provisioning the pool in a subnet of an Azure virtual network. And you can enable
private access from virtual networks from a service powered by Azure Private Link.
General security-related best practices

Pool configuration
Pools can be configured in one of two node communication modes, classic or simplified.
In the classic node communication model, the Batch service initiates communication to
the compute nodes, and compute nodes also require communicating to Azure Storage.
In the simplified node communication model, compute nodes initiate communication
with the Batch service. Due to the reduced scope of inbound/outbound connections
required, and not requiring Azure Storage outbound access for baseline operation, the
recommendation is to use the simplified node communication model. The classic node
communication model will be retired on March 31, 2026.

Pools should also be configured with enhanced security settings, including Trusted
Launch (requires Gen2 VM images and a compatible VM size), enabling secure boot,
vTPM, and encryption at host (requires a compatible VM size).

Batch account authentication


Batch account access supports two methods of authentication: Shared Key and
Microsoft Entra ID.

We strongly recommend using Microsoft Entra ID for Batch account authentication.


Some Batch capabilities require this method of authentication, including many of the
security-related features discussed here. The service API authentication mechanism for a
Batch account can be restricted to only Microsoft Entra ID using the
allowedAuthenticationModes property. When this property is set, API calls using Shared
Key authentication is rejected.

Batch account pool allocation mode


When creating a Batch account, you can choose between two pool allocation modes:

Batch service: The default option, where the underlying Virtual Machine Scale Set
resources used to allocate and manage pool nodes are created on Batch-owned
subscriptions, and aren't directly visible in the Azure portal. Only the Batch pools
and nodes are visible.
User subscription: The underlying Virtual Machine Scale Set resources are created
in the same subscription as the Batch account. These resources are therefore
visible in the subscription, in addition to the corresponding Batch resources.
With user subscription mode, Batch VMs and other resources are created directly in your
subscription when a pool is created. User subscription mode is required if you want to
create Batch pools using Azure Reserved VM Instances, use Azure Policy on Virtual
Machine Scale Set resources, and/or manage the core quota on the subscription (shared
across all Batch accounts in the subscription). To create a Batch account in user
subscription mode, you must also register your subscription with Azure Batch, and
associate the account with an Azure Key Vault.

Restrict network endpoint access

Batch network endpoints


By default, endpoints with public IP addresses are used to communicate with Batch
accounts, Batch pools, and pool nodes.

Batch account API


When a Batch account is created, a public endpoint is created that is used to invoke
most operations for the account using a REST API. The account endpoint has a base URL
using the format https://{account-name}.{region-id}.batch.azure.com . Access to the
Batch account is secured, with communication to the account endpoint being encrypted
using HTTPS, and each request authenticated using either shared key or Microsoft Entra
authentication.

Azure Resource Manager


In addition to operations specific to a Batch account, management operations apply to
single and multiple Batch accounts. These management operations are accessed via
Azure Resource Manager.

Batch management operations via Azure Resource Manager are encrypted using HTTPS,
and each request is authenticated using Microsoft Entra authentication.

Batch pool compute nodes


The Batch service communicates with a Batch node agent that runs on each node in the
pool. For example, the service instructs the node agent to run a task, stop a task, or get
the files for a task. Communication with the node agent is enabled by one or more load
balancers, the number of which depends on the number of nodes in a pool. The load
balancer forwards the communication to the desired node, with each node being
addressed by a unique port number. By default, load balancers have public IP addresses
associated with them. You can also remotely access pool nodes via RDP or SSH, see
Configure remote access to compute nodes in an Azure Batch pool.

Batch compute node OS


Batch supports both Linux and Windows operating systems. Batch supports Linux with
an aligned node agent for a subset of Linux OS distributions. It's recommended that the
operating system is kept up-to-date with the latest patches provided by the OS
publisher.

It's recommended to enable Auto OS upgrade for Batch pools, which allows the
underlying Azure infrastructure to coordinate updates across the pool. This option can
be configured to be nondisrupting for task execution. Automatic OS upgrade doesn't
support all operating systems that Batch supports. For more information, see the Virtual
Machine Scale Sets Auto OS upgrade Support Matrix. For Windows operating systems,
ensure that you aren't enabling the property
virtualMachineConfiguration.windowsConfiguration.enableAutomaticUpdates when using

Auto OS upgrade on the Batch pool.

Batch support for images and node agents phase out over time, typically aligned with
publisher support timelines. It's recommended to avoid using images with impending
end-of-life (EOL) dates or images that are past their EOL date. It's your responsibility to
periodically refresh your view of the EOL dates pertinent to your pools and migrate your
workloads before the EOL date occurs. If you're using a custom image with a specified
node agent, ensure that you follow Batch support end-of-life dates for the image for
which your custom image is derived or aligned with. An image without a specified
batchSupportEndOfLife date indicates that such a date hasn't been determined yet by

the Batch service. Absence of a date doesn't indicate that the respective image will be
supported indefinitely. An EOL date may be added or updated in the future at any time.
EOL dates can be discovered via the ListSupportedImages API, PowerShell, or Azure CLI.

Windows OS Transport Layer Security (TLS)


The Batch node agent doesn't modify operating system level defaults for SSL/TLS
versions or cipher suite ordering. In Windows, SSL/TLS versions and cipher suite order is
controlled at the operating system level, and therefore the Batch node agent adopts the
settings set by the image used by each compute node. Although the Batch node agent
attempts to utilize the most secure settings available when possible, it can still be
limited by operating system level settings. We recommend that you review your OS level
defaults and set them appropriately for the most secure mode that is amenable for your
workflow and organizational requirements. For more information, please visit Manage
TLS for cipher suite order enforcement and TLS registry settings for SSL/TLS version
control for Schannel SSP. Note that some setting changes require a reboot to take
effect. Utilizing a newer operating system with modern security defaults or a custom
image with modified settings is recommended instead of application of such settings
with a Batch start task.

Restricting access to Batch endpoints


Several capabilities are available to limit access to the various Batch endpoints,
especially when the solution uses a virtual network.

Use private endpoints


Azure Private Link enables access to Azure PaaS Services and Azure hosted customer-
owned/partner services over a private endpoint in your virtual network. You can use
Private Link to restrict access to a Batch account from within the virtual network or from
any peered virtual network. Resources mapped to Private Link are also accessible on-
premises over private peering through VPN or Azure ExpressRoute.

To use private endpoints, a Batch account needs to be configured appropriately when


created; public network access configuration must be disabled. Once created, private
endpoints can be created and associated with the Batch account. For more information,
see Use private endpoints with Azure Batch accounts.

Create pools in virtual networks

Compute nodes in a Batch pool can communicate with each other, such as to run multi-
instance tasks, without requiring a virtual network (VNET). However, by default, nodes in
a pool can't communicate with virtual machines that are outside of the pool on a virtual
network and have private IP addresses, such as license servers or file servers.

To allow compute nodes to communicate securely with other virtual machines, or with
an on-premises network, you can configure a pool to be in a subnet of an Azure VNET.

When the pools have public IP endpoints, the subnet must allow inbound
communication from the Batch service to be able to schedule tasks and perform other
operations on the compute nodes, and outbound communication to communicate with
Azure Storage or other resources as needed by your workload. For pools in the Virtual
Machine configuration, Batch adds network security groups (NSGs) at the network
interface level attached to compute nodes. These NSGs have rules to enable:
Inbound TCP traffic from Batch service IP addresses
Inbound TCP traffic for remote access
Outbound traffic on any port to the virtual network (may be amended per subnet-
level NSG rules)
Outbound traffic on any port to the internet (may be amended per subnet-level
NSG rules)

You don't have to specify NSGs at the virtual network subnet level, because Batch
configures its own NSGs. If you have an NSG associated with the subnet where Batch
compute nodes are deployed, or if you would like to apply custom NSG rules to override
the defaults applied, you must configure this NSG with at least the inbound and
outbound security rules in order to allow Batch service communication to the pool
nodes and pool node communication to Azure Storage.

For more information, see Create an Azure Batch pool in a virtual network.

Create pools with static public IP addresses


By default, the public IP addresses associated with pools are dynamic; they're created
when a pool is created and IP addresses can be added or removed when a pool is
resized. When the task applications running on pool nodes need to access external
services, access to those services may need to be restricted to specific IPs. In this case,
having dynamic IP addresses won't be manageable.

You can create static public IP address resources in the same subscription as the Batch
account before pool creation. You can then specify these addresses when creating your
pool.

For more information, see Create an Azure Batch pool with specified public IP addresses.

Create pools without public IP addresses


By default, all the compute nodes in an Azure Batch virtual machine configuration pool
are assigned one or more public IP addresses. These endpoints are used by the Batch
service to schedule tasks and for communication with compute nodes, including
outbound access to the internet.

To restrict access to these nodes and reduce the discoverability of these nodes from the
internet, you can provision the pool without public IP addresses.

For more information, see Create a pool without public IP addresses.


Limit remote access to pool nodes
For pools created with an API version earlier than 2024-07-01 , Batch by default permits a
node user with network connectivity to externally connect to a compute node in a Batch
pool using RDP or SSH.

To limit remote access, create your pools using an API version 2024-07-01 or later.

To limit remote access to nodes in pools created by API with version earlier than 2024-
07-01 , use one of the following methods:

Configure the PoolEndpointConfiguration to deny access. The appropriate network


security group (NSG) will be associated with the pool.
Create your pool without public IP addresses. By default, these pools can't be
accessed outside of the VNet.
Associate an NSG with the VNet to deny access to the RDP or SSH ports.
Don't create any users on the node. Without any node users, remote access won't
be possible.

Encrypt data

Encrypt data in transit


All communication to the Batch account endpoint (or via Azure Resource Manager) must
use HTTPS. You must use https:// in the Batch account URLs specified in APIs when
connecting to the Batch service.

Clients communicating with the Batch service should be configured to use Transport
Layer Security (TLS) 1.2.

Encrypt Batch data at rest


Some of the information specified in Batch APIs, such as account certificates, job and
task metadata, and task command lines, is automatically encrypted when stored by the
Batch service. By default, this data is encrypted using Azure Batch platform-managed
keys unique to each Batch account.

You can also encrypt this data using customer-managed keys. Azure Key Vault is used to
generate and store the key, with the key identifier registered with your Batch account.

Encrypt compute node disks


Batch compute nodes have two disks by default: an OS disk and the local temporary
SSD. Files and directories managed by Batch are located on the temporary SSD, which is
the default location for files such as task output files. Batch task applications can use the
default location on the SSD or the OS disk.

For extra security, encrypt these disks using one of these Azure disk encryption
capabilities:

Managed disk encryption at rest with platform-managed keys


Encryption at host using a platform-managed key
Azure Disk Encryption

Securely access services from compute nodes


Use Pool managed identities with the appropriate access permissions configured for the
user-assigned managed identity to access Azure services that support managed identity,
including Azure Key Vault. If you need to provision certificates on Batch nodes, utilize
the available Azure Key Vault VM extension with pool Managed Identity to install and
manage certificates on your Batch pool. For more information on deploying certificates
from Azure Key Vault with Managed Identity on Batch pools, see Enable automatic
certificate rotation in a Batch pool.

Governance and compliance

Compliance
To help customers meet their own compliance obligations across regulated industries
and markets worldwide, Azure maintains a large portfolio of compliance offerings .

These offerings are based on various types of assurances, including formal certifications,
attestations, validations, authorizations, and assessments produced by independent
third-party auditing firms, as well as contractual amendments, self-assessments, and
customer guidance documents produced by Microsoft. Review the comprehensive
overview of compliance offerings to determine which ones may be relevant to your
Batch solutions.

Azure Policy
Azure Policy helps to enforce organizational standards and to assess compliance at
scale. Common use cases for Azure Policy include implementing governance for
resource consistency, regulatory compliance, security, cost, and management.

Depending on your pool allocation mode and the resources to which a policy should
apply, use Azure Policy with Batch in one of the following ways:

Directly, using the Microsoft.Batch/batchAccounts resource. A subset of the


properties for a Batch account can be used. For example, your policy can include
valid Batch account regions, allowed pool allocation mode, and whether a public
network is enabled for accounts.
Indirectly, using the Microsoft.Compute/virtualMachineScaleSets resource. Batch
accounts with user subscription pool allocation mode can have policy set on the
Virtual Machine Scale Set resources that are created in the Batch account
subscription. For example, allowed VM sizes and ensure certain extensions are run
on each pool node.

Next steps
Review the Azure security baseline for Batch.
Read more best practices for Azure Batch.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Batch service workflow and resources
Article • 04/02/2025

In this overview of the core components of the Azure Batch service, we discuss the high-
level workflow that Batch developers can use to build large-scale parallel compute
solutions, along with the primary service resources that are used.

Whether you're developing a distributed computational application or service that


issues direct REST API calls or you're using another one of the Batch SDKs, you'll use
many of the resources and features discussed here.

 Tip

For a higher-level introduction to the Batch service, see What is Azure Batch?. Also
see the latest Batch service updates .

Basic workflow
The following high-level workflow is typical of nearly all applications and services that
use the Batch service for processing parallel workloads:

1. Upload the data files that you want to process to an Azure Storage account. Batch
includes built-in support for accessing Azure Blob storage, and your tasks can
download these files to compute nodes when the tasks are run.
2. Upload the application files that your tasks will run. These files can be binaries or
scripts and their dependencies, and are executed by the tasks in your jobs. Your
tasks can download these files from your Storage account, or you can use the
application packages feature of Batch for application management and
deployment.
3. Create a pool of compute nodes. When you create a pool, you specify the number
of compute nodes for the pool, their size, and the operating system. When each
task in your job runs, it's assigned to execute on one of the nodes in your pool.
4. Create a job. A job manages a collection of tasks. You associate each job to a
specific pool where that job's tasks will run.
5. Add tasks to the job. Each task runs the application or script that you uploaded to
process the data files it downloads from your Storage account. As each task
completes, it can upload its output to Azure Storage.
6. Monitor job progress and retrieve the task output from Azure Storage.
7 Note

You need a Batch account to use the Batch service. Most Batch solutions also use
an associated Azure Storage account for file storage and retrieval.

Batch service resources


The following topics discuss the resources of Batch that enable your distributed
computational scenarios.

Batch accounts and storage accounts


Nodes and pools
Jobs and tasks
Files and directories

Next steps
Learn about the Batch APIs and tools available for building Batch solutions.
Learn the basics of developing a Batch-enabled application using the Batch .NET
client library or Python. These quickstarts guide you through a sample application
that uses the Batch service to execute a workload on multiple compute nodes, and
includes using Azure Storage for workload file staging and retrieval.
Download and install Batch Explorer for use while you develop your Batch
solutions. Use Batch Explorer to help create, debug, and monitor Azure Batch
applications.
See community resources including Stack Overflow , the Batch Community
repo , and the Azure Batch forum.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Batch accounts and Azure Storage
accounts
Article • 04/02/2025

An Azure Batch account is a uniquely identified entity within the Batch service. Many
Batch solutions use Azure Storage for storing resource files and output files, so each
Batch account can be optionally associated with a corresponding storage account.

Batch accounts
All processing and resources such as tasks, job and batch pool are associated with a
Batch account. When your application makes a request against the Batch service, it
authenticates the request using the Azure Batch account name and the account URL.
Additionally, it can use either an access key or a Microsoft Entra token.

You can run multiple Batch workloads in a single Batch account. You can also distribute
your workloads among Batch accounts that are in the same subscription but located in
different Azure regions.

You can create a Batch account using the Azure portal or programmatically, such as with
the Batch Management .NET library. When creating the account, you can associate an
Azure storage account for storing job-related input and output data or applications.

When you create a Batch account, you can choose between user subscription and Batch
service pool allocation modes. For most cases, you should use the default Batch service
pool allocation mode. In Batch service mode, compute and virtual machine (VM)-related
resources for pools are allocated on Batch service managed Azure subscriptions.

In user subscription pool allocation mode, compute and VM-related resources for pools
are created directly in the Batch account subscription when a pool is created. In
scenarios where you create a Batch pool in a virtual network that you specify, certain
networking related resources are created in the subscription of the virtual network.

To create a Batch account in user subscription pool allocation mode, you must also
register your subscription with Azure Batch, and associate the account with Azure Key
Vault. For more information about requirements for user subscription pool allocation
mode, see Configure user subscription mode.

Azure Storage accounts


Most Batch solutions use Azure Storage for storing resource files and output files. For
example, your Batch tasks (including standard tasks, start tasks, job preparation tasks,
and job release tasks) typically specify resource files that reside in a storage account.
Storage accounts also stores that data that is processed and any output data that is
generated.

Batch supports the following types of Azure Storage accounts:

General-purpose v2 (GPv2) accounts


General-purpose v1 (GPv1) accounts
Blob storage accounts (currently supported for pools in the Virtual Machine
configuration)

) Important

You can't use the Application Packages or Azure storage-based virtual file system
mount features with Azure Storage accounts configured with firewall rules, or with
Hierarchical namespace set to Enabled.

For more information about storage accounts, see Azure storage account overview.

You can associate a storage account with your Batch account when you create the Batch
account, or later. Consider your cost and performance requirements when choosing a
storage account. For example, the GPv2 and blob storage account options support
greater capacity and scalability limits compared with GPv1. (Contact Azure Support to
request an increase in a storage limit.) These account options can improve the
performance of Batch solutions that contain a large number of parallel tasks that read
from or write to the storage account.

When a storage account is linked to a Batch account, it becomes the autostorage


account. An autostorage account is necessary if you intend to use the application
packages capability, as it stores the application package .zip files. It can also be used for
task resource files. Linking Batch accounts to autostorage can avoid the need for shared
access signature (SAS) URLs to access the resource files.

7 Note

Batch nodes automatically unzip application package .zip files when they are pulled
down from a linked storage account. This can cause the compute node local
storage to fill up. For more information, see Manage Batch application package.
Next steps
Learn about Nodes and pools.
Learn how to create and manage Batch accounts using the Azure portal or Batch
Management .NET.
Learn how to use private endpoints with Azure Batch accounts.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Nodes and pools in Azure Batch
Article • 04/17/2025

In an Azure Batch workflow, a compute node (or node) is a virtual machine that processes a
portion of your application's workload. A pool is a collection of these nodes for your
application to run on. This article explains more about nodes and pools, along with
considerations when creating and using them in an Azure Batch workflow.

Nodes
A node is an Azure virtual machine (VM) or cloud service VM that is dedicated to processing a
portion of your application's workload. The size of a node determines the number of CPU
cores, memory capacity, and local file system size that is allocated to the node.

You can create pools of Windows or Linux nodes by using Azure Cloud Services, images from
the Azure Virtual Machines Marketplace , or custom images that you prepare.

Nodes can run any executable or script supported by the operating system environment of the
node. Executables or scripts include *.exe, *.cmd, *.bat, and PowerShell scripts (for Windows)
and binaries, shell, and Python scripts (for Linux).

All compute nodes in Batch also include:

A standard folder structure and associated environment variables that are available for
reference by tasks.
Firewall settings that are configured to control access.
Remote access to both Windows (Remote Desktop Protocol (RDP)) and Linux (Secure
Shell (SSH)) nodes (unless you create your pool with remote access disabled).

By default, nodes can communicate with each other, but they can't communicate with virtual
machines that aren't part of the same pool. To allow nodes to communicate securely with other
virtual machines, or with an on-premises network, you can provision the pool in a subnet of an
Azure virtual network (VNet). When you do so, your nodes can be accessed through public IP
addresses. Batch creates these public IP addresses and may change over the lifetime of the
pool. You can also create a pool with static public IP addresses that you control, which ensures
that they don't change unexpectedly.

Pools
A pool is the collection of nodes that your application runs on.
Azure Batch pools build on top of the core Azure compute platform. They provide large-scale
allocation, application installation, data distribution, health monitoring, and flexible adjustment
(scaling) of the number of compute nodes within a pool.

Every node that is added to a pool is assigned a unique name and IP address. When a node is
removed from a pool, any changes that are made to the operating system or files are lost, and
its name and IP address are released for future use. When a node leaves a pool, its lifetime is
over.

A pool can only be used by the Batch account in which it was created. A Batch account can
create multiple pools to meet the resource requirements of the applications that need to run.

The pool can be created manually, or automatically by the Batch service when you specify the
work to be done. When you create a pool, you can specify the following attributes:

Nodes and pools in Azure Batch


Nodes
Pools
Operating system and version
Configurations
Virtual Machine Configuration
Node Agent SKUs
Custom images for Virtual Machine pools
Container support in Virtual Machine pools
Node type and target
Node size
Automatic scaling policy
Task scheduling policy
Communication status
Start tasks
Application packages
Virtual network (VNet) and firewall configuration
VNet requirements
Pool and compute node lifetime
Autopools
Security with certificates
Next steps

) Important

Batch accounts have a default quota that limits the number of cores in a Batch account.
The number of cores corresponds to the number of compute nodes. You can find the
default quotas and instructions on how to increase a quota in Quotas and limits for the
Azure Batch service. If your pool isn't achieving its target number of nodes, the core
quota might be the reason.

Operating system and version


When you create a Batch pool, you specify the Azure virtual machine configuration and the
type of operating system you want to run on each compute node in the pool.

Configurations

Virtual Machine Configuration


The Virtual Machine Configuration specifies that the pool is composed of Azure virtual
machines. These VMs may be created from either Linux or Windows images.

The Batch node agent is a program that runs on each node in the pool and provides the
command-and-control interface between the node and the Batch service. There are different
implementations of the node agent, known as SKUs, for different operating systems. When you
create a pool based on the Virtual Machine Configuration, you must specify not only the size of
the nodes and the source of the images used to create them, but also the virtual machine
image reference and the Batch node agent SKU to be installed on the nodes. For more
information about specifying these pool properties, see Provision Linux compute nodes in
Azure Batch pools. You can optionally attach one or more empty data disks to pool VMs
created from Marketplace images, or include data disks in custom images used to create the
VMs. When including data disks, you need to mount and format the disks from within a VM to
use them.

Node Agent SKUs


When you create a pool, you need to select the appropriate nodeAgentSkuId, depending on
the OS of the base image of your VHD. You can get a mapping of available node agent SKU IDs
to their OS Image references by calling the List Supported Node Agent SKUs operation.

Custom images for Virtual Machine pools


To learn how to create a pool with custom images, see Use the Azure Compute Gallery to
create a custom pool.
Container support in Virtual Machine pools
When creating a Virtual Machine Configuration pool using the Batch APIs, you can set up the
pool to run tasks in Docker containers. Currently, you must create the pool using an image that
supports Docker containers. Use the Windows Server 2016 Datacenter with Containers image
from the Azure Marketplace, or supply a custom VM image that includes Docker Community
Edition or Enterprise Edition and any required drivers. The pool settings must include a
container configuration that copies container images to the VMs when the pool is created.
Tasks that run on the pool can then reference the container images and container run options.

For more information, see Run Docker container applications on Azure Batch.

Node type and target


When you create a pool, you can specify which types of nodes you want and the target number
for each. The two types of nodes are:

Dedicated nodes. Dedicated compute nodes are reserved for your workloads. They're
typically more expensive than Spot nodes, but they're guaranteed to never be preempted.
Spot nodes. Spot nodes take advantage of surplus capacity in Azure to run your Batch
workloads. Spot nodes are less expensive per hour than dedicated nodes, and enable
workloads requiring significant compute power. For more information, see Use Spot VMs
with Batch.

Spot nodes may be preempted when Azure has insufficient surplus capacity. If a node is
preempted while running tasks, the tasks are requeued and run again once a compute node
becomes available again. Spot nodes are a good option for workloads where the job
completion time is flexible and the work is distributed across many nodes. Before you decide to
use Spot nodes for your scenario, make sure that any work lost due to preemption is minimal
and easy to resume or recreate.

You can have both Spot and dedicated compute nodes in the same pool. Each type of node
has its own target setting, for which you can specify the desired number of nodes.

The number of compute nodes is referred to as a target because, in some situations, your pool
might not reach the desired number of nodes. For example, a pool might not achieve the
target if it reaches the core quota for your Batch account first. Or, the pool might not achieve
the target if you applied an automatic scaling formula to the pool that limits the maximum
number of nodes.

7 Note
When Batch spot compute nodes are preempted, they transition to unusable state first.
After some time, these compute nodes will then transition to reflect the preempted state.
Batch automatically enables Try & restore behavior to restore evicted spot instances with
a best-effort goal to maintain target instance counts.

For pricing information for both Spot and dedicated nodes, see Batch Pricing .

Node size
When you create an Azure Batch pool, you can choose from among almost all the VM families
and sizes available in Azure. Azure offers a range of VM sizes for different workloads, including
specialized HPC or GPU-enabled VM sizes. Node VM sizes can only be chosen at the time a
pool is created. In other words, once a pool is created, its VM size can't be changed.

For more information, see Choose a VM size for compute nodes in an Azure Batch pool.

Automatic scaling policy


For dynamic workloads, you can apply an automatic scaling policy to a pool. The Batch service
periodically evaluates your formula and dynamically adjusts the number of nodes within the
pool according to the current workload and resource usage of your compute scenario. This
allows you to lower the overall cost of running your application by using only the resources
you need, and releasing those you don't need.

You enable automatic scaling by writing an automatic scaling formula and associating that
formula with a pool. The Batch service uses the formula to determine the target number of
nodes in the pool for the next scaling interval (an interval that you can configure). You can
specify the automatic scaling settings for a pool when you create it, or enable scaling on a pool
later. You can also update the scaling settings on a scaling-enabled pool.

As an example, perhaps a job requires that you submit a large number of tasks to be executed.
You can assign a scaling formula to the pool that adjusts the number of nodes in the pool
based on the current number of queued tasks and the completion rate of the tasks in the job.
The Batch service periodically evaluates the formula and resizes the pool, based on workload
and your other formula settings. The service adds nodes as needed when there are a large
number of queued tasks, and removes nodes when there are no queued or running tasks.

A scaling formula can be based on the following metrics:

Time metrics are based on statistics collected every five minutes in the specified number
of hours.
Resource metrics are based on CPU usage, bandwidth usage, memory usage, and
number of nodes.
Task metrics are based on task state, such as Active (queued), Running, or Completed.

When automatic scaling decreases the number of compute nodes in a pool, you must consider
how to handle tasks that are running at the time of the decrease operation. To accommodate
this, Batch provides a node deallocation option that you can include in your formulas. For
example, you can specify that running tasks are stopped immediately and then requeued for
execution on another node, or allowed to finish before the node is removed from the pool.
Setting the node deallocation option as taskcompletion or retaineddata prevents pool resize
operations until all tasks complete, or when all task retention periods expire, respectively.

For more information about automatically scaling an application, see Automatically scale
compute nodes in an Azure Batch pool.

 Tip

To maximize compute resource utilization, set the target number of nodes to zero at the
end of a job, but allow running tasks to finish.

Task scheduling policy


The max tasks per node configuration option determines the maximum number of tasks that
can be run in parallel on each compute node within the pool.

The default configuration specifies that one task at a time runs on a node, but there are
scenarios where it's beneficial to have two or more tasks executed on a node simultaneously.
See the example scenario in the concurrent node tasks article on how you can potentially
benefit from multiple tasks per node.

You can also specify a fill type, which determines whether Batch spreads the tasks evenly across
all nodes in a pool, or packs each node with the maximum number of tasks before assigning
tasks to another node.

Communication status
In most scenarios, tasks operate independently and don't need to communicate with one
another. However, there are some applications in which tasks must communicate, like MPI
scenarios.
You can configure a pool to allow internode communication so that nodes within a pool can
communicate at runtime. When internode communication is enabled, nodes in Cloud Services
Configuration pools can communicate with each other on ports greater than 1100, and Virtual
Machine Configuration pools don't restrict traffic on any port.

Enabling internode communication also impacts the placement of the nodes within clusters
and might limit the maximum number of nodes in a pool because of deployment restrictions. If
your application doesn't require communication between nodes, the Batch service can allocate
a potentially large number of nodes to the pool from many different clusters and data centers
to enable increased parallel processing power.

Start tasks
If desired, you can add a start task that executes on each node as that node joins the pool, and
each time a node is restarted or reimaged. The start task is especially useful for preparing
compute nodes for the execution of tasks, like installing the applications that your tasks run on
the compute nodes.

Application packages
You can specify application packages to deploy to the compute nodes in the pool. Application
packages provide simplified deployment and versioning of the applications that your tasks run.
Application packages that you specify for a pool are installed on every node that joins that
pool, and every time a node is rebooted or reimaged.

For more information about using application packages to deploy your applications to your
Batch nodes, see Deploy applications to compute nodes with Batch application packages.

Virtual network (VNet) and firewall configuration


When you provision a pool of compute nodes in Batch, you can associate the pool with a
subnet of an Azure virtual network (VNet). To use an Azure VNet, the Batch client API must use
Microsoft Entra authentication. Azure Batch support for Microsoft Entra ID is documented in
Authenticate Batch service solutions with Active Directory.

VNet requirements
For more information about setting up a Batch pool in a VNet, see Create a pool of virtual
machines with your virtual network.
 Tip

To ensure that the public IP addresses used to access nodes don't change, you can create
a pool with specified public IP addresses that you control.

Pool and compute node lifetime


When you design your Azure Batch solution, you must specify how and when pools are
created, and how long compute nodes within those pools are kept available.

On one end of the spectrum, you can create a pool for each job that you submit, and delete
the pool as soon as its tasks finish execution. This maximizes utilization because the nodes are
only allocated when needed, and they're shut down once they're idle. While this means that
the job must wait for the nodes to be allocated, it's important to note that tasks are scheduled
for execution as soon as nodes are individually allocated and the start task completes, if
specified to wait for start task completion. Batch doesn't wait until all nodes within a pool are
available before assigning tasks to the nodes. This ensures maximum utilization of all available
nodes.

At the other end of the spectrum, if having jobs start immediately is the highest priority, you
can create a pool ahead of time and make its nodes available before jobs are submitted. In this
scenario, tasks can start immediately, but nodes might sit idle while waiting for them to be
assigned.

A combined approach is typically used for handling a variable but ongoing load. You can have
a pool in which multiple jobs are submitted, and can scale the number of nodes up or down
according to the job load. You can do this reactively, based on current load, or proactively, if
load can be predicted. For more information, see Automatic scaling policy.

Autopools
An autopool is a pool that the Batch service creates when a job is submitted, rather than being
created explicitly before the jobs that will run in the pool. The Batch service manages the
lifetime of an autopool according to the characteristics that you specify. Most often, these
pools are also set to delete automatically after their jobs complete.

Security with certificates


You typically need to use certificates when you encrypt or decrypt sensitive information for
tasks, like the key for an Azure Storage account. To support this, you can install certificates on
nodes. Encrypted secrets are passed to tasks via command-line parameters or embedded in
one of the task resources, and the installed certificates can be used to decrypt them.

You use the Add certificate operation (Batch REST) or CertificateOperations.CreateCertificate


method (Batch .NET) to add a certificate to a Batch account. You can then associate the
certificate with a new or existing pool.

When a certificate is associated with a pool, the Batch service installs the certificate on each
node in the pool. The Batch service installs the appropriate certificates when the node starts
up, before launching any tasks (including the start task and job manager task).

If you add a certificate to an existing pool, you must reboot its compute nodes in order for the
certificate to be applied to the nodes.

Next steps
Learn about jobs and tasks.
Learn how to detect and avoid failures in pool and node background operations .
Jobs and tasks in Azure Batch
Article • 03/21/2025

In Azure Batch, a task represents a unit of computation. A job is a collection of these


tasks. More about jobs and tasks, and how they are used in an Azure Batch workflow, is
described below.

Jobs
A job is a collection of tasks. It manages how computation is performed by its tasks on
the compute nodes in a pool.

A job specifies the pool in which the work is to be run. You can create a new pool for
each job, or use one pool for many jobs. You can create a pool for each job that is
associated with a job schedule, or one pool for all jobs that are associated with a job
schedule.

Job priority
You can assign an optional job priority to jobs that you create. The Batch service uses
the priority value of the job to determine the order of scheduling (for all tasks within the
job) within each pool.

To update the priority of a job, call the Update the properties of a job operation (Batch
REST), or modify the CloudJob.Priority (Batch .NET). Priority values range from -1000
(lowest priority) to +1000 (highest priority).

Within the same pool, higher-priority jobs have scheduling precedence over lower-
priority jobs. Tasks in lower-priority jobs that are already running won't be preempted by
tasks in a higher-priority job. Jobs with the same priority level have an equal chance of
being scheduled, and ordering of task execution is not defined.

A job with a high-priority value running in one pool won't impact scheduling of jobs
running in a separate pool or in a different Batch account. Job priority doesn't apply to
autopools, which are created when the job is submitted.

Job constraints
You can use job constraints to specify certain limits for your jobs:
You can set a maximum wallclock time, so that if a job runs for longer than the
maximum wallclock time that is specified, the job and all of its tasks are
terminated.
You can specify the maximum number of task retries as a constraint, including
whether a task is always retried or never retried. Retrying a task means that if the
task fails, it will be requeued to run again.

Job manager tasks and automatic termination


Your client application can add tasks to a job, or you can specify a job manager task. A
job manager task contains the information that is necessary to create the required tasks
for a job, with the job manager task being run on one of the compute nodes in the pool.
The job manager task is handled specifically by Batch; it is queued as soon as the job is
created and is restarted if it fails. A job manager task is required for jobs that are created
by a job schedule, because it is the only way to define the tasks before the job is
instantiated.

By default, jobs remain in the active state when all tasks within the job are complete. You
can change this behavior so that the job is automatically terminated when all tasks in
the job are complete. Set the job's onAllTasksComplete property (OnAllTasksComplete
in Batch .NET) to terminatejob *` to automatically terminate the job when all of its tasks
are in the completed state.

The Batch service considers a job with no tasks to have all of its tasks completed.
Therefore, this option is most commonly used with a job manager task. If you want to
use automatic job termination without a job manager, you should initially set a new
job's onAllTasksComplete property to noaction , then set it to terminatejob *` only after
you've finished adding tasks to the job.

Scheduled jobs
Job schedules enable you to create recurring jobs within the Batch service. A job
schedule specifies when to run jobs and includes the specifications for the jobs to be
run. You can specify the duration of the schedule (how long and when the schedule is in
effect) and how frequently jobs are created during the scheduled period.

Tasks
A task is a unit of computation that is associated with a job. It runs on a node. Tasks are
assigned to a node for execution, or are queued until a node becomes free. Put simply, a
task runs one or more programs or scripts on a compute node to perform the work you
need done.

When you create a task, you can specify:

The command line for the task. This is the command line that runs your
application or script on the compute node.

It is important to note that the command line does not run under a shell.
Therefore, it cannot natively take advantage of shell features like environment
variable expansion (this includes the PATH ). To take advantage of such features,
you must invoke the shell in the command line, such as by launching cmd.exe on
Windows nodes or /bin/sh on Linux:

cmd /c MyTaskApplication.exe %MY_ENV_VAR%

/bin/sh -c MyTaskApplication $MY_ENV_VAR

If your tasks need to run an application or script that is not in the node's PATH or
reference environment variables, invoke the shell explicitly in the task command
line.

Resource files that contain the data to be processed. These files are automatically
copied to the node from Blob storage in an Azure Storage account before the
task's command line is executed. For more information, see Start task and Files and
directories.

The environment variables that are required by your application. For more
information, see Environment settings for tasks.

The constraints under which the task should execute. For example, constraints
include the maximum time that the task is allowed to run, the maximum number of
times a failed task should be retried, and the maximum time that files in the task's
working directory are retained.

Application packages to deploy to the compute node on which the task is


scheduled to run. Application packages provide simplified deployment and
versioning of the applications that your tasks run. Task-level application packages
are especially useful in shared-pool environments, where different jobs are run on
one pool, and the pool is not deleted when a job is completed. If your job has
fewer tasks than nodes in the pool, task application packages can minimize data
transfer since your application is deployed only to the nodes that run tasks.
A container image reference in Docker Hub or a private registry and additional
settings to create a Docker container in which the task runs on the node. You only
specify this information if the pool is set up with a container configuration.

7 Note

The maximum lifetime of a task, from when it is added to the job to when it
completes, is 180 days. Completed tasks persist for 7 days; data for tasks not
completed within the maximum lifetime is not accessible.

In addition to tasks you define to perform computation on a node, several special tasks
are also provided by the Batch service:

Start task
Job manager task
Job preparation and release tasks
Multi-instance tasks
Task dependencies

Start task
By associating a start task with a pool, you can prepare the operating environment of its
nodes. For example, you can perform actions such as installing the applications that
your tasks run, or starting background processes. The start task runs every time a node
starts, for as long as it remains in the pool. This includes when the node is first added to
the pool and when it is restarted or reimaged.

A primary benefit of the start task is that it can contain all the information necessary to
configure a compute node and install the applications required for task execution.
Therefore, increasing the number of nodes in a pool is as simple as specifying the new
target node count. The start task provides the information needed for the Batch service
to configure the new nodes and get them ready for accepting tasks.

As with any Azure Batch task, you can specify a list of resource files in Azure Storage, in
addition to a command line to be executed. The Batch service first copies the resource
files to the node from Azure Storage, and then runs the command line. For a pool start
task, the file list typically contains the task application and its dependencies.

However, the start task could also include reference data to be used by all tasks that are
running on the compute node. For example, a start task's command line could perform
a robocopy operation to copy application files (which were specified as resource files
and downloaded to the node) from the start task's working directory to the shared
folder, and then run an MSI or setup.exe .

Usually, you'll want the Batch service to wait for the start task to complete before
considering the node ready to be assigned tasks. However, you can configure this
differently as needed.

If a start task fails on a compute node, then the state of the node is updated to reflect
the failure, and the node is not assigned any tasks. A start task can fail if there is an issue
copying its resource files from storage, or if the process executed by its command line
returns a nonzero exit code.

If you add or update the start task for an existing pool, you must reboot its compute
nodes for the start task to be applied to the nodes.

7 Note

Batch limits the total size of a start task, which includes resource files and
environment variables. If you need to reduce the size of a start task, you can use
one of two approaches:

1. You can use application packages to distribute applications or data across


each node in your Batch pool. For more information about application
packages, see Deploy applications to compute nodes with Batch application
packages.

2. You can manually create a zipped archive containing your applications files.
Upload your zipped archive to Azure Storage as a blob. Specify the zipped
archive as a resource file for your start task. Before you run the command line
for your start task, unzip the archive from the command line.

To unzip the archive, you can use the archiving tool of your choice. You will
need to include the tool that you use to unzip the archive as a resource file for
the start task.

Job manager task


You typically use a job manager task to control and/or monitor job execution. For
example, job manager tasks are often used to create and submit the tasks for a job,
determine additional tasks to run, and determine when work is complete.
However, a job manager task is not restricted to these activities. It is a full-fledged task
that can perform any actions that are required for the job. For example, a job manager
task might download a file that is specified as a parameter, analyze the contents of that
file, and submit additional tasks based on those contents.

A job manager task is started before all other tasks. It provides the following features:

It is automatically submitted as a task by the Batch service when the job is created.
It is scheduled to execute before the other tasks in a job.
Its associated node is the last to be removed from a pool when the pool is being
downsized.
Its termination can be tied to the termination of all tasks in the job.
A job manager task is given the highest priority when it needs to be restarted. If an
idle node is not available, the Batch service might terminate one of the other
running tasks in the pool to make room for the job manager task to run.
A job manager task in one job does not have priority over the tasks of other jobs.
Across jobs, only job-level priorities are observed.

Job preparation and release tasks


Batch provides job preparation tasks for pre-job execution setup, and job release tasks
for post-job maintenance or cleanup.

A job preparation task runs on all compute nodes that are scheduled to run tasks,
before any of the other job tasks are executed. For example, you can use a job
preparation task to copy data that is shared by all tasks, but is unique to the job.

When a job has completed, a job release task runs on each node in the pool that
executed at least one task. For example, a job release task can delete data that was
copied by the job preparation task, or it can compress and upload diagnostic log data.

Both job preparation and release tasks allow you to specify a command line to run when
the task is invoked. They offer features like file download, elevated execution, custom
environment variables, maximum execution duration, retry count, and file retention time.

For more information on job preparation and release tasks, see Run job preparation and
completion tasks on Azure Batch compute nodes.

Multi-instance task
A multi-instance task is a task that is configured to run on more than one compute node
simultaneously. With multi-instance tasks, you can enable high-performance computing
scenarios that require a group of compute nodes that are allocated together to process
a single workload, such as Message Passing Interface (MPI).

For a detailed discussion on running MPI jobs in Batch by using the Batch .NET library,
check out Use multi-instance tasks to run Message Passing Interface (MPI) applications
in Azure Batch.

Task dependencies
Task dependencies, as the name implies, allow you to specify that a task depends on the
completion of other tasks before its execution. This feature provides support for
situations in which a "downstream" task consumes the output of an "upstream" task, or
when an upstream task performs some initialization that is required by a downstream
task.

To use this feature, you must first enable task dependencies on your Batch job. Then, for
each task that depends on another (or many others), you specify the tasks which that
task depends on.

With task dependencies, you can configure scenarios like the following:

taskB depends on taskA (taskB will not begin execution until taskA has completed).
taskC depends on both taskA and taskB.
taskD depends on a range of tasks, such as tasks 1 through 10, before it executes.

For more information, see Task dependencies in Azure Batch and the
TaskDependencies code sample in the azure-batch-samples GitHub repository.

Environment settings for tasks


Each task executed by the Batch service has access to environment variables that it sets
on compute nodes. This includes environment variables defined by the Batch service
and custom environment variables that you can define for your tasks. Applications and
scripts that your tasks execute have access to these environment variables during
execution.

You can set custom environment variables at the task or job level by populating the
environment settings property for these entities. For more information, see the Add a
task to a job operation (Batch REST), or the CloudTask.EnvironmentSettings and
CloudJob.CommonEnvironmentSettings properties in Batch .NET.

Your client application or service can obtain a task's environment variables, both service-
defined and custom, by using the Get information about a task operation (Batch REST)
or by accessing the CloudTask.EnvironmentSettings property (Batch .NET). Processes
executing on a compute node can access these and other environment variables on the
node, for example, by using the familiar %VARIABLE_NAME% (Windows) or $VARIABLE_NAME
(Linux) syntax.

You can find a list of all service-defined environment variables in Compute node
environment variables.

Next steps
Learn about files and directories.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Files and directories in Azure Batch
Article • 04/28/2025

In Azure Batch, each task has a working directory under which it can create files and directories.
This working directory can be used for storing the program that is run by the task, the data
that it processes, and the output of the processing it performs. All files and directories of a task
are owned by the task user.

The Batch service exposes a portion of the file system on a node as the root directory. This root
directory is located on the temporary storage drive of the VM, not directly on the OS drive.

Tasks can access the root directory by referencing the AZ_BATCH_NODE_ROOT_DIR environment
variable. For more information about using environment variables, see Environment settings for
tasks.

Root directory structure


The root directory contains the following directory structure:

applications: Contains information about the details of application packages installed on


the compute node. Tasks can access this directory by referencing the
AZ_BATCH_APP_PACKAGE environment variable.

fsmounts: The directory contains any file systems that are mounted on a compute node.
Tasks can access this directory by referencing the AZ_BATCH_NODE_MOUNTS_DIR environment
variable. For more information, see Mount a virtual file system on a Batch pool.

shared: This directory provides read/write access to all tasks that run on a node. Any task
that runs on the node can create, read, update, and delete files in this directory. Tasks can
access this directory by referencing the AZ_BATCH_NODE_SHARED_DIR environment variable.
startup: This directory is used by a start task as its working directory. All of the files that
are downloaded to the node by the start task are stored here. The start task can create,
read, update, and delete files under this directory. Tasks can access this directory by
referencing the AZ_BATCH_NODE_STARTUP_DIR environment variable.

volatile: This directory is for internal purposes. There's no guarantee that any files in this
directory or that the directory itself will exist in the future.

workitems: This directory contains the directories for jobs and their tasks on the compute
node.

Within the workitems directory, a Tasks directory is created for each task that runs on the
node. This directory can be accessed by referencing the AZ_BATCH_TASK_DIR environment
variable.

Within each Tasks directory, the Batch service creates a working directory ( wd ) whose
unique path is specified by the AZ_BATCH_TASK_WORKING_DIR environment variable. This
directory provides read/write access to the task. The task can create, read, update, and
delete files under this directory. This directory is retained based on the RetentionTime
constraint that is specified for the task.

The stdout.txt and stderr.txt files are written to the Tasks folder during the execution
of the task.

) Important

When a node is removed from the pool, all of the files that are stored on the node are
removed.

Batch root directory location


The value of the AZ_BATCH_NODE_ROOT_DIR compute node environment variable will be
determined by the VM size and the presence of a local temporary disk.

ノ Expand table

Local Temporary Disk Present Operating System Type AZ_BATCH_NODE_ROOT_DIR Value

No Linux /opt/batch/data

Yes Linux /mnt/batch or /mnt/resource/batch


Local Temporary Disk Present Operating System Type AZ_BATCH_NODE_ROOT_DIR Value

No Windows C:\batch\data

Yes Windows D:\batch

These environment variable values are implementation details and should not be considered
immutable. As these values may change at any time, the use of environment variables instead
of hardcoding the value is recommended.

Next steps
Learn about error handling and detection in Azure Batch.
Overview of Batch APIs and tools
Article • 04/02/2025

Processing parallel workloads with Azure Batch is typically done programmatically by


using one of the Batch APIs. Your client application or service can use the Batch APIs to
communicate with the Batch service. With the Batch APIs, you can create and manage
pools of compute nodes, either virtual machines or cloud services. You can then
schedule jobs and tasks to run on those nodes.

You can efficiently process large-scale workloads for your organization, or provide a
service front end to your customers so that they can run jobs and tasks—on demand, or
on a schedule—on one, hundreds, or even thousands of nodes. You can also use Azure
Batch as part of a larger workflow, managed by tools such as Azure Data Factory.

 Tip

To learn more about the features and workflow used in Azure Batch, see Batch
service workflow and resources.

Azure accounts for Batch development


When you develop Batch solutions, you use the following accounts in your Azure
subscription:

Batch account: Azure Batch resources, including pools, compute nodes, jobs, and
tasks, are associated with an Azure Batch account. When your application makes a
request against the Batch service, it authenticates the request using the Azure
Batch account name, the URL of the account, and either an access key or a
Microsoft Entra token. You can create a Batch account in the Azure portal or
programmatically.
Storage account: Batch includes built-in support for working with files in Azure
Storage. Nearly every Batch scenario uses Azure Blob storage for staging the
programs that your tasks run and the data that they process, and for the storage of
output data that they generate. Each Batch account is usually associated with a
corresponding storage account.

Service-level and management-level APIs


Azure Batch has two sets of APIs, one for the service level and one for the management
level. The naming is often similar, but they return different results.

Only actions from the management APIs are tracked in the activity log. Service level APIs
bypass the Azure Resource Management layer (management.azure.com) and are not
logged.

For example, the Batch service API to delete a pool is targeted directly on the batch
account: DELETE {batchUrl}/pools/{poolId}

Whereas the Batch management API to delete a pool is targeted at the


management.azure.com layer: DELETE
https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourc

eGroupName}/providers/Microsoft.Batch/batchAccounts/{accountName}/pools/{poolName}

Batch Service APIs


Your applications and services can issue direct REST API calls or use one or more of the
following client libraries to run and manage your Azure Batch workloads.

ノ Expand table

API API reference Download Tutorial Code More info


samples

Batch REST Azure REST API - Docs N/A - - Supported


versions

Batch .NET Azure SDK for .NET - NuGet Tutorial GitHub Release notes
Docs

Batch Python Azure SDK for Python - PyPI Tutorial GitHub Readme
Docs

Batch Azure SDK for npm Tutorial - Readme


JavaScript JavaScript - Docs

Batch Java Azure SDK for Java - Maven - GitHub Readme


Docs

Batch Management APIs


The Azure Resource Manager APIs for Batch provide programmatic access to Batch
accounts. Using these APIs, you can programmatically manage Batch accounts, quotas,
application packages, and other resources through the Microsoft.Batch provider.

ノ Expand table

API API reference Download Tutorial Code


samples

Batch Management REST Azure REST API - Docs - - GitHub

Batch Management .NET Azure SDK for .NET - Docs NuGet Tutorial GitHub

Batch Management Azure SDK for Python - PyPI - -


Python Docs

Batch Management Azure SDK for JavaScript - npm - -


JavaScript Docs

Batch Management Java Azure SDK for Java - Docs Maven - -

Batch command-line tools


These command-line tools provide the same functionality as the Batch service and Batch
Management APIs:

Batch PowerShell cmdlets: The Azure Batch cmdlets in the Azure PowerShell
module enable you to manage Batch resources with PowerShell.
Azure CLI: The Azure CLI is a cross-platform toolset that provides shell commands
for interacting with many Azure services, including the Batch service and Batch
Management service. For more information, see Manage Batch resources with
Azure CLI.

Other tools for application development


These additional tools may be helpful for building and debugging your Batch
applications and services.

Azure portal : You can create, monitor, and delete Batch pools, jobs, and tasks in
the Azure portal. You can view status information for these and other resources
while you run your jobs, and even download files from the compute nodes in your
pools. For example, you can download a failed task's stderr.txt while
troubleshooting. You can also download Remote Desktop (RDP) files that you can
use to log in to compute nodes.
Azure Batch Explorer : Batch Explorer is a free, rich-featured, standalone client
tool to help create, debug, and monitor Azure Batch applications. Download an
installation package for Mac, Linux, or Windows.
Azure Storage Explorer : While not strictly an Azure Batch tool, the Storage
Explorer can be helpful when developing and debugging your Batch solutions.

Additional resources
To learn about logging events from your Batch application, see Batch metrics,
alerts, and logs for diagnostic evaluation and monitoring.
For reference information on events raised by the Batch service, see Batch
Analytics.
For information about environment variables for compute nodes, see Azure Batch
runtime environment variables.

Next steps
Learn about the Batch service workflow and primary resources such as pools,
nodes, jobs, and tasks.
Get started with the Azure Batch library for .NET to learn how to use C# and the
Batch .NET library to execute a simple workload using a common Batch workflow.
A Python version and a JavaScript tutorial are also available.
Download the code samples on GitHub to see how both C# and Python can
interface with Batch to schedule and process sample workloads.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Error handling and detection in Azure
Batch
Article • 04/13/2023

At times, you might need to handle task and application failures in your Azure Batch
solution. This article explains different types of Batch errors, and how to resolve
common problems.

Error codes
Some general types of errors that you might see in Batch are:

Networking failures for requests that never reached Batch, or networking failures
when the Batch response didn't reach the client in time.
Internal server errors. These errors have a standard 5xx status code HTTP
response.
Throttling-related errors. These errors include 429 or 503 status code HTTP
responses with the Retry-after header.
4xx errors such as AlreadyExists and InvalidOperation . These errors indicate that

the resource isn't in the correct state for the state transition.

For detailed information about specific error codes, see Batch status and error codes.
This reference includes error codes for REST API, Batch service, and for job tasks and
scheduling.

Application failures
During execution, an application might produce diagnostic output. You can use this
output to troubleshoot issues. The Batch service writes standard output and standard
error output to the stdout.txt and stderr.txt files in the task directory on the compute
node. For more information, see Files and directories in Batch.

To download these output files, use the Azure portal or one of the Batch SDKs. For
example, to retrieve files for troubleshooting purposes, use ComputeNode.GetNodeFile
and CloudTask.GetNodeFile in the Batch .NET library.

Task errors
Task errors fall into several categories.
Pre-processing errors
If a task fails to start, a pre-processing error is set for the task. Pre-processing errors can
occur if:

The task's resource files have moved.


The storage account is no longer available.
Another issue happened that prevented the successful copying of files to the node.

File upload errors


If files that you specified for a task fail to upload for any reason, a file upload error is set
for the task. File upload errors can occur if:

The shared access signature (SAS) token supplied for accessing Azure Storage is
invalid.
The SAS token doesn't provide write permissions.
The storage account is no longer available.
Another issue happened that prevented the successful copying of files from the
node.

Application errors
The process specified by the task's command line can also fail. For more information,
see Task exit codes.

For application errors, configure Batch to automatically retry the task up to a specified
number of times.

Constraint errors
To specify the maximum execution duration for a job or task, set the maxWallClockTime
constraint. Use this setting to terminate tasks that fail to progress.

When the task exceeds the maximum time:

The task is marked as completed.


The exit code is set to 0xC000013A .
The schedulingError field is marked as { category:"ServerError",
code="TaskEnded"} .
Task exit codes
When a task executes a process, Batch populates the task's exit code property with the
return code of the process. If the process returns a nonzero exit code, the Batch service
marks the task as failed.

The Batch service doesn't determine a task's exit code. The process itself, or the
operating system on which the process executes, determines the exit code.

Task failures or interruptions


Tasks might occasionally fail or be interrupted. For example:

The task application itself might fail.


The node on which the task is running might reboot.
A resize operation might remove the node from the pool. This action might
happen if the pool's deallocation policy removes nodes immediately without
waiting for tasks to finish.

In all cases, Batch can automatically requeue the task for execution on another node.

It's also possible for an intermittent issue to cause a task to stop responding or take too
long to execute. You can set a maximum execution interval for a task. If a task exceeds
the interval, the Batch service interrupts the task application.

Connect to compute nodes


You can perform debugging and troubleshooting by signing in to a compute node
remotely. Use the Azure portal to download a Remote Desktop Protocol (RDP) file for
Windows nodes, and obtain Secure Shell (SSH) connection information for Linux nodes.
You can also download this information using the Batch .NET or Batch Python APIs.

To connect to a node via RDP or SSH, first create a user on the node. Use one of the
following methods:

The Azure portal


Batch REST API: adduser
Batch .NET API: ComputeNode.CreateComputeNodeUser
Batch Python module: add_user

If necessary, configure or disable access to compute nodes.


Troubleshoot problem nodes
Your Batch client application or service can examine the metadata of failed tasks to
identify a problem node. Each node in a pool has a unique ID. Task metadata includes
the node where a task runs. After you find the problem node, try the following methods
to resolve the failure.

Reboot node
Restarting a node sometimes fixes latent issues, such as stuck or crashed processes. If
your pool uses a start task, or your job uses a job preparation task, a node restart
executes these tasks.

Batch REST API: reboot


Batch .NET API: ComputeNode.Reboot

Reimage node
Reimaging a node reinstalls the operating system. Start tasks and job preparation tasks
rerun after the reimaging happens.

Batch REST API: reimage


Batch .NET API: ComputeNode.Reimage

Remove node from pool


Removing the node from the pool is sometimes necessary.

Batch REST API: removenodes


Batch .NET API: PoolOperations

Disable task scheduling on node


Disabling task scheduling on a node effectively takes the node offline. Batch assigns no
further tasks to the node. However, the node continues running in the pool. You can
then further investigate the failures without losing the failed task's data. The node also
won't cause more task failures.

For example, disable task scheduling on the node. Then, sign in to the node remotely.
Examine the event logs, and do other troubleshooting. After you solve the problems,
enable task scheduling again to bring the node back online.
Batch REST API: enablescheduling
Batch .NET API: ComputeNode.EnableScheduling

You can use these actions to specify Batch handles tasks currently running on the node.
For example, when you disable task scheduling with the Batch .NET API, you can specify
an enum value for DisableComputeNodeSchedulingOption. You can choose to:

Terminate running tasks: Terminate


Requeue tasks for scheduling on other nodes: Requeue
Allow running tasks to complete before performing the action: TaskCompletion

Retry after errors


The Batch APIs notify you about failures. You can retry all APIs using the built-in global
retry handler. It's a best practice to use this option.

After a failure, wait several seconds before retrying. If you retry too frequently or too
quickly, the retry handler throttles requests.

Next steps
Check for Batch pool and node errors
Check for Batch job and task errors
Azure Batch best practices
Article • 02/28/2025

This article discusses best practices and useful tips for using the Azure Batch service
effectively. These tips can help you enhance performance and avoid design pitfalls in
your Batch solutions.

 Tip

For guidance about security in Azure Batch, see Batch security and compliance
best practices.

Pools
Pools are the compute resources for executing jobs on the Batch service. The following
sections provide recommendations for working with Batch pools.

Pool configuration and naming


Pool allocation mode: When creating a Batch account, you can choose between
two pool allocation modes: Batch service or user subscription. For most cases, you
should use the default Batch service mode, in which pools are allocated behind the
scenes in Batch-managed subscriptions. In the alternative user subscription mode,
Batch VMs and other resources are created directly in your subscription when a
pool is created. User subscription accounts are primarily used to enable a small but
important subset of scenarios. For more information, see configuration for user
subscription mode.

classic or simplified node communication mode: Pools can be configured in

one of two node communication modes, classic or simplified. In the classic node
communication model, the Batch service initiates communication to the compute
nodes, and compute nodes also require communicating to Azure Storage. In the
simplified node communication model, compute nodes initiate communication
with the Batch service. Due to the reduced scope of inbound/outbound
connections required, and not requiring Azure Storage outbound access for
baseline operation, the recommendation is to use the simplified node
communication model. Some future improvements to the Batch service will also
require the simplified node communication model. The classic node
communication model will be retired on March 31, 2026.
Job and task run time considerations: If you have jobs comprised primarily of
short-running tasks, and the expected total task counts are small, so that the
overall expected run time of the job isn't long, don't allocate a new pool for each
job. The allocation time of the nodes will diminish the run time of the job.

Multiple compute nodes: Individual nodes aren't guaranteed to always be


available. While uncommon, hardware failures, operating system updates, and a
host of other issues can cause individual nodes to be offline. If your Batch
workload requires deterministic, guaranteed progress, you should allocate pools
with multiple nodes.

Images with impending end-of-life (EOL) dates: It's strongly recommended to


avoid images with impending Batch support end of life (EOL) dates. These dates
can be discovered via the ListSupportedImages API, PowerShell, or Azure CLI. It's
your responsibility to periodically refresh your view of the EOL dates pertinent to
your pools and migrate your workloads before the EOL date occurs. If you're using
a custom image with a specified node agent, ensure that you follow Batch support
end-of-life dates for the image for which your custom image is derived or aligned
with. An image without a specified batchSupportEndOfLife date indicates that such
a date hasn't been determined yet by the Batch service. Absence of a date doesn't
indicate that the respective image will be supported indefinitely. An EOL date may
be added or updated in the future at any time.

VM SKUs with impending end-of-life (EOL) dates: As with VM images, VM SKUs


or families may also reach Batch support end of life (EOL). These dates can be
discovered via the ListSupportedVirtualMachineSkus API, PowerShell, or Azure CLI.
Plan for the migration of your workload to a non-EOL VM SKU by creating a new
pool with an appropriate supported VM SKU. Absence of an associated
batchSupportEndOfLife date for a VM SKU doesn't indicate that particular VM SKU

will be supported indefinitely. An EOL date may be added or updated in the future
at any time.

Unique resource names: Batch resources (jobs, pools, etc.) often come and go
over time. For example, you may create a pool on Monday, delete it on Tuesday,
and then create another similar pool on Thursday. Each new resource you create
should be given a unique name that you haven't used before. You can create
uniqueness by using a GUID (either as the entire resource name, or as a part of it)
or by embedding the date and time that the resource was created in the resource
name. Batch supports DisplayName, which can give a resource a more readable
name even if the actual resource ID is something that isn't human-friendly. Using
unique names makes it easier for you to differentiate which particular resource did
something in logs and metrics. It also removes ambiguity if you ever have to file a
support case for a resource.

Continuity during pool maintenance and failure: It's best to have your jobs use
pools dynamically. If your jobs use the same pool for everything, there's a chance
that jobs won't run if something goes wrong with the pool. This principle is
especially important for time-sensitive workloads. For example, select or create a
pool dynamically when you schedule each job, or have a way to override the pool
name so that you can bypass an unhealthy pool.

Business continuity during pool maintenance and failure: There are many reasons
why a pool may not grow to the size you desire, such as internal errors or capacity
constraints. Make sure you can retarget jobs at a different pool (possibly with a
different VM size using UpdateJob) if necessary. Avoid relying on a static pool ID
with the expectation that it will never be deleted and never change.

Pool security

Isolation boundary
For the purposes of isolation, if your scenario requires isolating jobs or tasks from each
other, do so by having them in separate pools. A pool is the security isolation boundary
in Batch, and by default, two pools aren't visible or able to communicate with each
other. Avoid using separate Batch accounts as a means of security isolation unless the
larger environment from which the Batch account operates in requires isolation.

If desired, proper access control must be applied on the Batch account and APIs to
prevent access to all pools under the Batch account. It's recommended to disable shared
key access and only allow Entra-based authentication to enable role-based access
control.

Batch Node Agent updates

Batch node agents aren't automatically upgraded for pools that have nonzero compute
nodes. To ensure your Batch pools receive the latest security fixes and updates to the
Batch node agent, you need to either resize the pool to zero compute nodes or recreate
the pool. It's recommended to monitor the Batch Node Agent release notes to
understand changes to new Batch node agent versions. Checking regularly for updates
when they were released enables you to plan upgrades to the latest agent version.

Before you recreate or resize your pool, you should download any node agent logs for
debugging purposes if you're experiencing issues with your Batch pool or compute
nodes. This process is further discussed in the Nodes section.

7 Note

For general guidance about security in Azure Batch, see Batch security and
compliance best practices.

Operating system updates

It's recommended that the VM image selected for a Batch pool should be up-to-date
with the latest publisher provided security updates. Some images may perform
automatic package updates upon boot (or shortly thereafter), which may interfere with
certain user directed actions such as retrieving package repository updates (for example,
apt update ) or installing packages during actions such as a StartTask.

It's recommended to enable Auto OS upgrade for Batch pools, which allows the
underlying Azure infrastructure to coordinate updates across the pool. This option can
be configured to be nondisrupting for task execution. Automatic OS upgrade doesn't
support all operating systems that Batch supports. For more information, see the Virtual
Machine Scale Sets Auto OS upgrade Support Matrix. For Windows operating systems,
ensure that you aren't enabling the property
virtualMachineConfiguration.windowsConfiguration.enableAutomaticUpdates when using

Auto OS upgrade on the Batch pool.

Azure Batch doesn't verify or guarantee that images allowed for use with the service
have the latest security updates. Updates to images are under the purview of the
publisher of the image, and not that of Azure Batch. For certain images published under
microsoft-azure-batch , there's no guarantee that these images are kept up-to-date with

their upstream derived image.

Pool lifetime and billing


Pool lifetime can vary depending upon the method of allocation and options applied to
the pool configuration. Pools can have an arbitrary lifetime and a varying number of
compute nodes at any point in time. It's your responsibility to manage the compute
nodes in the pool either explicitly, or through features provided by the service
(autoscale or autopool).

Pool recreation: Avoid deleting and recreating pools on a daily basis. Instead,
create a new pool and then update your existing jobs to point to the new pool.
Once all of the tasks have been moved to the new pool, then delete the old pool.
Pool efficiency and billing: Batch itself incurs no extra charges. However, you do
incur charges for Azure resources utilized, such as compute, storage, networking,
and any other resources that may be required for your Batch workload. You're
billed for every compute node in the pool, regardless of the state it's in. For more
information, see Cost analysis and budgets for Azure Batch.

Ephemeral OS disks: Virtual Machine Configuration pools can use ephemeral OS


disks, which create the OS disk on the VM cache or temporary SSD, to avoid extra
costs associated with managed disks.

Pool allocation failures


Pool allocation failures can happen at any point during first allocation or subsequent
resizes. These failures can be due to temporary capacity exhaustion in a region or
failures in other Azure services that Batch relies on. Your core quota isn't a guarantee
but rather a limit.

Unplanned downtime
It's possible for Batch pools to experience downtime events in Azure. Understanding
that problems can arise and you should develop your workflow to be resilient to re-
executions. If nodes fail, Batch automatically attempts to recover these compute nodes
on your behalf. This recovery may trigger rescheduling any running task on the node
that is restored or on a different, available node. To learn more about interrupted tasks,
see Designing for retries.

Custom image pools


When you create an Azure Batch pool using the Virtual Machine Configuration, you
specify a VM image that provides the operating system for each compute node in the
pool. You can create the pool with a supported Azure Marketplace image, or you can
create a custom image with an Azure Compute Gallery image. While you can also use a
managed image to create a custom image pool, we recommend creating custom
images using the Azure Compute Gallery whenever possible. Using the Azure Compute
Gallery helps you provision pools faster, scale larger quantities of VMs, and improves
reliability when provisioning VMs.

Third-party images
Pools can be created using third-party images published to Azure Marketplace. With
user subscription mode Batch accounts, you may see the error "Allocation failed due to
marketplace purchase eligibility check" when creating a pool with certain third-party
images. To resolve this error, accept the terms set by the publisher of the image. You can
do so by using Azure PowerShell or Azure CLI.

Container pools
When you create a Batch pool with a virtual network, there can be interaction side
effects between the specified virtual network and the default Docker bridge. Docker, by
default, will create a network bridge with a subnet specification of 172.17.0.0/16 .
Ensure that there are no conflicting IP ranges between the Docker network bridge and
your virtual network.

Docker Hub limits the number of image pulls. Ensure that your workload doesn't exceed
published rate limits for Docker Hub-based images. It's recommended to use Azure
Container Registry directly or leverage Artifact cache in ACR.

Azure region dependency


You shouldn't rely on a single Azure region if you have a time-sensitive or production
workload. While rare, there are issues that can affect an entire region. For example, if
your processing needs to start at a specific time, consider scaling up the pool in your
primary region well before your start time. If that pool scale fails, you can fall back to
scaling up a pool in a backup region (or regions).

Pools across multiple accounts in different regions provide a ready, easily accessible
backup if something goes wrong with another pool. For more information, see Design
your application for high availability.

Jobs
A job is a container designed to contain hundreds, thousands, or even millions of tasks.
Follow these guidelines when creating jobs.

Fewer jobs, more tasks


Using a job to run a single task is inefficient. For example, it's more efficient to use a
single job containing 1,000 tasks rather than creating 100 jobs that contain 10 tasks
each. If you used 1,000 jobs, each with a single task that would be the least efficient,
slowest, and most expensive approach to take.
Avoid designing a Batch solution that requires thousands of simultaneously active jobs.
There's no quota for tasks, so executing many tasks under as few jobs as possible
efficiently uses your job and job schedule quotas.

Job lifetime
A Batch job has an indefinite lifetime until it's deleted from the system. Its state
designates whether it can accept more tasks for scheduling or not.

A job doesn't automatically move to completed state unless explicitly terminated. This
action can be automatically triggered through the onAllTasksComplete property or
maxWallClockTime.

There's a default active job and job schedule quota. Jobs and job schedules in
completed state don't count towards this quota.

Delete jobs when they're no longer needed, even if in completed state. Although
completed jobs don't count towards active job quota, it's beneficial to periodically clean
up completed jobs. For example, listing jobs will be more efficient when the total
number of jobs is a smaller set (even if proper filters are applied to the request).

Tasks
Tasks are individual units of work that comprise a job. Tasks are submitted by the user
and scheduled by Batch on to compute nodes. The following sections provide
suggestions for designing your tasks to handle issues and perform efficiently.

Save task data


Compute nodes are by their nature ephemeral. Batch features such as autopool and
autoscale can make it easy for nodes to disappear. When nodes leave a pool (due to a
resize or a pool delete), all the files on those nodes are also deleted. Because of this
behavior, a task should move its output off of the node it's running on, and to a durable
store before it completes. Similarly, if a task fails, it should move logs required to
diagnose the failure to a durable store.

Batch has integrated support Azure Storage to upload data via OutputFiles, and with
various shared file systems, or you can perform the upload yourself in your tasks.

Manage task lifetime


Delete tasks when they're no longer needed, or set a retentionTime task constraint. If a
retentionTime is set, Batch automatically cleans up the disk space used by the task

when the retentionTime expires.

Deleting tasks accomplishes two things:

Ensures that you don't have a build-up of tasks in the job. This action will help
avoid difficulty in finding the task you're interested in as you'll have to filter
through the Completed tasks.
Cleans up the corresponding task data on the node (provided retentionTime
hasn't already been hit). This action helps ensure that your nodes don't fill up with
task data and run out of disk space.

7 Note

For tasks just submitted to Batch, the DeleteTask API call takes up to 10 minutes to
take effect. Before it takes effect, other tasks might be prevented from being
scheduled. It's because Batch Scheduler still tries to schedule the tasks just deleted.
If you wanted to delete one task shortly after it's submitted, please terminate the
task instead (since the terminate task request will take effect immediately). And
then delete the task 10 minutes later.

Submit large numbers of tasks in collection


Tasks can be submitted on an individual basis or in collections. Submit tasks in
collections of up to 100 at a time when doing bulk submission of tasks to reduce
overhead and submission time.

Set max tasks per node appropriately


Batch supports oversubscribing tasks on nodes (running more tasks than a node has
cores). It's up to you to ensure that your tasks are right-sized for the nodes in your pool.
For example, you may have a degraded experience if you attempt to schedule eight
tasks that each consume 25% CPU usage onto one node (in a pool with
taskSlotsPerNode = 8 ).

Design for retries and re-execution


Tasks can be automatically retried by Batch. There are two types of retries: user-
controlled and internal. User-controlled retries are specified by the task's
maxTaskRetryCount. When a program specified in the task exits with a nonzero exit
code, the task is retried up to the value of the maxTaskRetryCount .

Although rare, a task can be retried internally due to failures on the compute node, such
as not being able to update internal state or a failure on the node while the task is
running. The task will be retried on the same compute node, if possible, up to an
internal limit before giving up on the task and deferring the task to be rescheduled by
Batch, potentially on a different compute node.

There are no design differences when executing your tasks on dedicated or Spot nodes.
Whether a task is preempted while running on a Spot node or interrupted due to a
failure on a dedicated node, both situations are mitigated by designing the task to
withstand failure.

Build durable tasks


Tasks should be designed to withstand failure and accommodate retry. This principle is
especially important for long running tasks. Ensure that your tasks generate the same,
single result even if they're run more than once. One way to achieve this outcome is to
make your tasks "goal seeking." Another way is to make sure your tasks are idempotent
(tasks will have the same outcome no matter how many times they're run).

A common example is a task to copy files to a compute node. A simple approach is a


task that copies all the specified files every time it runs, which is inefficient and isn't built
to withstand failure. Instead, create a task to ensure the files are on the compute node; a
task that doesn't recopy files that are already present. In this way, the task picks up
where it left off if it was interrupted.

Avoid short execution time


Tasks that only run for one to two seconds aren't ideal. Try to do a significant amount of
work in an individual task (10 second minimum, going up to hours or days). If each task
is executing for one minute (or more), then the scheduling overhead as a fraction of
overall compute time is small.

Use pool scope for short tasks on Windows nodes


When scheduling a task on Batch nodes, you can choose whether to run it with task
scope or pool scope. If the task will only run for a short time, task scope can be
inefficient due to the resources needed to create the autouser account for that task. For
greater efficiency, consider setting these tasks to pool scope. For more information, see
Run a task as an autouser with pool scope.

Nodes
A compute node is an Azure virtual machine (VM) or cloud service VM that is dedicated
to processing a portion of your application's workload. Follow these guidelines when
working with nodes.

Start tasks: lifetime and idempotency


As with other tasks, the node start task should be idempotent. Start tasks are rerun
when the compute node restarts or when the Batch agent restarts. An idempotent task
is simply one that produces a consistent result when run multiple times.

Start tasks shouldn't be long-running or be coupled to the lifetime of the compute


node. If you need to start programs that are services or service-like in nature, construct
a start task that enables these programs to be started and managed by operating
system facilities such as systemd on Linux or Windows Services. The start task should still
be constructed as idempotent such that subsequent execution of the start task is
handled properly if these programs were previously installed as services.

 Tip

When Batch reruns your start task, it will attempt to delete the start task directory
and create it again. If Batch fails to recreate the start task directory, then the
compute node will fail to launch the start task.

These services must not take file locks on any files in Batch-managed directories on the
node, because otherwise Batch is unable to delete those directories due to the file locks.
For example, instead of configuring launch of the service directly from the start task
working directory, copy the files elsewhere in an idempotent fashion. Then install the
service from that location using the operating system facilities.

Isolated nodes
Consider using isolated VM sizes for workloads with compliance or regulatory
requirements. Supported isolated sizes in virtual machine configuration mode include
Standard_E80ids_v4 , Standard_M128ms , Standard_F72s_v2 , Standard_G5 , Standard_GS5 ,
and Standard_E64i_v3 . For more information about isolated VM sizes, see Virtual
machine isolation in Azure.

Avoid creating directory junctions in Windows


Directory junctions, sometimes called directory hard-links, are difficult to deal with
during task and job cleanup. Use symlinks (soft-links) rather than hard-links.

Temporary disks and AZ_BATCH_NODE_ROOT_DIR


Batch relies on VM temporary disks, for Batch-compatible VM sizes, to store metadata
related to task execution along with any artifacts of each task execution on this
temporary disk. Examples of these temporary disk mount points or directories are:
/mnt/batch , /mnt/resource/batch , and D:\batch\tasks . Replacing, remounting,

junctioning, symlinking, or otherwise redirecting these mount points and directories or


any of the parent directories isn't supported and can lead to instability. If you require
more disk space, consider using a VM size or family that has temporary disk space that
meets your requirements or attaching data disks. For more information, see the next
section about attaching and preparing data disks for compute nodes.

Attaching and preparing data disks


Each individual compute node has the exact same data disk specification attached if
specified as part of the Batch pool instance. Only new data disks may be attached to
Batch pools. These data disks attached to compute nodes aren't automatically
partitioned, formatted, or mounted. It's your responsibility to perform these operations
as part of your start task. These start tasks must be crafted to be idempotent. Re-
execution of the start tasks on compute nodes is possible. If the start task isn't
idempotent, potential data loss can occur on the data disks.

 Tip

When mounting a data disk in Linux, if nesting the disk mountpoint under the
Azure temporary mount points such as /mnt or /mnt/resource , care should be
taken such that no dependency races are introduced. For example, if these mounts
are automatically performed by the OS, there can be a race between the temporary
disk being mounted and your data disk(s) being mounted under the parent. Steps
should be taken to ensure that appropriate dependencies are enforced by facilities
available such as systemd or defer mounting of the data disk to the start task as
part of your idempotent data disk preparation script.

Preparing data disks in Linux Batch pools

Azure data disks in Linux are presented as block devices and assigned a typical sd[X]
identifier. You shouldn't rely on static sd[X] assignments as these labels are dynamically
assigned at boot time and aren't guaranteed to be consistent between the first and any
subsequent boots. You should identify your attached disks through the mappings
presented in /dev/disk/azure/scsi1/ . For example, if you specified LUN 0 for your data
disk in the AddPool API, then this disk would manifest as /dev/disk/azure/scsi1/lun0 .
As an example, if you were to list this directory, you could potentially see:

user@host:~$ ls -l /dev/disk/azure/scsi1/
total 0
lrwxrwxrwx 1 root root 12 Oct 31 15:16 lun0 -> ../../../sdc

There's no need to translate the reference back to the sd[X] mapping in your
preparation script, instead refer to the device directly. In this example, this device would
be /dev/disk/azure/scsi1/lun0 . You could provide this ID directly to fdisk , mkfs , and
any other tooling required for your workflow. Alternatively, you can use lsblk with
blkid to map the UUID for the disk.

For more information about Azure data disks in Linux, including alternate methods of
locating data disks and /etc/fstab options, see this article. Ensure that there are no
dependencies or races as described by the Tip note before promoting your method into
production use.

Preparing data disks in Windows Batch pools

Azure data disks attached to Batch Windows compute nodes are presented
unpartitioned and unformatted. You need to enumerate disks with RAW partitions for
actioning as part of your start task. This information can be retrieved using the Get-Disk
PowerShell cmdlet. As an example, you could potentially see:

PS C:\Windows\system32> Get-Disk
Number Friendly Name Serial Number HealthStatus
OperationalStatus Total Size Partition

Style
------ ------------- ------------- ------------ -
---------------- ---------- ----------
0 Virtual HD Healthy
Online 30 GB MBR
1 Virtual HD Healthy
Online 32 GB MBR
2 Msft Virtu... Healthy
Online 64 GB RAW

Where disk number 2 is the uninitialized data disk attached to this compute node. These
disks can then be initialized, partitioned, and formatted as required for your workflow.

For more information about Azure data disks in Windows, including sample PowerShell
scripts, see this article. Ensure any sample scripts are validated for idempotency before
promotion into production use.

Collect Batch agent logs


If you notice a problem involving the behavior of a node or tasks running on a node,
collect the Batch agent logs prior to deallocating the nodes in question. The Batch agent
logs can be collected using the Upload Batch service logs API. These logs can be
supplied as part of a support ticket to Microsoft and will help with issue troubleshooting
and resolution.

Batch API

Timeout Failures
Timeout failures don't necessarily indicate that the service failed to process the request.
When a timeout failure occurs, you should either retry the operation or retrieve the state
of the resource, as appropriate for the situation, to verify the status of whether the
operation succeeded or failed.

Connectivity
Review the following guidance related to connectivity in your Batch solutions.
Network Security Groups (NSGs) and User Defined Routes
(UDRs)
When provisioning Batch pools in a virtual network, ensure that you're closely following
the guidelines regarding the use of the BatchNodeManagement.region service tag,
ports, protocols, and direction of the rule. Use of the service tag is highly recommended;
don't use underlying Batch service IP addresses as they can change over time. Using
Batch service IP addresses directly can cause instability, interruptions, or outages for
your Batch pools.

For User Defined Routes (UDRs), it's recommended to use


BatchNodeManagement.region service tags instead of Batch service IP addresses as they
can change over time.

Honoring DNS
Ensure that your systems honor DNS Time-to-Live (TTL) for your Batch account service
URL. Additionally, ensure that your Batch service clients and other connectivity
mechanisms to the Batch service don't rely on IP addresses.

Any HTTP requests with 5xx level status codes along with a "Connection: close" header
in the response requires adjusting your Batch service client behavior. Your Batch service
client should observe the recommendation by closing the existing connection, re-
resolving DNS for the Batch account service URL, and attempt following requests on a
new connection.

Retry requests automatically


Ensure that your Batch service clients have appropriate retry policies in place to
automatically retry your requests, even during normal operation and not exclusively
during any service maintenance time periods. These retry policies should span an
interval of at least 5 minutes. Automatic retry capabilities are provided with various
Batch SDKs, such as the .NET RetryPolicyProvider class.

Static public IP addresses


Typically, virtual machines in a Batch pool are accessed through public IP addresses that
can change over the lifetime of the pool. This dynamic nature can make it difficult to
interact with a database or other external service that limits access to certain IP
addresses. To address this concern, you can create a pool using a set of static public IP
addresses that you control. For more information, see Create an Azure Batch pool with
specified public IP addresses.

Batch node underlying dependencies


Consider the following dependencies and restrictions when designing your Batch
solutions.

System-created resources
Azure Batch creates and manages a set of users and groups on the VM, which shouldn't
be altered:

Windows:

A user named PoolNonAdmin


A user group named WATaskCommon

Linux:

A user named _azbatch

 Tip

Naming of these users or groups are implementation artifacts and are subject to
change at any time.

File cleanup
Batch actively tries to clean up the working directory that tasks are run in, once their
retention time expires. Any files written outside of this directory are your responsibility
to clean up to avoid filling up disk space.

The automated cleanup for the working directory will be blocked if you run a service on
Windows from the start task working directory, due to the folder still being in use. This
action will lead to degraded performance. To fix this issue, change the directory for that
service to a separate directory that isn't managed by Batch.

Next steps
Learn about the Batch service workflow and primary resources such as pools,
nodes, jobs, and tasks.
Learn about default Azure Batch quotas, limits, and constraints, and how to request
quota increases.
Learn how to detect and avoid failures in pool and node background operations .

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Batch service quotas and limits
Article • 06/05/2024

As with other Azure services, there are limits on certain resources associated with Azure
Batch. For example, if your pool doesn't reach your target number of compute nodes,
you might have reached the core quota limit for your Batch account. Many limits are
default quotas, which Azure applies at the subscription or account level.

Keep these quotas in mind as you design and scale up your Batch workloads. You can
run multiple Batch workloads in a single Batch account. Or, you can distribute your
workloads among Batch accounts in the same subscription but different Azure regions.
If you plan to run production workloads in Batch, you might need to increase one or
more of the quotas above the default. To raise a quota, request a quota increase at no
charge.

Resource quotas
A quota is a limit, not a capacity guarantee. If you have large-scale capacity needs,
contact Azure support.

Also note that quotas aren't guaranteed values. Quotas can vary based on changes from
the Batch service or a user request to change a quota value.

ノ Expand table

Resource Default Maximum


limit limit

Azure Batch accounts per region per subscription 1-3 50

Dedicated cores per Batch account 0-9001 Contact


support

Low-priority cores per Batch account 0-1001 Contact


support

Active jobs and job schedules per Batch account (completed jobs 100-300 1,0002
have no limit)

Pools per Batch account 0-1001 5002

Private endpoint connections per Batch account 100 100


1 For capacity management purposes, the default quotas for new Batch accounts in
some regions and for some subscription types have been reduced from the above range
of values. In some cases, these limits have been reduced to zero. When you create a new
Batch account, check your quotas and request an appropriate core or service quota
increase, if necessary. Alternatively, consider reusing Batch accounts that already have
sufficient quota or user subscription pool allocation Batch accounts to maintain core
and VM family quota across all Batch accounts on the subscription. Service quotas like
active jobs or pools apply to each distinct Batch account even for user subscription pool
allocation Batch accounts.

2
To request an increase beyond this limit, contact Azure Support.

7 Note

Default limits vary depending on the type of subscription you use to create a Batch
account. Cores quotas shown are for Batch accounts in Batch service mode. View
the quotas in your Batch account.

Core quotas

Core quotas in Batch service mode


Core quotas exist for each virtual machine (VM) series supported by Batch. These core
quotas are displayed on the Quotas page in the Azure portal. To update VM series
quota limits, open a support request.

For dedicated nodes, Batch enforces a core quota limit for each VM series, and a
total core quota limit for the entire Batch account.
For Spot nodes, Batch enforces only a total core quota for the Batch account
without any distinction between different VM series.

Core quotas in user subscription mode


If you created a Batch account with pool allocation mode set to user subscription, Batch
VMs and other resources are created directly in your subscription when a pool is created
or resized. The Azure Batch core quotas don't apply and the quotas in your subscription
for regional compute cores, per-series compute cores, and other resources are used and
enforced.
To learn more about these quotas, see Azure subscription and service limits, quotas, and
constraints.

Pool size limits


Pool size limits are set by the Batch service. Unlike resource quotas, these values can't be
changed. Only pools with inter-node communication and custom images have
restrictions different from the standard quota.

ノ Expand table

Resource Maximum Limit

Compute nodes in inter-node communication enabled pool

Batch service pool allocation mode 100

Batch subscription pool allocation mode 80

Compute nodes in pool created with a managed image resource1

Dedicated nodes 2000

Spot nodes 1000

1
For pools that aren't inter-node communication enabled.

Other limits
The Batch service sets the following other limits. Unlike resource quotas, it's not possible
to change these values.

ノ Expand table

Resource Maximum Limit

Concurrent tasks per compute node 4 x number of node cores

Applications per Batch account 200

Application packages per application 40

Application packages per pool 10

Maximum task lifetime 180 days1


Resource Maximum Limit

Mounts per compute node 10

Certificates per pool 12

1
The maximum lifetime of a task, from when it's added to the job to when it completes,
is 180 days. By default, data is retained for completed tasks for seven days if the
compute node where it ran is still available. Data for tasks not completed within the
maximum lifetime isn't accessible. Completed task data retention times are configurable
on a per task basis.

View Batch quotas


To view your Batch account quotas in the Azure portal:

1. Sign in to the Azure portal .

2. Select or search for Batch accounts.

3. On the Batch accounts page, select the Batch account that you want to review.

4. On the Batch account's menu, under Settings, select Quotas.

5. Review the quotas currently applied to the Batch account.


Increase a quota
You can request a quota increase for your Batch account or your subscription using the
Azure portal or by using the Azure Quota REST API.

The type of quota increase depends on the pool allocation mode of your Batch account.
To request a quota increase, you must include the VM series for which you would like to
increase the quota. When the quota increase is applied, it's applied to all series of VMs.

Once you've submitted your support request, Azure support will contact you. Quota
requests may be completed within a few minutes or up to two business days.

Quota types
You can select from two quota types when you create your support request.

Select Per Batch account to request quota increases for a single Batch account. These
quota increases can include dedicated and Spot cores, and the number of jobs and
pools. If you select this option, specify the Batch account to which this request applies.
Then, select the quota(s) you'd like to update. Provide the new limit you're requesting
for each resource. The Spot quota is a single value across all VM series. If you need
constrained SKUs, select Spot cores and include the VM families to request.

Select All accounts in this region to request quota increases that apply to all Batch
accounts in a region. For example, use this option to increase the number of Batch
accounts per region per subscription.

Request in Azure portal


To request a quota increase using the Azure portal, first open a support request:

1. Sign in to the Azure portal .

2. Select or search for Quotas.

3. On the Quotas page, select Increase my quotas.

You can also open the support request as follows:

1. Sign in to the Azure portal .

2. Select or search for Help + support in the Azure portal. Or, select the question
mark icon (?) in the portal menu. Then, in the Support + troubleshooting pane,
select Help + support.

3. On the New support request page, select Create a support request.

Next, fill out your support request.

1. On the Basics tab:

a. For Summary, enter a description of your issue.

b. For Issue Type, select Service and subscription limits (quotas).

c. For Subscription, select the Azure subscription where your Batch account is.

d. For Quota type, select Batch.

e. Select Next: Solutions to continue. The Solutions tab is skipped.


2. On the Details tab:

a. Under Problem details, select Enter details.

b. On the Quota details pane, for Location, enter the Azure region where you
want to increase the quota.

c. For Quota type, select your quota type. If you're not sure which option to select,
see the explanation of quota types.

d. If applicable, for Batch account, select the Batch account to update.

e. If applicable, for Select Quotas to Update, select which specific quotas to


increase.
f. Under Advanced diagnostic information, choose whether to allow collection of
advanced diagnostic information.

g. Under Support method, select the appropriate severity level for your business
situation . Also select your preferred contact method and support language.

h. Under Contact information, enter and verify the required contact details.

i. Select Next: Review + create to continue.

3. Select Create to submit the support request.

Request through Azure Quota REST API


You can use the Azure Quota REST API to request a quota increase at the subscription
level or at the Batch account level.

For details and examples, see Request a quota increase using the Azure Support REST
API.

Related quotas for VM pools


Batch pools in a VM configuration deployed in an Azure virtual network automatically
allocate more Azure networking resources. These resources are created in the
subscription that contains the virtual network supplied when creating the Batch pool.
The following resources are created for each 100 pool nodes in a virtual network:

One network security group


One public IP address
One load balancer

These resources are limited by the subscription's resource quotas. If you plan large pool
deployments in a virtual network, you may need to request a quota increase for one or
more of these resources.

Next steps
Learn about the Batch service workflow and primary resources such as pools,
nodes, jobs, and tasks.
Learn about Azure subscription and service limits, quotas, and constraints.
Choose a VM size and image for compute
nodes in an Azure Batch pool
Article • 04/28/2025

When you select a node size for an Azure Batch pool, you can choose from almost all the VM
sizes available in Azure. Azure offers a range of sizes for Linux and Windows VMs for different
workloads.

Supported VM series and sizes

Pools in Virtual Machine configuration


Batch pools in the Virtual Machine configuration support almost all VM sizes available in Azure.
The supported VM sizes in a region can be obtained via the Batch Management API. You can
use one of the following methods to return a list of VM sizes supported by Batch in a region:

PowerShell: Get-AzBatchSupportedVirtualMachineSku
Azure CLI: az batch location list-skus
Batch Management APIs: List Supported Virtual Machine SKUs

For example, using the Azure CLI, you can obtain the list of skus for a particular Azure region
with the following command:

Azure CLI

az batch location list-skus --location <azure-region>

 Tip

Avoid VM SKUs/families with impending Batch support end of life (EOL) dates. These dates
can be discovered via the ListSupportedVirtualMachineSkus API, PowerShell, or Azure
CLI. For more information, see the Batch best practices guide regarding Batch pool VM
SKU selection.

Using Generation 2 VM Images


Some VM series, such as FX and Mv2, can only be used with generation 2 VM images.
Generation 2 VM images are specified like any VM image, using the sku property of the
imageReference configuration; the sku strings have a suffix such as -g2 or -gen2 . To get a list
of VM images supported by Batch, including generation 2 images, use the 'list supported
images' API, PowerShell, or Azure CLI.

Size considerations
Application requirements - Consider the characteristics and requirements of the
application run on the nodes. Aspects like whether the application is multithreaded and
how much memory it consumes can help determine the most suitable and cost-effective
node size. For multi-instance MPI workloads or CUDA applications, consider specialized
HPC or GPU-enabled VM sizes, respectively. For more information, see Use RDMA-
capable or GPU-enabled instances in Batch pools.

Tasks per node - It's typical to select a node size assuming one task runs on a node at a
time. However, it might be advantageous to have multiple tasks (and therefore multiple
application instances) run in parallel on compute nodes during job execution. In this case,
it's common to choose a multicore node size to accommodate the increased demand of
parallel task execution.

Load levels for different tasks - All of the nodes in a pool are the same size. If you intend
to run applications with differing system requirements and/or load levels, we recommend
that you use separate pools.

Region availability - A VM series or size might not be available in the regions where you
create your Batch accounts. To check that a size is available, see Products available by
region .

Quotas - The cores quotas in your Batch account can limit the number of nodes of a
given size you can add to a Batch pool. When needed, you can request a quota increase.

Supported VM images
Use one of the following APIs to return a list of Windows and Linux VM images currently
supported by Batch, including the node agent SKU IDs for each image:

PowerShell: Get-AzBatchSupportedImage
Azure CLI: az batch pool supported-images
Batch Service APIs: List Supported Images

For example, using the Azure CLI, you can obtain the list of supported VM images with the
following command:

Azure CLI
az batch pool supported-images list

Images that have a verificationType of verified undergo regular interoperability validation


testing with the Batch service by the Azure Batch team. The verified designation doesn't
mean that every possible application or usage scenario is validated, but that functionality
exposed by the Batch API such as executing tasks, mounting a supported virtual filesystem, etc.
are regularly tested as part of release processes. Images that have a verificationType of
unverified don't undergo regular validation testing but were initially verified to boot on Azure

Batch compute nodes and transition to an idle compute node state. Support for unverified
images isn't guaranteed.

 Tip

Avoid images with impending Batch support end of life (EOL) dates. These dates can be
discovered via the ListSupportedImages API, PowerShell, or Azure CLI. For more
information, see the Batch best practices guide regarding Batch pool VM image selection.

 Tip

The value of the AZ_BATCH_NODE_ROOT_DIR compute node environment variable is


dependent upon if the VM has a local temporary disk or not. See Batch root directory
location for more information.

Next steps
Learn about the Batch service workflow and primary resources such as pools, nodes, jobs,
and tasks.
Learn about using specialized VM sizes with RDMA-capable or GPU-enabled instances in
Batch pools.
Reliability in Azure Batch
Article • 08/22/2024

This article describes reliability support in Azure Batch and covers both intra-regional
resiliency with availability zones and links to information on cross-region recovery and
business continuity.

Availability zone support


Availability zones are physically separate groups of datacenters within each Azure
region. When one zone fails, services can fail over to one of the remaining zones.

For more information on availability zones in Azure, see What are availability zones?

Batch maintains parity with Azure on supporting availability zones.

Prerequisites
For user subscription mode Batch accounts, make sure that the subscription in
which you're creating your pool doesn't have a zone offer restriction on the
requested VM SKU. To see if your subscription doesn't have any restrictions, call
the Resource Skus List API and check the ResourceSkuRestrictions . If a zone
restriction exists, you can submit a support ticket to remove the zone restriction.

Because InfiniBand doesn't support inter-zone communication, you can't create a


pool with a zonal policy if it has inter-node communication enabled and uses a VM
SKU that supports InfiniBand.

Batch maintains parity with Azure on supporting availability zones. To use the zonal
option, your pool must be created in an Azure region with availability zone
support.

To allocate your Batch pool across availability zones, the Azure region in which the
pool was created must support the requested VM SKU in more than one zone. To
validate that the region supports the requested VM SKU in more than one zone,
call the Resource Skus List API and check the locationInfo field of resourceSku .
Ensure that more than one zone is supported for the requested VM SKU. You can
also use the Azure CLI to list all available Resource SKUs with the following
command:

Azure CLI
az vm list-skus

Create an Azure Batch pool across availability zones


For examples on how to create a Batch pool across availability zones, see Create an
Azure Batch pool across availability zones.

Learn more about creating Batch accounts with the Azure portal, the Azure CLI,
PowerShell, or the Batch management API.

Zone down experience


During a zone down outage, the nodes within that zone become unavailable. Any nodes
within that same node pool from other zone(s) aren't impacted and continue to be
available.

Azure Batch account doesn't reallocate or create new nodes to compensate for nodes
that have gone down due to the outage. Users are required to add more nodes to the
node pool, which are then allocated from other healthy zone(s).

Fault tolerance
To prepare for a possible availability zone failure, you should over-provision capacity of
service to ensure that the solution can tolerate 1/3 loss of capacity and continue to
function without degraded performance during zone-wide outages. Since the platform
spreads VMs across three zones and you need to account for at least the failure of one
zone, multiply peak workload instance count by a factor of zones/(zones-1), or 3/2. For
example, if your typical peak workload requires four instances, you should provision six
instances: (2/3 * 6 instances) = 4 instances.

Availability zone migration


You can't migrate an existing Batch pool to availability zone support. If you wish to
recreate your Batch pool across availability zones, see Create an Azure Batch pool across
availability zones.
Cross-region disaster recovery and business
continuity
Azure Batch is available in all Azure regions. However, when a Batch account is created,
it must be associated with one specific region. All subsequent operations for that Batch
account only apply to that region. For example, pools and associated virtual machines
(VMs) are created in the same region as the Batch account.

When designing an application that uses Batch, you must consider the possibility that
Batch may not be available in a region. It's possible to encounter a rare situation where
there's a problem with the region as a whole, the entire Batch service in the region, or
your specific Batch account.

If the application or solution using Batch must always be available, then it should be
designed to either failover to another region or always have the workload split between
two or more regions. Both approaches require at least two Batch accounts, with each
account located in a different region.

You're responsible for setting up cross-region disaster recovery with Azure Batch. If you
run multiple Batch accounts across specific regions and take advantage of availability
zones, your application can meet your disaster recovery objectives when one of your
Batch accounts becomes unavailable.

When providing the ability to failover to an alternate region, all components in a


solution must be considered; it's not sufficient to simply have a second Batch account.
For example, in most Batch applications, an Azure storage account is required. The
storage account and Batch account must be in the same region for acceptable
performance.

Consider the following points when designing a solution that can failover:

Precreate all required services in each region, such as the Batch account and the
storage account. There's often no charge for having accounts created, and charges
accrue only when the account is used or when data is stored.

Make sure ahead of time that the appropriate quotas are set for all user
subscription Batch accounts, to allocate the required number of cores using the
Batch account.

Use templates and/or scripts to automate the deployment of the application in a


region.
Keep application binaries and reference data up to date in all regions. Staying up
to date will ensure that the region can be brought online quickly without having to
wait for the upload and deployment of files. For example, consider the case where
a custom application to install on pool nodes is stored and referenced using Batch
application packages. When an update of the application is released, it should be
uploaded to each Batch account and referenced by the pool configuration (or
make the latest version the default version).

In the application calling Batch, storage, and any other services, make it easy to
switch over clients or the load to different regions.

Consider frequently switching over to an alternate region as part of normal


operation. For example, with two deployments in separate regions, switch over to
the alternate region every month.

The duration of time to recover from a disaster depends on the setup you choose. Batch
itself is agnostic regarding whether you're using multiple accounts or a single account.
In active-active configurations, where two Batch instances are receiving traffic
simultaneously, disaster recovery is faster than for an active-passive configuration.
Which configuration you choose should be based on business needs (different regions,
latency requirements) and technical considerations.

Single-region disaster recovery


How you implement disaster recovery in Batch is the same, whether you're working in a
single-region or multi-region geography. The only differences are which SKU you use
for storage, and whether you intend to use the same or different storage account across
all regions.

Disaster recovery testing


You should perform your own disaster recovery testing of your Batch enabled solution.
It's considered a best practice to enable easy switching between client and service load
across different regions.

Testing your disaster recovery plan for Batch can be as simple as alternating Batch
accounts. For example, you could rely on a single Batch account in a specific region for
one operational day. Then, on the next day, you could switch to a second Batch account
in a different region. Disaster recovery is primarily managed on the client side. This
multiple-account approach to disaster recovery takes care of RTO and RPO expectations
in either single-region or multiple-region geographies.
Capacity and proactive disaster recovery resiliency
Microsoft and its customers operate under the Shared Responsibility model. Microsoft is
responsible for platform and infrastructural resiliency. You are responsible for addressing
disaster recovery for any specific service you deploy and control. To ensure that recovery
is proactive:

You should always predeploy secondaries. The predeployment of secondaries is


necessary because there's no guarantee of capacity at time of impact for those
who haven't preallocated such resources.

Precreate all required services in each region, such as your Batch accounts and
associated storage accounts. There's no charge for creating new accounts; charges
accrue only when the account is used or when data is stored.

Make sure appropriate quotas are set on all subscriptions ahead of time, so you
can allocate the required number of cores using the Batch account. As with other
Azure services, there are limits on certain resources associated with the Batch
service. Many of these limits are default quotas applied by Azure at the
subscription or account level. Keep these quotas in mind as you design and scale
up your Batch workloads.

7 Note

If you plan to run production workloads in Batch, you may need to increase one or
more of the quotas above the default. To raise a quota, you can request a quota
increase at no charge. For more information, see Request a quota increase.

Storage
You must configure Batch storage to ensure data is backed up cross-region; customer
responsibility is the default. Most Batch solutions use Azure Storage for storing resource
files and output files. For example, your Batch tasks (including standard tasks, start tasks,
job preparation tasks, and job release tasks) typically specify resource files that reside in
a storage account. Storage accounts also store data that is processed and any output
data that is generated. Understanding possible data loss across the regions of your
service operations is an important consideration. You must also confirm whether data is
rewritable or read-only.

Batch supports the following types of Azure Storage accounts:

General-purpose v2 (GPv2) accounts


General-purpose v1 (GPv1) accounts
Blob storage accounts (currently supported for pools in the Virtual Machine
configuration)

For more information about storage accounts, see Azure storage account overview.

You can associate a storage account with your Batch account when you create the
account or do this step later.

If you're setting up a separate storage account for each region your service is available
in, you must use zone-redundant storage (ZRS) accounts. Use geo-zone-redundant
storage (GZRS) accounts if you're using the same storage account across multiple paired
regions. For geographies that contain a single region, you must create a zone-
redundant storage (ZRS) account because GZRS isn't available.

Capacity planning is another important consideration with storage and should be


addressed proactively. Consider your cost and performance requirements when
choosing a storage account. For example, the GPv2 and blob storage account options
support greater capacity and scalability limits compared with GPv1. (Contact Azure
Support to request an increase in a storage limit.) These account options can improve
the performance of Batch solutions that contain a large number of parallel tasks that
read from or write to the storage account.

When a storage account is linked to a Batch account, think of it as the autostorage


account. An autostorage account is required if you plan to use the application packages
capability, as it's used to store the application package .zip files. An autostorage account
can also be used for task resource files; since the autostorage account is already linked
to the Batch account, this avoids the need for shared access signature (SAS) URLs to
access the resource files.

Next steps
Reliability in Azure

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Azure Batch runtime environment variables
Article • 04/28/2025

The Azure Batch service sets the following environment variables on compute nodes. You can reference these environment variables in task
command lines, and in the programs and scripts run by the command lines.

For more information about using environment variables with Batch, see Environment settings for tasks.

Environment variable visibility


These environment variables are visible only in the context of the task user, which is the user account on the node under which a task is executed.
You won't see these variables when connecting remotely to a compute node via Remote Desktop Protocol (RDP) or Secure Shell (SSH) and listing
environment variables. This is because the user account that is used for remote connection is not the same as the account that is used by the task.

To get the current value of an environment variable, launch cmd.exe on a Windows compute node or /bin/sh on a Linux node:

cmd /c set <ENV_VARIABLE_NAME>

/bin/sh -c "printenv <ENV_VARIABLE_NAME>"

Command-line expansion of environment variables


The command lines executed by tasks on compute nodes don't run under a shell. This means that these command lines can't natively use shell
features such as environment variable expansion (including the PATH ). To use such features, you must invoke the shell in the command line. For
example, launch cmd.exe on Windows compute nodes or /bin/sh on Linux nodes:

cmd /c MyTaskApplication.exe %MY_ENV_VAR%

/bin/sh -c "MyTaskApplication $MY_ENV_VAR"

Environment variables
7 Note

AZ_BATCH_AUTHENTICATION_TOKEN is deprecated and will be retired on September 30, 2024. See the announcement for details and alternative
implementation.

ノ Expand table

Variable name Description Availability Example

AZ_BATCH_ACCOUNT_NAME The name of the Batch account that the task belongs All tasks. mybatchaccount
to.

AZ_BATCH_ACCOUNT_URL The URL of the Batch account. All tasks. https://myaccount.westus.batch.azu

AZ_BATCH_APP_PACKAGE A prefix of all the app package environment Any task AZ_BATCH_APP_PACKAGE_FOO_1 (Li
variables. For example, if Application "FOO" version with an AZ_BATCH_APP_PACKAGE_FOO#1 (W
"1" is installed onto a pool, the environment variable associated
is AZ_BATCH_APP_PACKAGE_FOO_1 (on Linux) or app
AZ_BATCH_APP_PACKAGE_FOO#1 (on Windows). package.
AZ_BATCH_APP_PACKAGE_FOO_1 points to the Also
location that the package was downloaded (a available
folder). When using the default version of the app for all tasks
package, use the AZ_BATCH_APP_PACKAGE if the node
environment variable without the version numbers. itself has
If in Linux, and the application package name is application
"Agent-linux-x64" and the version is "1.1.46.0, the packages.
environment name is actually:
AZ_BATCH_APP_PACKAGE_agent_linux_x64_1_1_46_0,
using underscores and lower case. For more
information, see Execute the installed applications
for more details.
Variable name Description Availability Example

AZ_BATCH_AUTHENTICATION_TOKEN An authentication token that grants access to a All tasks. OAuth2 access token
limited set of Batch service operations. This
environment variable is only present if the
authenticationTokenSettings are set when the task is
added. The token value is used in the Batch APIs as
credentials to create a Batch client, such as in the
BatchClient.Open() .NET API. The token doesn't
support private networking.

AZ_BATCH_CERTIFICATES_DIR A directory within the task working directory in All tasks. /mnt/batch/tasks/workitems/batchjo
which certificates are stored for Linux compute 1/task001/certs
nodes. This environment variable does not apply to
Windows compute nodes.

AZ_BATCH_HOST_LIST The list of nodes that are allocated to a multi- Multi- 10.0.0.4,10.0.0.5
instance task in the format nodeIP,nodeIP . instance
primary
and
subtasks.

AZ_BATCH_IS_CURRENT_NODE_MASTER Specifies whether the current node is the master Multi- true
node for a multi-instance task. Possible values are instance
true and false . primary
and
subtasks.

AZ_BATCH_JOB_ID The ID of the job that the task belongs to. All tasks batchjob001
except start
task.

AZ_BATCH_JOB_PREP_DIR The full path of the job preparation task directory on All tasks AZ_BATCH_JOB_PREP_DIR
the node. except start
task and
job
preparation
task. Only
available if
the job is
configured
with a job
preparation
task.

AZ_BATCH_JOB_PREP_WORKING_DIR The full path of the job preparation task working All tasks AZ_BATCH_JOB_PREP_WORKING_DIR
directory on the node. except start
task and
job
preparation
task. Only
available if
the job is
configured
with a job
preparation
task.

AZ_BATCH_MASTER_NODE The IP address and port of the compute node on Multi- 10.0.0.4:6000
which the primary task of a multi-instance task runs. instance
Do not use the port specified here for MPI or NCCL primary
communication - it is reserved for the Azure Batch and
service. Use the variable MASTER_PORT instead, subtasks.
either by setting it with a value passed in through
command line argument (port 6105 is a good
default choice), or using the value AML sets if it does
so.

AZ_BATCH_NODE_ID The ID of the node that the task is assigned to. All tasks. tvm-1219235766_3-20160919t17271

AZ_BATCH_NODE_IS_DEDICATED If true , the current node is a dedicated node. If All tasks. true
false , it is an Azure Spot node.

AZ_BATCH_NODE_LIST The list of nodes that are allocated to a multi- Multi- 10.0.0.4;10.0.0.5
instance task in the format nodeIP;nodeIP . instance
primary
Variable name Description Availability Example

and
subtasks.

AZ_BATCH_NODE_MOUNTS_DIR The full path of the node level file system mount All tasks AZ_BATCH_NODE_MOUNTS_DIR
location where all mount directories reside. including
Windows file shares use a drive letter, so for start task
Windows, the mount drive is part of devices and have access
drives. to the user,
given the
user is
aware of
the mount
permissions
for the
mounted
directory.

AZ_BATCH_NODE_ROOT_DIR The full path of the root of all Batch directories on All tasks. AZ_BATCH_NODE_ROOT_DIR
the node.

AZ_BATCH_NODE_SHARED_DIR The full path of the shared directory on the node. All All tasks. AZ_BATCH_NODE_SHARED_DIR
tasks that execute on a node have read/write access
to this directory. Tasks that execute on other nodes
do not have remote access to this directory (it is not
a "shared" network directory).

AZ_BATCH_NODE_STARTUP_DIR The full path of the start task directory on the node. All tasks. AZ_BATCH_NODE_STARTUP_DIR

AZ_BATCH_POOL_ID The ID of the pool that the task is running on. All tasks. batchpool001

AZ_BATCH_TASK_DIR The full path of the task directory on the node. This All tasks. AZ_BATCH_TASK_DIR
directory contains the stdout.txt and stderr.txt
for the task, and the
AZ_BATCH_TASK_WORKING_DIR.

AZ_BATCH_TASK_ID The ID of the current task. All tasks task001


except start
task.

AZ_BATCH_TASK_SHARED_DIR A directory path that is identical for the primary task Multi- AZ_BATCH_TASK_SHARED_DIR
and every subtask of a multi-instance task. The path instance
exists on every node on which the multi-instance primary
task runs, and is read/write accessible to the task and
commands running on that node (both the subtasks.
coordination command and the application
command. Subtasks or a primary task that execute
on other nodes do not have remote access to this
directory (it is not a "shared" network directory).

AZ_BATCH_TASK_WORKING_DIR The full path of the task working directory on the All tasks. AZ_BATCH_TASK_WORKING_DIR
node. The currently running task has read/write
access to this directory.

AZ_BATCH_TASK_RESERVED_EPHEMERAL_DISK_SPACE_BYTES The current threshold for disk space upon which the All tasks. 1000000
VM will be marked as DiskFull .

CCP_NODES The list of nodes and number of cores per node that Multi- 2 10.0.0.4 1 10.0.0.5 1
are allocated to a multi-instance task. Nodes and instance
cores are listed in the format primary
numNodes<space>node1IP<space>node1Cores<space> and
node2IP<space>node2Cores<space> ... , where the subtasks.
number of nodes is followed by one or more node
IP addresses and the number of cores for each.

) Important

Exact values for paths for Environment Variables are considered implementation details and are subject to change. Use the Batch provided
Environment Variables instead of attempting to construct raw path representations.

Environment variables related to directory location


The following table specifies the values of each environment variable value postfix after the AZ_BATCH_NODE_ROOT_DIR value, see
AZ_BATCH_NODE_ROOT_DIR for more information.

ノ Expand table

Environment Variable Name Environment Variable Value Directory Postfix

AZ_BATCH_NODE_STARTUP_DIR startup

AZ_BATCH_NODE_SHARED_DIR shared

AZ_BATCH_NODE_MOUNTS_DIR fsmounts

Task environment variables related to directory location


The job directories are different between single-run job and job schedule. The following table specifies the values of job directory in single-run job
and job schedule.

ノ Expand table

Job Type Job Directory Value Postfix after AZ_BATCH_NODE_ROOT_DIR

Job workitems\{job name}\job-1

Job Schedule workitems\{job schedule name}\{job name}

The following table specifies the values of each environment variable value postfix after the job directory.

ノ Expand table

Environment Variable Name Environment Variable Value Directory Postfix After Job Directory

AZ_BATCH_TASK_WORKING_DIR {task name}\wd

AZ_BATCH_TASK_DIR {task name}

AZ_BATCH_TASK_SHARED_DIR {task name}

AZ_BATCH_JOB_PREP_DIR {job prepration task name}

AZ_BATCH_JOB_PREP_WORKING_DIR {job prepration task name}\wd

Next steps
Learn how to use environment variables with Batch.
Learn more about files and directories in Batch
Learn about multi-instance-tasks.
Migrate Azure Batch custom image pools
to Azure Compute Gallery
Article • 04/25/2025

To improve reliability, scale, and align with modern Azure offerings, Azure Batch will retire
custom image Batch pools specified from virtual hard disk (VHD) blobs in Azure Storage and
Azure Managed Images on March 31, 2026. Learn how to migrate your Azure Batch custom
image pools using Azure Compute Gallery.

Feature end of support


When you create an Azure Batch pool using the Virtual Machine Configuration, you specify an
image reference that provides the operating system for each compute node in the pool. You
can create a pool of virtual machines either with a supported Azure Marketplace image or with
a custom image. Custom images from VHD blobs and managed Images are either legacy
offerings or non-scalable solutions for Azure Batch. To ensure reliable infrastructure
provisioning at scale, all custom image sources other than Azure Compute Gallery will be
retired on March 31, 2026.

Alternative: Use Azure Compute Gallery references


for Batch custom image pools
When you use the Azure Compute Gallery (formerly known as Shared Image Gallery) for your
custom image, you have control over the operating system type and configuration, and the
type of data disks. Your shared image can include applications and reference data that become
available on all the Batch pool nodes as soon as they're provisioned. You can also have multiple
versions of an image as needed for your environment. When you use an image version to
create a VM, the image version is used to create new disks for the VM.

Using a shared image saves time in preparing your pool's compute nodes to run your Batch
workload. It's possible to use an Azure Marketplace image and install software on each
compute node after allocation. However, using a shared image can lead to more efficiencies in
faster compute node to ready state and reproducible workloads. Additionally, you can specify
multiple replicas for the shared image so when you create pools with many compute nodes,
provisioning latencies can be lower.

Migrate your eligible pools


To migrate your Batch custom image pools from managed image to shared image, review the
Azure Batch guide on using Azure Compute Gallery to create a custom image pool.

If you have either a VHD blob or a managed image, you can convert them directly to a
Compute Gallery image that can be used with Azure Batch custom image pools. When you're
creating a VM image definition for a Compute Gallery, on the Version tab, you can select a
source option to migrate from, including types being retired for Batch custom image pools:

ノ Expand table

Source Other fields

Managed image Select the Source image from the drop-down. The managed image must be in the
same region that you chose in Instance details.

VHD in a storage Select Browse to choose the storage account for the VHD.
account

For more information about this process, see creating an image definition and version for
Compute Gallery.

FAQs
How can I create an Azure Compute Gallery?

See the guide for Compute Gallery creation.

How do I create a Pool with a Compute Gallery image?

See the guide for creating a Pool with a Compute Gallery image.

What considerations are there for Compute Gallery image based Pools?

See the considerations for large pools.

Can I use Azure Compute Gallery images in different subscriptions or in different


Microsoft Entra tenants?

If the Shared Image isn't in the same subscription as the Batch account, you must register
the Microsoft.Batch resource provider for that subscription. The two subscriptions must
be in the same Microsoft Entra tenant. The image can be in a different region as long as it
has replicas in the same region as your Batch account.

Next steps
For more information, see Azure Compute Gallery.
Migrate Batch low-priority VMs to Spot
VMs
04/09/2025

The ability to allocate low-priority compute nodes in Azure Batch pools is being retired on
September 30, 2025. Learn how to migrate your Batch pools with low-priority compute nodes
to compute nodes based on Spot instances.

About the feature


Currently, as part of a Batch pool configuration, you can specify a target number of low-priority
compute nodes for Batch managed pool allocation Batch accounts. In user subscription pool
allocation Batch accounts, you can specify a target number of spot compute nodes. In both
cases, these compute resources are allocated from spare capacity and offered at a discount
compared to dedicated, on-demand VMs.

The amount of unused capacity that's available varies depending on factors such as VM family,
VM size, region, and time of day. Unlike dedicated capacity, these low-priority or spot VMs can
be reclaimed at any time by Azure. Therefore, low-priority and spot VMs are typically viable for
Batch workloads that are amenable to interruption or don't require strict completion
timeframes to potentially lower costs.

Feature end of support


Only low-priority compute nodes in Batch are being retired. Spot compute nodes will continue
to be supported, is a GA offering, and not affected by this deprecation. On September 30,
2025, we'll retire low-priority compute nodes. After that date, existing low-priority pools in
Batch may no longer be usable, attempts to seek back to target low-priority node counts will
fail, and you'll no longer be able to provision new pools with low-priority compute nodes.

Alternative: Use Azure Spot-based compute nodes


in Batch pools
As of December 2021, Azure Batch began offering Spot-based compute nodes in Batch. Like
low-priority VMs, you can use spot instances to obtain spare capacity at a discounted price in
exchange for the possibility that the VM will be preempted. If a preemption occurs, the spot
compute node will be evicted and all work that wasn't appropriately checkpointed will be lost.
Checkpointing is optional and is up to the Batch end-user to implement. The running Batch
task that was interrupted due to preemption will be automatically requeued for execution by a
different compute node. Additionally, Azure Batch will automatically attempt to seek back to
the target Spot node count as specified on the pool.

See the detailed breakdown between the low-priority and spot offering in Batch.

Migrate a Batch pool with low-priority compute


nodes or create a Batch pool with Spot instances
1. Ensure that you're using a user subscription pool allocation mode Batch account.

2. In the Azure portal, select the Batch account and view an existing pool or create a new
pool.

3. Under Scale, select either Target dedicated nodes or Target Spot/low-priority nodes.

4. For an existing pool, select the pool, and then select Scale to update the number of spot
nodes required based on the job scheduled.

5. Select Save.

To Ensure the Migration is Correctly Applied:


Azure CLI

az batch pool show

--account-name <your-batch-account-name>

--account-endpoint "https://<your-batch-account-name>.<region>.batch.azure.com"

--pool-id <your-pool-id>

--query "{PoolID:id, VMSize:vmSize,


SpotNodes:scaleSettings.targetLowPriorityNodes}"
FAQs
How do I create a user subscription pool allocation Batch account?

See the quickstart to create a new Batch account in user subscription pool allocation
mode.

Are Spot VMs available in Batch managed pool allocation accounts?

No. Spot VMs are available only in user subscription pool allocation Batch accounts.

Are spot instances available for CloudServiceConfiguration Pools?

No. Spot instances are only available for VirtualMachineConfiguration pools.


CloudServiceConfiguration pools will be retired before low-priority pools. We
recommend that you migrate to VirtualMachineConfiguration pools and user
subscription pool allocation Batch accounts before then.

What is the pricing and eviction policy of spot instances? Can I view pricing history and
eviction rates?

Yes. In the Azure portal, you can see historical pricing and eviction rates per size in a
region.

For more information about using spot VMs, see Spot Virtual Machines.

Can I transfer my quotas between Batch accounts?

Currently you can't transfer any quotas between Batch accounts.

Next steps
See the Batch Spot compute instance guide for details on further details in the difference
between offerings, limitations, and deployment examples.
Migrate Azure Batch pools to the simplified
compute node communication model
Article • 04/02/2025

To improve security, simplify the user experience, and enable key future improvements, Azure
Batch will retire the classic compute node communication model on March 31, 2026. Learn how
to migrate your Batch pools to using the simplified compute node communication model.

About the feature


An Azure Batch pool contains one or more compute nodes, which execute user-specified
workloads in the form of Batch tasks. To enable Batch functionality and Batch pool
infrastructure management, compute nodes must communicate with the Azure Batch service.
In the classic compute node communication model, the Batch service initiates communication
to the compute nodes and compute nodes must be able to communicate with Azure Storage
for baseline operations. In the Simplified compute node communication model, Batch pools
only require outbound access to the Batch service for baseline operations.

Feature end of support


The simplified compute node communication model will replace the classic compute node
communication model after March 31, 2026. The change is introduced in two phases:

From now until September 30, 2024, the default node communication mode for newly
created Batch pools with virtual networks will remain as classic.
After September 30, 2024, the default node communication mode for newly created Batch
pools with virtual networks will switch to the simplified.

After March 31, 2026, the option to use classic compute node communication mode will no
longer be honored. Batch pools without user-specified virtual networks are generally
unaffected by this change and the Batch service controls the default communication mode.

Alternative: Use simplified compute node


communication model
The simplified compute node communication mode streamlines the way Batch pool
infrastructure is managed on behalf of users. This communication mode reduces the
complexity and scope of inbound and outbound networking connections required in the
baseline operations.
The simplified model also provides more fine-grained data exfiltration control, since outbound
communication to Storage.region is no longer required. You can explicitly lock down outbound
communication to Azure Storage if necessary for your workflow. For example, autostorage
accounts for AppPackages and other storage accounts for resource files or output files can be
scoped appropriately.

Migrate your eligible pools


To migrate your Batch pools from classic to the simplified compute node communication
model, follow this document from the section entitled potential impact between classic and
simplified communication modes. You can either create new pools or update existing pools
with simplified compute node communication.

FAQs
Are public IP addresses still required for my pools?

By default, a public IP address is still needed to initiate the outbound connection to the
Azure Batch service from compute nodes. If you want to eliminate the need for public IP
addresses from compute nodes entirely, see the guide to create a simplified node
communication pool without public IP addresses

How can I connect to my nodes for diagnostic purposes?

RDP or SSH connectivity to the node is unaffected – load balancer(s) are still created
which can route those requests through to the node when provisioned with a public IP
address.

Are there any differences in billing?

There should be no cost or billing implications for the new model.

Are there any changes to Azure Batch agents on the compute node?

An extra agent on compute nodes is invoked in simplified compute node communication


mode for both Linux and Windows, Microsoft.BatchClusters.Agent and
Microsoft.BatchClusters.Agent.exe , respectively.

Are there any changes to how my linked resources from Azure Storage in Batch pools and
tasks are downloaded?

This behavior is unaffected – all user-specified resources that require Azure Storage such
as resource files, output files, or application packages are made from the compute node
directly to Azure Storage. You need to ensure your networking configuration allows these
flows.

Next steps
For more information, see Simplified compute node communication.
Create a Batch account in the Azure
portal
Article • 04/02/2025

This article shows how to use the Azure portal to create an Azure Batch account that has
account properties to fit your compute scenario. You see how to view account
properties like access keys and account URLs. You also learn how to configure and
create user subscription mode Batch accounts.

For background information about Batch accounts and scenarios, see Batch service
workflow and resources.

Create a Batch account


When you create a Batch account, you can choose between user subscription and Batch
service pool allocation modes. For most cases, you should use the default Batch service
pool allocation mode. In Batch service mode, compute and virtual machine (VM)-related
resources for pools are allocated on Batch service managed Azure subscriptions.

In user subscription pool allocation mode, compute and VM-related resources for pools
are created directly in the Batch account subscription when a pool is created. In
scenarios where you create a Batch pool in a virtual network that you specify, certain
networking related resources are created in the subscription of the virtual network.

To create a Batch account in user subscription pool allocation mode, you must also
register your subscription with Azure Batch, and associate the account with Azure Key
Vault. For more information about requirements for user subscription pool allocation
mode, see Configure user subscription mode.

To create a Batch account in the default Batch service mode:

1. Sign in to the Azure portal .

2. In the Azure Search box, enter and then select batch accounts.

3. On the Batch accounts page, select Create.

4. On the New Batch account page, enter or select the following details.

Subscription: Select the subscription to use if not already selected.


Resource group: Select the resource group for the Batch account, or create a
new one.

Account name: Enter a name for the Batch account. The name must be
unique within the Azure region, can contain only lowercase characters or
numbers, and must be 3-24 characters long.

7 Note

The Batch account name is part of its ID and can't be changed after
creation.

Location: Select the Azure region for the Batch account if not already
selected.

Storage account: Optionally, select Select a storage account to associate an


Azure Storage account with the Batch account.

On the Choose storage account screen, select an existing storage account or


select Create new to create a new one. A general-purpose v2 storage account
is recommended for the best performance.

5. Optionally, select Next: Advanced or the Advanced tab to specify Identity type,
Pool allocation mode, and Authentication mode. The default options work for
most scenarios. To create the account in User subscription mode, see Configure
user subscription mode.

6. Optionally, select Next: Networking or the Networking tab to configure public


network access for your Batch account.

7. Select Review + create, and when validation passes, select Create to create the
Batch account.
View Batch account properties
Once the account is created, select Go to resource to access its settings and properties.
Or search for and select batch accounts in the portal Search box, and select your account
from the list on the Batch accounts page.

On your Batch account page, you can access all account settings and properties from
the left navigation menu.

When you develop an application by using the Batch APIs, you use an account URL
and key to access your Batch resources. To view the Batch account access
information, select Keys.
Batch also supports Microsoft Entra authentication. User subscription mode Batch
accounts must be accessed by using Microsoft Entra ID. For more information, see
Authenticate Azure Batch services with Microsoft Entra ID.

To view the name and keys of the storage account associated with your Batch
account, select Storage account.

To view the resource quotas that apply to the Batch account, select Quotas.

Configure user subscription mode


You must take several steps before you can create a Batch account in user subscription
mode.

) Important

To create a Batch account in user subscription mode, you must have Contributor or
Owner role in the subscription.

Accept legal terms


You must accept the legal terms for the image before you use a subscription with a
Batch account in user subscription mode. If you haven't done this action, you might get
the error Allocation failed due to marketplace purchase eligibility when you try to
allocate Batch nodes.

To accept the legal terms, run the commands Get-AzMarketplaceTerms and Set-
AzMarketplaceTerms in PowerShell. Set the following parameters based on your Batch
pool's configuration:

Publisher : The image's publisher


Product : The image offer

Name : The offer SKU

For example:

PowerShell

Get-AzMarketplaceTerms -Publisher 'microsoft-azure-batch' -Product 'ubuntu-


server-container' -Name '20-04-lts' | Set-AzMarketplaceTerms -Accept

) Important

If you've enabled Private Azure Marketplace, you must follow the steps in Add new
collection to add a new collection to allow the selected image.

Allow Batch to access the subscription


When you create the first user subscription mode Batch account in an Azure
subscription, you must register your subscription with Batch resource provider, and
assign Azure Batch Service Orchestration Role to Microsoft Azure Batch service
principal. You need to do this configuration only once per subscription.

) Important

You need Owner permissions in the subscription to take this action.

1. In the Azure portal , search for and select subscriptions.

2. On the Subscriptions page, select the subscription you want to use for the Batch
account.

3. On the Subscription page, select Resource providers from the left navigation.
4. On the Resource providers page, search for Microsoft.Batch. If Microsoft.Batch
resource provider appears as NotRegistered, select it and then select Register at
the top of the screen.

5. Return to the Subscription page and select Access control (IAM) from the left
navigation.

6. At the top of the Access control (IAM) page, select Add > Add role assignment.

7. On the Role tab, search for and select Azure Batch Service Orchestration Role,
and then select Next.

8. On the Members tab, select Select members. On the Select members screen,
search for and select Microsoft Azure Batch, and then select Select.

9. Select Review + assign to go to Review + assign tab, and select Review + create
again to apply role assignment changes.

For detailed steps, see Assign Azure roles by using the Azure portal.

Create a key vault


User subscription mode requires Azure Key Vault. The key vault must be in the same
subscription and region as the Batch account.

To create a new key vault:

1. Search for and select key vaults from the Azure Search box, and then select Create
on the Key vaults page.
2. On the Create a key vault page, enter a name for the key vault, and choose an
existing resource group or create a new one in the same region as your Batch
account.
3. On the Access configuration tab, select either Azure role-based access control or
Vault access policy under Permission model, and under Resource access, check all
3 checkboxes for Azure Virtual Machine for deployment, Azure Resource
Manager for template deployment and Azure Disk Encryption for volume
encryption.
4. Leave the remaining settings at default values, select Review + create, and then
select Create.

Create a Batch account in user subscription mode


To create a Batch account in user subscription mode:

1. Follow the preceding instructions to create a Batch account, but select User
subscription for Pool allocation mode on the Advanced tab of the New Batch
account page.
2. You must then select Select a key vault to select an existing key vault or create a
new one.
3. After you select the key vault, select the checkbox next to I agree to grant Azure
Batch access to this key vault.
4. Select Review + create, and then select Create to create the Batch account.

Create a Batch account with designated authentication


mode
To create a Batch account with authentication mode settings:

1. Follow the preceding instructions to create a Batch account, but select Batch
Service for Authentication mode on the Advanced tab of the New Batch account
page.

2. You must then select Authentication mode to define which authentication mode
that a Batch account can use by authentication mode property key.

3. You can select either of the 3 "Microsoft Entra ID, Shared Key, Task
Authentication Token authentication mode for the Batch account to support or
leave the settings at default values.
4. Leave the remaining settings at default values, select Review + create, and then
select Create.

 Tip

For enhanced security, it is advised to confine the authentication mode of the Batch
account solely to Microsoft Entra ID. This measure mitigates the risk of shared key
exposure and introduces additional RBAC controls. For more details, see Batch
security best practices.

2 Warning

The Task Authentication Token will retire on September 30, 2024. Should you
require this feature, it is recommended to use User assigned managed identity in
the Batch pool as an alternative.

Grant access to the key vault manually


To grant access to the key vault manually in Azure portal , you need to assign Key
Vault Secrets Officer role for Batch:

1. Select Access control (IAM) from the left navigation of the key vault page.
2. At the top of the Access control (IAM) page, select Add > Add role assignment.
3. On the Add role assignment screen, under Role tab, under Job function roles sub
tab, search and select Key Vault Secrets Officer role for the Batch account, and
then select Next.
4. On the Members tab, select Select members. On the Select members screen,
search for and select Microsoft Azure Batch, and then select Select.
5. Select the Review + create button on the bottom to go to Review + assign tab,
and select the Review + create button on the bottom again.

For detailed steps, see Assign Azure roles by using the Azure portal.

7 Note

KeyVaultNotFound error returns for Batch account creation if the RBAC role isn't
assigned for Batch in the referenced key vault.

If the Key Vault permission model is Vault access policy, you also need to configure the
Access policies:

1. Select Access policies from the left navigation of the key vault page.

2. On the Access policies page, select Create.

3. On the Create an access policy screen, select a minimum of Get, List, Set, Delete,
and Recover permissions under Secret permissions.
4. Select Next.

5. On the Principal tab, search for and select Microsoft Azure Batch.

6. Select the Review + create tab, and then select Create.

Configure subscription quotas


For user subscription Batch accounts, core quotas must be set manually. Standard Batch
core quotas don't apply to accounts in user subscription mode. The quotas in your
subscription for regional compute cores, per-series compute cores, and other resources
are used and enforced.

To view and configure the core quotas associated with your Batch account:

1. In the Azure portal , select your user subscription mode Batch account.
2. From the left menu, select Quotas.

Other Batch account management options


You can also create and manage Batch accounts by using the following tools:

Batch PowerShell cmdlets


Azure CLI
Batch Management .NET

Next steps
Learn about the Batch service workflow and primary resources such as pools,
nodes, jobs, and tasks.
Learn the basics of developing a Batch-enabled application by using the Batch
.NET client library or Python. These quickstarts guide you through a sample
application that uses the Batch service to execute a workload on multiple compute
nodes, using Azure Storage for workload file staging and retrieval.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Manage Batch accounts and quotas with
the Batch Management client library for
.NET
Article • 05/09/2025

You can lower maintenance overhead in your Azure Batch applications by using the Batch
Management .NET library to automate Batch account creation, deletion, key management, and
quota discovery.

Create and delete Batch accounts within any region. If, as an independent software
vendor (ISV) for example, you provide a service for your clients in which each is assigned
a separate Batch account for billing purposes, you can add account creation and deletion
capabilities to your customer portal.
Retrieve and regenerate account keys programmatically for any of your Batch accounts.
This can help you comply with security policies that enforce periodic rollover or expiry of
account keys. When you have several Batch accounts in various Azure regions,
automation of this rollover process increases your solution's efficiency.
Check account quotas and take the trial-and-error guesswork out of determining which
Batch accounts have what limits. By checking your account quotas before starting jobs,
creating pools, or adding compute nodes, you can proactively adjust where or when
these compute resources are created. You can determine which accounts require quota
increases before allocating additional resources in those accounts.
Combine features of other Azure services for a full-featured management experience by
using Batch Management .NET, Microsoft Entra ID, and the Azure Resource Manager
together in the same application. By using these features and their APIs, you can provide
a frictionless authentication experience, the ability to create and delete resource groups,
and the capabilities that are described above for an end-to-end management solution.

7 Note

While this article focuses on the programmatic management of your Batch accounts, keys,
and quotas, you can also perform many of these activities by using the Azure portal.

Create and delete Batch accounts


One of the primary features of the Batch Management API is to create and delete Batch
accounts in an Azure region. To do so, use BatchAccountCollection.CreateOrUpdate and Delete,
or their asynchronous counterparts.
The following code snippet creates an account, obtains the newly created account from the
Batch service, and then deletes it.

C#

string subscriptionId = "Your SubscriptionID";


string resourceGroupName = "Your ResourceGroup name";

var credential = new DefaultAzureCredential();


ArmClient _armClient = new ArmClient(credential);

ResourceIdentifier resourceGroupResourceId =
ResourceGroupResource.CreateResourceIdentifier(subscriptionId, resourceGroupName);
ResourceGroupResource resourceGroupResource =
_armClient.GetResourceGroupResource(resourceGroupResourceId);

var data = new BatchAccountCreateOrUpdateContent(AzureLocation.EastUS);

// Create a new batch account


resourceGroupResource.GetBatchAccounts().CreateOrUpdate(WaitUntil.Completed,
"Your BatchAccount name", data);

// Get an existing batch account


BatchAccountResource batchAccount = resourceGroupResource.GetBatchAccount("Your
BatchAccount name");

// Delete the batch account


batchAccount.Delete(WaitUntil.Completed);

7 Note

Applications that use the Batch Management .NET library require service administrator or
coadministrator access to the subscription that owns the Batch account to be managed.
For more information, see the Microsoft Entra ID section and the AccountManagement
code sample.

Retrieve and regenerate account keys


Obtain primary and secondary account keys from any Batch account within your subscription
by using GetKeys. You can regenerate those keys by using RegenerateKey.

C#

string subscriptionId = "Your SubscriptionID";


string resourceGroupName = "Your ResourceGroup name";

var credential = new DefaultAzureCredential();


ArmClient _armClient = new ArmClient(credential);

ResourceIdentifier resourceGroupResourceId =
ResourceGroupResource.CreateResourceIdentifier(subscriptionId, resourceGroupName);
ResourceGroupResource resourceGroupResource =
_armClient.GetResourceGroupResource(resourceGroupResourceId);

var data = new BatchAccountCreateOrUpdateContent(AzureLocation.EastUS);

// Get an existing batch account


BatchAccountResource batchAccount = resourceGroupResource.GetBatchAccount("Your
BatchAccount name");

// Get and print the primary and secondary keys


BatchAccountKeys accountKeys = batchAccount.GetKeys();

Console.WriteLine("Primary key: {0}", accountKeys.Primary);


Console.WriteLine("Secondary key: {0}", accountKeys.Secondary);

// Regenerate the primary key


BatchAccountRegenerateKeyContent regenerateKeyContent = new
BatchAccountRegenerateKeyContent(BatchAccountKeyType.Primary);
batchAccount.RegenerateKey(regenerateKeyContent);

 Tip

You can create a streamlined connection workflow for your management applications.
First, obtain an account key for the Batch account you wish to manage with GetKeys.
Then, use this key when initializing the Batch .NET library's BatchSharedKeyCredentials
class, which is used when initializing BatchClient.

Check Azure subscription and Batch account


quotas
Azure subscriptions and the individual Azure services like Batch all have default quotas that
limit the number of certain entities within them. For the default quotas for Azure subscriptions,
see Azure subscription and service limits, quotas, and constraints. For the default quotas of the
Batch service, see Quotas and limits for the Azure Batch service. By using the Batch
Management .NET library, you can check these quotas in your applications. This enables you to
make allocation decisions before you add accounts or compute resources like pools and
compute nodes.

Check an Azure subscription for Batch account quotas


Before creating a Batch account in a region, you can check your Azure subscription to see
whether you are able to add an account in that region.

In the code snippet below, we first use GetBatchAccounts to get a collection of all Batch
accounts that are within a subscription. Once we've obtained this collection, we determine how
many accounts are in the target region. Then we use GetBatchQuotas to obtain the Batch
account quota and determine how many accounts (if any) can be created in that region.

C#

string subscriptionId = "Your SubscriptionID";


ArmClient _armClient = new ArmClient(new DefaultAzureCredential());

ResourceIdentifier subscriptionResourceId =
SubscriptionResource.CreateResourceIdentifier(subscriptionId);
SubscriptionResource subscriptionResource =
_armClient.GetSubscriptionResource(subscriptionResourceId);

// Get a collection of all Batch accounts within the subscription


var batchAccounts = subscriptionResource.GetBatchAccounts();
Console.WriteLine("Total number of Batch accounts under subscription id {0}:
{1}", subscriptionId, batchAccounts.Count());

// Get a count of all accounts within the target region


string region = "eastus";
int accountsInRegion = batchAccounts.Count(o => o.Data.Location == region);

// Get the account quota for the specified region


BatchLocationQuota batchLocationQuota =
subscriptionResource.GetBatchQuotas(AzureLocation.EastUS);
Console.WriteLine("Account quota for {0} region: {1}", region,
batchLocationQuota.AccountQuota);

// Determine how many accounts can be created in the target region


Console.WriteLine("Accounts in {0}: {1}", region, accountsInRegion);
Console.WriteLine("You can create {0} accounts in the {1} region.",
batchLocationQuota.AccountQuota - accountsInRegion, region);

In the snippet above, creds is an instance of TokenCredentials. To see an example of creating


this object, see the AccountManagement code sample on GitHub.

Check a Batch account for compute resource quotas


Before increasing compute resources in your Batch solution, you can check to ensure the
resources you want to allocate won't exceed the account's quotas. In the code snippet below,
we print the quota information for the Batch account named mybatchaccount . In your own
application, you could use such information to determine whether the account can handle the
additional resources to be created.
C#

string subscriptionId = "Your SubscriptionID";


string resourceGroupName = "Your ResourceGroup name";

var credential = new DefaultAzureCredential();


ArmClient _armClient = new ArmClient(credential);

ResourceIdentifier resourceGroupResourceId =
ResourceGroupResource.CreateResourceIdentifier(subscriptionId, resourceGroupName);
ResourceGroupResource resourceGroupResource =
_armClient.GetResourceGroupResource(resourceGroupResourceId);

// Get an existing batch account


BatchAccountResource batchAccount = resourceGroupResource.GetBatchAccount("Your
BatchAccount name");

// Now print the compute resource quotas for the account


Console.WriteLine("Core quota: {0}", batchAccount.Data.DedicatedCoreQuota);
Console.WriteLine("Pool quota: {0}", batchAccount.Data.PoolQuota);
Console.WriteLine("Active job and job schedule quota: {0}",
batchAccount.Data.ActiveJobAndJobScheduleQuota);

) Important

While there are default quotas for Azure subscriptions and services, many of these limits
can be raised by requesting a quota increase in the Azure portal.

Use Microsoft Entra ID with Batch Management


.NET
The Batch Management .NET library is an Azure resource provider client, and is used together
with Azure Resource Manager to manage account resources programmatically. Microsoft Entra
ID is required to authenticate requests made through any Azure resource provider client,
including the Batch Management .NET library, and through Azure Resource Manager. For
information about using Microsoft Entra ID with the Batch Management .NET library, see Use
Microsoft Entra ID to authenticate Batch solutions.

Sample project on GitHub


To see Batch Management .NET in action, check out the AccountManagement sample project
on GitHub. The AccountManagement sample application demonstrates the following
operations:
1. Acquire a security token from Microsoft Entra ID by using Acquire and cache tokens using
the Microsoft Authentication Library (MSAL). If the user is not already signed in, they are
prompted for their Azure credentials.
2. With the security token obtained from Microsoft Entra ID, create a SubscriptionClient to
query Azure for a list of subscriptions associated with the account. The user can select a
subscription from the list if it contains more than one subscription.
3. Get credentials associated with the selected subscription.
4. Create a ResourceManagementClient object by using the credentials.
5. Use a ResourceManagementClient object to create a resource group.
6. Use a BatchManagementClient object to perform several Batch account operations:

Create a Batch account in the new resource group.


Get the newly created account from the Batch service.
Print the account keys for the new account.
Regenerate a new primary key for the account.
Print the quota information for the account.
Print the quota information for the subscription.
Print all accounts within the subscription.
Delete the newly created account.

7. Delete the resource group.

To run the sample application successfully, you must first register it with your Microsoft Entra
tenant in the Azure portal and grant permissions to the Azure Resource Manager API. Follow
the steps provided in Authenticate Batch Management solutions with Active Directory.

Next steps
Learn about the Batch service workflow and primary resources such as pools, nodes, jobs,
and tasks.
Learn the basics of developing a Batch-enabled application using the Batch .NET client
library or Python. These quickstarts guide you through a sample application that uses the
Batch service to execute a workload on multiple compute nodes, using Azure Storage for
workload file staging and retrieval.
Plan to manage costs for Azure Batch
Article • 03/27/2025

This article describes how you plan for and manage costs for Azure Batch. Before you
deploy the service, you can use the Azure pricing calculator to estimate costs for
Azure Batch. Later, as you deploy Azure resources, review the estimated costs.

After you start running Batch workloads, use Cost Management features to set budgets
and monitor costs. You can also review forecasted costs and identify spending trends to
identify areas where you might want to act. Costs for Azure Batch are only a portion of
the monthly costs in your Azure bill. Although this article explains how to plan for and
manage costs for Azure Batch, you're billed for all Azure services and resources used in
your Azure subscription, including the third-party services.

Prerequisites
Cost analysis in Cost Management supports most Azure account types, but not all of
them. To view the full list of supported account types, see Understand Cost
Management data. To view cost data, you need at least read access for an Azure
account. For information about assigning access to Microsoft Cost Management data,
see Assign access to data.

Estimate costs before using Azure Batch


Use the Azure pricing calculator to estimate costs before you add virtual machines.

1. On the Products tab, go to the Compute section or search for Batch in the search
bar. on the Batch tile, select Add to estimate and scroll down to the Your Estimate
section.

2. Notice that Azure Batch is a free service and that the costs associated with Azure
Batch are for the underlying resources that run your workloads. When adding
Azure Batch to your estimate, the pricing calculator automatically creates a
selection for Cloud Services and Virtual machines. You can read more about Azure
Cloud Services and Azure Virtual Machines (VMs) in each product's documentation.
What you need to know for estimated the cost of Azure Batch is that virtual
machines are the most significant resource.

Select options from the drop-downs. There are various options available to choose
from. The options that have the largest impact on your estimate total are your
virtual machine's operating system, the operating system license if applicable, the
VM size you select under INSTANCE, the number of instances you choose, and the
amount of time your month your instances to run.

Notice that the total estimate changes as you select different options. The estimate
appears in the upper corner and the bottom of the Your Estimate section.

You can learn more about the cost of running virtual machines from the Plan to
manage costs for virtual machines documentation.

Understand the full billing model for Azure


Batch
Azure Batch runs on Azure infrastructure that accrues costs when you deploy new
resources. It's important to understand that there could be other additional
infrastructure costs that might accrue.

How you're charged for Azure Batch


Azure Batch is a free service. There are no costs for Batch itself. However, there can be
charges for the underlying compute resources and software licenses used to run Batch
workloads. Costs may be incurred from virtual machines in a pool, data transfer from the
VM, or any input or output data stored in the cloud.

Costs that might accrue with Azure Batch


Although Batch itself is a free service, many of the underlying resources that run your
workloads aren't. These include:

Virtual Machines
To learn more about the costs associated with virtual machines, see the How
you're charged for virtual machines section of Plan to manage costs for virtual
machines.
Each VM in a pool created with Virtual Machine Configuration has an associated
OS disk that uses Azure-managed disks. Azure-managed disks have an
additional cost, and other disk performance tiers have different costs as well.
Storage
When applications are deployed to Batch node virtual machines using
application packages, you're billed for the Azure Storage resources that your
application packages consume. You're also billed for the storage of any input or
output files, such as resource files and other log data.
In general, the cost of storage data associated with Batch is much lower than
the cost of compute resources.
In some cases, a load balancer
Networking resources
For Virtual Machine Configuration pools, standard load balancers are used,
which require static IP addresses. The load balancers used by Batch are visible
for accounts configured in user subscription mode, but not those in Batch
service mode.
Standard load balancers incur charges for all data passed to and from Batch
pool VMs. Select Batch APIs that retrieve data from pool nodes (such as Get
Task/Node File), task application packages, resource/output files, and container
images also incur charges.
Virtual Network
Depending on what services you use, your Batch solution may incur additional
fees. Services commonly used with Batch that may have associated costs include:
Application Insights
Data Factory
Azure Monitor

Costs might accrue after resource deletion


After you delete Azure Batch resources, the following resources might continue to exist.
They continue to accrue costs until you delete them.

Virtual machine
Any disks deployed other than the OS and local disks
By default, the OS disk is deleted with the VM, but it can be set not to during
the VM's creation
Virtual network
Your virtual NIC and public IP, if applicable, can be set to delete along with your
virtual machine
Bandwidth
Load balancer

For virtual networks, one virtual network is billed per subscription and per region. Virtual
networks cannot span regions or subscriptions. Setting up private endpoints in vNet
setups may also incur charges.

Bandwidth is charged by usage; the more data transferred, the more you're charged.

Using Azure Prepayment with Azure Batch


While Azure Batch is a free service, you can pay for underlying resource charges with
your Azure Prepayment credit. However, you can't use Azure Prepayment credit to pay
for charges for third party products and services including those from the Azure
Marketplace.

View cost analysis and create budgets


As you use Azure resources with Azure Batch, you incur costs. Azure resource usage unit
costs vary by time intervals (seconds, minutes, hours, and days) or by unit usage (bytes,
megabytes, and so on.) As soon as Azure resource use starts, costs are incurred, and you
can see the costs in cost analysis. Microsoft Cost Management lets you plan, analyze
and reduce your spending to maximize your cloud investment. You can view and filter
Batch costs to be viewed and filtered, forecast future costs, and set spending limits with
alerts when those limits are reached.

In the Azure portal, you can create budgets and spending alerts for your Batch pools or
Batch accounts. Budgets and alerts are useful for notifying stakeholders of any risks of
overspending, although it's possible for there to be a delay in spending alerts and to
slightly exceed a budget.

The following screenshot shows an example of the Cost analysis view for a subscription,
filtered to only display the accumulated costs associated with all Batch accounts. The
lower charts show how the total cost for the period selected can be categorized by
consumed service, location, and meter. While this is an example and is not meant to be
reflective of costs you may see for your subscriptions, it is typical in that the largest cost
is for the virtual machines that are allocated for Batch pool nodes.
A further level of cost analysis detail can be obtained by specifying a Resource filter. For
Batch accounts, these values are the Batch account name plus pool name. This allows
you to view costs for a specific pool, multiple pools, or one or more accounts.

View cost analysis for a Batch pool

Batch service pool allocation mode

For Batch accounts created with the Batch service pool allocation mode:

1. In the Azure portal, type in or select Cost Management + Billing .


2. Select your subscription in the Billing scopes section.
3. Under Cost Management, select Cost analysis.
4. Select Add Filter. In the first drop-down, select Resource.
5. In the second drop-down, select the Batch pool. When the pool is selected, you
see the cost analysis for your pool. The screenshot below shows example data.
The resulting cost analysis shows the cost of the pool as well as the resources that
contribute to this cost. In this example, the VMs used in the pool are the most costly
resource.

7 Note

The pool in this example uses Virtual Machine Configuration, and are charged
based on the Virtual Machines pricing structure. Pools that use Cloud Services
Configuration are charged based on the Cloud Services pricing structure.

Tags can be associated with Batch accounts, allowing tags to be used for further cost
filtering. For example, tags can be used to associate project, user, or group information
with a Batch account. Tags cannot currently be associated with Batch pools.

User subscription pool allocation mode

For Batch accounts created with the user subscription pool allocation mode:

1. In the Azure portal, type in or select Cost Management + Billing .


2. Select your subscription in the Billing scopes section.
3. Under Cost Management, select Cost analysis.
4. Select Add Filter. In the first drop-down, select Tag.
5. In the second drop-down, select poolname.
6. In the third drop-down, select the Batch pool. When the pool is selected, you see
the cost analysis for your pool. The screenshot below shows example data.
Note that if you're interested in viewing cost data for all pools in a user subscription
Batch account, you can select batchaccountname in the second drop-down and the
name of your Batch account in the third drop-down.

7 Note

Pools created by user subscription Batch accounts don't show up under the
Resource filter, though their usage still shows up when filtering for "virtual
machines" under service name.

Create a budget for a Batch pool


Budgets can be created and cost alerts issued when various percentages of a budget are
reached, such as 60%, 80%, and 100%. The budgets can specify one or more filters, so
you can monitor and alert on Batch account costs at various granularities.

1. From the Cost analysis page, select Budget: none.


2. Select Create new budget >.
3. Use the resulting window to configure a budget specifically for your pool. For
more information, see Tutorial: Create and manage Azure budgets.

Minimize costs associated with Azure Batch


Depending on your scenario, you may want to reduce costs as much as possible.
Consider using one or more of these strategies to maximize the efficiency of your
workloads and reduce potential costs.
Reduce pool node use
The largest costs associated with using Batch are typically from the virtual machines
allocated for pool nodes. For Virtual Machine configuration pools, the associated
managed disks used for the VM OS disks can also contribute significantly to costs.

Evaluate your Batch application to determine if pool nodes are being well utilized by job
tasks, or if pool nodes are idle for more than the expected time. It may be possible to
reduce the number of pool nodes that are allocated, reduce the rate of pool node scale-
up, or increase the rate of scale-down to increase utilization.

In addition to custom monitoring, Batch metrics can help to identify nodes that are
allocated but in an idle state. You can select a metric for most pool node states to view
by using Batch monitoring metrics in the Azure portal. The 'Idle Node Count' and
'Running Node Count' could be viewed to give an indication of how well the pool nodes
are utilized, for example.

Ensure pool nodes are able to run tasks


Allocated nodes that are listed for a pool normally incur costs, but it is possible for pool
nodes to be in a state where can't run tasks, such as 'unusable' or 'starttaskfailed'. Batch
APIs or metrics can be used to monitor for and detect this category of VM. The reason
for these states can then be determined and corrective action taken to reduce or
eliminate these unhealthy nodes.

Use the right pool node VM size


Ensure the appropriate VM size is being used, so that VMs are utilized well when
running tasks while providing the performance necessary to complete your job tasks in
the required time. Pool node VMs can be underutilized in some situations, such as low
CPU usage. Costs can be saved by choosing a VM size with a lower price.

To determine VM utilization, you can log in to a node when running tasks to view
performance data or use monitoring capabilities, such as Application Insights, to obtain
performance data from pool nodes.

Use pool slots to reduce node requirements


Multiple task slots can be specified for a pool, so that the corresponding number of
tasks can be run in parallel on each node. Pool task slots can be used to reduce the
number of nodes used in a pool by choosing larger VM sizes and running multiple tasks
in parallel on the node to ensure the node is well utilized. If nodes are underutilized,
slots can be used to increase utilization. For example, for a single-threaded task
application, one slot per core could be configured. It is also possible to have more slots
than cores. This would be applicable if the application blocks significantly waiting for
calls to external services to be returned, for one example.

Setting taskSchedulingPolicy to pack helps ensure VMs are utilized as much as possible,
with scaling more easily able to remove nodes not running any tasks.

Use Azure Spot virtual machines


Azure Spot VMs reduce the cost of Batch workloads by taking advantage of surplus
computing capacity in Azure. When you specify Spot VMs in your pools, Batch uses this
surplus to run your workload. There can be substantial cost savings when you use Spot
VMs instead of dedicated VMs. Keep in mind that Spot VMs are not suitable for all
workloads, since there may not be available capacity to allocate, or they may get
preempted.

Use ephemeral OS disks


By default, pool nodes use managed disks, which incur costs. Virtual Machine
Configuration pools in some VM sizes can use ephemeral OS disks, which create the OS
disk on the VM cache or temporary SSD, to avoid extra costs associated with managed
disks.

Purchase reservations for virtual machine instances


If you intend to use Batch for a long period of time, you can reduce the cost of VMs by
using Azure Reservations for your workloads. A reservation rate is considerably lower
than a pay-as-you-go rate. Virtual machine instances used without a reservation are
charged at the pay-as-you-go rate. When you purchase a reservation, the reservation
discount is applied. When you commit to one-year or three-year plans for VM instances,
significant discounts are applied to VM usage, including VMs consumed via Batch pools.

It is important to note that reservation discount is "use-it-or-lose-it." If there no


matching resources are used for an hour, you'll lose the reservation quantity for that
hour. Unused reserved hours can't be carried forward, and are therefore lost if not used.
Batch workloads often scale the number of allocated VMs according to load and have
varying load, including periods where there is no load. Care therefore needs to be taken
determining the reservation amount, given that reserved hours are lost if Batch VMs are
scaled down below the reservation quantity.
Use automatic scaling
Automatic scaling dynamically scales the number of VMs in your Batch pool based on
demands of the current job. When you scale the pool based on the lifetime of a job,
automatic scaling ensures that VMs are scaled up and used only when there is a job to
perform. When the job is complete, or when there are no jobs, the VMs are
automatically scaled down to save compute resources. Scaling allows you to lower the
overall cost of your Batch solution by using only the resources you need.

Next steps
Learn more about Microsoft Cost Management + Billing.
Learn about using Azure Spot VMs with Batch.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Configure public network access with
Azure Batch accounts
Article • 11/21/2024

By default, Azure Batch accounts have public endpoints and are publicly accessible. This
article shows how to configure your Batch account to allow access from only specific
public IP addresses or IP address ranges.

IP network rules are configured on the public endpoints. IP network rules don't apply to
private endpoints configured with Private Link.

Each endpoint supports a maximum of 200 IP network rules.

Batch account public endpoints


Batch accounts have two public endpoints:

The Account endpoint is the endpoint for Batch Service REST API (data plane). Use
this endpoint for managing pools, compute nodes, jobs, tasks, etc.
The Node management endpoint is used by Batch pool nodes to access the Batch
node management service. This endpoint only applicable when using simplified
compute node communication.

You can check both endpoints in account properties when you query the Batch account
with Batch Management REST API. You can also check them in the overview for your
Batch account in the Azure portal:

You can configure public network access to Batch account endpoints with the following
options:
All networks: allow public network access with no restriction.
Selected networks: allow public network access with allowed network rules.
Disabled: disable public network access, and private endpoints are required to
access Batch account endpoints.

Access from selected public networks


1. In the portal, navigate to your Batch account.
2. Under Settings, select Networking.
3. On the Public access tab, select to allow public access from Selected networks.
4. Under access for each endpoint, enter a public IP address or address range in CIDR
notation one by one.
5. Select Save.

7 Note

After adding a rule, it takes a few minutes for the rule to take effect.

 Tip

To configure IP network rules for node management endpoint, you will need to
know the public IP addresses or address ranges used by Batch pool's internet
outbound access. This can typically be determined with Batch pools created in
virtual network or with specified public IP addresses.

Disable public network access


Optionally, disable public network access to Batch account endpoints. Disabling the
public network access overrides all IP network rules configurations. For example, you
might want to disable public access to a Batch account secured in a virtual network
using Private Link.

1. In the portal, navigate to your Batch account and select Settings > Networking.
2. On the Public access tab, select Disabled.
3. Select Save.

Restore public network access


To re-enable the public network access, update the networking settings to allow public
access. Enabling the public access overrides all IP network rule configurations, and will
allow access from any IP addresses.

1. In the portal, navigate to your Batch account and select Settings > Networking.
2. On the Public access tab, select All networks.
3. Select Save.

Next steps
Learn how to use private endpoints with Batch accounts.
Learn how to use simplified compute node communication.
Learn more about creating pools in a virtual network.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Use private endpoints with Azure Batch
accounts
Article • 08/14/2023

By default, Azure Batch accounts have public endpoints and are publicly accessible. The
Batch service offers the ability to create private endpoint for Batch accounts, allowing
private network access to the Batch service.

By using Azure Private Link, you can connect to an Azure Batch account via a private
endpoint. The private endpoint is a set of private IP addresses in a subnet within your
virtual network. You can then limit access to an Azure Batch account over private IP
addresses.

Private Link allows users to access an Azure Batch account from within the virtual
network or from any peered virtual network. Resources mapped to Private Link are also
accessible on-premises over private peering through VPN or Azure ExpressRoute. You
can connect to an Azure Batch account configured with Private Link by using the
automatic or manual approval method.

This article describes the steps to create a private endpoint to access Batch account
endpoints.

Private endpoint sub-resources supported for


Batch account
Batch account resource has two endpoints supported to access with private endpoints:

Account endpoint (sub-resource: batchAccount): this endpoint is used for


accessing Batch Service REST API (data plane), for example managing pools,
compute nodes, jobs, tasks, etc.

Node management endpoint (sub-resource: nodeManagement): used by Batch


pool nodes to access Batch node management service. This endpoint is only
applicable when using simplified compute node communication.
 Tip

You can create private endpoint for one of them or both within your virtual
network, depending on the actual usage for your Batch account. For example, if you
run Batch pool within the virtual network, but call Batch service REST API from
somewhere else, you will only need to create the nodeManagement private
endpoint in the virtual network.

Azure portal
Use the following steps to create a private endpoint with your Batch account using the
Azure portal:

1. Go to your Batch account in the Azure portal.


2. In Settings, select Networking and go to the tab Private Access. Then, select +
Private endpoint.
3. In the Basics pane, enter or select the subscription, resource group, private
endpoint resource name and region details, then select Next: Resource.

4. In the Resource pane, set the Resource type to Microsoft.Batch/batchAccounts.


Select the Batch account you want to access, select the target sub-resource, then
select Next: Configuration.

5. In the Configuration pane, enter or select this information:

For Virtual network, select your virtual network.


For Subnet, select your subnet.
For Private IP configuration, select the default Dynamically allocate IP
address.
For Integrate with private DNS zone, select Yes. To connect privately with
your private endpoint, you need a DNS record. We recommend that you
integrate your private endpoint with a private DNS zone. You can also use
your own DNS servers or create DNS records by using the host files on your
virtual machines.
For Private DNS Zone, select privatelink.batch.azure.com. The private DNS
zone is determined automatically. You can't change this setting by using the
Azure portal.

) Important

If you have existing private endpoints created with previous private DNS zone
privatelink.<region>.batch.azure.com , please follow Migration with existing

Batch account private endpoints.


If you've selected private DNS zone integration, make sure the private DNS
zone is linked to your virtual network successfully. It's possible that Azure
portal let you choose an existing private DNS zone, which might not be linked
to your virtual network and you'll need to manually add the virtual network
link.
6. Select Review + create, then wait for Azure to validate your configuration.
7. When you see the Validation passed message, select Create.

 Tip

You can also create the private endpoint from Private Link Center in Azure portal,
or create a new resource by searching private endpoint.

Use the private endpoint


After the private endpoint is provisioned, you can access the Batch account using the
private IP address within the virtual network:

Private endpoint for batchAccount: can access Batch account data plane to
manage pools/jobs/tasks.

Private endpoint for nodeManagement: Batch pool's compute nodes can connect
to and be managed by Batch node management service.

 Tip

It's recommended to also disable the public network access with your Batch
account when you're using private endpoints, which will restrict the access to
private network only.

) Important

If public network access is disabled with Batch account, performing account


operations (for example pools, jobs) outside of the virtual network where the
private endpoint is provisioned will result in an "AuthorizationFailure" message for
Batch account in the Azure portal.

To view the IP addresses for the private endpoint from the Azure portal:

1. Select All resources.


2. Search for the private endpoint that you created earlier.
3. Select the DNS Configuration tab to see the DNS settings and IP addresses.
Configure DNS zones
Use a private DNS zone within the subnet where you've created the private endpoint.
Configure the endpoints so that each private IP address is mapped to a DNS entry.

When you're creating the private endpoint, you can integrate it with a private DNS zone
in Azure. If you choose to instead use a custom domain, you must configure it to add
DNS records for all private IP addresses reserved for the private endpoint.

Migration with existing Batch account private


endpoints
With the introduction of the new private endpoint sub-resource nodeManagement for
Batch node management endpoint, the default private DNS zone for Batch account is
simplified from privatelink.<region>.batch.azure.com to privatelink.batch.azure.com .
To keep backward compatibility with the previously used private DNS zone, for a Batch
account with any approved batchAccount private endpoint, its account endpoint's DNS
CNAME mappings contains both zones (with the previous zone comes first), for
example:

myaccount.east.batch.azure.com CNAME
myaccount.privatelink.east.batch.azure.com
myaccount.privatelink.east.batch.azure.com CNAME
myaccount.east.privatelink.batch.azure.com
myaccount.east.privatelink.batch.azure.com CNAME <Batch API public FQDN>

Continue to use previous private DNS zone


If you've already used the previous DNS zone privatelink.<region>.batch.azure.com
with your virtual network, you should continue to use it for existing and new
batchAccount private endpoints, and no action is needed.

) Important

With existing usage of previous private DNS zone, please keep using it even with
newly created private endpoints. Do not use the new zone with your DNS
integration solution until you can migrate to the new zone.

Create a new batchAccount private endpoint with DNS


integration in Azure portal
If you manually create a new batchAccount private endpoint using Azure portal with
automatic DNS integration enabled, it will use the new private DNS zone
privatelink.batch.azure.com for the DNS integration: create the private DNS zone, link

it to your virtual network, and configure DNS A record in the zone for your private
endpoint.

However, if your virtual network has already been linked to the previous private DNS
zone privatelink.<region>.batch.azure.com , it will break the DNS resolution for your
batch account in your virtual network, because the DNS A record for your new private
endpoint is added into the new zone but DNS resolution checks the previous zone first
for backward-compatibility support.

You can mitigate this issue with following options:

If you don't need the previous private DNS zone anymore, unlink it from your
virtual network. No further action is needed.

Otherwise, after the new private endpoint is created:

1. make sure the automatic private DNS integration has a DNS A record created
in the new private DNS zone privatelink.batch.azure.com . For example,
myaccount.<region> A <IPv4 address> .
2. Go to previous private DNS zone privatelink.<region>.batch.azure.com .

3. Manually add a DNS CNAME record. For example, myaccount CNAME =>
myaccount.<region>.privatelink.batch.azure.com .

) Important

This manual mitigation is only needed when you create a new batchAccount
private endpoint with private DNS integration in the same virtual network which
has already been linked to the previous private DNS zone.

Migrating previous private DNS zone to the new zone


Although you can keep using the previous private DNS zone with your existing
deployment process, it's recommended to migrate it to the new zone for simplicity of
DNS configuration management:

With the new private DNS zone privatelink.batch.azure.com , you won't need to
configure and manage different zones for each region with your Batch accounts.
When you start to use the new nodeManagement private endpoint that also uses
the new private DNS zone, you'll only need to manage one single private DNS
zone for both types of private endpoints.

You can migrate the previous private DNS zone with following steps:

1. Create and link the new private DNS zone privatelink.batch.azure.com to your
virtual network.
2. Copy all DNS A records from the previous private DNS zone to the new zone:

From zone "privatelink.<region>.batch.azure.com":


myaccount A <ip>
To zone "privatelink.batch.azure.com":
myaccount.<region> A <ip>

3. Unlink the previous private DNS zone from your virtual network.
4. Verify DNS resolution within your virtual network, and the Batch account DNS
name should continue to be resolved to the private endpoint IP address:
nslookup myaccount.<region>.batch.azure.com

5. Start to use the new private DNS zone with your deployment process for new
private endpoints.
6. Delete the previous private DNS zone after the migration is completed.

Pricing
For details on costs related to private endpoints, see Azure Private Link pricing .

Current limitations and best practices


When creating a private endpoint with your Batch account, keep in mind the following:

Private endpoint resources can be created in different subscription as the Batch


account, but the subscription must be registered with Microsoft.Batch resource
provider.
Resource movement isn't supported for private endpoints with Batch accounts.
If a Batch account resource is moved to a different resource group or subscription,
the private endpoints can still work, but the association to the Batch account
breaks. If you delete the private endpoint resource, its associated private endpoint
connection still exists in your Batch account. You can manually remove connection
from your Batch account.
To delete the private connection, either delete the private endpoint resource, or
delete the private connection in the Batch account (this action disconnects the
related private endpoint resource).
DNS records in the private DNS zone aren't removed automatically when you
delete a private endpoint connection from the Batch account. You must manually
remove the DNS records before adding a new private endpoint linked to this
private DNS zone. If you don't clean up the DNS records, unexpected access issues
might happen.
When private endpoint is enabled for the Batch account, the task authentication
token for Batch task is not supported. The workaround is to use Batch pool with
managed identities.

Next steps
Learn how to create Batch pools in virtual networks.
Learn how to create Batch pools without public IP addresses.
Learn how to configure public network access for Batch accounts.
Learn how to manage private endpoint connections for Batch accounts.
Learn about Azure Private Link.
Manage private endpoint connections
with Azure Batch accounts
Article • 06/24/2024

You can query and manage all existing private endpoint connections for your Batch
account. Supported management operations include:

Approve a pending connection.


Reject a connection (either in pending or approved state).
Remove a connection, which will remove the connection from Batch account and
mark the associated private endpoint resource as Disconnected state.

Azure portal
1. Go to your Batch account in Azure portal.

2. In Settings, select Networking and go to tab Private Access.

3. Select the private connection, then perform the Approve/Reject/Remove operation.

Az PowerShell module
Examples using Az PowerShell module Az.Network:

PowerShell

$accountResourceId =
"/subscriptions/<subscription>/resourceGroups/<rg>/providers/Microsoft.Batch
/batchAccounts/<account>"
$pecResourceId = "$accountResourceId/privateEndpointConnections/<pe-
connection-name>"

# List all private endpoint connections for Batch account


Get-AzPrivateEndpointConnection -PrivateLinkResourceId $accountResourceId
# Show the specified private endpoint connection
Get-AzPrivateEndpointConnection -ResourceId $pecResourceId

# Approve connection
Approve-AzPrivateEndpointConnection -Description "Approved!" -ResourceId
$pecResourceId

# Reject connection
Deny-AzPrivateEndpointConnection -Description "Rejected!" -ResourceId
$pecResourceId

# Remove connection
Remove-AzPrivateEndpointConnection -ResourceId $pecResourceId

Azure CLI
Examples using Azure CLI (az network private-endpoint):

sh

accountResourceId="/subscriptions/<subscription>/resourceGroups/<rg>/provide
rs/Microsoft.Batch/batchAccounts/<account>"
pecResourceId="$accountResourceId/privateEndpointConnections/<pe-connection-
name>"

# List all private endpoint connections for Batch account


az network private-endpoint-connection list --id $accountResourceId

# Show the specified private endpoint connection


az network private-endpoint-connection show --id $pecResourceId

# Approve connection
az network private-endpoint-connection approve --description "Approved!" --
id $pecResourceId

# Reject connection
az network private-endpoint-connection reject --description "Rejected!" --id
$pecResourceId

# Remove connection
az network private-endpoint-connection delete --id $pecResourceId

Feedback
Was this page helpful?  Yes  No
Provide product feedback | Get help at Microsoft Q&A
Configure customer-managed keys for your
Azure Batch account with Azure Key Vault
and Managed Identity
07/01/2025

By default Azure Batch uses platform-managed keys to encrypt all the customer data stored in
the Azure Batch Service, like certificates, job/task metadata. Optionally, you can use your own
keys, that is, customer-managed keys, to encrypt data stored in Azure Batch.

The keys you provide must be generated in Azure Key Vault, and they must be accessed with
managed identities for Azure resources.

There are two types of managed identities: system-assigned and user-assigned.

You can either create your Batch account with system-assigned managed identity, or create a
separate user-assigned managed identity that has access to the customer-managed keys.
Review the comparison table to understand the differences and consider which option works
best for your solution. For example, if you want to use the same managed identity to access
multiple Azure resources, a user-assigned managed identity is needed. If not, a system-
assigned managed identity associated with your Batch account may be sufficient. Using a user-
assigned managed identity also gives you the option to enforce customer-managed keys at
Batch account creation, as shown next.

Create a Batch account with system-assigned


managed identity
If you don't need a separate user-assigned managed identity, you can enable system-assigned
managed identity when you create your Batch account.

) Important

A system-assigned managed identity created for a Batch account for customer data
encryption as described in this document cannot be used as a user-assigned managed
identity on a Batch pool. If you wish to use the same managed identity on both the Batch
account and Batch pool, then use a common user-assigned managed identity instead.

Azure portal
In the Azure portal , when you create Batch accounts, pick System assigned in the identity
type under the Advanced tab.

After the account is created, you can find a unique GUID in the Identity principal Id field under
the Properties section. The Identity Type will show System assigned .

You need this value in order to grant this Batch account access to the Key Vault.

Azure CLI
When you create a new Batch account, specify SystemAssigned for the --identity parameter.

Azure CLI

resourceGroupName='myResourceGroup'
accountName='mybatchaccount'

az batch account create \


--name $accountName \
--resource-group $resourceGroupName \
--locations regionName='West US 2' \
--identity 'SystemAssigned'
After the account is created, you can verify that system-assigned managed identity has been
enabled on this account. Be sure to note the PrincipalId , as this value is needed to grant this
Batch account access to the Key Vault.

Azure CLI

az batch account show \


--name $accountName \
--resource-group $resourceGroupName \
--query identity

7 Note

The system-assigned managed identity created in a Batch account is only used for
retrieving customer-managed keys from the Key Vault. This identity is not available on
Batch pools. To use a user-assigned managed identity in a pool, see Configure managed
identities in Batch pools.

Create a user-assigned managed identity


If you prefer, you can create a user-assigned managed identity that can be used to access your
customer-managed keys.

You need the Client ID value of this identity in order for it to access the Key Vault.

Configure your Azure Key Vault instance


The Azure Key Vault in which your keys are generated must be created in the same tenant as
your Batch account. It doesn't need to be in the same resource group or even in the same
subscription.

Create an Azure Key Vault


When creating an Azure Key Vault instance with customer-managed keys for Azure Batch,
make sure that Soft Delete and Purge Protection are both enabled.
Add an access policy to your Azure Key Vault instance
In the Azure portal, after the Key Vault is created, In the Access Policy under Setting, add the
Batch account access using managed identity. Under Key Permissions, select Get, Wrap Key
and Unwrap Key.

In the Select field under Principal, fill in one of the following:

For system-assigned managed identity: Enter the principalId that you previously
retrieved or the name of the Batch account.
For user-assigned managed identity: Enter the Client ID that you previously retrieved or
the name of the user-assigned managed identity.
Generate a key in Azure Key Vault
In the Azure portal, go to the Key Vault instance in the key section, select Generate/Import.
Select the Key Type to be RSA and RSA Key Size to be at least 2048 bits. EC key types are
currently not supported as a customer-managed key on a Batch account.

After the key is created, click on the newly created key and the current version, copy the Key
Identifier under properties section. Be sure that under Permitted Operations, Wrap Key and
Unwrap Key are both checked.
Enable customer-managed keys on a Batch account
Now that the prerequisites are in place, you can enable customer-managed keys on your Batch
account.

Azure portal
In the Azure portal , go to the Batch account page. Under the Encryption section, enable
Customer-managed key. You can directly use the Key Identifier, or you can select the key vault
and then click Select a key vault and key.

Azure CLI
After the Batch account is created with system-assigned managed identity and the access to
Key Vault is granted, update the Batch account with the {Key Identifier} URL under
keyVaultProperties parameter. Also set --encryption-key-source as Microsoft.KeyVault .

Azure CLI

az batch account set \


--name $accountName \
--resource-group $resourceGroupName \
--encryption-key-source Microsoft.KeyVault \
--encryption-key-identifier {YourKeyIdentifier}
Create a Batch account with user-assigned
managed identity and customer-managed keys
As an example using the Batch management .NET client, you can create a Batch account that
has a user-assigned managed identity and customer-managed keys.

c#

string subscriptionId = "Your SubscriptionID";


string resourceGroupName = "Your ResourceGroup name";

var credential = new DefaultAzureCredential();


ArmClient _armClient = new ArmClient(credential);

ResourceIdentifier resourceGroupResourceId =
ResourceGroupResource.CreateResourceIdentifier(subscriptionId, resourceGroupName);
ResourceGroupResource resourceGroupResource =
_armClient.GetResourceGroupResource(resourceGroupResourceId);

var data = new BatchAccountCreateOrUpdateContent(AzureLocation.EastUS)


{
Encryption = new BatchAccountEncryptionConfiguration()
{
KeySource = BatchAccountKeySource.MicrosoftKeyVault,
KeyIdentifier = new Uri("Your Key Azure Resource Manager Resource ID"),
},

Identity = new ManagedServiceIdentity(ManagedServiceIdentityType.UserAssigned)


{
UserAssignedIdentities = {
[new ResourceIdentifier("Your Identity Azure Resource Manager
ResourceId")] = new UserAssignedIdentity(),
},
}
};

var lro =
resourceGroupResource.GetBatchAccounts().CreateOrUpdate(WaitUntil.Completed, "Your
BatchAccount name", data);
BatchAccountResource batchAccount = lro.Value;

Update the customer-managed key version


When you create a new version of a key, update the Batch account to use the new version.
Follow these steps:

1. Navigate to your Batch account in Azure portal and display the Encryption settings.
2. Enter the URI for the new key version. Alternately, you can select the Key Vault and the
key again to update the version.
3. Save your changes.

You can also use Azure CLI to update the version.

Azure CLI

az batch account set \


--name $accountName \
--resource-group $resourceGroupName \
--encryption-key-identifier {YourKeyIdentifierWithNewVersion}

 Tip

You can have your keys automatically rotate by creating a key rotation policy within Key
Vault. When specifying a Key Identifier for the Batch account, use the versionless key
identifier to enable autorotation with a valid rotation policy. For more information, see
how to configure key rotation in Key Vault.

Use a different key for Batch encryption


To change the key used for Batch encryption, follow these steps:

1. Navigate to your Batch account and display the Encryption settings.


2. Enter the URI for the new key. Alternately, you can select the Key Vault and choose a new
key.
3. Save your changes.

You can also use Azure CLI to use a different key.

Azure CLI

az batch account set \


--name $accountName \
--resource-group $resourceGroupName \
--encryption-key-identifier {YourNewKeyIdentifier}

Frequently asked questions


Are customer-managed keys supported for existing Batch accounts? No. Customer-
managed keys are only supported for new Batch accounts.
Can I select RSA key sizes larger than 2048 bits? Yes, RSA key sizes of 3072 and 4096
bits are also supported.
What operations are available after a customer-managed key is revoked? The only
operation allowed is account deletion if Batch loses access to the customer-managed key.
How should I restore access to my Batch account if I accidentally delete the Key Vault
key? Since purge protection and soft delete are enabled, you could restore the existing
keys. For more information, see Recover an Azure Key Vault.
Can I disable customer-managed keys? You can set the encryption type of the Batch
Account back to "Microsoft managed key" at any time. You're free to delete or change the
key afterwards.
How can I rotate my keys? Customer-managed keys aren't automatically rotated unless
the key is versionless with an appropriate key rotation policy set within Key Vault. To
manually rotate the key, update the Key Identifier that the account is associated with.
After I restore access how long will it take for the Batch account to work again? It can
take up to 10 minutes for the account to be accessible again once access is restored.
While the Batch Account is unavailable what happens to my resources? Any pools that
are active when Batch access to the customer-managed key is lost will continue to run.
However, the nodes in these pools will transition into an unavailable state, and tasks will
stop running (and be requeued). Once access is restored, nodes become available again,
and tasks are restarted.
Does this encryption mechanism apply to VM disks in a Batch pool? No. For Cloud
Services Configuration pools (which are deprecated ), no encryption is applied for the
OS and temporary disk. For Virtual Machine Configuration pools, the OS and any
specified data disks are encrypted with a Microsoft platform managed key by default.
Currently, you can't specify your own key for these disks. To encrypt the temporary disk of
VMs for a Batch pool with a Microsoft platform managed key, you must enable the
diskEncryptionConfiguration property in your Virtual Machine Configuration Pool. For
highly sensitive environments, we recommend enabling temporary disk encryption and
avoiding storing sensitive data on OS and data disks. For more information, see Create a
pool with disk encryption enabled
Is the system-assigned managed identity on the Batch account available on the
compute nodes? No. The system-assigned managed identity is currently used only for
accessing the Azure Key Vault for the customer-managed key. To use a user-assigned
managed identity on compute nodes, see Configure managed identities in Batch pools.

Next steps
Learn more about security best practices in Azure Batch.
Learn more about Azure Key Vault.
Move an Azure Batch account to another
region
Article • 04/25/2025

There are scenarios where you might want to move an existing Azure Batch account from one
region to another. For example, you might want to move for disaster recovery planning. This
article explains how to move a Batch account between regions using the Azure portal.

Moving Batch accounts directly from one region to another isn't possible. You can use an Azure
Resource Manager template (ARM template) to export the existing configuration of your Batch
account instead. Then, stage the resource in another region. First, export the Batch account to
a template. Next, modify the parameters to match the destination region. Deploy the modified
template to the new region. Last, recreate jobs and other features in the account.

For more information on Resource Manager and templates, see Quickstart: Create and deploy
Azure Resource Manager templates by using the Azure portal.

Prerequisites
Make sure that the services and features that your Batch account uses are supported in
the new target region.
It's recommended to move any Azure resources associated with your Batch account to
the new target region. For example, follow the steps in Move an Azure Storage account to
another region to move an associated autostorage account. If you prefer, you can leave
resources in the original region, however, performance is typically better when your Batch
account is in the same region as your other Azure resources used by your workload. This
article assumes you've already migrated your storage account or any other regional Azure
resources to be aligned with your Batch account.

Prepare the template


To get started, you need to export and then modify an ARM template.

Export a template
Export an ARM template that contains settings and information for your Batch account.

1. Sign in to the Azure portal .

2. Select All resources and then select your Batch account.


3. Select > Automation > Export template.

4. Choose Download in the Export template pane.

5. Locate the .zip file that you downloaded from the portal. Unzip that file into a folder of
your choice.

This zip file contains the .json files that make up the template. The file also includes
scripts to deploy the template.

Modify the template


Load and modify the template so you can create a new Batch account in the target region.

1. In the Azure portal, select Create a resource.

2. In Search the Marketplace, type template deployment, and then press ENTER.

3. Select Template deployment (deploy using custom templates).

4. Select Create.

5. Select Build your own template in the editor.

6. Select Load file, and then select the template.json file that you downloaded in the last
section.

7. In the uploaded template.json file, name the target Batch account by entering a new
defaultValue for the Batch account name. This example sets the defaultValue of the Batch
account name to mytargetaccount and replaces the string in defaultValue with the
resource ID for mytargetstorageaccount .

JSON

{
"$schema": "https://schema.management.azure.com/schemas/2019-04-
01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"batchAccounts_mysourceaccount_name": {
"defaultValue": "mytargetaccount",
"type": "String"
}
},

8. Next, update the defaultValue of the storage account with your migrated storage
account's resource ID. To get this value, navigate to the storage account in the Azure
portal, select JSON View near the top fo the screen, and then copy the value shown
under Resource ID. This example uses the resource ID for a storage account named
mytargetstorageaccount in the resource group mytargetresourcegroup .

JSON

"storageAccounts_mysourcestorageaccount_externalid": {
"defaultValue":
"/subscriptions/{subscriptionID}/resourceGroups/mytargetresourcegroup/provide
rs/Microsoft.Storage/storageAccounts/mytargetstorageaccount",
"type": "String"
}
},

9. Finally, edit the location property to use your target region. This example sets the target
region to centralus .

JSON

{
"resources": [
{
"type": "Microsoft.Batch/batchAccounts",
"apiVersion": "2021-01-01",
"name": "[parameters('batchAccounts_mysourceaccount_name')]",
"location": "centralus",

To obtain region location codes, see Azure Locations . The code for a region is the region
name with no spaces. For example, Central US = centralus.

Move the account


Deploy the template to create a new Batch account in the target region.

1. Now that you've made your modifications, select Save below the template.json file.

2. Enter or select the property values:

Subscription: Select an Azure subscription.

Resource group: Select the resource group that you created when moving the
associated storage account.

Region: Select the Azure region where you want to move the account.

3. Select Review and create, then select Create.


Configure the new Batch account
Some features don't export to a template, so you have to recreate them in the new Batch
account. These features include:

Jobs (and tasks)


Job schedules
Certificates
Application packages

Be sure to configure features in the new account as needed. You can look at how you've
configured these features in your source Batch account for reference.

) Important

New Batch accounts are entirely separate from any prior existing Batch accounts, even
within the same region. These newly created Batch accounts will have default service and
core quotas associated with them. For User Subscription pool allocation mode Batch
accounts, core quotas from the subscription will apply. You will need to ensure that these
new Batch accounts have sufficient quota before migrating your workload.

Discard or clean up
Confirm that your new Batch account is successfully working in the new region. Also make sure
to restore the necessary features. Then, you can delete the source Batch account.

1. In the Azure portal, expand the menu on the left side to open the menu of services, and
choose Batch accounts.

2. Locate the Batch account to delete, and right-click the More button (...) on the right side
of the listing. Be sure that you're selecting the original source Batch account, not the new
one you created.

3. Select Delete, then confirm.

Next steps
Learn more about moving resources to a new resource group or subscription.
Batch account shared key credential
rotation
07/01/2025

Batch accounts can be authenticated in one of two ways, either via shared key or Microsoft
Entra ID. Batch accounts with shared key authentication enabled have two keys associated with
them to allow for key rotation scenarios.

 Tip

It's highly recommended to avoid using shared key authentication with Batch accounts.
The preferred authentication mechanism is through Microsoft Entra ID. You can disable
shared key authentication during account creation or you can update allowed
Authentication Modes for an active account.

Batch shared key rotation procedure


Azure Batch accounts have two shared keys, primary or secondary . It's important not to
regenerate both keys at the same time, and instead regenerate them one at a time to avoid
potential downtime.

2 Warning

Once a key has been regenerated, it is no longer valid and the prior key cannot be
recovered for use. Ensure that your application update process follows the recommended
key rotation procedure to prevent losing access to your Batch account.

The typical key rotation procedure is as follows:

1. Normalize your application code to use either the primary or secondary key. If you're
using both keys in your application simultaneously, then any rotation procedure leads to
authentication errors. The following steps assume that you're using the primary key in
your application.
2. Regenerate the secondary key.
3. Update your application code to utilize the newly regenerated secondary key. Deploy
these changes and ensure that everything is working as expected.
4. Regenerate the primary key.
5. Optionally update your application code to use the primary key and deploy. This step
isn't strictly necessary as long as you're tracking which key is used in your application and
deployed.

Rotation in Azure portal


First, sign in to the Azure portal . Then, navigate to the Keys blade of your Batch account
under Settings. Then select either Regenerate primary or Regenerate secondary to create a
new key.

See also
Learn more about Batch accounts.
Learn how to authenticate with Batch Service APIs or Batch Management APIs with
Microsoft Entra ID.
Associate Azure Batch accounts with
network security perimeter
Article • 03/19/2025

The network security perimeter (NSP) provided by Azure networking serves as a


comprehensive tool for customers to ensure optimal security when utilizing PaaS
resources. It allows customers to establish logical boundaries for network isolation and
collectively manage public access controls for numerous PaaS resources.

With a network security perimeter:

PaaS resources associated with a specific perimeter are, by default, only able to
communicate with other PaaS resources within the same perimeter.
Explicit access rules can actively permit external inbound and outbound
communication.
Diagnostic Logs are enabled for PaaS resources within perimeter for Audit and
Compliance.

) Important

Network security perimeter rules do not govern the private link with the private
endpoint.

Network security perimeter scenarios in Batch


service
Azure Batch service is designed to support various scenarios that necessitate access to
other PaaS resources:

Application packages require communication with Azure Storage. For more


information, see batch-application-packages.
Customer managed key requires communication with Azure KeyVault. For more
information, see batch-customer-managed-key.

Network administrators can use the network security perimeter feature to create an
isolation boundary for their PaaS services. This security perimeter permits the setting up
of public access controls for various PaaS resources, providing a consistent user
experience and a uniform API. Setting up network security perimeter for PaaS
communications supported by Batch, refer to the Network security perimeter in Azure
Storage and Network security perimeter in Azure Key Vault for more details.

Network security perimeter provides several methods to enable Batch to interact with
other PaaS services if the target PaaS service is in network security perimeter:

Associate the Batch account with the same perimeter as the target resource and
assign the necessary permissions to the Managed Identity used across these
resources.
Create the profile with appropriate inbound access rules (for example, creating an
inbound access rule for the Batch account's fully qualified domain name) and apply
it to the target PaaS resource. This profile is used to evaluate inbound traffic (sent
from Batch) from outside the perimeter traffic.

Batch users can also use the network security perimeter to secure inbound traffic, not
just the outbound traffic scenarios with Azure Storage and Azure Key Vault.

7 Note

Network security perimeters do not regulate nodes within Batch pools. To ensure
network isolation for the pool, you may still need to create a nodeManagement
private endpoint for the Batch pool without public ip addresses. To enable a node
to access Azure Storage and other PaaS resources associated with a network
security perimeter, ensure that relevant access rules are added to the target PaaS
resource's profile. These access rules grant the node the necessary permissions to
visit.

Configure network security perimeter for Azure


Batch account

Prerequisite
1. Set up your Batch account by using a user-assigned managed identity.

2. It's optional but recommended to change the public network access of your Batch
account to SecuredByPerimeter .

This public network access value guarantees that the resource's inbound and
outbound connectivity is restricted to resources within the same perimeter. The
associated perimeter profile sets the rules that control public access.
This Batch account modification can be made using the Batch management
Account API or SDK BatchPublicNetworkAccess Enum value.

3. Make sure your Batch account operates only with the simplified node
communication pool.

Create a network security perimeter


Create your own network security perimeter resource using Azure portal or PowerShell
or Azure CLI.

Associate Batch account with the network security


perimeter

Using Azure portal

1. Navigate to your network security perimeter resource in the Azure portal, where
you should establish a profile for your Batch account to associate with. If you do
not create the profile, go to Settings -> Profiles to create a network security
perimeter profile initially.

2. In Overview, select the third option Associate resources to your profile


3. Associate resources with a new profile or associate resources with an existing
profile

Using PowerShell
1. Create a new profile for your network security perimeter

Azure PowerShell

# Create a new profile


$nspProfile = @{
Name = '<ProfileName>'
ResourceGroupName = '<ResourceGroupName>'
SecurityPerimeterName = '<NetworkSecurityPerimeterName>'
}

$profile = New-AzNetworkSecurityPerimeterProfile @nspProfile


2. Associate the Batch account with the network security perimeter profile

Azure PowerShell

# Associate the PaaS resource with the above created profile


$nspAssociation = @{
AssociationName = '<AssociationName>'
ResourceGroupName = '<ResourceGroupName>'
SecurityPerimeterName = '<NetworkSecurityPerimeterName>'
AccessMode = 'Learning'
ProfileId = '<NetworkSecurityPerimeterProfileId>'
PrivateLinkResourceId = '<BatchAccountResourceId>'
}

New-AzNetworkSecurityPerimeterAssociation @nspAssociation | format-


list

Using Azure CLI

1. Create a new profile for your network security perimeter with the following
command:

Azure CLI

# Create a new profile


az network perimeter profile create \
--name <ProfileName> \
--resource-group <ResourceGroupName> \
--perimeter-name <NetworkSecurityPerimeterName>

2. Associate the Batch account (PaaS resource) with the network security perimeter
profile with the following commands.

Azure CLI

# Get the profile id


az network perimeter profile show \
--name <ProfileName> \
--resource-group <ResourceGroupName> \
--perimeter-name <NetworkSecurityPerimeterName>

# Associate the Batch account with the network security perimeter


profile
# Replace <PaaSArmId> and <NetworkSecurityPerimeterProfileId> with the
values for your Batch account resource id and profile
az network perimeter association create \
--name <NetworkSecurityPerimeterAssociationName> \
--perimeter-name <NetworkSecurityPerimeterName> \
--resource-group <ResourceGroupName> \
--access-mode Learning \
--private-link-resource "{id:<PaaSArmId>}" \
--profile "{id:<NetworkSecurityPerimeterProfileId>}"

Next steps
Learn more about Security Best Practices in Azure Batch.
Learn more about Network Security Perimeter Concepts.
Learn more about Network Security Perimeter Diagnostic Logs.
Learn more about Network Security Perimeter Role Based Access Control.
Learn more about Network Security Perimeter Transition.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Authenticate Azure Batch services with
Microsoft Entra ID
Article • 04/02/2025

Azure Batch supports authentication with Microsoft Entra ID, Microsoft's multitenant
cloud based directory and identity management service. Azure uses Microsoft Entra ID
to authenticate its own customers, service administrators, and organizational users.

This article describes two ways to use Microsoft Entra authentication with Azure Batch:

Integrated authentication authenticates a user who's interacting with an


application. The application gathers a user's credentials and uses those credentials
to authenticate access to Batch resources.

A service principal authenticates an unattended application. The service principal


defines the policy and permissions for the application and represents the
application to access Batch resources at runtime.

For more information about Microsoft Entra ID, see the Microsoft Entra documentation.

Gather endpoints for authentication


To authenticate Batch applications with Microsoft Entra ID, you need to include the
Microsoft Entra endpoint and Batch resource endpoint in your code.

Microsoft Entra endpoint


The base Microsoft Entra authority endpoint is https://login.microsoftonline.com/ . To
authenticate with Microsoft Entra ID, use this endpoint with the tenant ID that identifies
the Microsoft Entra tenant to use for authentication:

https://login.microsoftonline.com/<tenant-id>

You can get your tenant ID from the main Microsoft Entra ID page in the Azure portal.
You can also select Properties in the left navigation and see the Tenant ID on the
Properties page.
) Important

The tenant-specific Microsoft Entra endpoint is required when you


authenticate by using a service principal.

When you authenticate by using integrated authentication, the tenant-specific


endpoint is recommended, but optional. You can also use the Microsoft Entra
common endpoint to provide a generic credential gathering interface when a
specific tenant isn't provided. The common endpoint is
https://login.microsoftonline.com/common .

For more information about Microsoft Entra endpoints, see Authentication vs.
authorization.

Batch resource endpoint


Use the Batch resource endpoint https://batch.core.windows.net/ to acquire a token
for authenticating requests to the Batch service.

Register your application with a tenant


The first step in using Microsoft Entra authentication is to register your application in a
Microsoft Entra tenant. Once you register your application, you can call the Microsoft
Authentication Library (MSAL) from your code. The MSAL provides an API for
authenticating with Microsoft Entra ID from your application. Registering your
application is required whether you use integrated authentication or a service principal.

When you register your application, you supply information about your application to
Microsoft Entra ID. Microsoft Entra ID then provides an application ID, also called a client
ID, that you use to associate your application with Microsoft Entra ID at runtime. For
more information about the application ID, see Application and service principal objects
in Microsoft Entra ID.

To register your Batch application, follow the steps at Register an application.

After you register your application, you can see the Application (client) ID on the
application's Overview page.

Configure integrated authentication


To authenticate with integrated authentication, you need to grant your application
permission to connect to the Batch service API. This step enables your application to use
Microsoft Entra ID to authenticate calls to the Batch service API.

After you register your application, follow these steps to grant the application access to
the Batch service:

1. In the Azure portal, search for and select app registrations.


2. On the App registrations page, select your application.
3. On your application's page, select API permissions from the left navigation.
4. On the API permissions page, select Add a permission.
5. On the Request API permissions page, select Azure Batch.
6. On the Azure Batch page, under Select permissions, select the checkbox next to
user_impersonation, and then select Add permissions.

The API permissions page now shows that your Microsoft Entra application has access
to both Microsoft Graph and Azure Batch. Permissions are granted to Microsoft Graph
automatically when you register an app with Microsoft Entra ID.

Configure a service principal


To authenticate an application that runs unattended, you use a service principal. When
your application authenticates by using a service principal, it sends both the application
ID and a secret key to Microsoft Entra ID.

After you register your application, follow these steps in the Azure portal to configure a
service principal:

1. Request a secret for your application.


2. Assign Azure role-based access control (Azure RBAC) to your application.

Request a secret for your application


Follow these steps to create and copy the secret key to use in your code:

1. In the Azure portal, search for and select app registrations.


2. On the App registrations page, select your application.
3. On your application's page, select Certificates & secrets from the left navigation.
4. On the Certificates & secrets page, select New client secret.
5. On the Add a client secret page, enter a description and select an expiration
period for the secret.
6. Select Add to create the secret and display it on the Certificates & secrets page.
7. Copy the secret Value to a safe place, because you won't be able to access it again
after you leave this page. If you lose access to your key, you can generate a new
one.

Assign Azure RBAC to your application


Follow these steps to assign an Azure RBAC role to your application. For more
information, see Role-based access control for Azure Batch service.

1. In the Azure portal, navigate to the Batch account your application uses.
2. Select Access control (IAM) from the left navigation.
3. On the Access control (IAM) page, select Add role assignment.
4. On the Add role assignment page, select the Role tab, and then select one of
Azure Batch built-in RBAC roles the role for your app.
5. Select the Members tab, and select Select members under Members.
6. On the Select members screen, search for and select your application, and then
select Select.
7. Select Review + assign on the Add role assignment page.

Your application should now appear on the Role assignments tab of the Batch account's
Access control (IAM) page.

Code examples
The code examples in this section show how to authenticate with Microsoft Entra ID by
using integrated authentication or with a service principal. The code examples use .NET
and Python, but the concepts are similar for other languages.

7 Note

A Microsoft Entra authentication token expires after one hour. When you use a
long-lived BatchClient object, it's best to get a token from MSAL on every request
to ensure that you always have a valid token.

To do this in .NET, write a method that retrieves the token from Microsoft Entra ID,
and pass that method to a BatchTokenCredentials object as a delegate. Every
request to the Batch service calls the delegate method to ensure that a valid token
is provided. By default MSAL caches tokens, so a new token is retrieved from
Microsoft Entra-only when necessary. For more information about tokens in
Microsoft Entra ID, see Security tokens.

Code example: Use Microsoft Entra integrated


authentication with Batch .NET
To authenticate with integrated authentication from Batch .NET:

1. Install the Azure Batch .NET and the MSAL NuGet packages.

2. Declare the following using statements in your code:

C#
using Microsoft.Azure.Batch;
using Microsoft.Azure.Batch.Auth;
using Microsoft.Identity.Client;

3. Reference the Microsoft Entra endpoint, including the tenant ID. You can get your
tenant ID from the Microsoft Entra ID Overview page in the Azure portal.

C#

private const string AuthorityUri =


"https://login.microsoftonline.com/<tenant-id>";

4. Reference the Batch service resource endpoint:

C#

private const string BatchResourceUri =


"https://batch.core.windows.net/";

5. Reference your Batch account:

C#

private const string BatchAccountUrl = "https://<myaccount>.


<mylocation>.batch.azure.com";

6. Specify the application (client) ID for your application. You can get the application
ID from your application's Overview page in the Azure portal.

C#

private const string ClientId = "<application-id>";

7. Specify the redirect URI that you provided when you registered the application.

C#

private const string RedirectUri = "https://<redirect-uri>";

8. Write a callback method to acquire the authentication token from Microsoft Entra
ID. The following example calls MSAL to authenticate a user who's interacting with
the application. The MSAL
IConfidentialClientApplication.AcquireTokenByAuthorizationCode method prompts
the user for their credentials. The application proceeds once the user provides
credentials.

The authorizationCode parameter is the authorization code obtained from the


authorization server after the user authenticates. WithRedirectUri specifies the
redirect URI that the authorization server redirects the user to after authentication.

C#

public static async Task<string> GetTokenUsingAuthorizationCode(string


authorizationCode, string redirectUri, string[] scopes)
{
var app = ConfidentialClientApplicationBuilder.Create(ClientId)
.WithAuthority(AuthorityUri)
.WithRedirectUri(RedirectUri)
.Build();

var authResult = await app.AcquireTokenByAuthorizationCode(scopes,


authorizationCode).ExecuteAsync();
return authResult.AccessToken;
}

9. Call this method with the following code, replacing <authorization-code> with the
authorization code obtained from the authorization server. The .default scope
ensures that the user has permission to access all the scopes for the resource.

C#

var token = await GetTokenUsingAuthorizationCode("<authorization-


code>", "RedirectUri", new string[] { "BatchResourceUri/.default" });

10. Construct a BatchTokenCredentials object that takes the delegate as a parameter.


Use those credentials to open a BatchClient object. Then use the BatchClient
object for subsequent operations against the Batch service:

C#

public static void PerformBatchOperations()


{
Func<Task<string>> tokenProvider = () =>
GetTokenUsingAuthorizationCode();

using (var client = BatchClient.Open(new


BatchTokenCredentials(BatchAccountUrl, tokenProvider)))
{
client.JobOperations.ListJobs();
}
}

Code example: Use a Microsoft Entra service principal


with Batch .NET
To authenticate with a service principal from Batch .NET:

1. Install the Azure Batch .NET and the MSAL NuGet packages.

2. Declare the following using statements in your code:

C#

using Microsoft.Azure.Batch;
using Microsoft.Azure.Batch.Auth;
using Microsoft.Identity.Client;

3. Reference the Microsoft Entra endpoint, including the tenant ID. When you use a
service principal, you must provide a tenant-specific endpoint. You can get your
tenant ID from the Microsoft Entra ID Overview page in the Azure portal.

C#

private const string AuthorityUri =


"https://login.microsoftonline.com/<tenant-id>";

4. Reference the Batch service resource endpoint:

C#

private const string BatchResourceUri =


"https://batch.core.windows.net/";

5. Reference your Batch account:

C#

private const string BatchAccountUrl = "https://<myaccount>.


<mylocation>.batch.azure.com";

6. Specify the application (client) ID for your application. You can get the application
ID from your application's Overview page in the Azure portal.
C#

private const string ClientId = "<application-id>";

7. Specify the secret key that you copied from the Azure portal.

C#

private const string ClientKey = "<secret-key>";

8. Write a callback method to acquire the authentication token from Microsoft Entra
ID. The following ConfidentialClientApplicationBuilder.Create method calls MSAL
for unattended authentication.

C#

public static async Task<string> GetAccessToken(string[] scopes)


{
var app = ConfidentialClientApplicationBuilder.Create(clientId)
.WithClientSecret(ClientKey)
.WithAuthority(new Uri(AuthorityUri))
.Build();

var result = await


app.AcquireTokenForClient(scopes).ExecuteAsync();
return result.AccessToken;
}

9. Call this method by using the following code. The .default scope ensures that the
application has permission to access all the scopes for the resource.

C#

var token = await GetAccessToken(new string[] { $"


{BatchResourceUri}/.default" });

10. Construct a BatchTokenCredentials object that takes the delegate as a parameter.


Use those credentials to open a BatchClient object. Then use the BatchClient
object for subsequent operations against the Batch service:

C#

public static void PerformBatchOperations()


{
Func<Task<string>> tokenProvider = () => GetAccessToken();
using (var client = BatchClient.Open(new
BatchTokenCredentials(BatchAccountUrl, tokenProvider)))
{
client.JobOperations.ListJobs();
}
}

Code example: Use a Microsoft Entra service principal


with Batch Python
To authenticate with a service principal from Batch Python:

1. Install the azure-batch and azure-common Python modules.

2. Reference the modules:

Python

from azure.batch import BatchServiceClient


from azure.common.credentials import ServicePrincipalCredentials

3. To use a service principal, provide a tenant-specific endpoint. You can get your
tenant ID from the Microsoft Entra ID Overview page or Properties page in the
Azure portal.

Python

TENANT_ID = "<tenant-id>"

4. Reference the Batch service resource endpoint:

Python

RESOURCE = "https://batch.core.windows.net/"

5. Reference your Batch account:

Python

BATCH_ACCOUNT_URL = "https://<myaccount>.<mylocation>.batch.azure.com"

6. Specify the application (client) ID for your application. You can get the application
ID from your application's Overview page in the Azure portal.
Python

CLIENT_ID = "<application-id>"

7. Specify the secret key that you copied from the Azure portal:

Python

SECRET = "<secret-key>"

8. Create a ServicePrincipalCredentials object:

Python

credentials = ServicePrincipalCredentials(
client_id=CLIENT_ID,
secret=SECRET,
tenant=TENANT_ID,
resource=RESOURCE
)

9. Use the service principal credentials to open a BatchServiceClient object. Then use
the BatchServiceClient object for subsequent operations against the Batch service.

Python

batch_client = BatchServiceClient(
credentials,
batch_url=BATCH_ACCOUNT_URL
)

For a Python example of how to create a Batch client authenticated by using a Microsoft
Entra token, see the Deploying Azure Batch Custom Image with a Python Script
sample .

Next steps
Authenticate Batch Management solutions with Active Directory
Client credential flows in MSAL.NET
Using MSAL.NET to get tokens by authorization code (for web sites)
Application and service principal objects in Microsoft Entra ID
How to create a Microsoft Entra application and service principal that can access
resources
Microsoft identity platform code samples
Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Authenticate Batch Management
solutions with Microsoft Entra ID
Article • 06/24/2024

Applications that call the Azure Batch Management service authenticate with Microsoft
Authentication Library (Microsoft Entra ID). Microsoft Entra ID is Microsoft's multitenant
cloud based directory and identity management service. Azure itself uses Microsoft
Entra ID for the authentication of its customers, service administrators, and
organizational users.

The Batch Management .NET library exposes types for working with Batch accounts,
account keys, applications, and application packages. The Batch Management .NET
library is an Azure resource provider client, and is used together with Azure Resource
Manager to manage these resources programmatically. Microsoft Entra ID is required to
authenticate requests made through any Azure resource provider client, including the
Batch Management .NET library, and through Azure Resource Manager.

In this article, we explore using Microsoft Entra ID to authenticate from applications that
use the Batch Management .NET library. We show how to use Microsoft Entra ID to
authenticate a subscription administrator or co-administrator, using integrated
authentication. We use the AccountManagement sample project, available on GitHub,
to walk through using Microsoft Entra ID with the Batch Management .NET library.

To learn more about using the Batch Management .NET library and the
AccountManagement sample, see Manage Batch accounts and quotas with the Batch
Management client library for .NET.

Register your application with Microsoft Entra


ID
The Microsoft Authentication Library (MSAL) provides a programmatic interface to
Microsoft Entra ID for use within your applications. To call MSAL from your application,
you must register your application in a Microsoft Entra tenant. When you register your
application, you supply Microsoft Entra ID with information about your application,
including a name for it within the Microsoft Entra tenant. Microsoft Entra ID then
provides an application ID that you use to associate your application with Microsoft
Entra ID at runtime. To learn more about the application ID, see Application and service
principal objects in Microsoft Entra ID.
To register the AccountManagement sample application, follow the steps in the Adding
an Application section in Integrating applications with Microsoft Entra ID. Specify Native
Client Application for the type of application. The industry standard OAuth 2.0 URI for
the Redirect URI is urn:ietf:wg:oauth:2.0:oob . However, you can specify any valid URI
(such as http://myaccountmanagementsample ) for the Redirect URI, as it does not need to
be a real endpoint.

Once you complete the registration process, you'll see the application ID and the object
(service principal) ID listed for your application.

Grant the Azure Resource Manager API access


to your application
Next, you'll need to delegate access to your application to the Azure Resource Manager
API. The Microsoft Entra identifier for the Resource Manager API is Windows Azure
Service Management API.

Follow these steps in the Azure portal:

1. In the left-hand navigation pane of the Azure portal, choose All services, click App
Registrations, and click Add.

2. Search for the name of your application in the list of app registrations:

3. Display the Settings blade. In the API Access section, select Required permissions.

4. Click Add to add a new required permission.

5. In step 1, enter Windows Azure Service Management API, select that API from the
list of results, and click the Select button.

6. In step 2, select the check box next to Access Azure classic deployment model as
organization users, and click the Select button.

7. Click the Done button.

The Required Permissions blade now shows that permissions to your application are
granted to both the MSAL and Resource Manager APIs. Permissions are granted to
MSAL by default when you first register your app with Microsoft Entra ID.
Microsoft Entra endpoints
To authenticate your Batch Management solutions with Microsoft Entra ID, you'll need
two well-known endpoints.

The Microsoft Entra common endpoint provides a generic credential gathering


interface when a specific tenant is not provided, as in the case of integrated
authentication:

https://login.microsoftonline.com/common

The Azure Resource Manager endpoint is used to acquire a token for


authenticating requests to the Batch management service:

https://management.core.windows.net/

The AccountManagement sample application defines constants for these endpoints.


Leave these constants unchanged:

C#

// Azure Active Directory "common" endpoint.


private const string AuthorityUri =
"https://login.microsoftonline.com/common";
// Azure Resource Manager endpoint
private const string ResourceUri = "https://management.core.windows.net/";

Reference your application ID


Your client application uses the application ID (also referred to as the client ID) to access
Microsoft Entra ID at runtime. Once you've registered your application in the Azure
portal, update your code to use the application ID provided by Microsoft Entra ID for
your registered application. In the AccountManagement sample application, copy your
application ID from the Azure portal to the appropriate constant:

C#

// Specify the unique identifier (the "Client ID") for your application.
This is required so that your
// native client application (i.e. this sample) can access the Microsoft
Graph API. For information
// about registering an application in Azure Active Directory, please see
"Register an application with the Microsoft identity platform" here:
// https://learn.microsoft.com/azure/active-directory/develop/quickstart-
register-app
private const string ClientId = "<application-id>";

Also copy the redirect URI that you specified during the registration process. The
redirect URI specified in your code must match the redirect URI that you provided when
you registered the application.

C#

// The URI to which Azure AD will redirect in response to an OAuth 2.0


request. This value is
// specified by you when you register an application with AAD (see ClientId
comment). It does not
// need to be a real endpoint, but must be a valid URI (e.g.
https://accountmgmtsampleapp).
private const string RedirectUri = "http://myaccountmanagementsample";

Acquire a Microsoft Entra authentication token


After you register the AccountManagement sample in the Microsoft Entra tenant and
update the sample source code with your values, the sample is ready to authenticate
using Microsoft Entra ID. When you run the sample, the MSAL attempts to acquire an
authentication token. At this step, it prompts you for your Microsoft credentials:

C#

// Obtain an access token using the "common" AAD resource. This allows the
application
// to query AAD for information that lies outside the application's tenant
(such as for
// querying subscription information in your Azure account).
AuthenticationContext authContext = new AuthenticationContext(AuthorityUri);
AuthenticationResult authResult = authContext.AcquireToken(ResourceUri,
ClientId,
new
Uri(RedirectUri),

PromptBehavior.Auto);

After you provide your credentials, the sample application can proceed to issue
authenticated requests to the Batch management service.

Next steps
For more information on running the AccountManagement sample application ,
see Manage Batch accounts and quotas with the Batch Management client library
for .NET.
To learn more about Microsoft Entra ID, see the Microsoft Entra Documentation.
In-depth examples showing how to use MSAL are available in the Azure Code
Samples library.
To authenticate Batch service applications using Microsoft Entra ID, see
Authenticate Batch service solutions with Active Directory.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Use certificates to securely access Azure
Key Vault with Batch
07/01/2025

2 Warning

Batch account certificates as detailed in this article are deprecated. To securely access
Azure Key Vault, simply use Pool managed identities with the appropriate access
permissions configured for the user-assigned managed identity to access your Key Vault. If
you need to provision certificates on Batch nodes, please utilize the available Azure Key
Vault VM extension in conjunction with pool Managed Identity to install and manage
certificates on your Batch pool. For more information on deploying certificates from Azure
Key Vault with Managed Identity on Batch pools, see Enable automatic certificate
rotation in a Batch pool.

In this article, you'll learn how to set up Batch nodes with certificates to securely access
credentials stored in Azure Key Vault.

To authenticate to Azure Key Vault from a Batch node, you need:

A Microsoft Entra credential


A certificate
A Batch account
A Batch pool with at least one node

Obtain a certificate
If you don't already have a certificate, use the PowerShell cmdlet New-SelfSignedCertificate to
make a new self-signed certificate.

Create a service principal


Access to Key Vault is granted to either a user or a service principal. To access Key Vault
programmatically, use a service principal with the certificate you created in the previous step.
The service principal must be in the same Microsoft Entra tenant as the Key Vault.

PowerShell

$now = [System.DateTime]::Parse("2020-02-10")
# Set this to the expiration date of the certificate
$expirationDate = [System.DateTime]::Parse("2021-02-10")
# Point the script at the cer file you created $cerCertificateFilePath =
'c:\temp\batchcertificate.cer'
$cer = New-Object System.Security.Cryptography.X509Certificates.X509Certificate2
$cer.Import($cerCertificateFilePath)
# Load the certificate into memory
$credValue = [System.Convert]::ToBase64String($cer.GetRawCertData())
# Create a new AAD application that uses this certificate
$newADApplication = New-AzureRmADApplication -DisplayName "Batch Key Vault Access"
-HomePage "https://batch.mydomain.com" -IdentifierUris
"https://batch.mydomain.com" -certValue $credValue -StartDate $now -EndDate
$expirationDate
# Create new AAD service principal that uses this application
$newAzureAdPrincipal = New-AzureRmADServicePrincipal -ApplicationId
$newADApplication.ApplicationId

The URLs for the application aren't important, since we're only using them for Key Vault access.

Grant rights to Key Vault


The service principal created in the previous step needs permission to retrieve the secrets from
Key Vault. Permission can be granted either through the Azure portal or with the PowerShell
command below.

PowerShell

Set-AzureRmKeyVaultAccessPolicy -VaultName 'BatchVault' -ServicePrincipalName


'"https://batch.mydomain.com' -PermissionsToSecrets 'Get'

Assign a certificate to a Batch account


Create a Batch pool, then go to the certificate tab in the pool and assign the certificate you
created. The certificate is now on all Batch nodes.

Next, assign the certificate to the Batch account. Assigning the certificate to the account lets
Batch assign it to the pools and then to the nodes. The easiest way to do this is to go to your
Batch account in the portal, navigate to Certificates, and select Add. Upload the .pfx file you
generated earlier and supply the password. Once complete, the certificate is added to the list
and you can verify the thumbprint.

Now when you create a Batch pool, you can navigate to Certificates within the pool and assign
the certificate you created to that pool. When you do so, ensure you select LocalMachine for
the store location. The certificate is loaded on all Batch nodes in the pool.
Install Azure PowerShell
If you plan on accessing Key Vault using PowerShell scripts on your nodes, then you need the
Azure PowerShell library installed. If your nodes have Windows Management Framework
(WMF) 5 installed, you can use the install-module command to download it. If you're using
nodes that don’t have WMF 5, the easiest way to install it is to bundle up the Azure PowerShell
.msi file with your Batch files, and then call the installer as the first part of your Batch startup
script. See this example for details:

PowerShell

$psModuleCheck=Get-Module -ListAvailable -Name Azure -Refresh


if($psModuleCheck.count -eq 0) {
$psInstallerPath = Join-Path $downloadPath "azure-powershell.3.4.0.msi" Start-
Process msiexec.exe -ArgumentList /i, $psInstallerPath, /quiet -wait
}

Access Key Vault


Now you're ready to access Key Vault in scripts running on your Batch nodes. To access Key
Vault from a script, all you need is for your script to authenticate against Microsoft Entra ID
using the certificate. To do this in PowerShell, use the following example commands. Specify
the appropriate GUID for Thumbprint, App ID (the ID of your service principal), and Tenant ID
(the tenant where your service principal exists).

PowerShell

Add-AzureRmAccount -ServicePrincipal -CertificateThumbprint -ApplicationId

Once authenticated, access KeyVault as you normally would.

PowerShell

$adminPassword=Get-AzureKeyVaultSecret -VaultName BatchVault -Name batchAdminPass

These are the credentials to use in your script.

Next steps
Learn more about Azure Key Vault.
Review the Azure Security Baseline for Batch.
Learn about Batch features such as configuring access to compute nodes, using Linux
compute nodes, and using private endpoints.
Role-based access control for Azure Batch
service
08/12/2025

Azure Batch Service supports a set of built-in Azure roles that provide different levels of
permissions to Azure Batch account. By using Azure role-based access control (Azure RBAC), an
authorization system for managing individual access to Azure resources, you could assign
specific permissions to users, service principals, or other identities that need to interact with
your Batch account. You can also assign custom roles with custom, fine-grained permissions
that adapt your specific use scenario.

7 Note

All RBAC (both built-in and custom) roles are for users authenticated by Microsoft Entra
ID, not for the Batch shared key credentials. The Batch shared key credentials give full
permission to the Batch account.

Assign Azure RBAC


Follow these steps to assign an Azure RBAC role to a user, group, service principal, or managed
identity. For detailed steps, see Assign Azure roles by using the Azure portal.

1. In the Azure portal, navigate to your specific Batch account.

 Tip

You can also set up Azure RBAC for whole resource groups, subscriptions, or
management groups. Do this by selecting the desired scope level and then
navigating to the desired item. For example, selecting Resource groups and then
navigating to a specific resource group.

2. Select Access control (IAM) from the left navigation.

3. On the Access control (IAM) page, select Add role assignment.

4. On the Add role assignment page, select the Role tab, and then select one of Azure
Batch built-in RBAC roles.

5. Select the Members tab, and select Select members under Members.
6. On the Select members screen, search for and select a user, group, service principal, or
managed identity, and then select Select.

7 Note

When configuring an application to authenticate Azure Batch services with service


principal, search and select your application here to configure its access and
permissions to the Azure Batch account.

7. Select Review + assign on the Add role assignment page.

The target identity should now appear on the Role assignments tab of the Batch account's
Access control (IAM) page.

Azure Batch built-in RBAC roles


Azure Batch has some predefined roles to address common user scenarios, ensuring
appropriate access levels on Azure Batch account could be efficiently assigned to an identity
for their specific duty.

ノ Expand table

Built-in role Description ID

Azure Batch Account Grants full access to manage all Batch 29fe4964-1e60-436b-
Contributor resources, including Batch accounts, pools, and bd3a-77fd4c178b3c
jobs.

Azure Batch Account Lets you view all resources including pools and 11076f67-66f6-4be0-8f6b-
Reader jobs in the Batch account. f0609fd05cc9

Azure Batch Data Grants permissions to manage Batch pools and 6aaa78f1-f7de-44ca-8722-
Contributor jobs but not to modify accounts. c64a23943cae

Azure Batch Job Lets you submit and manage jobs in the Batch 48e5e92e-a480-4e71-
Submitter account. aa9c-2778f4c13781

ノ Expand table
Permissions Azure Batch Azure Batch Azure Batch Azure Batch
Account Account Data Job
Contributor Reader Contributor Submitter

List Batch accounts or view ✓ ✓ ✓


properties of a Batch
account

Create, update or delete a ✓


Batch account

List access keys for a Batch ✓


account

Regenerate access keys for ✓


a Batch account

List or view properties of ✓ ✓ ✓ ✓


applications and application
packages on a Batch
account

Create, update or delete ✓ ✓


applications and application
packages on a Batch
account

List or view properties of ✓ ✓ ✓


certificates on a Batch
account

Create, update or delete ✓ ✓


certificates on a Batch
account

List or view properties of ✓ ✓ ✓ ✓


pools on a Batch account

Create, update or delete ✓ ✓


pools on a Batch account

List or view properties of ✓ ✓ ✓ ✓


jobs on a Batch account

Create, update or delete ✓ ✓ ✓


jobs on a Batch account

List or view properties of ✓ ✓ ✓ ✓


job schedules on a Batch
account
Permissions Azure Batch Azure Batch Azure Batch Azure Batch
Account Account Data Job
Contributor Reader Contributor Submitter

Create, update or delete job ✓ ✓ ✓


schedules on a Batch
account

2 Warning

The Batch account certificate feature has been retired.

Azure Batch Account Contributor


Grants full access to manage all Batch resources, including Batch accounts, pools, and jobs.

ノ Expand table

Actions Description

Microsoft.Authorization/*/read Read roles and role assignments.

Microsoft.Insights/alertRules/* Create and manage a classic metric alert.

Microsoft.Resources/deployments/* Create and manage a deployment.

Microsoft.Resources/subscriptions/resourceGroups/read Gets or lists resource groups.

Microsoft.Batch/*

NotActions

none

DataActions

Microsoft.Batch/*

NotDataActions

none

JSON

{
"assignableScopes": [
"/"
],
"description": "Grants full access to manage all Batch resources, including
Batch accounts, pools and jobs.",
"id": "/providers/Microsoft.Authorization/roleDefinitions/29fe4964-1e60-436b-
bd3a-77fd4c178b3c",
"permissions": [
{
"actions": [
"Microsoft.Authorization/*/read",
"Microsoft.Batch/*",
"Microsoft.Insights/alertRules/*",
"Microsoft.Resources/deployments/*",
"Microsoft.Resources/subscriptions/resourceGroups/read"
],
"dataActions": [
"Microsoft.Batch/*"
],
"notActions": [],
"notDataActions": []
}
],
"roleName": "Azure Batch Account Contributor",
"roleType": "BuiltInRole",
"type": "Microsoft.Authorization/roleDefinitions"
}

Azure Batch Account Reader


Lets you view all resources including pools and jobs in the Batch account.

ノ Expand table

Actions Description

Microsoft.Batch/*/read View all resources in Batch account.

Microsoft.Resources/subscriptions/resourceGroups/read Gets or lists resource groups.

NotActions

none

DataActions

Microsoft.Batch/*/read View all resources in Batch account.

NotDataActions

none
JSON

{
"assignableScopes": [
"/"
],
"description": "Lets you view all resources including pools and jobs in the
Batch account.",
"id": "/providers/Microsoft.Authorization/roleDefinitions/11076f67-66f6-4be0-
8f6b-f0609fd05cc9",
"permissions": [
{
"actions": [
"Microsoft.Batch/*/read",
"Microsoft.Resources/subscriptions/resourceGroups/read"
],
"dataActions": [
"Microsoft.Batch/*/read"
],
"notActions": [],
"notDataActions": []
}
],
"roleName": "Azure Batch Account Reader",
"roleType": "BuiltInRole",
"type": "Microsoft.Authorization/roleDefinitions"
}

Azure Batch Data Contributor


Grants permissions to manage Batch pools and jobs but not to modify accounts.

ノ Expand table

Actions Description

Microsoft.Authorization/*/read Read roles and role assignments.

Microsoft.Batch/batchAccounts/read Lists Batch accounts or gets the


properties of a Batch account.

Microsoft.Batch/batchAccounts/applications/* Create and manage applications and


application packages on a Batch
account.

Microsoft.Batch/batchAccounts/certificates/* Create and manage certificates on a


Batch account. (Warning: Certificate
feature was retired)
Actions Description

Microsoft.Batch/batchAccounts/certificateOperationResults/* Gets the results of a long running


certificate operation on a Batch
account. (Warning: Certificate feature
was retired)

Microsoft.Batch/pools/* Create and manage pools on a Batch


account.

Microsoft.Batch/poolOperationResults/* Gets the results of a long running


pool operation on a Batch account.

Microsoft.Batch/locations/*/read Get Batch account operation


result/Batch quota/supported VM
size at the given location.

Microsoft.Insights/alertRules/* Create and manage a classic metric


alert.

Microsoft.Resources/deployments/* Create and manage a deployment.

Microsoft.Resources/subscriptions/resourceGroups/read Gets or lists resource groups.

NotActions

none

DataActions

Microsoft.Batch/batchAccounts/jobSchedules/* Create and manage job schedules on


a Batch account.

Microsoft.Batch/batchAccounts/jobs/* Create and manage jobs on a Batch


account.

NotDataActions

none

JSON

{
"assignableScopes": [
"/"
],
"description": "Grants permissions to manage Batch pools and jobs but not to
modify accounts.",
"id": "/providers/Microsoft.Authorization/roleDefinitions/6aaa78f1-f7de-44ca-
8722-c64a23943cae",
"permissions": [
{
"actions": [
"Microsoft.Authorization/*/read",
"Microsoft.Batch/batchAccounts/read",
"Microsoft.Batch/batchAccounts/applications/*",
"Microsoft.Batch/batchAccounts/certificates/*",
"Microsoft.Batch/batchAccounts/certificateOperationResults/*",
"Microsoft.Batch/batchAccounts/pools/*",
"Microsoft.Batch/batchAccounts/poolOperationResults/*",
"Microsoft.Batch/locations/*/read",
"Microsoft.Insights/alertRules/*",
"Microsoft.Resources/deployments/*",
"Microsoft.Resources/subscriptions/resourceGroups/read"
],
"dataActions": [
"Microsoft.Batch/batchAccounts/jobSchedules/*",
"Microsoft.Batch/batchAccounts/jobs/*"
],
"notActions": [],
"notDataActions": []
}
],
"roleName": "Azure Batch Data Contributor",
"roleType": "BuiltInRole",
"type": "Microsoft.Authorization/roleDefinitions"
}

Azure Batch Job Submitter


Lets you submit and manage jobs in the Batch account.

ノ Expand table

Actions Description

Microsoft.Batch/batchAccounts/applications/read Lists applications or gets the properties


of an application.

Microsoft.Batch/batchAccounts/applications/versions/read Gets the properties of an application


package.

Microsoft.Batch/pools/read Lists pools on a Batch account or gets


the properties of a pool.

Microsoft.Insights/alertRules/* Create and manage a classic metric


alert.

Microsoft.Resources/subscriptions/resourceGroups/read Gets or lists resource groups.

NotActions
Actions Description

none

DataActions

Microsoft.Batch/batchAccounts/jobSchedules/* Create and manage job schedules on a


Batch account.

Microsoft.Batch/batchAccounts/jobs/* Create and manage jobs on a Batch


account.

NotDataActions

none

JSON

{
"assignableScopes": [
"/"
],
"description": "Lets you submit and manage jobs in the Batch account.",
"id": "/providers/Microsoft.Authorization/roleDefinitions/48e5e92e-a480-4e71-
aa9c-2778f4c13781",
"permissions": [
{
"actions": [
"Microsoft.Batch/batchAccounts/applications/read",
"Microsoft.Batch/batchAccounts/applications/versions/read",
"Microsoft.Batch/batchAccounts/pools/read",
"Microsoft.Insights/alertRules/*",
"Microsoft.Resources/subscriptions/resourceGroups/read"
],
"dataActions": [
"Microsoft.Batch/batchAccounts/jobSchedules/*",
"Microsoft.Batch/batchAccounts/jobs/*"
],
"notActions": [],
"notDataActions": []
}
],
"roleName": "Azure Batch Job Submitter",
"roleType": "BuiltInRole",
"type": "Microsoft.Authorization/roleDefinitions"
}

Assign a custom role


If Azure Batch built-in roles don't meet your needs, Azure custom roles could be used to grant
granular permission to a user for submitting jobs, tasks, and more. You can use a custom role
to grant or deny permissions to a Microsoft Entra ID for the following Azure Batch RBAC
operations.

Microsoft.Batch/batchAccounts/pools/write
Microsoft.Batch/batchAccounts/pools/delete
Microsoft.Batch/batchAccounts/pools/read
Microsoft.Batch/batchAccounts/jobSchedules/write
Microsoft.Batch/batchAccounts/jobSchedules/delete
Microsoft.Batch/batchAccounts/jobSchedules/read
Microsoft.Batch/batchAccounts/jobs/write
Microsoft.Batch/batchAccounts/jobs/delete
Microsoft.Batch/batchAccounts/jobs/read
Microsoft.Batch/batchAccounts/certificates/write
Microsoft.Batch/batchAccounts/certificates/delete (Warning: Certificate feature was
retired)
Microsoft.Batch/batchAccounts/certificates/read (Warning: Certificate feature was retired)
Microsoft.Batch/batchAccounts/applications/write
Microsoft.Batch/batchAccounts/applications/delete
Microsoft.Batch/batchAccounts/applications/read
Microsoft.Batch/batchAccounts/applications/versions/write
Microsoft.Batch/batchAccounts/applications/versions/delete
Microsoft.Batch/batchAccounts/applications/versions/read
Microsoft.Batch/batchAccounts/read, for any read operation
Microsoft.Batch/batchAccounts/listKeys/action, for any operation

 Tip

Jobs that use autopool require pool-level write permissions.

7 Note

Certain role assignments need to be specified in the actions field, whereas others need to
be specified in the dataActions field. You need to examine both actions and dataActions
to understand the full scope of capabilities assigned to a role. For more information, see
Azure resource provider operations.

The following example shows an Azure Batch custom role definition:


JSON

{
"properties":{
"roleName":"Azure Batch Custom Job Submitter",
"type":"CustomRole",
"description":"Allows a user to submit autopool jobs to Azure Batch",
"assignableScopes":[
"/subscriptions/aaaa0a0a-bb1b-cc2c-dd3d-eeeeee4e4e4e"
],
"permissions":[
{
"actions":[
"Microsoft.Batch/*/read",
"Microsoft.Batch/batchAccounts/pools/write",
"Microsoft.Batch/batchAccounts/pools/delete",
"Microsoft.Authorization/*/read",
"Microsoft.Resources/subscriptions/resourceGroups/read",
"Microsoft.Support/*",
"Microsoft.Insights/alertRules/*"
],
"notActions":[

],
"dataActions":[
"Microsoft.Batch/batchAccounts/jobs/*",
"Microsoft.Batch/batchAccounts/jobSchedules/*"
],
"notDataActions":[

]
}
]
}
}

Next steps
Create a Batch account in the Azure portal
Authenticate Batch Management solutions with Microsoft Entra ID
Authenticate Azure Batch services with Microsoft Entra ID
Copy applications and data to pool nodes
07/01/2025

Azure Batch supports several ways for getting data and applications onto compute nodes so
that they're available for use by tasks.

The method you choose may depend on the scope of your file or application. Data and
applications may be required to run the entire job, and so need to be installed on every node.
Some files or applications may be required only for a specific task. Others may need to be
installed for the job, but don't need to be on every node. Batch has tools for each of these
scenarios.

Determine the scope required of a file


You need to determine the scope of a file - is the file required for a pool, a job, or a task. Files
that are scoped to the pool should use pool application packages, or a start task. Files scoped
to the job should use a job preparation task. A good example of files scoped at the pool or job
level are applications. Files scoped to the task should use task resource files.

Pool start task resource files


For applications or data that need to be installed on every node in the pool, use pool start task
resource files. Use this method along with either an application package or the start task's
resource file collection in order to perform an install command.

For example, you can use the start task command line to move or install applications. You can
also specify a list of files or containers in an Azure storage account. For more information, see
Add#ResourceFile in REST documentation.

If every job that runs on the pool runs an application (.exe) that must first be installed with a
.msi file, you'll need to set the start task's wait for success property to true. For more
information, see Add#StartTask in REST documentation.

Application package references


For applications or data that need to be installed on every node in the pool, consider using
application packages. There is no install command associated with an application package, but
you can use a start task to run any install command. If your application doesn't require
installation, or consists of a large number of files, you can use this method.
Application packages are useful when you have a large number of files, because they can
combine many file references into a small payload. If you try to include more than 100 separate
resource files into one task, the Batch service might come up against internal system limitations
for a single task. Application packages are also useful when you have many different versions
of the same application and need to choose between them.

Extensions
Extensions are small applications that facilitate post-provisioning configuration and setup on
Batch compute nodes. When you create a pool, you can select a supported extension to be
installed on the compute nodes as they are provisioned. After that, the extension can perform
its intended operation.

Job preparation task resource files


For applications or data that must be installed for the job to run, but don't need to be installed
on the entire pool, consider using job preparation task resource files.

For example, if your pool has many different types of jobs, and only one job type needs an .msi
file in order to run, it makes sense to put the installation step into a job preparation task.

Task resource files


Task resource files are appropriate when your application or data is relevant only to an
individual task.

For example, you might have five tasks, each processing a different file and then writing the
output to blob storage In this case, the input file should be specified on the task resource files
collection, because each task has its own input file.

Additional ways to get data onto nodes


Because you have control over Azure Batch nodes, and can run custom executables, you can
pull data from any number of custom sources. Make sure the Batch node has connectivity to
the target and that you have credentials to that source on the node.

A few examples of ways to transfer data to Batch nodes are:

Downloading data from SQL


Downloading data from other web services/custom locations
Mapping a network share
Azure storage
Keep in mind that blob storage has download scalability targets. Azure storage file share
scalability targets are the same as for a single blob. The size will impact the number of nodes
and pools you need.

Next steps
Learn about using application packages with Batch.
Learn more about working with nodes and pools.
Deploy applications to compute nodes with
Batch application packages
Article • 04/25/2025

Application packages can simplify the code in your Azure Batch solution and make it easier to
manage the applications that your tasks run. With application packages, you can upload and
manage multiple versions of the applications your tasks run, including their supporting files.
You can then automatically deploy one or more of these applications to the compute nodes in
your pool.

The APIs for creating and managing application packages are part of the Batch Management
.NET library. The APIs for installing application packages on a compute node are part of the
Batch .NET library. Comparable features are in the available Batch APIs for other programming
languages.

This article explains how to upload and manage application packages in the Azure portal. It
also shows how to install them on a pool's compute nodes with the Batch .NET library.

Application package requirements


To use application packages, you need to link an Azure Storage account to your Batch account.

There are restrictions on the number of applications and application packages within a Batch
account and on the maximum application package size. For more information, see Batch
service quotas and limits.

7 Note

Batch pools created prior to July 5, 2017 do not support application packages (unless they
were created after March 10, 2016 by using Cloud Services Configuration). The application
packages feature described here supersedes the Batch Apps feature available in previous
versions of the service.

Understand applications and application packages


Within Azure Batch, an application refers to a set of versioned binaries that can be
automatically downloaded to the compute nodes in your pool. An application contains one or
more application packages, which represent different versions of the application.
Each application package is a .zip file that contains the application binaries and any supporting
files. Only the .zip format is supported.

You can specify application packages at the pool or task level.

Pool application packages are deployed to every node in the pool. Applications are
deployed when a node joins a pool and when it's rebooted or reimaged.

Pool application packages are appropriate when all nodes in a pool run a job's tasks. You
can specify one or more application packages to deploy when you create a pool. You can
also add or update an existing pool's packages. To install a new package to an existing
pool, you must restart its nodes.

Task application packages are deployed only to a compute node scheduled to run a task,
just before running the task's command line. If the specified application package and
version is already on the node, it isn't redeployed and the existing package is used.

Task application packages are useful in shared-pool environments, where different jobs
run on one pool, and the pool isn't deleted when a job completes. If your job has fewer
tasks than nodes in the pool, task application packages can minimize data transfer, since
your application is deployed only to the nodes that run tasks.

Other scenarios that can benefit from task application packages are jobs that run a large
application but for only a few tasks. For example, task applications might be useful for a
heavyweight preprocessing stage or a merge task.

With application packages, your pool's start task doesn't have to specify a long list of individual
resource files to install on the nodes. You don't have to manually manage multiple versions of
your application files in Azure Storage or on your nodes. And you don't need to worry about
generating SAS URLs to provide access to the files in your Azure Storage account. Batch works
in the background with Azure Storage to store application packages and deploy them to
compute nodes.

7 Note

The total size of a start task must be less than or equal to 32,768 characters, including
resource files and environment variables. If your start task exceeds this limit, using
application packages is another option. You can also create a .zip file containing your
resource files, upload the file as a blob to Azure Storage, and then unzip it from the
command line of your start task.

Upload and manage applications


You can use the Azure portal or the Batch Management APIs to manage the application
packages in your Batch account. The following sections explain how to link a storage account,
you learn how to add and manage applications and application packages in the Azure portal.

7 Note

While you can define application values in the Microsoft.Batch/batchAccounts resource


of an ARM template, it's not currently possible to use an ARM template to upload
application packages to use in your Batch account. You must upload them to your linked
storage account as described in Add a new application.

Link a storage account


To use application packages, you must link an Azure Storage account to your Batch account.
The Batch service uses the associated storage account to store your application packages.
Ideally, you should create a storage account specifically for use with your Batch account.

If you haven't yet configured a storage account, the Azure portal displays a warning the first
time you select Applications from the left navigation menu in your Batch account. To need to
link a storage account to your Batch account:

1. Select the Warning window that states, "No Storage account configured for this batch
account."
2. Then choose Storage Account set... on the next page.
3. Choose the Select a storage account link in the Storage Account Information section.
4. Select the storage account you want to use with this batch account in the list on the
Choose storage account pane.
5. Then select Save on the top left corner of the page.

After you link the two accounts, Batch can automatically deploy the packages stored in the
linked Storage account to your compute nodes.

) Important

You can't use application packages with Azure Storage accounts configured with firewall
rules or with Hierarchical namespace set to Enabled.

The Batch service uses Azure Storage to store your application packages as block blobs. You're
charged as normal for the block blob data, and the size of each package can't exceed the
maximum block blob size. For more information, see Scalability and performance targets for
Blob storage. To minimize costs, be sure to consider the size and number of your application
packages, and periodically remove deprecated packages.

Add a new application


To create a new application, you add an application package and specify a unique application
ID.

In your Batch account, select Applications from the left navigation menu, and then select Add.

Enter the following information:

Application ID: The ID of your new application.


Version": The version for the application package you're uploading.
Application package: The .zip file containing the application binaries and supporting files
that are required to run the application.

The Application ID and Version you enter must follow these requirements:

On Windows nodes, the ID can contain any combination of alphanumeric characters,


hyphens, and underscores. On Linux nodes, only alphanumeric characters and
underscores are permitted.
Can't contain more than 64 characters.
Must be unique within the Batch account.
IDs are case-preserving and case-insensitive.

When you're ready, select Submit. After the .zip file has been uploaded to your Azure Storage
account, the portal displays a notification. Depending on the size of the file that you're
uploading and the speed of your network connection, this process might take some time.

View current applications


To view the applications in your Batch account, select Applications in the left navigation menu.

Selecting this menu option opens the Applications window. This window displays the ID of
each application in your account and the following properties:

Packages: The number of versions associated with this application.


Default version: If applicable, the application version that is installed if no version is
specified when deploying the application.
Allow updates: Specifies whether package updates and deletions are allowed.

To see the file structure of the application package on a compute node, navigate to your Batch
account in the Azure portal. Select Pools. Then select the pool that contains the compute node.
Select the compute node on which the application package is installed and open the
applications folder.

View application details


To see the details for an application, select it in the Applications window. You can configure
your application by selecting Settings in the left navigation menu.

Allow updates: Indicates whether application packages can be updated or deleted. The
default is Yes. If set to No, existing application packages can't be updated or deleted, but
new application package versions can still be added.
Default version: The default application package to use when the application is deployed
if no version is specified.
Display name: A friendly name that your Batch solution can use when it displays
information about the application. For example, this name can be used in the UI of a
service that you provide to your customers through Batch.

Add a new application package


To add an application package version for an existing application, select the application on the
Applications page of your Batch account. Then select Add.

As you did for the new application, specify the Version for your new package, upload your .zip
file in the Application package field, and then select Submit.

Update or delete an application package


To update or delete an existing application package, select the application on the Applications
page of your Batch account. Select the ellipsis in the row of the application package that you
want to modify. Then select the action that you want to perform.

If you select Update, you can upload a new .zip file. This file replaces the previous .zip file that
you uploaded for that version.

If you select Delete, you're prompted to confirm the deletion of that version. After you select
OK, Batch deletes the .zip file from your Azure Storage account. If you delete the default
version of an application, the Default version setting is removed for that application.

Install applications on compute nodes


You've learned how to manage application packages in the Azure portal. Now you can learn
how to deploy them to compute nodes and run them with Batch tasks.

Install pool application packages


To install an application package on all compute nodes in a pool, specify one or more
application package references for the pool. The application packages that you specify for a
pool are installed on each compute node that joins the pool and on any node that is rebooted
or reimaged.

In Batch .NET, specify one or more CloudPool.ApplicationPackageReferences when you create a


new pool or when you use an existing pool. The ApplicationPackageReference class specifies an
application ID and version to install on a pool's compute nodes.

C#

// Create the unbound CloudPool


CloudPool myCloudPool =
batchClient.PoolOperations.CreatePool(
poolId: "myPool",
targetDedicatedComputeNodes: 1,
virtualMachineSize: "standard_d1_v2",
VirtualMachineConfiguration: new VirtualMachineConfiguration(
imageReference: new ImageReference(
publisher: "MicrosoftWindowsServer",
offer: "WindowsServer",
sku: "2019-datacenter-core",
version: "latest"),
nodeAgentSkuId: "batch.node.windows amd64");

// Specify the application and version to install on the compute nodes


myCloudPool.ApplicationPackageReferences = new List<ApplicationPackageReference>
{
new ApplicationPackageReference {
ApplicationId = "litware",
Version = "1.1001.2b" }
};

// Commit the pool so that it's created in the Batch service. As the nodes join
// the pool, the specified application package is installed on each.
await myCloudPool.CommitAsync();

) Important

If an application package deployment fails, the Batch service marks the node unusable
and no tasks are scheduled for execution on that node. If this happens, restart the node to
reinitiate the package deployment. Restarting the node also enables task scheduling again
on the node.

Install task application packages


Similar to a pool, you specify application package references for a task. When a task is
scheduled to run on a node, the package is downloaded and extracted just before the task's
command line runs. If a specified package and version is already installed on the node, the
package isn't downloaded and the existing package is used.

To install a task application package, configure the task's


CloudTask.ApplicationPackageReferences property:

C#

CloudTask task =
new CloudTask(
"litwaretask001",
"cmd /c %AZ_BATCH_APP_PACKAGE_LITWARE%\\litware.exe -args -here");

task.ApplicationPackageReferences = new List<ApplicationPackageReference>


{
new ApplicationPackageReference
{
ApplicationId = "litware",
Version = "1.1001.2b"
}
};

Execute the installed applications


The packages that you specify for a pool or task are downloaded and extracted to a named
directory within the AZ_BATCH_ROOT_DIR of the node. Batch also creates an environment variable
that contains the path to the named directory. Your task command lines use this environment
variable when referencing the application on the node.

On Windows nodes, the variable is in the following format:

Windows:
AZ_BATCH_APP_PACKAGE_APPLICATIONID#version
On Linux nodes, the format is slightly different. Periods (.), hyphens (-) and number signs (#) are
flattened to underscores in the environment variable. Also, the case of the application ID is
preserved. For example:

Linux:
AZ_BATCH_APP_PACKAGE_applicationid_version

APPLICATIONID and version are values that correspond to the application and package version

you've specified for deployment. For example, if you specify that version 2.7 of application
blender should be installed on Windows nodes, your task command lines would use this
environment variable to access its files:

Windows:
AZ_BATCH_APP_PACKAGE_BLENDER#2.7

On Linux nodes, specify the environment variable in this format. Flatten the periods (.), hyphens
(-) and number signs (#) to underscores, and preserve the case of the application ID:

Linux:
AZ_BATCH_APP_PACKAGE_blender_2_7

When you upload an application package, you can specify a default version to deploy to your
compute nodes. If you've specified a default version for an application, you can omit the
version suffix when you reference the application. You can specify the default application
version in the Azure portal, in the Applications window, as shown in Upload and manage
applications.

For example, if you set "2.7" as the default version for application blender, and your tasks
reference the following environment variable, then your Windows nodes use version 2.7:

AZ_BATCH_APP_PACKAGE_BLENDER

The following code snippet shows an example task command line that launches the default
version of the blender application:

C#
string taskId = "blendertask01";
string commandLine =
@"cmd /c %AZ_BATCH_APP_PACKAGE_BLENDER%\blender.exe -args -here";
CloudTask blenderTask = new CloudTask(taskId, commandLine);

 Tip

For more information about compute node environment settings, see Environment
settings for tasks.

Update a pool's application packages


If an existing pool has already been configured with an application package, you can specify a
new package for the pool. This means:

The Batch service installs the newly specified package on all new nodes that join the pool
and on any existing node that is rebooted or reimaged.
Compute nodes that are already in the pool when you update the package references
don't automatically install the new application package. These compute nodes must be
rebooted or reimaged to receive the new package.
When a new package is deployed, the created environment variables reflect the new
application package references.

In this example, the existing pool has version 2.7 of the blender application configured as one
of its CloudPool.ApplicationPackageReferences. To update the pool's nodes with version 2.76b,
specify a new ApplicationPackageReference with the new version, and commit the change.

C#

string newVersion = "2.76b";


CloudPool boundPool = await batchClient.PoolOperations.GetPoolAsync("myPool");
boundPool.ApplicationPackageReferences = new List<ApplicationPackageReference>
{
new ApplicationPackageReference {
ApplicationId = "blender",
Version = newVersion }
};
await boundPool.CommitAsync();

Now that the new version has been configured, the Batch service installs version 2.76b to any
new node that joins the pool. To install 2.76b on the nodes that are already in the pool, reboot
or reimage them. Rebooted nodes retain files from previous package deployments.
List the applications in a Batch account
You can list the applications and their packages in a Batch account by using the
ApplicationOperations.ListApplicationSummaries method.

C#

// List the applications and their application packages in the Batch account.
List<ApplicationSummary> applications = await
batchClient.ApplicationOperations.ListApplicationSummaries().ToListAsync();
foreach (ApplicationSummary app in applications)
{
Console.WriteLine("ID: {0} | Display Name: {1}", app.Id, app.DisplayName);

foreach (string version in app.Versions)


{
Console.WriteLine(" {0}", version);
}
}

Next steps
The Batch REST API also provides support to work with application packages. For
example, see the applicationPackageReferences element for how to specify packages to
install, and Applications for how to obtain application information.
Learn how to programmatically manage Azure Batch accounts and quotas with Batch
Management .NET. The Batch Management .NET library can enable account creation and
deletion features for your Batch application or service.
Creating and using resource files
Article • 02/07/2025

An Azure Batch task often requires some form of data to process. Resource files are the
way to provide this data to your Batch virtual machine (VM) via a task. All types of tasks
support resource files: tasks, start tasks, job preparation tasks, job release tasks, etc. This
article covers a few common methods of how to create resource files and place them on
a VM.

Resource files put data onto a VM in Batch, but the type of data and how it's used is
flexible. There are, however, some common use cases:

Provision common files on each VM using resource files on a start task.


Provision input data to be processed by tasks.

Common files could be, for example, files on a start task used to install applications that
your tasks run. Input data could be raw image or video data, or any information to be
processed by Batch.

Types of resource files


There are a few different options available to generate resource files, each with their
own methods. The creation process for resource files varies depending on where the
original data is stored and whether multiple files should be created.

Storage container URL: Generates resource files from any storage container in
Azure.
Storage container name: Generates resource files from the name of a container in
the Azure storage account linked to your Batch account (the autostorage account).
Single resource file from web endpoint: Generates a single resource file from any
valid HTTP URL.

Storage container URL


Using a storage container URL means, with the correct permissions, you can access files
in any storage container in Azure.

In this C# example, the files have already been uploaded to an Azure storage container
as blob storage. To access the data needed to create a resource file, we first need to get
access to the storage container. This can be done in several ways.
Shared Access Signature
Create a shared access signature (SAS) URI with the correct permissions to access the
storage container. Set the expiration time and permissions for the SAS. In this case, no
start time is specified, so the SAS becomes valid immediately and expires two hours
after it's generated.

C#

SharedAccessBlobPolicy sasConstraints = new SharedAccessBlobPolicy


{
SharedAccessExpiryTime = DateTime.UtcNow.AddHours(2),
Permissions = SharedAccessBlobPermissions.Read |
SharedAccessBlobPermissions.List
};

7 Note

For container access, you must have both Read and List permissions, whereas with
blob access, you only need Read permission.

Once the permissions are configured, create the SAS token and format the SAS URL for
access to the storage container. Using the formatted SAS URL for the storage container,
generate a resource file with FromStorageContainerUrl.

C#

CloudBlobContainer container =
blobClient.GetContainerReference(containerName);

string sasToken = container.GetSharedAccessSignature(sasConstraints);


string containerSasUrl = String.Format("{0}{1}", container.Uri, sasToken);

ResourceFile inputFile =
ResourceFile.FromStorageContainerUrl(containerSasUrl);

If desired, you can use the blobPrefix property to limit downloads to only those blobs
whose name begins with a specified prefix:

C#

ResourceFile inputFile =
ResourceFile.FromStorageContainerUrl(containerSasUrl, blobPrefix =
yourPrefix);
Managed identity
Create a user-assigned managed identity and assign it the Storage Blob Data Reader
role for your Azure Storage container. Next, assign the managed identity to your pool so
that your VMs can access the identity. Finally, you can access the files in your container
by specifying the identity for Batch to use.

C#

CloudBlobContainer container =
blobClient.GetContainerReference(containerName);

ResourceFile inputFile = ResourceFile.FromStorageContainerUrl(container.Uri,


identityReference: new ComputeNodeIdentityReference() { ResourceId =
"/subscriptions/SUB/resourceGroups/RG/providers/Microsoft.ManagedIdentity/us
erAssignedIdentities/identity-name" });

Public access
An alternative to generating a SAS URL or using a managed identity is to enable
anonymous, public read-access to a container and its blobs in Azure Blob storage. By
doing so, you can grant read-only access to these resources without sharing your
account key, and without requiring a SAS. Public access is typically used for scenarios
where you want certain blobs to be always available for anonymous read-access. If this
scenario suits your solution, see Configure anonymous public read access for containers
and blobs to learn more about managing access to your blob data.

Storage container name (autostorage)


Instead of configuring and creating a SAS URL, you can use the name of your Azure
storage container to access your blob data. The storage container you use must be in
the Azure storage account that's linked to your Batch account, sometimes referred to as
the autostorage account. Using the autostorage container allows you to bypass
configuring and creating a SAS URL to access a storage container. Instead, you provide
the name of the storage container in your linked storage account.

If you don't have an autostorage account already, see the steps in Create a Batch
account for details on how to create and link a storage account.

The following example uses AutoStorageContainer to generate the file from data in the
autostorage account.

C#
ResourceFile inputFile =
ResourceFile.FromAutoStorageContainer(containerName);

As with a storage container URL, you can use the blobPrefix property to specify which
blobs will be downloaded:

C#

ResourceFile inputFile =
ResourceFile.FromAutoStorageContainer(containerName, blobPrefix =
yourPrefix);

Single resource file from web endpoint


To create a single resource file, you can specify a valid HTTP URL containing your input
data. The URL is provided to the Batch API, and then the data is used to create a
resource file. This method can be used whether the data to create your resource file is in
Azure Storage, or in any other web location, such as a GitHub endpoint.

The following example uses FromUrl to retrieve the file from a string that contains a
valid URL, then generates a resource file to be used by your task. No credentials are
needed for this scenario. (Credentials are required if using blob storage, unless public
read access is enabled on the blob container.)

C#

ResourceFile inputFile = ResourceFile.FromUrl(yourURL, filePath);

You can also use a string that you define as a URL (or a combination of strings that,
together, create the full URL for your file).

C#

ResourceFile inputFile = ResourceFile.FromUrl(yourDomain + yourFile,


filePath);

If your file is in Azure Storage, you can use a managed identity instead of generating a
Shared Access Signature for the resource file.

C#

ResourceFile inputFile = ResourceFile.FromUrl(yourURLFromAzureStorage,


identityReference: new ComputeNodeIdentityReference() { ResourceId =
"/subscriptions/SUB/resourceGroups/RG/providers/Microsoft.ManagedIdentity/us
erAssignedIdentities/identity-name"},
filePath: filepath
);

7 Note

Managed identity authentication will only work with files in Azure Storage. The
managed identity needs the Storage Blob Data Reader role assignment for the
container the file is in, and it must also be assigned to the Batch pool.

Tips and suggestions


Azure Batch tasks can use files in many ways, which is why Batch provides various
options for managing files on tasks. The following scenarios aren't meant to be
comprehensive, but cover a few common situations and provide recommendations.

Many resource files


If common task files are shared among many tasks in your Batch job, you may want to
use an application package to contain those files. Application packages provide
optimization for download speed, and data in application packages is cached between
tasks. With application packages, you don't need to manually manage several resource
files or generate SAS URLs to access the files in Azure Storage. Batch works in the
background with Azure Storage to store and deploy application packages to compute
nodes. If your task files don't change often, application packages may be a good fit for
your solution.

Conversely, if your tasks each have many files unique to that task, resource files are likely
the best option. Tasks that use unique files often need to be updated or replaced, which
is not as easy to do with application package content. Resource files provide additional
flexibility for updating, adding, or editing individual files.

Number of resource files per task


When a task specifies a large number of resource files, Batch might reject the task as
being too large. This depends on the total length of the filenames or URLs (as well as
identity reference) for all the files added to the task. It's best to keep your tasks small by
minimizing the number of resource files on the task itself.
If there's no way to minimize the number of files your task needs, you can optimize the
task by creating a single resource file that references a storage container of resource
files. To do this, put your resource files into an Azure Storage container and use one of
the methods described above to generate resource files as needed.

Next steps
Learn about application packages as an alternative to resource files.
Learn about using containers for resource files.
Learn how to gather and save the output data from your tasks.
Learn about the Batch APIs and tools available for building Batch solutions.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Choose a VM size and image for compute
nodes in an Azure Batch pool
Article • 04/28/2025

When you select a node size for an Azure Batch pool, you can choose from almost all the VM
sizes available in Azure. Azure offers a range of sizes for Linux and Windows VMs for different
workloads.

Supported VM series and sizes

Pools in Virtual Machine configuration


Batch pools in the Virtual Machine configuration support almost all VM sizes available in Azure.
The supported VM sizes in a region can be obtained via the Batch Management API. You can
use one of the following methods to return a list of VM sizes supported by Batch in a region:

PowerShell: Get-AzBatchSupportedVirtualMachineSku
Azure CLI: az batch location list-skus
Batch Management APIs: List Supported Virtual Machine SKUs

For example, using the Azure CLI, you can obtain the list of skus for a particular Azure region
with the following command:

Azure CLI

az batch location list-skus --location <azure-region>

 Tip

Avoid VM SKUs/families with impending Batch support end of life (EOL) dates. These dates
can be discovered via the ListSupportedVirtualMachineSkus API, PowerShell, or Azure
CLI. For more information, see the Batch best practices guide regarding Batch pool VM
SKU selection.

Using Generation 2 VM Images


Some VM series, such as FX and Mv2, can only be used with generation 2 VM images.
Generation 2 VM images are specified like any VM image, using the sku property of the
imageReference configuration; the sku strings have a suffix such as -g2 or -gen2 . To get a list
of VM images supported by Batch, including generation 2 images, use the 'list supported
images' API, PowerShell, or Azure CLI.

Size considerations
Application requirements - Consider the characteristics and requirements of the
application run on the nodes. Aspects like whether the application is multithreaded and
how much memory it consumes can help determine the most suitable and cost-effective
node size. For multi-instance MPI workloads or CUDA applications, consider specialized
HPC or GPU-enabled VM sizes, respectively. For more information, see Use RDMA-
capable or GPU-enabled instances in Batch pools.

Tasks per node - It's typical to select a node size assuming one task runs on a node at a
time. However, it might be advantageous to have multiple tasks (and therefore multiple
application instances) run in parallel on compute nodes during job execution. In this case,
it's common to choose a multicore node size to accommodate the increased demand of
parallel task execution.

Load levels for different tasks - All of the nodes in a pool are the same size. If you intend
to run applications with differing system requirements and/or load levels, we recommend
that you use separate pools.

Region availability - A VM series or size might not be available in the regions where you
create your Batch accounts. To check that a size is available, see Products available by
region .

Quotas - The cores quotas in your Batch account can limit the number of nodes of a
given size you can add to a Batch pool. When needed, you can request a quota increase.

Supported VM images
Use one of the following APIs to return a list of Windows and Linux VM images currently
supported by Batch, including the node agent SKU IDs for each image:

PowerShell: Get-AzBatchSupportedImage
Azure CLI: az batch pool supported-images
Batch Service APIs: List Supported Images

For example, using the Azure CLI, you can obtain the list of supported VM images with the
following command:

Azure CLI
az batch pool supported-images list

Images that have a verificationType of verified undergo regular interoperability validation


testing with the Batch service by the Azure Batch team. The verified designation doesn't
mean that every possible application or usage scenario is validated, but that functionality
exposed by the Batch API such as executing tasks, mounting a supported virtual filesystem, etc.
are regularly tested as part of release processes. Images that have a verificationType of
unverified don't undergo regular validation testing but were initially verified to boot on Azure

Batch compute nodes and transition to an idle compute node state. Support for unverified
images isn't guaranteed.

 Tip

Avoid images with impending Batch support end of life (EOL) dates. These dates can be
discovered via the ListSupportedImages API, PowerShell, or Azure CLI. For more
information, see the Batch best practices guide regarding Batch pool VM image selection.

 Tip

The value of the AZ_BATCH_NODE_ROOT_DIR compute node environment variable is


dependent upon if the VM has a local temporary disk or not. See Batch root directory
location for more information.

Next steps
Learn about the Batch service workflow and primary resources such as pools, nodes, jobs,
and tasks.
Learn about using specialized VM sizes with RDMA-capable or GPU-enabled instances in
Batch pools.
Update Batch pool properties
Article • 04/02/2025

When you create an Azure Batch pool, you specify certain properties that define the
configuration of the pool. Examples include specifying the VM size, VM image to use,
virtual network configuration, and encryption settings. However, you may need to
update pool properties as your workload evolves over time or if a VM image reaches
end-of-life.

Some, but not all, of these pool properties can be patched or updated to accommodate
these situations. This article provides information about updateable pool properties,
expected behaviors for pool property updates, and examples.

 Tip

Some pool properties can only be updated using the Batch Management Plane
APIs or SDKs using Entra authentication. You will need to install or use the
appropriate API or SDK for these operations to be available.

Updateable pool properties


Batch provides multiple methods to update properties on a pool. Selecting which API to
use determines the set of pool properties that can be updated as well as the update
behavior.

7 Note

If you want to update pool properties that aren't part of the following Update or
Patch APIs, then you must recreate the pool to reflect the desired state.

Management Plane: Pool - Update


The recommended path to updating pool properties is utilizing the Pool - Update API as
part of the Batch Management Plane API or SDK. This API provides the most
comprehensive and flexible way to update pool properties. Using this API allows select
update of Management plane only pool properties and the ability to update other
properties that would otherwise be immutable via Data Plane APIs.
) Important

You must use API version 2024-07-01 or newer of the Batch Management Plane API
for updating pool properties as described in this section.

Since this operation is a PATCH , only pool properties specified in the request are
updated. If properties aren't specified as part of the request, then the existing values
remain unmodified.

Some properties can only be updated when the pool has no active nodes in it or where
the total number of compute nodes in the pool is zero. The properties that don't require
the pool to be size zero for the new value to take effect are:

applicationPackages
certificates
metadata
scaleSettings
startTask

If there are active nodes when the pool is updated with these properties, reboot of
active compute nodes may be required for changes to take effect. For more information,
see the documentation for each individual pool property.

All other updateable pool properties require the pool to be of size zero nodes to be
accepted as part of the request to update.

You may also use Pool - Create API to update these select properties, but since the
operation is a PUT , the request fully replaces all existing properties. Therefore, any
property that isn't specified in the request is removed or set with the associated default.

Example: Update VM Image Specification

The following example shows how to update a pool VM image configuration via the
Management Plane C# SDK:

C#

public async Task UpdatePoolVmImage()


{
// Authenticate
var clientId = Environment.GetEnvironmentVariable("CLIENT_ID");
var clientSecret = Environment.GetEnvironmentVariable("CLIENT_SECRET");
var tenantId = Environment.GetEnvironmentVariable("TENANT_ID");
var subscriptionId =
Environment.GetEnvironmentVariable("SUBSCRIPTION_ID");
ClientSecretCredential credential = new
ClientSecretCredential(tenantId, clientId, clientSecret);
ArmClient client = new ArmClient(credential, subscriptionId);

// Get an existing Batch account


string resourceGroupName = "<resourcegroup>";
string accountName = "<batchaccount>";
ResourceIdentifier batchAccountResourceId =
BatchAccountResource.CreateResourceIdentifier(subscriptionId,
resourceGroupName, accountName);
BatchAccountResource batchAccount =
client.GetBatchAccountResource(batchAccountResourceId);

// get the collection of this BatchAccountPoolResource


BatchAccountPoolCollection collection =
batchAccount.GetBatchAccountPools();

// Update the pool


string poolName = "mypool";
BatchAccountPoolData data = new BatchAccountPoolData()
{
DeploymentConfiguration = new BatchDeploymentConfiguration()
{
VmConfiguration = new BatchVmConfiguration(new
BatchImageReference()
{
Publisher = "MicrosoftWindowsServer",
Offer = "WindowsServer",
Sku = "2022-datacenter-azure-edition-smalldisk",
Version = "latest",
},
nodeAgentSkuId: "batch.node.windows amd64"),
},
};

ArmOperation<BatchAccountPoolResource> lro = await


collection.CreateOrUpdateAsync(WaitUntil.Completed, poolName, data);
BatchAccountPoolResource result = lro.Value;

BatchAccountPoolData resourceData = result.Data;


Console.WriteLine($"Succeeded on id: {resourceData.Id}");
}

Example: Update VM Size and Target Node Communication Mode

The following example shows how to update a pool VM image size and target node
communication mode to be simplified via REST API:

HTTP
PATCH
https://management.azure.com/subscriptions/<subscriptionid>/resourceGroups/<
resourcegroupName>/providers/Microsoft.Batch/batchAccounts/<batchaccountname
>/pools/<poolname>?api-version=2024-07-01

Request Body

JSON

{
"type": "Microsoft.Batch/batchAccounts/pools",
"parameters": {
"properties": {
"vmSize": "standard_d32ads_v5",
"targetNodeCommunicationMode": "simplified"
}
}
}

Data Plane: Pool - Patch or Update Properties


The Data Plane offers the ability to either patch or update select pool properties. The
available APIs are the Pool - Patch API or the Pool - Update Properties API as part of the
Batch Data Plane API or SDK.

The Patch API allows patching of select pool properties as specified in the
documentation such as the startTask . Since this operation is a PATCH , only pool
properties specified in the request are updated. If properties aren't specified as part of
the request, then the existing values remain unmodified.

The Update Properties API allows select update of the pool properties as specified in the
documentation. This request fully replaces the existing properties, therefore any
property that isn't specified in the request is removed.

Compute nodes must be rebooted for changes to take effect for the following
properties:

applicationPackageReferences
certificateReferences
startTask

The pool must be resized to zero active nodes for updates to the
targetNodeCommunicationMode property.
FAQs
Do I need to perform any other operations after updating pool properties while
the pool has active nodes?

Yes, for pool properties that can be updated with active nodes, there are select
properties which require compute nodes to be rebooted. Alternatively, the pool can be
scaled down to zero nodes to reflect the modified properties.

Can I modify the Managed identity collection on the pool while the pool has active
nodes?

Yes, but you shouldn't. While Batch doesn't prohibit mutation of the collection with
active nodes, we recommend avoiding doing so as that leads to inconsistency in the
identity collection if the pool scales out. We recommend to only update this property
when the pool is sized zero. For more information, see the Configure managed identities
article.

Next steps
Learn more about available Batch APIs and tools.
Learn how to check pools and nodes for errors.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Create an Azure Batch pool in a virtual
network
Article • 11/19/2024

When you create an Azure Batch pool, you can provision the pool in a subnet of an
Azure Virtual Network that you specify. This article explains how to set up a Batch pool
in a Virtual Network.

Why use a Virtual Network?


Compute nodes in a pool can communicate with each other, such as to run multi-
instance tasks, without requiring a separate Virtual Network. However, by default, nodes
in a pool can't communicate with any virtual machine (VM) that is outside of the pool,
such as license or file servers.

To allow compute nodes to communicate securely with other virtual machines, or with
an on-premises network, you can provision the pool in a subnet of a Virtual Network.

Prerequisites
Authentication. To use an Azure Virtual Network, the Batch client API must use
Microsoft Entra authentication. To learn more, see Authenticate Batch service
solutions with Active Directory.

An Azure Virtual Network. To prepare a Virtual Network with one or more subnets
in advance, you can use the Azure portal, Azure PowerShell, the Microsoft Azure
CLI (CLI), or other methods.

To create an Azure Resource Manager-based Virtual Network, see Create a


virtual network. A Resource Manager-based Virtual Network is recommended
for new deployments, and is supported only on pools that use Virtual Machine
Configuration.

To create a classic Virtual Network, see Create a virtual network (classic) with
multiple subnets. A classic Virtual Network is supported only on pools that use
Cloud Services Configuration.

) Important
Avoid using 172.17.0.0/16 for Azure Batch pool VNet. It is the default for
Docker bridge network and may conflict with other networks that you want
to connect to the VNet. Creating a virtual network for Azure Batch pool
requires careful planning of your network infrastructure.

General virtual network requirements


The Virtual Network must be in the same subscription and region as the Batch
account you use to create your pool.

The subnet specified for the pool must have enough unassigned IP addresses to
accommodate the number of VMs targeted for the pool, enough to accommodate
the targetDedicatedNodes and targetLowPriorityNodes properties of the pool. If
the subnet doesn't have enough unassigned IP addresses, the pool partially
allocates the compute nodes, and a resize error occurs.

If you aren't using Simplified Compute Node Communication, you need to resolve
your Azure Storage endpoints by using any custom DNS servers that serve your
virtual network. Specifically, URLs of the form <account>.table.core.windows.net ,
<account>.queue.core.windows.net , and <account>.blob.core.windows.net should

be resolvable.

Multiple pools can be created in the same virtual network or in the same subnet
(as long as it has sufficient address space). A single pool can't exist across multiple
virtual networks or subnets.

) Important

Batch pools can be configured in one of two node communication modes. Classic
node communication mode is where the Batch service initiates communication to
the compute nodes. Simplified node communication mode is where the compute
nodes initiate communication to the Batch Service.

Any virtual network or peered virtual network that will be used for Batch pools
should not have overlapping IP address ranges with software defined networking
or routing on compute nodes. A common source for conflicts is from the use of a
container runtime, such as docker. Docker will create a default network bridge with
a defined subnet range of 172.17.0.0/16 . Any services running within a virtual
network in that default IP address space will conflict with services on the compute
node, such as remote access via SSH.
Pools in Virtual Machine Configuration
Requirements:

Supported Virtual Networks: Azure Resource Manager-based virtual networks only.


Subnet ID: when specifying the subnet using the Batch APIs, use the resource
identifier of the subnet. The subnet identifier is of the form:

/subscriptions/{subscription}/resourceGroups/{group}/providers/Microsoft.Network/vi
rtualNetworks/{network}/subnets/{subnet}

Permissions: check whether your security policies or locks on the Virtual Network's
subscription or resource group restrict a user's permissions to manage the Virtual
Network.
Networking resources: Batch automatically creates more networking resources in
the resource group containing the Virtual Network.

) Important

For each 100 dedicated or low-priority nodes, Batch creates one network security
group (NSG), one public IP address, and one load balancer. These resources are
limited by the subscription's resource quotas. For large pools, you might need to
request a quota increase for one or more of these resources.

Network security groups for Virtual Machine


Configuration pools: Batch default
Batch creates a network security group (NSG) at the network interface level of each
Virtual Machine Scale Set deployment within a Batch pool. For pools that don't have
public IP addresses under simplified compute node communication, NSGs aren't
created.

In order to provide the necessary communication between compute nodes and the
Batch service, these NSGs are configured such that:

Inbound TCP traffic on ports 29876 and 29877 from Batch service IP addresses that
correspond to the BatchNodeManagement.region service tag. This rule is only
created in classic pool communication mode.
Outbound any traffic on port 443 to Batch service IP addresses that correspond to
the BatchNodeManagement.region service tag.
Outbound traffic on any port to the virtual network. This rule might be amended
per subnet-level NSG rules.
Outbound traffic on any port to the Internet. This rule might be amended per
subnet-level NSG rules.

7 Note

For pools created using an API version earlier than 2024-07-01 , inbound TCP traffic
on port 22 (Linux nodes) or port 3389 (Windows nodes) is configured to allow
remote access via SSH or RDP on the default ports.

) Important

Use caution if you modify or add inbound or outbound rules in Batch-configured


NSGs. If communication to the compute nodes in the specified subnet is denied by
an NSG, the Batch service will set the state of the compute nodes to unusable.
Additionally, no resource locks should be applied to any resource created by Batch,
because this can prevent cleanup of resources as a result of user-initiated actions
such as deleting a pool.

Network security groups for Virtual Machine


Configuration pools: Specifying subnet-level rules
If you have an NSG associated with the subnet for Batch compute nodes, you must
configure this NSG with at least the inbound and outbound security rules that are shown
in the following tables.

2 Warning

Batch service IP addresses can change over time. Therefore, you should use the
BatchNodeManagement.region service tag for the NSG rules indicated in the
following tables. Avoid populating NSG rules with specific Batch service IP
addresses.

Inbound security rules

ノ Expand table
Source Service Tag or IP Destination Protocol Pool Required
Addresses Ports Communication
Mode

BatchNodeManagement.region 29876-29877 TCP Classic Yes


service tag

Source IP addresses for remotely 3389 TCP Classic or No


accessing compute nodes (Windows), 22 Simplified
(Linux)

Configure inbound traffic on port 3389 (Windows) or 22 (Linux) only if you need to
permit remote access to the compute nodes from outside sources on default RDP or
SSH ports, respectively. You might need to allow SSH traffic on Linux if you require
support for multi-instance tasks with certain Message Passing Interface (MPI) runtimes
in the subnet containing the Batch compute nodes as traffic may be blocked per
subnet-level NSG rules. MPI traffic is typically over private IP address space, but can vary
between MPI runtimes and runtime configuration. Allowing traffic on these ports isn't
strictly required for the pool compute nodes to be usable. You can also disable default
remote access on these ports through configuring pool endpoints.

Outbound security rules

ノ Expand table

Destination Service Tag Destination Protocol Pool Required


Ports Communication
Mode

BatchNodeManagement.region 443 * Simplified Yes


service tag

Storage.region service tag 443 TCP Classic Yes

Outbound to BatchNodeManagement.region service tag is required in classic pool


communication mode if you're using Job Manager tasks or if your tasks must
communicate back to the Batch service. For outbound to BatchNodeManagement.region
in simplified pool communication mode, the Batch service currently only uses TCP
protocol, but UDP might be required for future compatibility. For pools without public IP
addresses using simplified communication mode and with a node management
private endpoint, an NSG isn't needed. For more information about outbound security
rules for the BatchNodeManagement.region service tag, see Use simplified compute
node communication.
Create a pool with a Virtual Network in the
Azure portal
After you've created your Virtual Network and assigned a subnet to it, you can create a
Batch pool with that Virtual Network. Follow these steps to create a pool from the Azure
portal:

1. Search for and select Batch accounts in the search bar at the top of the Azure
portal. Select your Batch account. This account must be in the same subscription
and region as the resource group containing the Virtual Network you intend to
use.

2. Select Pools from the left navigation.

3. On the Pools window, select Add.

4. On the Add Pool page, select the options and enter the information for your pool.
For more information on creating pools for your Batch account, see Create a pool
of compute nodes. Node size, Target dedicated nodes, and Target Spot/low-
priority nodes, and any desired optional settings.

5. In Virtual Network, select the virtual network and subnet you wish to use.

6. Select OK to create your pool.

) Important
If you try to delete a subnet which is being used by a pool, you will get an error
message. All pools using a subnet must be deleted before you delete that subnet.

User-defined routes for forced tunneling


You might have requirements in your organization to redirect (force) internet-bound
traffic from the subnet back to your on-premises location for inspection and logging.
Additionally, you might have enabled forced tunneling for the subnets in your Virtual
Network.

To ensure that the nodes in your pool work in a Virtual Network that has forced
tunneling enabled, you must add the following user-defined routes (UDR) for that
subnet.

For classic communication mode pools:

The Batch service needs to communicate with nodes for scheduling tasks. To
enable this communication, add a UDR corresponding to the
BatchNodeManagement.region service tag in the region where your Batch account
exists. Set the Next hop type to Internet.

Ensure that your on-premises network isn't blocking outbound TCP traffic to Azure
Storage on destination port 443 (specifically, URLs of the form
*.table.core.windows.net , *.queue.core.windows.net , and
*.blob.core.windows.net ).

For simplified communication mode pools without using node management private
endpoint:

Ensure that your on-premises network isn't blocking outbound TCP/UDP traffic to
the Azure Batch BatchNodeManagement.region service tag on destination port
443. Currently only TCP protocol is used, but UDP might be required for future
compatibility.

For all pools:

If you use virtual file mounts, review the networking requirements, and ensure that
no required traffic is blocked.

2 Warning
Batch service IP addresses can change over time. To prevent outages due to Batch
service IP address changes, do not directly specify IP addresses. Instead use the
BatchNodeManagement.region service tag.

Next steps
Batch service workflow and resources
Tutorial: Route network traffic with a route table using the Azure portal

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Use simplified compute node
communication
Article • 03/27/2025

An Azure Batch pool contains one or more compute nodes that execute user-specified
workloads in the form of Batch tasks. To enable Batch functionality and Batch pool
infrastructure management, compute nodes must communicate with the Azure Batch
service.

Batch supports two types of communication modes:

Classic: the Batch service initiates communication with the compute nodes.
Simplified: the compute nodes initiate communication with the Batch service.

This article describes the simplified communication mode and the associated network
configuration requirements.

 Tip

Information in this document pertaining to networking resources and rules such as


NSGs doesn't apply to Batch pools with no public IP addresses that use the node
management private endpoint without internet outbound access.

2 Warning

The classic compute node communication mode will be retired on 31 March 2026
and replaced with the simplified communication mode described in this document.

Supported regions
Simplified compute node communication in Azure Batch is currently available for the
following regions:

Public: all public regions where Batch is present except for West India.
Government: USGov Arizona, USGov Virginia, USGov Texas.
China: all China regions where Batch is present except for China North 1 and China
East 1.
Differences between classic and simplified
modes
The simplified compute node communication mode streamlines the way Batch pool
infrastructure is managed on behalf of users. This communication mode reduces the
complexity and scope of inbound and outbound networking connections required in
baseline operations.

Batch pools with the classic communication mode require the following networking
rules in network security groups (NSGs), user-defined routes (UDRs), and firewalls when
creating a pool in a virtual network:

Inbound:
Destination ports 29876 , 29877 over TCP from BatchNodeManagement.<region>

Outbound:
Destination port 443 over TCP to Storage.<region>
Destination port 443 over TCP to BatchNodeManagement.<region> for certain
workloads that require communication back to the Batch Service, such as Job
Manager tasks

Batch pools with the simplified communication mode only need outbound access to
Batch account's node management endpoint (see Batch account public endpoints). They
require the following networking rules in NSGs, UDRs, and firewalls:

Inbound:
None

Outbound:
Destination port 443 over ANY to BatchNodeManagement.<region>

Outbound requirements for a Batch account can be discovered using the List Outbound
Network Dependencies Endpoints API. This API reports the base set of dependencies,
depending upon the Batch account pool communication mode. User-specific workloads
might need extra rules such as opening traffic to other Azure resources (such as Azure
Storage for Application Packages, Azure Container Registry) or endpoints like the
Microsoft package repository for virtual file system mounting functionality.

Benefits of simplified mode


Azure Batch users utilizing the simplified mode benefit from simplification of networking
connections and rules. Simplified compute node communication helps reduce security
risks by removing the requirement to open ports for inbound communication from the
internet. Only a single outbound rule to a well-known Service Tag is required for
baseline operation.

The simplified mode also provides more fine-grained data exfiltration control over the
classic communication mode since outbound communication to Storage.<region> is no
longer required. You can explicitly lock down outbound communication to Azure
Storage if necessary for your workflow. For example, you can scope your outbound
communication rules to Azure Storage to enable your AppPackage storage accounts or
other storage accounts for resource files or output files.

Even if your workloads aren't currently impacted by the changes (as described in the
following section), it's recommended to move to the simplified mode. Future
improvements in the Batch service might only be functional with simplified compute
node communication.

Potential impact between classic and simplified


communication modes
In many cases, the simplified communication mode doesn't directly affect your Batch
workloads. However, simplified compute node communication has an impact for the
following cases:

Users who specify a virtual network as part of creating a Batch pool and do one or
both of the following actions:
Explicitly disable outbound network traffic rules that are incompatible with
simplified compute node communication.
Use UDRs and firewall rules that are incompatible with simplified compute node
communication.
Users who enable software firewalls on compute nodes and explicitly disable
outbound traffic in software firewall rules that are incompatible with simplified
compute node communication.

If either of these cases applies to you, then follow the steps outlined in the next section
to ensure that your Batch workloads can still function in simplified mode. It's strongly
recommended that you test and verify all of your changes in a dev and test environment
first before pushing your changes into production.

Required network configuration changes for simplified


mode
The following steps are required to migrate to the new communication mode:

1. Ensure your networking configuration as applicable to Batch pools (NSGs, UDRs,


firewalls, etc.) includes a union of the modes, that is, the combined network rules
of both classic and simplified modes. At a minimum, these rules would be:

Inbound:
Destination ports 29876 , 29877 over TCP from BatchNodeManagement.
<region>

Outbound:
Destination port 443 over TCP to Storage.<region>
Destination port 443 over ANY to BatchNodeManagement.<region>

2. If you have any other inbound or outbound scenarios required by your workflow,
you need to ensure that your rules reflect these requirements.
3. Use one of the following options to update your workloads to use the new
communication mode.

Create new pools with the targetNodeCommunicationMode set to simplified and


validate that the new pools are working correctly. Migrate your workload to
the new pools and delete any earlier pools.
Update existing pools targetNodeCommunicationMode property to simplified
and then resize all existing pools to zero nodes and scale back out.

4. Use the Get Pool API, List Pool API, or the Azure portal to confirm the
currentNodeCommunicationMode is set to the desired communication mode of
simplified.
5. Modify all applicable networking configuration to the simplified communication
rules, at the minimum (note any extra rules needed as discussed above):

Inbound:
None
Outbound:
Destination port 443 over ANY to BatchNodeManagement.<region>

If you follow these steps, but later want to switch back to classic compute node
communication, you need to take the following actions:

1. Revert any networking configuration operating exclusively in simplified compute


node communication mode.
2. Create new pools or update existing pools targetNodeCommunicationMode property
set to classic.
3. Migrate your workload to these pools, or resize existing pools and scale back out
(see step 3 above).
4. See step 4 above to confirm that your pools are operating in classic
communication mode.
5. Optionally restore your networking configuration.

Specify the communication mode on a Batch


pool
The targetNodeCommunicationMode property on Batch pools allows you to indicate a
preference to the Batch service for which communication mode to utilize between the
Batch service and compute nodes. The following are the allowable options on this
property:

Classic: creates the pool using classic compute node communication.


Simplified: creates the pool using simplified compute node communication.
Default: allows the Batch service to select the appropriate compute node
communication mode. For pools without a virtual network, the pool may be
created in either classic or simplified mode. For pools with a virtual network, the
pool always defaults to classic until 30 September 2024. For more information, see
the classic compute node communication mode migration guide.

 Tip

Specifying the target node communication mode indicates a preference for the
Batch service, but doesn't guarantee that it will be honored. Certain configurations
on the pool might prevent the Batch service from honoring the specified target
node communication mode, such as interaction with no public IP address, virtual
networks, and the pool configuration type.

The following are examples of how to create a Batch pool with simplified compute node
communication.

Azure portal
First, sign in to the Azure portal . Then, navigate to the Pools blade of your Batch
account and select the Add button. Under OPTIONAL SETTINGS, you can select
Simplified as an option from the pull-down of Node communication mode as shown:
To update an existing pool to simplified communication mode, navigate to the Pools
blade of your Batch account and select the pool to update. On the left-side navigation,
select Node communication mode. There you can select a new target node
communication mode as shown below. After selecting the appropriate communication
mode, select the Save button to update. You need to scale the pool down to zero nodes
first, and then back out for the change to take effect, if conditions allow.
To display the current node communication mode for a pool, navigate to the Pools
blade of your Batch account, and select the pool to view. Select Properties on the left-
side navigation and the pool node communication mode appears under the General
section.
REST API
This example shows how to use the Batch Service REST API to create a pool with
simplified compute node communication.

HTTP

POST {batchURL}/pools?api-version=2022-10-01.16.0
client-request-id: 00000000-0000-0000-0000-000000000000

Request body

JSON

"pool": {
"id": "pool-simplified",
"vmSize": "standard_d2s_v3",
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "Canonical",
"offer": "0001-com-ubuntu-server-jammy",
"sku": "22_04-lts"
},
"nodeAgentSKUId": "batch.node.ubuntu 22.04"
},
"resizeTimeout": "PT15M",
"targetDedicatedNodes": 2,
"targetLowPriorityNodes": 0,
"taskSlotsPerNode": 1,
"taskSchedulingPolicy": {
"nodeFillType": "spread"
},
"enableAutoScale": false,
"enableInterNodeCommunication": false,
"targetNodeCommunicationMode": "simplified"
}

Limitations
The following are known limitations of the simplified communication mode:

Limited migration support for previously created pools without public IP addresses.
These pools can only be migrated if created in a virtual network, otherwise they
won't use simplified compute node communication, even if specified on the pool.
Cloud Service Configuration pools are not supported for simplified compute node
communication and are deprecated . Specifying a communication mode for
these types of pools aren't honored and always results in classic communication
mode. We recommend using Virtual Machine Configuration for your Batch pools.

Next steps
Learn how to use private endpoints with Batch accounts.
Learn more about pools in virtual networks.
Learn how to create a pool with specified public IP addresses.
Learn how to create a pool without public IP addresses.
Learn how to configure public network access for Batch accounts.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Create a formula to automatically scale
compute nodes in a Batch pool
06/05/2025

Azure Batch can automatically scale pools based on parameters that you define, saving you
time and money. With automatic scaling, Batch dynamically adds nodes to a pool as task
demands increase, and removes compute nodes as task demands decrease.

To enable automatic scaling on a pool of compute nodes, you associate the pool with an
autoscale formula that you define. The Batch service uses the autoscale formula to determine
how many nodes are needed to execute your workload. These nodes can be dedicated nodes
or Azure Spot nodes. Batch periodically reviews service metrics data and uses it to adjust the
number of nodes in the pool based on your formula and at an interval that you define.

You can enable automatic scaling when you create a pool, or apply it to an existing pool. Batch
lets you evaluate your formulas before assigning them to pools and to monitor the status of
automatic scaling runs. Once you configure a pool with automatic scaling, you can make
changes to the formula later.

) Important

When you create a Batch account, you can specify the pool allocation mode, which
determines whether pools are allocated in a Batch service subscription (the default) or in
your user subscription. If you created your Batch account with the default Batch service
configuration, then your account is limited to a maximum number of cores that can be
used for processing. The Batch service scales compute nodes only up to that core limit.
For this reason, the Batch service might not reach the target number of compute nodes
specified by an autoscale formula. To learn how to view and increase your account quotas,
see Quotas and limits for the Azure Batch service.

If you created your account with user subscription mode, then your account shares in the
core quota for the subscription. For more information, see Virtual Machines limits in
Azure subscription and service limits, quotas, and constraints.

Autoscale formulas
An autoscale formula is a string value that you define that contains one or more statements.
The autoscale formula is assigned to a pool's autoScaleFormula element (Batch REST) or
CloudPool.AutoScaleFormula property (Batch .NET). The Batch service uses your formula to
determine the target number of compute nodes in the pool for the next interval of processing.
The formula string can't exceed 8 KB, can include up to 100 statements that are separated by
semicolons, and can include line breaks and comments.

You can think of automatic scaling formulas as a Batch autoscale "language." Formula
statements are free-formed expressions that can include both service-defined variables, which
are defined by the Batch service, and user-defined variables. Formulas can perform various
operations on these values by using built-in types, operators, and functions. For example, a
statement might take the following form:

$myNewVariable = function($ServiceDefinedVariable, $myCustomVariable);

Formulas generally contain multiple statements that perform operations on values that are
obtained in previous statements. For example, first you obtain a value for variable1 , then pass
it to a function to populate variable2 :

$variable1 = function1($ServiceDefinedVariable);
$variable2 = function2($OtherServiceDefinedVariable, $variable1);

Include these statements in your autoscale formula to arrive at a target number of compute
nodes. Dedicated nodes and Spot nodes each have their own target settings. An autoscale
formula can include a target value for dedicated nodes, a target value for Spot nodes, or both.

The target number of nodes might be higher, lower, or the same as the current number of
nodes of that type in the pool. Batch evaluates a pool's autoscale formula at specific automatic
scaling intervals. Batch adjusts the target number of each type of node in the pool to the
number that your autoscale formula specifies at the time of evaluation.

Sample autoscale formulas


The following examples show two autoscale formulas, which can be adjusted to work for most
scenarios. The variables startingNumberOfVMs and maxNumberofVMs in the example formulas can
be adjusted to your needs.

Pending tasks

With this autoscale formula, the pool is initially created with a single VM. The $PendingTasks
metric defines the number of tasks that are running or queued. The formula finds the average
number of pending tasks in the last 15 minutes and sets the $TargetDedicatedNodes variable
accordingly. The formula ensures that the target number of dedicated nodes never exceeds 25
VMs. As new tasks are submitted, the pool automatically grows. As tasks complete, VMs
become free and the autoscaling formula shrinks the pool.

This formula scales dedicated nodes, but can be modified to apply to scale Spot nodes as well.

startingNumberOfVMs = 1;
maxNumberofVMs = 25;
pendingTaskSamplePercent = $PendingTasks.GetSamplePercent(TimeInterval_Minute *
15);
pendingTaskSamples = pendingTaskSamplePercent < 70 ? startingNumberOfVMs :
avg($PendingTasks.GetSample(TimeInterval_Minute * 15));
$TargetDedicatedNodes=min(maxNumberofVMs, pendingTaskSamples);
$NodeDeallocationOption = taskcompletion;

) Important

Currently, Batch Service has limitations with the resolution of the pending tasks. When a
task is added to the job, it's also added into a internal queue used by Batch service for
scheduling. If the task is deleted before it can be scheduled, the task might persist within
the queue, causing it to still be counted in $PendingTasks . This deleted task will eventually
be cleared from the queue when Batch gets chance to pull tasks from the queue to
schedule with idle nodes in the Batch pool.

Preempted nodes
This example creates a pool that starts with 25 Spot nodes. Every time a Spot node is
preempted, it's replaced with a dedicated node. As with the first example, the maxNumberofVMs
variable prevents the pool from exceeding 25 VMs. This example is useful for taking advantage
of Spot VMs while also ensuring that only a fixed number of preemptions occur for the lifetime
of the pool.

maxNumberofVMs = 25;
$TargetDedicatedNodes = min(maxNumberofVMs, $PreemptedNodeCount.GetSample(180 *
TimeInterval_Second));
$TargetLowPriorityNodes = min(maxNumberofVMs , maxNumberofVMs -
$TargetDedicatedNodes);
$NodeDeallocationOption = taskcompletion;
You'll learn more about how to create autoscale formulas and see more example autoscale
formulas later in this article.

Variables
You can use both service-defined and user-defined variables in your autoscale formulas.

The service-defined variables are built in to the Batch service. Some service-defined variables
are read-write, and some are read-only.

User-defined variables are variables that you define. In the previous example,
$TargetDedicatedNodes and $PendingTasks are service-defined variables, while
startingNumberOfVMs and maxNumberofVMs are user-defined variables.

7 Note

Service-defined variables are always preceded by a dollar sign ($). For user-defined
variables, the dollar sign is optional.

The following tables show the read-write and read-only variables defined by the Batch service.

Read-write service-defined variables


You can get and set the values of these service-defined variables to manage the number of
compute nodes in a pool.

ノ Expand table

Variable Description

$TargetDedicatedNodes The target number of dedicated compute nodes for the pool. Specified as a
target because a pool might not always achieve the desired number of
nodes. For example, if the target number of dedicated nodes is modified by
an autoscale evaluation before the pool has reached the initial target, the
pool might not reach the target.

A pool in an account created in Batch service mode might not achieve its
target if the target exceeds a Batch account node or core quota. A pool in an
account created in user subscription mode might not achieve its target if the
target exceeds the shared core quota for the subscription.

$TargetLowPriorityNodes The target number of Spot compute nodes for the pool. Specified as a target
because a pool might not always achieve the desired number of nodes. For
example, if the target number of Spot nodes is modified by an autoscale
Variable Description

evaluation before the pool has reached the initial target, the pool might not
reach the target. A pool might also not achieve its target if the target
exceeds a Batch account node or core quota.

For more information on Spot compute nodes, see Use Spot VMs with Batch.

$NodeDeallocationOption The action that occurs when compute nodes are removed from a pool.
Possible values are:
- requeue: The default value. Ends tasks immediately and puts them back on
the job queue so that they're rescheduled. This action ensures the target
number of nodes is reached as quickly as possible. However, it might be less
efficient, because any running tasks are interrupted and then must be
restarted.
- terminate: Ends tasks immediately and removes them from the job queue.
- taskcompletion: Waits for currently running tasks to finish and then
removes the node from the pool. Use this option to avoid tasks being
interrupted and requeued, wasting any work the task has done.
- retaineddata: Waits for all the local task-retained data on the node to be
cleaned up before removing the node from the pool.

7 Note

The $TargetDedicatedNodes variable can also be specified using the alias


$TargetDedicated . Similarly, the $TargetLowPriorityNodes variable can be specified using

the alias $TargetLowPriority . If both the fully named variable and its alias are set by the
formula, the value assigned to the fully named variable takes precedence.

Read-only service-defined variables


You can get the value of these service-defined variables to make adjustments that are based on
metrics from the Batch service.

) Important

Job release tasks aren't currently included in variables that provide task counts, such as
$ActiveTasks and $PendingTasks . Depending on your autoscale formula, this can result in

nodes being removed with no nodes available to run job release tasks.

 Tip
These read-only service-defined variables are objects that provide various methods to
access data associated with each. For more information, see Obtain sample data later in
this article.

ノ Expand table

Variable Description

$CPUPercent The average percentage of CPU usage.

$ActiveTasks The number of tasks that are ready to execute but aren't yet executing. This
includes all tasks that are in the active state and whose dependencies have
been satisfied. Any tasks that are in the active state but whose dependencies
haven't been satisfied are excluded from the $ActiveTasks count. For a
multi-instance task, $ActiveTasks includes the number of instances set on
the task.

$RunningTasks The number of tasks in a running state.

$PendingTasks The sum of $ActiveTasks and $RunningTasks .

$SucceededTasks The number of tasks that finished successfully.

$FailedTasks The number of tasks that failed.

$TaskSlotsPerNode The number of task slots that can be used to run concurrent tasks on a
single compute node in the pool.

$CurrentDedicatedNodes The current number of dedicated compute nodes.

$CurrentLowPriorityNodes The current number of Spot compute nodes, including any nodes that have
been preempted.

$UsableNodeCount The number of usable compute nodes.

$PreemptedNodeCount The number of nodes in the pool that are in a preempted state.

7 Note

Use $RunningTasks when scaling based on the number of tasks running at a point in time,
and $ActiveTasks when scaling based on the number of tasks that are queued up to run.

Types
Autoscale formulas support the following types:
double
doubleVec
doubleVecList
string
timestamp--a compound structure that contains the following members:
year
month (1-12)
day (1-31)
weekday (in the format of number; for example, 1 for Monday)
hour (in 24-hour number format; for example, 13 means 1 PM)
minute (00-59)
second (00-59)
timeinterval
TimeInterval_Zero
TimeInterval_100ns
TimeInterval_Microsecond
TimeInterval_Millisecond
TimeInterval_Second
TimeInterval_Minute
TimeInterval_Hour
TimeInterval_Day
TimeInterval_Week
TimeInterval_Year

Operations
These operations are allowed on the types that are listed in the previous section.

ノ Expand table

Operation Supported operators Result type

double operator double +, -, *, / double

double operator timeinterval * timeinterval

doubleVec operator double +, -, *, / doubleVec

doubleVec operator doubleVec +, -, *, / doubleVec

timeinterval operator double *, / timeinterval

timeinterval operator timeinterval +, - timeinterval


Operation Supported operators Result type

timeinterval operator timestamp + timestamp

timestamp operator timeinterval + timestamp

timestamp operator timestamp - timeinterval

operator double -, ! double

operator timeinterval - timeinterval

double operator double <, <=, ==, >=, >, != double

string operator string <, <=, ==, >=, >, != double

timestamp operator timestamp <, <=, ==, >=, >, != double

timeinterval operator timeinterval <, <=, ==, >=, >, != double

double operator double &&, || double

Testing a double with a ternary operator ( double ? statement1 : statement2 ), results in


nonzero as true, and zero as false.

Functions
You can use these predefined functions when defining an autoscale formula.

ノ Expand table

Function Return Description


type

avg(doubleVecList) double Returns the average value for all values in the doubleVecList.

ceil(double) double Returns the smallest integer value not less than the double.

ceil(doubleVecList) doubleVec Returns the component-wise ceil of the doubleVecList.

floor(double) double Returns the largest integer value not greater than the double.

floor(doubleVecList) doubleVec Returns the component-wise floor of the doubleVecList.

len(doubleVecList) double Returns the length of the vector that is created from the
doubleVecList.

lg(double) double Returns the log base 2 of the double.

lg(doubleVecList) doubleVec Returns the component-wise lg of the doubleVecList.


Function Return Description
type

ln(double) double Returns the natural log of the double.

ln(doubleVecList) doubleVec Returns the component-wise ln of the doubleVecList.

log(double) double Returns the log base 10 of the double.

log(doubleVecList) doubleVec Returns the component-wise log of the doubleVecList.

max(doubleVecList) double Returns the maximum value in the doubleVecList.

min(doubleVecList) double Returns the minimum value in the doubleVecList.

norm(doubleVecList) double Returns the two-norm of the vector that is created from the
doubleVecList.

percentile(doubleVec v, double Returns the percentile element of the vector v.


double p)

rand() double Returns a random value between 0.0 and 1.0.

range(doubleVecList) double Returns the difference between the min and max values in the
doubleVecList.

round(double) double Returns the nearest integer value to the double (in floating-
point format), rounding halfway cases away from zero.

round(doubleVecList) doubleVec Returns the component-wise round of the doubleVecList.

std(doubleVecList) double Returns the sample standard deviation of the values in the
doubleVecList.

stop() Stops evaluation of the autoscaling expression.

sum(doubleVecList) double Returns the sum of all the components of the doubleVecList.

time(string dateTime="") timestamp Returns the time stamp of the current time if no parameters are
passed, or the time stamp of the dateTime string if that is
passed. Supported dateTime formats are W3C-DTF and RFC
1123.

val(doubleVec v, double double Returns the value of the element that is at location i in vector v,
i) with a starting index of zero.

Some of the functions that are described in the previous table can accept a list as an argument.
The comma-separated list is any combination of double and doubleVec. For example:

doubleVecList := ( (double | doubleVec)+(, (double | doubleVec) )* )?


The doubleVecList value is converted to a single doubleVec before evaluation. For example, if v
= [1,2,3] , then calling avg(v) is equivalent to calling avg(1,2,3) . Calling avg(v, 7) is

equivalent to calling avg(1,2,3,7) .

Metrics
You can use both resource and task metrics when you define a formula. You adjust the target
number of dedicated nodes in the pool based on the metrics data that you obtain and
evaluate. For more information on each metric, see the Variables section.

ノ Expand table

Metric Description

Resource Resource metrics are based on the CPU, the bandwidth, the memory usage of compute nodes,
and the number of nodes.

These service-defined variables are useful for making adjustments based on node count:
- $TargetDedicatedNodes
- $TargetLowPriorityNodes
- $CurrentDedicatedNodes
- $CurrentLowPriorityNodes
- $PreemptedNodeCount
- $UsableNodeCount

These service-defined variables are useful for making adjustments based on node resource
usage:
- $CPUPercent

Task Task metrics are based on the status of tasks, such as Active, Pending, and Completed. The
following service-defined variables are useful for making pool-size adjustments based on task
metrics:
- $ActiveTasks
- $RunningTasks
- $PendingTasks
- $SucceededTasks
- $FailedTasks

Obtain sample data


The core operation of an autoscale formula is to obtain task and resource metrics data
(samples), and then adjust pool size based on that data. As such, it's important to have a clear
understanding of how autoscale formulas interact with samples.
Methods
Autoscale formulas act on samples of metric data provided by the Batch service. A formula
grows or shrinks the pool compute nodes based on the values that it obtains. Service-defined
variables are objects that provide methods to access data that's associated with that object. For
example, the following expression shows a request to get the last five minutes of CPU usage:

$CPUPercent.GetSample(TimeInterval_Minute * 5)

The following methods can be used to obtain sample data about service-defined variables.

ノ Expand table

Method Description

GetSample() The GetSample() method returns a vector of data samples.

A sample is 30 seconds worth of metrics data. In other words, samples are obtained
every 30 seconds. But as noted below, there's a delay between when a sample is
collected and when it's available to a formula. As such, not all samples for a given
time period might be available for evaluation by a formula.

- doubleVec GetSample(double count) : Specifies the number of samples to obtain


from the most recent samples that were collected. GetSample(1) returns the last
available sample. For metrics like $CPUPercent , however, GetSample(1) shouldn't be
used, because it's impossible to know when the sample was collected. It could be
recent, or, because of system issues, it might be much older. In such cases, it's
better to use a time interval as shown below.

- doubleVec GetSample((timestamp or timeinterval) startTime [, double


samplePercent]) : Specifies a time frame for gathering sample data. Optionally, it
also specifies the percentage of samples that must be available in the requested
time frame. For example, $CPUPercent.GetSample(TimeInterval_Minute * 10) would
return 20 samples if all samples for the last 10 minutes are present in the
CPUPercent history. If the last minute of history wasn't available, only 18 samples
would be returned. In this case $CPUPercent.GetSample(TimeInterval_Minute * 10,
95) would fail because only 90 percent of the samples are available, but
$CPUPercent.GetSample(TimeInterval_Minute * 10, 80) would succeed.

- doubleVec GetSample((timestamp or timeinterval) startTime, (timestamp or


timeinterval) endTime [, double samplePercent]) : Specifies a time frame for
gathering data, with both a start time and an end time. As mentioned above,
there's a delay between when a sample is collected and when it becomes available
Method Description

to a formula. Consider this delay when you use the GetSample method. See
GetSamplePercent below.

GetSamplePeriod() Returns the period of samples that were taken in a historical sample data set.

Count() Returns the total number of samples in the metrics history.

HistoryBeginTime() Returns the time stamp of the oldest available data sample for the metric.

GetSamplePercent() Returns the percentage of samples that are available for a given time interval. For
example, doubleVec GetSamplePercent( (timestamp or timeinterval) startTime [,
(timestamp or timeinterval) endTime] ) . Because the GetSample method fails if the
percentage of samples returned is less than the samplePercent specified, you can
use the GetSamplePercent method to check first. Then you can perform an alternate
action if insufficient samples are present, without halting the automatic scaling
evaluation.

Samples
The Batch service periodically takes samples of task and resource metrics and makes them
available to your autoscale formulas. These samples are recorded every 30 seconds by the
Batch service. However, there's typically a delay between when those samples were recorded
and when they're made available to (and read by) your autoscale formulas. Additionally,
samples might not be recorded for a particular interval because of factors such as network or
other infrastructure issues.

Sample percentage
When samplePercent is passed to the GetSample() method or the GetSamplePercent() method
is called, percent refers to a comparison between the total possible number of samples
recorded by the Batch service and the number of samples that are available to your autoscale
formula.

Let's look at a 10-minute time span as an example. Because samples are recorded every 30
seconds within that 10-minute time span, the maximum total number of samples recorded by
Batch would be 20 samples (2 per minute). However, due to the inherent latency of the
reporting mechanism and other issues within Azure, there might be only 15 samples that are
available to your autoscale formula for reading. So, for example, for that 10-minute period,
only 75 percent of the total number of samples recorded might be available to your formula.

GetSample() and sample ranges


Your autoscale formulas grow and shrink your pools by adding or removing nodes. Because
nodes cost you money, be sure that your formulas use an intelligent method of analysis that's
based on sufficient data. It's recommended that you use a trending-type analysis in your
formulas. This type grows and shrinks your pools based on a range of collected samples.

To do so, use GetSample(interval look-back start, interval look-back end) to return a vector
of samples:

$runningTasksSample = $RunningTasks.GetSample(1 * TimeInterval_Minute, 6 *


TimeInterval_Minute);

When Batch evaluates the above line, it returns a range of samples as a vector of values. For
example:

$runningTasksSample=[1,1,1,1,1,1,1,1,1,1];

After you collect the vector of samples, you can then use functions like min() , max() , and
avg() to derive meaningful values from the collected range.

To exercise extra caution, you can force a formula evaluation to fail if less than a certain sample
percentage is available for a particular time period. When you force a formula evaluation to fail,
you instruct Batch to cease further evaluation of the formula if the specified percentage of
samples isn't available. In this case, no change is made to the pool size. To specify a required
percentage of samples for the evaluation to succeed, specify it as the third parameter to
GetSample() . Here, a requirement of 75 percent of samples is specified:

$runningTasksSample = $RunningTasks.GetSample(60 * TimeInterval_Second, 120 *


TimeInterval_Second, 75);

Because there might be a delay in sample availability, you should always specify a time range
with a look-back start time that's older than one minute. It takes approximately one minute for
samples to propagate through the system, so samples in the range (0 * TimeInterval_Second,
60 * TimeInterval_Second) might not be available. Again, you can use the percentage

parameter of GetSample() to force a particular sample percentage requirement.

) Important
We strongly recommend that you avoid relying only on GetSample(1) in your autoscale
formulas. This is because GetSample(1) essentially says to the Batch service, "Give me the
last sample you had, no matter how long ago you retrieved it." Since it's only a single
sample, and it might be an older sample, it might not be representative of the larger
picture of recent task or resource state. If you do use GetSample(1) , make sure that it's
part of a larger statement and not the only data point that your formula relies on.

Write an autoscale formula


You build an autoscale formula by forming statements that use the above components, then
combine those statements into a complete formula. In this section, you create an example
autoscale formula that can perform real-world scaling decisions and make adjustments.

First, let's define the requirements for our new autoscale formula. The formula should:

Increase the target number of dedicated compute nodes in a pool if CPU usage is high.
Decrease the target number of dedicated compute nodes in a pool when CPU usage is
low.
Always restrict the maximum number of dedicated nodes to 400.
When reducing the number of nodes, don't remove nodes that are running tasks; if
necessary, wait until tasks have finished before removing nodes.

The first statement in the formula increases the number of nodes during high CPU usage. You
define a statement that populates a user-defined variable ( $totalDedicatedNodes ) with a value
that is 110 percent of the current target number of dedicated nodes, but only if the minimum
average CPU usage during the last 10 minutes was above 70 percent. Otherwise, it uses the
value for the current number of dedicated nodes.

$totalDedicatedNodes =
(min($CPUPercent.GetSample(TimeInterval_Minute * 10)) > 0.7) ?
($CurrentDedicatedNodes * 1.1) : $CurrentDedicatedNodes;

To decrease the number of dedicated nodes during low CPU usage, the next statement in the
formula sets the same $totalDedicatedNodes variable to 90 percent of the current target
number of dedicated nodes, if average CPU usage in the past 60 minutes was under 20
percent. Otherwise, it uses the current value of $totalDedicatedNodes populated in the
statement above.
$totalDedicatedNodes =
(avg($CPUPercent.GetSample(TimeInterval_Minute * 60)) < 0.2) ?
($CurrentDedicatedNodes * 0.9) : $totalDedicatedNodes;

Now, limit the target number of dedicated compute nodes to a maximum of 400.

$TargetDedicatedNodes = min(400, $totalDedicatedNodes);

Finally, ensure that nodes aren't removed until their tasks are finished.

$NodeDeallocationOption = taskcompletion;

Here's the complete formula:

$totalDedicatedNodes =
(min($CPUPercent.GetSample(TimeInterval_Minute * 10)) > 0.7) ?
($CurrentDedicatedNodes * 1.1) : $CurrentDedicatedNodes;
$totalDedicatedNodes =
(avg($CPUPercent.GetSample(TimeInterval_Minute * 60)) < 0.2) ?
($CurrentDedicatedNodes * 0.9) : $totalDedicatedNodes;
$TargetDedicatedNodes = min(400, $totalDedicatedNodes);
$NodeDeallocationOption = taskcompletion;

7 Note

If you choose, you can include both comments and line breaks in formula strings. Also be
aware that missing semicolons might result in evaluation errors.

Automatic scaling interval


By default, the Batch service adjusts a pool's size according to its autoscale formula every 15
minutes. This interval is configurable by using the following pool properties:

CloudPool.AutoScaleEvaluationInterval (Batch .NET)


autoScaleEvaluationInterval (REST API)
The minimum interval is five minutes, and the maximum is 168 hours. If an interval outside this
range is specified, the Batch service returns a Bad Request (400) error.

7 Note

Autoscaling is not currently intended to respond to changes in less than a minute, but
rather is intended to adjust the size of your pool gradually as you run a workload.

Create an autoscale-enabled pool with Batch SDKs


Pool autoscaling can be configured using any of the Batch SDKs, the Batch REST API Batch
PowerShell cmdlets, and the Batch CLI. In this section, you can see examples for both .NET and
Python.

.NET
To create a pool with autoscaling enabled in .NET, follow these steps:

1. Create the pool with BatchClient.PoolOperations.CreatePool.


2. Set the CloudPool.AutoScaleEnabled property to true.
3. Set the CloudPool.AutoScaleFormula property with your autoscale formula.
4. (Optional) Set the CloudPool.AutoScaleEvaluationInterval property (default is 15 minutes).
5. Commit the pool with CloudPool.Commit or CommitAsync.

The following example creates an autoscale-enabled pool in .NET. The pool's autoscale formula
sets the target number of dedicated nodes to 5 on Mondays, and to 1 on every other day of
the week. The automatic scaling interval is set to 30 minutes. In this and the other C# snippets
in this article, myBatchClient is a properly initialized instance of the BatchClient class.

C#

CloudPool pool = myBatchClient.PoolOperations.CreatePool(


poolId: "mypool",
virtualMachineSize: "standard_d1_v2",
VirtualMachineConfiguration: new VirtualMachineConfiguration(
imageReference: new ImageReference(
publisher: "MicrosoftWindowsServer",
offer: "WindowsServer",
sku: "2019-datacenter-core",
version: "latest"),
nodeAgentSkuId: "batch.node.windows amd64");
pool.AutoScaleEnabled = true;
pool.AutoScaleFormula = "$TargetDedicatedNodes = (time().weekday == 1 ? 5:1);";
pool.AutoScaleEvaluationInterval = TimeSpan.FromMinutes(30);
await pool.CommitAsync();

) Important

When you create an autoscale-enabled pool, don't specify the targetDedicatedNodes


parameter or the targetLowPriorityNodes parameter on the call to CreatePool . Instead,
specify the AutoScaleEnabled and AutoScaleFormula properties on the pool. The values for
these properties determine the target number of each type of node.

To manually resize an autoscale-enabled pool (for example, with


BatchClient.PoolOperations.ResizePoolAsync), you must first disable automatic scaling on
the pool, then resize it.

 Tip

For more examples of using the .NET SDK, see the Batch .NET Quickstart repository on
GitHub.

Python
To create an autoscale-enabled pool with the Python SDK:

1. Create a pool and specify its configuration.


2. Add the pool to the service client.
3. Enable autoscale on the pool with a formula you write.

The following example illustrates these steps.

Python

# Create a pool; specify configuration


new_pool = batch.models.PoolAddParameter(
id="autoscale-enabled-pool",
virtual_machine_configuration=batchmodels.VirtualMachineConfiguration(
image_reference=batchmodels.ImageReference(
publisher="Canonical",
offer="UbuntuServer",
sku="20.04-LTS",
version="latest"
),
node_agent_sku_id="batch.node.ubuntu 20.04"),
vm_size="STANDARD_D1_v2",
target_dedicated_nodes=0,
target_low_priority_nodes=0
)
batch_service_client.pool.add(new_pool) # Add the pool to the service client

formula = """$curTime = time();


$workHours = $curTime.hour >= 8 && $curTime.hour < 18;
$isWeekday = $curTime.weekday >= 1 && $curTime.weekday <= 5;
$isWorkingWeekdayHour = $workHours && $isWeekday;
$TargetDedicated = $isWorkingWeekdayHour ? 20:10;""";

# Enable autoscale; specify the formula


response = batch_service_client.pool.enable_auto_scale(pool_id,
auto_scale_formula=formula,

auto_scale_evaluation_interval=datetime.timedelta(minutes=10),
pool_enable_auto_scale_options=None,
custom_headers=None, raw=False)

 Tip

For more examples of using the Python SDK, see the Batch Python Quickstart
repository on GitHub.

Enable autoscaling on an existing pool


Each Batch SDK provides a way to enable automatic scaling. For example:

BatchClient.PoolOperations.EnableAutoScaleAsync (Batch .NET)


Enable automatic scaling on a pool (REST API)

When you enable autoscaling on an existing pool, keep in mind:

If autoscaling is currently disabled on the pool, you must specify a valid autoscale formula
when you issue the request. You can optionally specify an automatic scaling interval. If
you don't specify an interval, the default value of 15 minutes is used.
If autoscaling is currently enabled on the pool, you can specify a new formula, a new
interval, or both. You must specify at least one of these properties.
If you specify a new automatic scaling interval, the existing schedule is stopped and a
new schedule is started. The new schedule's start time is the time at which the request
to enable autoscaling was issued.
If you omit either the autoscale formula or interval, the Batch service continues to use
the current value of that setting.

7 Note
If you specified values for the targetDedicatedNodes or targetLowPriorityNodes parameters
of the CreatePool method when you created the pool in .NET, or for the comparable
parameters in another language, then those values are ignored when the autoscale
formula is evaluated.

This C# example uses the Batch .NET library to enable autoscaling on an existing pool.

C#

// Define the autoscaling formula. This formula sets the target number of nodes
// to 5 on Mondays, and 1 on every other day of the week
string myAutoScaleFormula = "$TargetDedicatedNodes = (time().weekday == 1 ?
5:1);";

// Set the autoscale formula on the existing pool


await myBatchClient.PoolOperations.EnableAutoScaleAsync(
"myexistingpool",
autoscaleFormula: myAutoScaleFormula);

Update an autoscale formula


To update the formula on an existing autoscale-enabled pool, call the operation to enable
autoscaling again with the new formula. For example, if autoscaling is already enabled on
myexistingpool when the following .NET code is executed, its autoscale formula is replaced

with the contents of myNewFormula .

C#

await myBatchClient.PoolOperations.EnableAutoScaleAsync(
"myexistingpool",
autoscaleFormula: myNewFormula);

Update the autoscale interval


To update the autoscale evaluation interval of an existing autoscale-enabled pool, call the
operation to enable autoscaling again with the new interval. For example, to set the autoscale
evaluation interval to 60 minutes for a pool that's already autoscale-enabled in .NET:

C#

await myBatchClient.PoolOperations.EnableAutoScaleAsync(
"myexistingpool",
autoscaleEvaluationInterval: TimeSpan.FromMinutes(60));
Evaluate an autoscale formula
You can evaluate a formula before applying it to a pool. This lets you test the formula's results
before you put it into production.

Before you can evaluate an autoscale formula, you must first enable autoscaling on the pool
with a valid formula, such as the one-line formula $TargetDedicatedNodes = 0 . Then, use one of
the following to evaluate the formula you want to test:

BatchClient.PoolOperations.EvaluateAutoScale or EvaluateAutoScaleAsync

These Batch .NET methods require the ID of an existing pool and a string containing the
autoscale formula to evaluate.

Evaluate an automatic scaling formula

In this REST API request, specify the pool ID in the URI, and the autoscale formula in the
autoScaleFormula element of the request body. The response of the operation contains
any error information that might be related to the formula.

The following Batch .NET example evaluates an autoscale formula. If the pool doesn't already
use autoscaling, enable it first.

C#

// First obtain a reference to an existing pool


CloudPool pool = await batchClient.PoolOperations.GetPoolAsync("myExistingPool");

// If autoscaling isn't already enabled on the pool, enable it.


// You can't evaluate an autoscale formula on a non-autoscale-enabled pool.
if (pool.AutoScaleEnabled == false)
{
// You need a valid autoscale formula to enable autoscaling on the
// pool. This formula is valid, but won't resize the pool:
await pool.EnableAutoScaleAsync(
autoscaleFormula: "$TargetDedicatedNodes = $CurrentDedicatedNodes;",
autoscaleEvaluationInterval: TimeSpan.FromMinutes(5));

// Batch limits EnableAutoScaleAsync calls to once every 30 seconds.


// Because you want to apply our new autoscale formula below if it
// evaluates successfully, and you *just* enabled autoscaling on
// this pool, pause here to ensure you pass that threshold.
Thread.Sleep(TimeSpan.FromSeconds(31));

// Refresh the properties of the pool so that we've got the


// latest value for AutoScaleEnabled
await pool.RefreshAsync();
}

// You must ensure that autoscaling is enabled on the pool prior to


// evaluating a formula
if (pool.AutoScaleEnabled == true)
{
// The formula to evaluate - adjusts target number of nodes based on
// day of week and time of day
string myFormula = @"
$curTime = time();
$workHours = $curTime.hour >= 8 && $curTime.hour < 18;
$isWeekday = $curTime.weekday >= 1 && $curTime.weekday <= 5;
$isWorkingWeekdayHour = $workHours && $isWeekday;
$TargetDedicatedNodes = $isWorkingWeekdayHour ? 20:10;
";

// Perform the autoscale formula evaluation. Note that this code does not
// actually apply the formula to the pool.
AutoScaleRun eval =
await batchClient.PoolOperations.EvaluateAutoScaleAsync(pool.Id,
myFormula);

if (eval.Error == null)
{
// Evaluation success - print the results of the AutoScaleRun.
// This will display the values of each variable as evaluated by the
// autoscale formula.
Console.WriteLine("AutoScaleRun.Results: " +
eval.Results.Replace("$", "\n $"));

// Apply the formula to the pool since it evaluated successfully


await batchClient.PoolOperations.EnableAutoScaleAsync(pool.Id, myFormula);
}
else
{
// Evaluation failed, output the message associated with the error
Console.WriteLine("AutoScaleRun.Error.Message: " +
eval.Error.Message);
}
}

Successful evaluation of the formula shown in this code snippet produces results similar to:

AutoScaleRun.Results:
$TargetDedicatedNodes=10;
$NodeDeallocationOption=requeue;
$curTime=2016-10-13T19:18:47.805Z;
$isWeekday=1;
$isWorkingWeekdayHour=0;
$workHours=0

Get information about autoscale runs


It's recommended to periodically check the Batch service's evaluation of your autoscale
formula. To do so, get (or refresh) a reference to the pool, then examine the properties of its
last autoscale run.

In Batch .NET, the CloudPool.AutoScaleRun property has several properties that provide
information about the latest automatic scaling run performed on the pool:

AutoScaleRun.Timestamp
AutoScaleRun.Results
AutoScaleRun.Error

In the REST API, information about a pool includes the latest automatic scaling run information
in the autoScaleRun property.

The following C# example uses the Batch .NET library to print information about the last
autoscaling run on pool myPool.

C#

await Cloud pool = myBatchClient.PoolOperations.GetPoolAsync("myPool");


Console.WriteLine("Last execution: " + pool.AutoScaleRun.Timestamp);
Console.WriteLine("Result:" + pool.AutoScaleRun.Results.Replace("$", "\n $"));
Console.WriteLine("Error: " + pool.AutoScaleRun.Error);

Sample output from the preceding example:

Last execution: 10/14/2016 18:36:43


Result:
$TargetDedicatedNodes=10;
$NodeDeallocationOption=requeue;
$curTime=2016-10-14T18:36:43.282Z;
$isWeekday=1;
$isWorkingWeekdayHour=0;
$workHours=0
Error:

Get autoscale run history using pool autoscale


events
You can also check automatic scaling history by querying PoolAutoScaleEvent. Batch emits this
event to record each occurrence of autoscale formula evaluation and execution, which can be
helpful to troubleshoot potential issues.
Sample event for PoolAutoScaleEvent:

JSON

{
"id": "poolId",
"timestamp": "2020-09-21T23:41:36.750Z",
"formula": "...",
"results":
"$TargetDedicatedNodes=10;$NodeDeallocationOption=requeue;$curTime=2016-10-
14T18:36:43.282Z;$isWeekday=1;$isWorkingWeekdayHour=0;$workHours=0",
"error": {
"code": "",
"message": "",
"values": []
}
}

Example autoscale formulas


Let's look at a few formulas that show different ways to adjust the amount of compute
resources in a pool.

Example 1: Time-based adjustment


Suppose you want to adjust the pool size based on the day of the week and time of day. This
example shows how to increase or decrease the number of nodes in the pool accordingly.

The formula first obtains the current time. If it's a weekday (1-5) and within working hours (8
AM to 6 PM), the target pool size is set to 20 nodes. Otherwise, it's set to 10 nodes.

$curTime = time();
$workHours = $curTime.hour >= 8 && $curTime.hour < 18;
$isWeekday = $curTime.weekday >= 1 && $curTime.weekday <= 5;
$isWorkingWeekdayHour = $workHours && $isWeekday;
$TargetDedicatedNodes = $isWorkingWeekdayHour ? 20:10;
$NodeDeallocationOption = taskcompletion;

$curTime can be adjusted to reflect your local time zone by adding time() to the product of

TimeZoneInterval_Hour and your UTC offset. For instance, use $curTime = time() + (-6 *
TimeInterval_Hour); for Mountain Daylight Time (MDT). Keep in mind that the offset needs to

be adjusted at the start and end of daylight saving time, if applicable.


Example 2: Task-based adjustment
In this C# example, the pool size is adjusted based on the number of tasks in the queue. Both
comments and line breaks are included in the formula strings.

C#

// Get pending tasks for the past 15 minutes.


$samples = $PendingTasks.GetSamplePercent(TimeInterval_Minute * 15);
// If you have fewer than 70 percent data points, use the last sample point,
// otherwise use the maximum of last sample point and the history average.
$tasks = $samples < 70 ? max(0,$PendingTasks.GetSample(1)) : max(
$PendingTasks.GetSample(1), avg($PendingTasks.GetSample(TimeInterval_Minute *
15)));
// If number of pending tasks is not 0, set targetVM to pending tasks, otherwise
// half of current dedicated.
$targetVMs = $tasks > 0? $tasks:max(0, $TargetDedicatedNodes/2);
// The pool size is capped at 20, if target VM value is more than that, set it
// to 20. This value should be adjusted according to your use case.
$TargetDedicatedNodes = max(0, min($targetVMs, 20));
// Set node deallocation mode - let running tasks finish before removing a node
$NodeDeallocationOption = taskcompletion;

Example 3: Accounting for parallel tasks


This C# example adjusts the pool size based on the number of tasks. This formula also takes
into account the TaskSlotsPerNode value that's been set for the pool. This approach is useful in
situations where parallel task execution has been enabled on your pool.

C#

// Determine whether 70 percent of the samples have been recorded in the past
// 15 minutes; if not, use last sample
$samples = $ActiveTasks.GetSamplePercent(TimeInterval_Minute * 15);
$tasks = $samples < 70 ? max(0,$ActiveTasks.GetSample(1)) : max(
$ActiveTasks.GetSample(1),avg($ActiveTasks.GetSample(TimeInterval_Minute * 15)));
// Set the number of nodes to add to one-fourth the number of active tasks
// (the TaskSlotsPerNode property on this pool is set to 4, adjust
// this number for your use case)
$cores = $TargetDedicatedNodes * 4;
$extraVMs = (($tasks - $cores) + 3) / 4;
$targetVMs = ($TargetDedicatedNodes + $extraVMs);
// Attempt to grow the number of compute nodes to match the number of active
// tasks, with a maximum of 3
$TargetDedicatedNodes = max(0,min($targetVMs,3));
// Keep the nodes active until the tasks finish
$NodeDeallocationOption = taskcompletion;
Example 4: Setting an initial pool size
This example shows a C# example with an autoscale formula that sets the pool size to a
specified number of nodes for an initial time period. After that, it adjusts the pool size based
on the number of running and active tasks.

Specifically, this formula does the following:

Sets the initial pool size to four nodes.


Doesn't adjust the pool size within the first 10 minutes of the pool's lifecycle.
After 10 minutes, obtains the max value of the number of running and active tasks within
the past 60 minutes.
If both values are 0, indicating that no tasks were running or active in the last 60
minutes, the pool size is set to 0.
If either value is greater than zero, no change is made.

C#

string now = DateTime.UtcNow.ToString("r");


string formula = string.Format(@"
$TargetDedicatedNodes = {1};
lifespan = time() - time(""{0}"");
span = TimeInterval_Minute * 60;
startup = TimeInterval_Minute * 10;
ratio = 50;

$TargetDedicatedNodes = (lifespan > startup ?


(max($RunningTasks.GetSample(span, ratio), $ActiveTasks.GetSample(span, ratio)) ==
0 ? 0 : $TargetDedicatedNodes) : {1});
", now, 4);

Next steps
Learn how to execute multiple tasks simultaneously on the compute nodes in your pool.
Along with autoscaling, this can help to lower job duration for some workloads, saving
you money.
Learn how to query the Azure Batch service efficiently.
Configure remote access to compute
nodes in an Azure Batch pool
Article • 12/16/2024

If configured, you can allow a node user with network connectivity to connect externally
to a compute node in a Batch pool. For example, a user can connect by Remote Desktop
(RDP) on port 3389 to a compute node in a Windows pool. Similarly, by default, a user
can connect by Secure Shell (SSH) on port 22 to a compute node in a Linux pool.

7 Note

As of API version 2024-07-01 (and all pools created after 30 November 2025
regardless of API version), Batch no longer automatically maps common remote
access ports for SSH and RDP. If you wish to allow remote access to your Batch
compute nodes with pools created with API version 2024-07-01 or later (and after
30 November 2025), then you must manually configure the pool endpoint
configuration to enable such access.

In your environment, you might need to enable, restrict, or disable external access
settings or any other ports you wish on the Batch pool. You can modify these settings by
using the Batch APIs to set the PoolEndpointConfiguration property.

Batch pool endpoint configuration


The endpoint configuration consists of one or more network address translation (NAT)
pools of frontend ports. Don't confuse a NAT pool with the Batch pool of compute
nodes. You set up each NAT pool to override the default connection settings on the
pool's compute nodes.

Each NAT pool configuration includes one or more network security group (NSG) rules.
Each NSG rule allows or denies certain network traffic to the endpoint. You can choose
to allow or deny all traffic, traffic identified by a service tag (such as "Internet"), or traffic
from specific IP addresses or subnets.

Considerations
The pool endpoint configuration is part of the pool's network configuration. The
network configuration can optionally include settings to join the pool to an Azure
virtual network. If you set up the pool in a virtual network, you can create NSG
rules that use address settings in the virtual network.
You can configure multiple NSG rules when you configure a NAT pool. The rules
are checked in the order of priority. Once a rule applies, no more rules are tested
for matching.

Example: Allow RDP traffic from a specific IP


address
The following C# snippet shows how to configure the RDP endpoint on compute nodes
in a Windows pool to allow RDP access only from IP address 198.168.100.7. The second
NSG rule denies traffic that doesn't match the IP address.

C#

using Microsoft.Azure.Batch;
using Microsoft.Azure.Batch.Common;

namespace AzureBatch
{
public void SetPortsPool()
{
pool.NetworkConfiguration = new NetworkConfiguration
{
EndpointConfiguration = new PoolEndpointConfiguration(new
InboundNatPool[]
{
new InboundNatPool("RDP", InboundEndpointProtocol.Tcp, 3389,
7500, 8000, new NetworkSecurityGroupRule[]
{
new NetworkSecurityGroupRule(179,
NetworkSecurityGroupRuleAccess.Allow, "198.168.100.7"),
new NetworkSecurityGroupRule(180,
NetworkSecurityGroupRuleAccess.Deny, "*")
})
})
};
}
}

Example: Allow SSH traffic from a specific


subnet
The following Python snippet shows how to configure the SSH endpoint on compute
nodes in a Linux pool to allow access only from the subnet 192.168.1.0/24. The second
NSG rule denies traffic that doesn't match the subnet.

Python

from azure.batch import models as batchmodels

class AzureBatch(object):
def set_ports_pool(self, **kwargs):
pool.network_configuration = batchmodels.NetworkConfiguration(
endpoint_configuration=batchmodels.PoolEndpointConfiguration(
inbound_nat_pools=[batchmodels.InboundNATPool(
name='SSH',
protocol='tcp',
backend_port=22,
frontend_port_range_start=4000,
frontend_port_range_end=4100,
network_security_group_rules=[
batchmodels.NetworkSecurityGroupRule(
priority=170,
access='allow',
source_address_prefix='192.168.1.0/24'
),
batchmodels.NetworkSecurityGroupRule(
priority=175,
access='deny',
source_address_prefix='*'
)
]
)
]
)
)

Example: Deny all RDP traffic


The following C# snippet shows how to configure the RDP endpoint on compute nodes
in a Windows pool to deny all network traffic. The endpoint uses a frontend pool of
ports in the range 60000 - 60099.

7 Note

As of Batch API version 2024-07-01 , port 3389 typically associated with RDP is no
longer mapped by default. Creating an explicit deny rule is no longer required if
access is not needed from the Internet for Batch pools created with this API version
or later. You may still need to specify explicit deny rules to restrict access from
other sources.
C#

using Microsoft.Azure.Batch;
using Microsoft.Azure.Batch.Common;

namespace AzureBatch
{
public void SetPortsPool()
{
pool.NetworkConfiguration = new NetworkConfiguration
{
EndpointConfiguration = new PoolEndpointConfiguration(new
InboundNatPool[]
{
new InboundNatPool("RDP", InboundEndpointProtocol.Tcp, 3389,
60000, 60099, new NetworkSecurityGroupRule[]
{
new NetworkSecurityGroupRule(162,
NetworkSecurityGroupRuleAccess.Deny, "*"),
})
})
};
}
}

Example: Deny all SSH traffic from the internet


The following Python snippet shows how to configure the SSH endpoint on compute
nodes in a Linux pool to deny all internet traffic. The endpoint uses a frontend pool of
ports in the range 4000 - 4100.

7 Note

As of Batch API version 2024-07-01 , port 22 typically associated with SSH is no


longer mapped by default. Creating an explicit deny rule is no longer required if
access is not needed from the Internet for Batch pools created with this API version
or later. You may still need to specify explicit deny rules to restrict access from
other sources.

Python

from azure.batch import models as batchmodels

class AzureBatch(object):
def set_ports_pool(self, **kwargs):
pool.network_configuration = batchmodels.NetworkConfiguration(
endpoint_configuration=batchmodels.PoolEndpointConfiguration(
inbound_nat_pools=[batchmodels.InboundNATPool(
name='SSH',
protocol='tcp',
backend_port=22,
frontend_port_range_start=4000,
frontend_port_range_end=4100,
network_security_group_rules=[
batchmodels.NetworkSecurityGroupRule(
priority=170,

access=batchmodels.NetworkSecurityGroupRuleAccess.deny,
source_address_prefix='Internet'
)
]
)
]
)
)

Next steps
Learn about the Batch service workflow and primary resources such as pools,
nodes, jobs, and tasks.
Learn more about NSG rules in Azure with Filtering network traffic with network
security groups.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Use the Azure Compute Gallery to create a
custom image pool
07/01/2025

When you create an Azure Batch pool using the Virtual Machine Configuration, you specify a
VM image that provides the operating system for each compute node in the pool. You can
create a pool of virtual machines either with a supported Azure Marketplace image or create a
custom image with an Azure Compute Gallery image.

Benefits of the Azure Compute Gallery


When you use the Azure Compute Gallery for your custom image, you have control over the
operating system type and configuration, as well as the type of data disks. Your Shared Image
can include applications and reference data that become available on all the Batch pool nodes
as soon as they're provisioned.

You can also have multiple versions of an image as needed for your environment. When you
use an image version to create a VM, the image version is used to create new disks for the VM.

Using a Shared Image saves time in preparing your pool's compute nodes to run your Batch
workload. It's possible to use an Azure Marketplace image and install software on each
compute node after provisioning, but using a Shared Image is typically more efficient.
Additionally, you can specify multiple replicas for the Shared Image so when you create pools
with many VMs (more than 600 VMs), you'll save time on pool creation.

Using a Shared Image configured for your scenario can provide several advantages:

Use the same images across the regions. You can create Shared Image replicas across
different regions so all your pools utilize the same image.
Configure the operating system (OS). You can customize the configuration of the
image's operating system disk.
Pre-install applications. Pre-installing applications on the OS disk is more efficient and
less error-prone than installing applications after provisioning the compute nodes with a
start task.
Copy large amounts of data once. Make static data part of the managed Shared Image
by copying it to a managed image's data disks. This only needs to be done once and
makes data available to each node of the pool.
Grow pools to larger sizes. With the Azure Compute Gallery, you can create larger pools
with your customized images along with more Shared Image replicas.
Better performance than using just a managed image as a custom image. For a Shared
Image custom image pool, the time to reach the steady state is up to 25% faster, and the
VM idle latency is up to 30% shorter.
Image versioning and grouping for easier management. The image grouping definition
contains information about why the image was created, what OS it is for, and information
about using the image. Grouping images allows for easier image management. For more
information, see Image definitions.

Prerequisites
An Azure Batch account. To create a Batch account, see the Batch quickstarts using the
Azure portal or Azure CLI.

7 Note

Authentication using Microsoft Entra ID is required. If you use Shared Key Auth, you will
get an authentication error.

an Azure Compute Gallery image. To create a Shared Image, you need to have or create
a managed image resource. The image should be created from snapshots of the VM's OS
disk and optionally its attached data disks.

7 Note

If the Shared Image is in a different subscription than the Batch account, you must register
the Microsoft.Batch resource provider in the subscription where the Shared Image
resides. Both the subscriptions must belong to the same Microsoft Entra tenant.

The image can be in a different region as long as it has replicas in the same region as your
Batch account.

If you use a Microsoft Entra application to create a custom image pool with an Azure Compute
Gallery image, that application must have been granted an Azure built-in role that gives it
access to the Shared Image. You can grant this access in the Azure portal by navigating to the
Shared Image, selecting Access control (IAM) and adding a role assignment for the
application.

7 Note
Reader permissions for the Azure Compute Gallery image are inadequate as they
necessitate the execution of the following minimum action:
Microsoft.Compute/disks/beginGetAccess/action for appropriate access.

Prepare a Shared Image


In Azure, you can prepare a shared image from a managed image, which can be created from:

Snapshots of an Azure VM's OS and data disks


A generalized Azure VM with managed disks
A generalized on-premises VHD uploaded to the cloud

7 Note

Batch only supports generalized Shared Images; a specialized Shared Image can't be used
to create a pool.

The following steps show how to prepare a VM, take a snapshot, and create an image from the
snapshot.

Prepare a VM
If you're creating a new VM for the image, use Azure Marketplace image supported by Batch as
the base image for your managed image.

To get a full list of current Azure Marketplace image references supported by Azure Batch, use
one of the following APIs to return a list of Windows and Linux VM images:

PowerShell: Azure Batch supported images


Azure CLI: Azure Batch pool supported images
Batch service APIs: Batch service APIs and Azure Batch service supported images

Follow these guidelines when creating VMs:

Ensure the VM is created with a managed disk. This is the default storage setting when
you create a VM.
Don't install Azure extensions, such as the Custom Script extension, on the VM. If the
image contains a pre-installed extension, Azure may encounter problems when deploying
the Batch pool.
When using attached data disks, you need to mount and format the disks from within a
VM to use them.
Ensure that the base OS image you provide uses the default temp drive. The Batch node
agent currently expects the default temp drive.
Ensure that the OS disk isn't encrypted.
Once the VM is running, connect to it via RDP (for Windows) or SSH (for Linux). Install any
necessary software or copy desired data.
For faster pool provisioning, use the ReadWrite disk cache setting for the VM's OS disk.

Create an Azure Compute Gallery


You need to create an Azure Compute Gallery to make your custom image available. Select this
gallery when creating image in the following steps. To learn how to create an Azure Compute
Gallery for your images, see Create an Azure Compute Gallery.

Create an image
To create an image from a VM in the portal, see Capture an image of a VM.

To create an image using a source other than a VM, see Create an image.

7 Note

If the base image has purchase plan information, ensure that the gallery image has
identical purchase plan information as the base image. For more information on creating
image which has purchase plan, refer to Supply Azure Marketplace purchase plan
information when creating images.

If the base image does not have purchase plan information, avoid specifying any purchase
plan information for the gallery image.

For the purchase plan information about these Marketplace images, see the guidance for
Linux or Windows VMs.

Use Azure PowerShell Get-AzGalleryImageDefinition or Azure CLI az sig image-definition


show to check whether the gallery image has correct plan information.

Create a pool from a Shared Image using the Azure


CLI
To create a pool from your Shared Image using the Azure CLI, use the az batch pool create
command. Specify the Shared Image ID in the --image field. Make sure the OS type and SKU
matches the versions specified by --node-agent-sku-id

) Important

The node agent SKU id must align with the publisher/offer/SKU in order for the node to
start.

Azure CLI

az batch pool create \


--id mypool --vm-size Standard_A1_v2 \
--target-dedicated-nodes 2 \
--image "/subscriptions/{sub id}/resourceGroups/{resource group
name}/providers/Microsoft.Compute/galleries/{gallery name}/images/{image
definition name}/versions/{version id}" \
--{node-agent-sku-id}

Create a pool from a Shared Image using C#


Alternatively, you can create a pool from a Shared Image using the C# SDK.

C#

private static VirtualMachineConfiguration


CreateVirtualMachineConfiguration(ImageReference imageReference)
{
return new VirtualMachineConfiguration(
imageReference: imageReference,
nodeAgentSkuId: {});
}

private static ImageReference CreateImageReference()


{
return new ImageReference(
virtualMachineImageId: "/subscriptions/{sub id}/resourceGroups/{resource
group name}/providers/Microsoft.Compute/galleries/{gallery name}/images/{image
definition name}/versions/{version id}");
}

private static void CreateBatchPool(BatchClient batchClient,


VirtualMachineConfiguration vmConfiguration)
{
try
{
CloudPool pool = batchClient.PoolOperations.CreatePool(
poolId: PoolId,
targetDedicatedComputeNodes: PoolNodeCount,
virtualMachineSize: PoolVMSize,
virtualMachineConfiguration: vmConfiguration);

pool.Commit();
}
...
}

Create a pool from a Shared Image using Python


You also can create a pool from a Shared Image by using the Python SDK:

Python

# Import the required modules from the


# Azure Batch Client Library for Python
import azure.batch as batch
import azure.batch.models as batchmodels
from azure.common.credentials import ServicePrincipalCredentials

# Specify Batch account and service principal account credentials


account = "{batch-account-name}"
batch_url = "{batch-account-url}"
ad_client_id = "{sp-client-id}"
ad_tenant = "{tenant-id}"
ad_secret = "{sp-secret}"

# Pool settings
pool_id = "LinuxNodesSamplePoolPython"
vm_size = "STANDARD_D2_V3"
node_count = 1

# Initialize the Batch client with Azure AD authentication


creds = ServicePrincipalCredentials(
client_id=ad_client_id,
secret=ad_secret,
tenant=ad_tenant,
resource="https://batch.core.windows.net/"
)
client = batch.BatchServiceClient(creds, batch_url)

# Configure the start task for the pool


start_task = batchmodels.StartTask(
command_line="printenv AZ_BATCH_NODE_STARTUP_DIR"
)
start_task.run_elevated = True

# Create an ImageReference which specifies the image from


# Azure Compute Gallery to install on the nodes.
ir = batchmodels.ImageReference(
virtual_machine_image_id="/subscriptions/{sub id}/resourceGroups/{resource
group name}/providers/Microsoft.Compute/galleries/{gallery name}/images/{image
definition name}/versions/{version id}"
)

# Create the VirtualMachineConfiguration, specifying


# the VM image reference and the Batch node agent to
# be installed on the node.
vmc = batchmodels.VirtualMachineConfiguration(
image_reference=ir,
{node_agent_sku_id}
)

# Create the unbound pool


new_pool = batchmodels.PoolAddParameter(
id=pool_id,
vm_size=vm_size,
target_dedicated_nodes=node_count,
virtual_machine_configuration=vmc,
start_task=start_task
)

# Create pool in the Batch service


client.pool.add(new_pool)

Create a pool from a Shared Image or Custom


Image using the Azure portal
Use the following steps to create a pool from a Shared Image in the Azure portal.

1. Open the Azure portal .


2. Go to Batch accounts and select your account.
3. Select Pools and then Add to create a new pool.
4. In the Image Type section, select Azure Compute Gallery.
5. Complete the remaining sections with information about your managed image.
6. Select OK.
7. Once the node is allocated, use Connect to generate user and the RDP file for Windows
OR use SSH to for Linux to log in to the allocated node and verify.
Considerations for large pools
If you plan to create a pool with hundreds or thousands of VMs or more using a Shared Image,
use the following guidance.

Azure Compute Gallery replica numbers. For every pool with up to 300 instances, we
recommend you keep at least one replica. For example, if you're creating a pool with
3,000 VMs, you should keep at least 10 replicas of your image. We always suggest
keeping more replicas than minimum requirements for better performance.

Resize timeout. If your pool contains a fixed number of nodes (if it doesn't autoscale),
increase the resizeTimeout property of the pool depending on the pool size. For every
1,000 VMs, the recommended resize timeout is at least 15 minutes. For example, the
recommended resize timeout for a pool with 2,000 VMs is at least 30 minutes.
Next steps
For an in-depth overview of Batch, see Batch service workflow and resources.
Learn about the Azure Compute Gallery.
Use a managed image to create a
custom image pool
Article • 03/19/2024

To create a custom image pool for your Batch pool's virtual machines (VMs), you can use
a managed image to create an Azure Compute Gallery image. Using just a managed
image is also supported, but only for API versions up to and including 2019-08-01.

2 Warning

Support for creating a Batch pool using a managed image is being retired after 31
March 2026. Please migrate to hosting custom images in Azure Compute Gallery to
use for creating a custom image pool in Batch. For more information, see the
migration guide.

This topic explains how to create a custom image pool using only a managed image.

Prerequisites
A managed image resource. To create a pool of virtual machines using a custom
image, you need to have or create a managed image resource in the same Azure
subscription and region as the Batch account. The image should be created from
snapshots of the VM's operating system's (OS) disk and optionally its attached
data disks.
Use a unique custom image for each pool you create.
To create a pool with the image using the Batch APIs, specify the resource ID of
the image, which is of the form /subscriptions/xxxx-xxxxxx-xxxxx-
xxxxxx/resourceGroups/myResourceGroup/providers/Microsoft.Compute/images/my

Image .

The managed image resource should exist for the lifetime of the pool to allow
scale-up and can be removed after the pool is deleted.

Microsoft Entra authentication. The Batch client API must use Microsoft Entra
authentication. Azure Batch support for Microsoft Entra ID is documented in
Authenticate Batch service solutions with Active Directory.

Prepare a managed image


In Azure, you can prepare a managed image from:

Snapshots of an Azure VM's OS and data disks


A generalized Azure VM with managed disks
A generalized on-premises VHD uploaded to the cloud

To scale Batch pools reliably with a managed image, we recommend creating the
managed image using only the first method: using snapshots of the VM's disks. The
following steps show how to prepare a VM, take a snapshot, and create a managed
image from the snapshot.

Prepare a VM
If you're creating a new VM for the image, use a first party Azure Marketplace image
supported by Batch as the base image for your managed image. Only first party images
can be used as a base image. To get a full list of Azure Marketplace image references
supported by Azure Batch, see List Supported Images.

7 Note

You can't use a third-party image that has additional license and purchase terms as
your base image. For information about these Marketplace images, see the
guidance for Linux or Windows VMs.

To use third-party image, you can use the Azure Compute Gallery. Please refer to
Use the Azure Compute Gallery to create a custom image pool for more
information.

Ensure the VM is created with a managed disk. This is the default storage setting
when you create a VM.
Don't install Azure extensions, such as the Custom Script extension, on the VM. If
the image contains a preinstalled extension, Azure may encounter problems when
deploying the Batch pool.
When using attached data disks, you need to mount and format the disks from
within a VM to use them.
Ensure that the base OS image you provide uses the default temp drive. The Batch
node agent currently expects the default temp drive.
Ensure that the OS disk isn't encrypted.
Once the VM is running, connect to it via RDP (for Windows) or SSH (for Linux).
Install any necessary software or copy desired data.
Create a VM snapshot
A snapshot is a full, read-only copy of a VHD. To create a snapshot of a VMs OS or data
disks, you can use the Azure portal or command-line tools. For steps and options to
create a snapshot, see the guidance for VMs.

Create an image from one or more snapshots


To create a managed image from a snapshot, use Azure command-line tools such as the
az image create command. You can create an image by specifying an OS disk snapshot
and optionally one or more data disk snapshots.

Create a pool from a managed image


Once you have found the resource ID of your managed image, create a custom image
pool from that image. The following steps show you how to create a custom image pool
using either Batch Service or Batch Management.

7 Note

Make sure that the identity you use for Microsoft Entra authentication has
permissions to the image resource. See Authenticate Batch service solutions with
Active Directory.

The resource for the managed image must exist for the lifetime of the pool. If the
underlying resource is deleted, the pool cannot be scaled.

Batch Service .NET SDK


C#

private static VirtualMachineConfiguration


CreateVirtualMachineConfiguration(ImageReference imageReference)
{
return new VirtualMachineConfiguration(
imageReference: imageReference,
nodeAgentSkuId: "batch.node.windows amd64");
}

private static ImageReference CreateImageReference()


{
return new ImageReference(
virtualMachineImageId: "/subscriptions/{sub
id}/resourceGroups/{resource group
name}/providers/Microsoft.Compute/images/{image definition name}");
}

private static void CreateBatchPool(BatchClient batchClient,


VirtualMachineConfiguration vmConfiguration)
{
try
{
CloudPool pool = batchClient.PoolOperations.CreatePool(
poolId: PoolId,
targetDedicatedComputeNodes: PoolNodeCount,
virtualMachineSize: PoolVMSize,
virtualMachineConfiguration: vmConfiguration);

pool.Commit();
}

Batch Management REST API


REST API URI

HTTP

PUT https://management.azure.com/subscriptions/{sub
id}/resourceGroups/{resource group
name}/providers/Microsoft.Batch/batchAccounts/{account name}/pools/{pool
name}?api-version=2020-03-01

Request Body

JSON

{
"properties": {
"vmSize": "{VM size}",
"deploymentConfiguration": {
"virtualMachineConfiguration": {
"imageReference": {
"id": "/subscriptions/{sub id}/resourceGroups/{resource group
name}/providers/Microsoft.Compute/images/{image name}"
},
"nodeAgentSkuId": "{Node Agent SKU ID}"
}
}
}
}
Considerations for large pools
If you plan to create a pool with hundreds of VMs or more using a custom image, it's
important to follow the preceding guidance to use an image created from a VM
snapshot.

Also note the following considerations:

Size limits - Batch limits the pool size to 2500 dedicated compute nodes, or 1000
Spot nodes, when you use a custom image.

If you use the same image (or multiple images based on the same underlying
snapshot) to create multiple pools, the total compute nodes in the pools can't
exceed the preceding limits. We don't recommend using an image or its
underlying snapshot for more than a single pool.

Limits may be reduced if you configure the pool with inbound NAT pools.

Resize timeout - If your pool contains a fixed number of nodes (doesn't autoscale),
increase the resizeTimeout property of the pool to a value such as 20-30 minutes.
If your pool doesn't reach its target size within the timeout period, perform
another resize operation.

If you plan a pool with more than 300 compute nodes, you might need to resize
the pool multiple times to reach the target size.

By using the Azure Compute Gallery, you can create larger pools with your customized
images along with more Shared Image replicas along with improved performance
benefits such as decreased time for nodes to become ready.

Considerations for using Packer


Creating a managed image resource directly with Packer can only be done with user
subscription mode Batch accounts. For Batch service mode accounts, you need to create
a VHD first, then import the VHD to a managed image resource. Depending on your
pool allocation mode (user subscription, or Batch service), your steps to create a
managed image resource varies.

Ensure that the resource used to create the managed image exists for the lifetimes of
any pool referencing the custom image. Failure to do so can result in pool allocation
failures and/or resize failures.
If the image or the underlying resource is removed, you may get an error similar to:
There was an error encountered while performing the last resize on the pool. Please

try resizing the pool again. Code: AllocationFailed . If you get this error, ensure that

the underlying resource hasn't been removed.

For more information on using Packer to create a VM, see Build a Linux image with
Packer or Build a Windows image with Packer.

Next steps
Learn how to use the Azure Compute Gallery to create a custom pool.
For an in-depth overview of Batch, see Batch service workflow and resources.
Create an Azure Batch pool across
Availability Zones
Article • 08/12/2024

Azure regions which support Availability Zones have a minimum of three separate
zones, each with their own independent power source, network, and cooling system.
When you create an Azure Batch pool using Virtual Machine Configuration, you can
choose to provision your Batch pool across Availability Zones. Creating your pool with
this zonal policy helps protect your Batch compute nodes from Azure datacenter-level
failures.

For example, you could create your pool with zonal policy in an Azure region which
supports three Availability Zones. If an Azure datacenter in one Availability Zone has an
infrastructure failure, your Batch pool will still have healthy nodes in the other two
Availability Zones, so the pool will remain available for task scheduling.

Regional support and other requirements


Batch maintains parity with Azure on supporting Availability Zones. To use the zonal
option, your pool must be created in a supported Azure region.

In order for your Batch pool to be allocated across availability zones, the Azure region in
which the pool is created must support the requested VM SKU in more than one zone.
You can validate this by calling the Resource Skus List API and check the locationInfo
field of resourceSku. Be sure that more than one zone is supported for the requested
VM SKU.

For user subscription mode Batch accounts, make sure that the subscription in which
you're creating your pool doesn't have a zone offer restriction on the requested VM
SKU. To confirm this, call the Resource Skus List API and check the
ResourceSkuRestrictions. If a zone restriction exists, you can submit a support ticket to
remove the zone restriction.

Also note that you can't create a pool with a zonal policy if it has inter-node
communication enabled and uses a VM SKU that supports InfiniBand.

Create a Batch pool across Availability Zones


The following examples show how to create a Batch pool across Availability Zones.
7 Note

When creating your pool with a zonal policy, the Batch service will try to allocate
your pool across all Availability Zones in the selected region; you can't specify a
particular allocation across the zones.

Batch Management Client .NET SDK


C#

var credential = new DefaultAzureCredential();


ArmClient _armClient = new ArmClient(credential);

var batchAccountIdentifier = ResourceIdentifier.Parse("your-batch-account-


resource-id");

BatchAccountResource batchAccount =
_armClient.GetBatchAccountResource(batchAccountIdentifier);

var poolName = "pool2";


var imageReference = new BatchImageReference()
{
Publisher = "canonical",
Offer = "0001-com-ubuntu-server-jammy",
Sku = "22_04-lts",
Version = "latest"
};
string nodeAgentSku = "batch.node.ubuntu 22.04";

var batchAccountPoolData = new BatchAccountPoolData()


{
VmSize = "Standard_DS1_v2",
DeploymentConfiguration = new BatchDeploymentConfiguration()
{
VmConfiguration = new BatchVmConfiguration(imageReference,
nodeAgentSku)
{
NodePlacementPolicy = BatchNodePlacementPolicyType.Zonal,
},
},
ScaleSettings = new BatchAccountPoolScaleSettings()
{
FixedScale = new BatchAccountFixedScaleSettings()
{
TargetDedicatedNodes = 5,
ResizeTimeout = TimeSpan.FromMinutes(15),
}
},

};
ArmOperation<BatchAccountPoolResource> armOperation =
batchAccount.GetBatchAccountPools().CreateOrUpdate(
WaitUntil.Completed, poolName, batchAccountPoolData);
BatchAccountPoolResource pool = armOperation.Value;

Batch REST API


REST API URL

POST {batchURL}/pools?api-version=2021-01-01.13.0
client-request-id: 00000000-0000-0000-0000-000000000000

Request body

"pool": {
"id": "pool2",
"vmSize": "standard_a1",
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "Canonical",
"offer": "UbuntuServer",
"sku": "20.04-lts"
},
"nodePlacementConfiguration": {
"policy": "Zonal"
}
"nodeAgentSKUId": "batch.node.ubuntu 20.04"
},
"resizeTimeout": "PT15M",
"targetDedicatedNodes": 5,
"targetLowPriorityNodes": 0,
"maxTasksPerNode": 3,
"enableAutoScale": false,
"enableInterNodeCommunication": false
}

Next steps
Learn about the Batch service workflow and primary resources such as pools,
nodes, jobs, and tasks.
Learn about creating a pool in a subnet of an Azure virtual network.
Learn about creating an Azure Batch pool without public IP addresses.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Create a pool with disk encryption enabled
07/01/2025

When you create an Azure Batch pool using Virtual Machine Configuration, you can encrypt
compute nodes in the pool with a platform-managed key by specifying the disk encryption
configuration.

This article explains how to create a Batch pool with disk encryption enabled.

Why use a pool with disk encryption


configuration?
With a Batch pool, you can access and store data on the OS and temporary disks of the
compute node. Encrypting the server-side disk with a platform-managed key will safeguard this
data with low overhead and convenience.

Batch will apply one of these disk encryption technologies on compute nodes, based on pool
configuration and regional supportability.

Managed disk encryption at rest with platform-managed keys


Encryption at host using a platform-managed Key
Azure Disk Encryption

You won't be able to specify which encryption method will be applied to the nodes in your
pool. Instead, you provide the target disks you want to encrypt on their nodes, and Batch can
choose the appropriate encryption method, ensuring the specified disks are encrypted on the
compute node. The following image depicts how Batch makes that choice.

) Important

If you are creating your pool with a Linux custom image, you can only enable disk
encryption only if your pool is using an Encryption At Host Supported VM size.
Encryption At Host is not currently supported on User Subscription Pools until the feature
becomes publicly available in Azure.
Some disk encryption configurations require that the VM family of the pool supports
encryption at host. See End-to-end encryption using encryption at host to determine which VM
families support encryption at host.

Azure portal
When creating a Batch pool in the Azure portal, select either OsDisk, TemporaryDisk or
OsAndTemporaryDisk under Disk Encryption Configuration.

After the pool is created, you can see the disk encryption configuration targets in the pool's
Properties section.
Examples
The following examples show how to encrypt the OS and temporary disks on a Batch pool
using the Batch .NET SDK, the Batch REST API, and the Azure CLI.

Batch .NET SDK


C#

pool.VirtualMachineConfiguration.DiskEncryptionConfiguration = new
DiskEncryptionConfiguration(
targets: new List<DiskEncryptionTarget> { DiskEncryptionTarget.OsDisk,
DiskEncryptionTarget.TemporaryDisk }
);

Batch REST API


REST API URL:

POST {batchURL}/pools?api-version=2020-03-01.11.0
client-request-id: 00000000-0000-0000-0000-000000000000

Request body:
"pool": {
"id": "pool2",
"vmSize": "standard_a1",
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "Canonical",
"offer": "UbuntuServer",
"sku": "22.04-LTS"
},
"diskEncryptionConfiguration": {
"targets": [
"OsDisk",
"TemporaryDisk"
]
}
"nodeAgentSKUId": "batch.node.ubuntu 22.04"
},
"resizeTimeout": "PT15M",
"targetDedicatedNodes": 5,
"targetLowPriorityNodes": 0,
"taskSlotsPerNode": 3,
"enableAutoScale": false,
"enableInterNodeCommunication": false
}

Azure CLI
Azure CLI

az batch pool create \


--id diskencryptionPool \
--vm-size Standard_DS1_V2 \
--target-dedicated-nodes 2 \
--image canonical:ubuntuserver:22.04-LTS \
--node-agent-sku-id "batch.node.ubuntu 22.04" \
--disk-encryption-targets OsDisk TemporaryDisk

Next steps
Learn more about server-side encryption of Azure Disk Storage.
For an in-depth overview of Batch, see Batch service workflow and resources.
Create an Azure Batch pool with specified
public IP addresses
07/01/2025

In Azure Batch, you can create a Batch pool in a subnet of an Azure virtual network (VNet).
Virtual machines (VMs) in the Batch pool are accessible through public IP addresses that Batch
creates. These public IP addresses can change over the lifetime of the pool. If the IP addresses
aren't refreshed, your network settings might become outdated.

You can create a list of static public IP addresses to use with the VMs in your pool instead. In
some cases, you might need to control the list of public IP addresses to make sure they don't
change unexpectedly. For example, you might be working with an external service, such as a
database, which restricts access to specific IP addresses.

For information about creating pools without public IP addresses, read Create an Azure Batch
pool without public IP addresses.

Prerequisites
The Batch client API must use Microsoft Entra authentication to use a public IP address.
An Azure VNet from the same subscription where you're creating your pool and IP
addresses. You can only use Azure Resource Manager-based VNets. Verify that the VNet
meets all of the general VNet requirements.
At least one existing Azure public IP address. Follow the public IP address requirements to
create and configure the IP addresses.

7 Note

Batch automatically allocates additional networking resources in the resource group


containing the public IP addresses. For each 100 dedicated nodes, Batch generally
allocates one network security group (NSG) and one load balancer. These resources are
limited by the subscription's resource quotas. When using larger pools, you may need to
request a quota increase for one or more of these resources.

Public IP address requirements


Create one or more public IP addresses through one of these methods:

Use the Azure portal


Use the Azure Command-Line Interface (Azure CLI)
Use Azure PowerShell.

Make sure your public IP addresses meet the following requirements:

Create the public IP addresses in the same subscription and region as the account for the
Batch pool.
Set the IP address assignment to Static.
Set the SKU to Standard.
Specify a DNS name.
Make sure no other resources use these public IP addresses, or the pool might experience
allocation failures. Only use these public IP addresses for the VM configuration pools.
Make sure that no security policies or resource locks restrict user access to the public IP
address.
Create enough public IP addresses for the pool to accommodate the number of target
VMs.
This number must equal at least the sum of
the targetDedicatedNodes and targetLowPriorityNodes properties of the pool.
If you don't create enough IP addresses, the pool partially allocates the compute
nodes, and a resize error happens.
Currently, Batch uses one public IP address for every 100 VMs.
Also create a buffer of public IP addresses. A buffer helps Batch with internal optimization
for scaling down. A buffer also allows quicker scaling up after an unsuccessful scale up or
scale down. We recommend adding one of the following amounts of buffer IP addresses;
choose whichever number is greater.
Add at least one more IP address.
Or, add approximately 10% of the number of total public IP addresses in the pool.

) Important

After you create the Batch pool, you can't add or change its list of public IP addresses. If
you want to change the list, you have to delete and recreate the pool.

Create a Batch pool with public IP addresses


The following example shows how to create a pool through the Azure Batch Service REST API
that uses public IP addresses.

REST API URI:

HTTP
POST {batchURL}/pools?api-version=2020-03-01.11.0
client-request-id: 00000000-0000-0000-0000-000000000000

Request body:

JSON

"pool": {
"id": "pool2",
"vmSize": "standard_a1",
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "Canonical",
"offer": "UbuntuServer",
"sku": "20.04-LTS"
},
"nodeAgentSKUId": "batch.node.ubuntu 20.04"
},
"networkConfiguration": {
"subnetId":
"/subscriptions/<subId>/resourceGroups/<rgId>/providers/Microsoft.Network/virtualN
etworks/<vNetId>/subnets/<subnetId>",
"publicIPAddressConfiguration": {
"provision": "usermanaged",
"ipAddressIds": [

"/subscriptions/<subId>/resourceGroups/<rgId>/providers/Microsoft.Network/publicIP
Addresses/<publicIpId>"
]
},

"resizeTimeout":"PT15M",
"targetDedicatedNodes":5,
"targetLowPriorityNodes":0,
"taskSlotsPerNode":3,
"taskSchedulingPolicy": {
"nodeFillType":"spread"
},
"enableAutoScale":false,
"enableInterNodeCommunication":true,
"metadata": [ {
"name":"myproperty",
"value":"myvalue"
} ]
}

Next steps
Learn about the Batch service workflow and primary resources.
Create a pool in a subnet of an Azure virtual network.
Create a simplified node communication
pool without public IP addresses
Article • 08/14/2023

7 Note

This replaces the previous preview version of Azure Batch pool without public IP
addresses. This new version requires using simplified compute node
communication.

) Important

Support for pools without public IP addresses in Azure Batch is currently available
for select regions.

When you create an Azure Batch pool, you can provision the virtual machine (VM)
configuration pool without a public IP address. This article explains how to set up a
Batch pool without public IP addresses.

Why use a pool without public IP addresses?


By default, all the compute nodes in an Azure Batch VM configuration pool are assigned
a public IP address. This address is used by the Batch service to support outbound
access to the internet, as well inbound access to compute nodes from the internet.

To restrict access to these nodes and reduce the discoverability of these nodes from the
internet, you can provision the pool without public IP addresses.

Prerequisites

) Important

The prerequisites have changed from the previous preview version of this feature.
Make sure to review each item for changes before proceeding.

Use simplified compute node communication. For more information, see Use
simplified compute node communication.
The Batch client API must use Azure Active Directory (AD) authentication. Azure
Batch support for Azure AD is documented in Authenticate Batch service solutions
with Active Directory.

Create your pool in an Azure virtual network (VNet), follow these requirements and
configurations. To prepare a VNet with one or more subnets in advance, you can
use the Azure portal, Azure PowerShell, the Azure Command-Line Interface (Azure
CLI), or other methods.

The VNet must be in the same subscription and region as the Batch account you
use to create your pool.

The subnet specified for the pool must have enough unassigned IP addresses to
accommodate the number of VMs targeted for the pool; that is, the sum of the
targetDedicatedNodes and targetLowPriorityNodes properties of the pool. If the

subnet doesn't have enough unassigned IP addresses, the pool partially


allocates the compute nodes, and a resize error occurs.

If you plan to use private endpoint, and your virtual network has private
endpoint network policy enabled, make sure the inbound connection with
TCP/443 to the subnet hosting the private endpoint must be allowed from Batch
pool's subnet.

Enable outbound access for Batch node management. A pool with no public IP
addresses doesn't have internet outbound access enabled by default. Choose one
of the following options to allow compute nodes to access the Batch node
management service (see Use simplified compute node communication):

Use nodeManagement private endpoint with Batch accounts, which provides


private access to Batch node management service from the virtual network. This
solution is the preferred method.

Alternatively, provide your own internet outbound access support (see


Outbound access to the internet).

) Important

There are two sub-resources for private endpoints with Batch accounts. Please use
the nodeManagement private endpoint for the Batch pool without public IP
addresses. For more details please check Use private endpoints with Azure Batch
accounts.
Current limitations
1. Pools without public IP addresses must use Virtual Machine Configuration and not
Cloud Services Configuration.
2. Custom endpoint configuration for Batch compute nodes doesn't work with pools
without public IP addresses.
3. Because there are no public IP addresses, you can't use your own specified public
IP addresses with this type of pool.
4. The task authentication token for Batch task is not supported. The workaround is
to use Batch pool with managed identities.

Create a pool without public IP addresses in the


Azure portal
1. If needed, create nodeManagement private endpoint for your Batch account in the
virtual network (see the outbound access requirement in prerequisites).
2. Navigate to your Batch account in the Azure portal.
3. In the Settings window on the left, select Pools.
4. In the Pools window, select Add.
5. On the Add Pool window, select the option you intend to use from the Image
Type dropdown.
6. Select the correct Publisher/Offer/Sku of your image.
7. Specify the remaining required settings, including the Node size, Target dedicated
nodes, and Target Spot/low-priority nodes.
8. For Node communication mode, select Simplified under Optional Settings.
9. Select a virtual network and subnet you wish to use. This virtual network must be
in the same location as the pool you're creating.
10. In IP address provisioning type, select NoPublicIPAddresses.

The following screenshot shows the elements that's required to be modified to create a
pool without public IP addresses.
Use the Batch REST API to create a pool
without public IP addresses
The following example shows how to use the Batch Service REST API to create a pool
that uses public IP addresses.

REST API URI


HTTP

POST {batchURL}/pools?api-version=2022-10-01.16.0
client-request-id: 00000000-0000-0000-0000-000000000000

Request body
JSON

"pool": {
"id": "pool-npip",
"vmSize": "standard_d2s_v3",
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "Canonical",
"offer": "0001-com-ubuntu-server-jammy",
"sku": "22_04-lts"
},
"nodeAgentSKUId": "batch.node.ubuntu 22.04"
},
"networkConfiguration": {
"subnetId":
"/subscriptions/<your_subscription_id>/resourceGroups/<your_resource_group>/
providers/Microsoft.Network/virtualNetworks/<your_vnet_name>/subnets/<your_s
ubnet_name>",
"publicIPAddressConfiguration": {
"provision": "NoPublicIPAddresses"
}
},
"resizeTimeout": "PT15M",
"targetDedicatedNodes": 2,
"targetLowPriorityNodes": 0,
"taskSlotsPerNode": 1,
"taskSchedulingPolicy": {
"nodeFillType": "spread"
},
"enableAutoScale": false,
"enableInterNodeCommunication": false,
"targetNodeCommunicationMode": "simplified"
}

Create a pool without public IP addresses using


ARM template
You can use this Azure Quickstart Template to create a pool without public IP
addresses using Azure Resource Manager (ARM) template.

Following resources will be deployed by the template:

Azure Batch account with IP firewall configured to block public network access to
Batch node management endpoint
Virtual network with network security group to block internet outbound access
Private endpoint to access Batch node management endpoint of the account
DNS integration for the private endpoint using private DNS zone linked to the
virtual network
Batch pool deployed in the virtual network and without public IP addresses

If you're familiar with using ARM templates, select the Deploy to Azure button. The
template will open in the Azure portal.

7 Note

If the private endpoint deployment failed due to invalid groupId


"nodeManagement", please check if the region is in the supported list for
Simplified compute node communication. Choose the right region, and then retry
the deployment.

Outbound access to the internet


In a pool without public IP addresses, your virtual machines won't be able to access the
public internet unless you configure your network setup appropriately, such as by using
virtual network NAT. NAT only allows outbound access to the internet from the virtual
machines in the virtual network. Batch-created compute nodes won't be publicly
accessible, since they don't have public IP addresses associated.

Another way to provide outbound connectivity is to use a user-defined route (UDR). This
method lets you route traffic to a proxy machine that has public internet access, for
example Azure Firewall.

) Important

There is no extra network resource (load balancer, network security group) created
for simplified node communication pools without public IP addresses. Since the
compute nodes in the pool are not bound to any load balancer, Azure may provide
Default Outbound Access. However, Default Outbound Access is not suitable for
production workloads, so it is strongly recommended to bring your own Internet
outbound access.

Troubleshooting

Unusable compute nodes in a Batch pool


If compute nodes run into unusable state in a Batch pool without public IP addresses,
the first and most important check is to verify the outbound access to the Batch node
management service. It must be configured correctly so that compute nodes are able to
connect to service from your virtual network.

Using nodeManagement private endpoint

If you created node management private endpoint in the virtual network for your Batch
account:

Check if the private endpoint is created in the right virtual network, in provisioning
Succeeded state, and also in Approved status.
Check if the DNS configuration is set up correctly for the node management
endpoint of your Batch account:
If your private endpoint is created with automatic private DNS zone integration,
check the DNS A record is configured correctly in the private DNS zone
privatelink.batch.azure.com , and the zone is linked to your virtual network.

If you're using your own DNS solution, make sure the DNS record for your Batch
node management endpoint is configured correctly and point to the private
endpoint IP address.
Check the DNS resolution for Batch node management endpoint of your account.
You can confirm it by running nslookup <nodeManagementEndpoint> from within your
virtual network, and the DNS name should be resolved to the private endpoint IP
address.
If your virtual network has private endpoint network policy enabled, check NSG
and UDR for subnets of both the Batch pool and the private endpoint. The
inbound connection with TCP/443 to the subnet hosting the private endpoint must
be allowed from Batch pool's subnet.
From the Batch pool's subnet, run TCP ping to the node management endpoint
using default HTTPS port (443). This probe can tell if the private link connection is
working as expected.

# Windows
Test-TcpConnection -ComputeName <nodeManagementEndpoint> -Port 443
# Linux
nc -v <nodeManagementEndpoint> 443

If the TCP ping fails (for example, timed out), it's typically an issue with the private link
connection, and you can raise Azure support ticket with this private endpoint resource.
Otherwise, this node unusable issue can be troubleshot as normal Batch pools, and you
can raise support ticket with your Batch account.

Using your own internet outbound solution

If you're using your own internet outbound solution instead of private endpoint, run TCP
ping to the node management endpoint. If it's not working, check if your outbound
access is configured correctly by following detailed requirements for simplified compute
node communication.

Connect to compute nodes


There's no internet inbound access to compute nodes in the Batch pool without public
IP addresses. To access your compute nodes for debugging, you'll need to connect from
within the virtual network:

Use jumpbox machine inside the virtual network, then connect to your compute
nodes from there.
Or, try using other remote connection solutions like Azure Bastion:
Create Bastion in the virtual network with IP based connection enabled.
Use Bastion to connect to the compute node using its IP address.

You can follow the guide Connect to compute nodes to get user credential and IP
address for the target compute node in your Batch pool.

Migration from previous preview version of No


Public IP pools
For existing pools that use the previous preview version of Azure Batch No Public IP
pool, it's only possible to migrate pools created in a virtual network.

1. Create a private endpoint for Batch node management in the virtual network.
2. Update the pool's node communication mode to simplified.
3. Scale down the pool to zero nodes.
4. Scale out the pool again. The pool is then automatically migrated to the new
version.

Next steps
Learn how to use simplified compute node communication.
Learn more about creating pools in a virtual network.
Learn how to use private endpoints with Batch accounts.
Use ephemeral OS disk nodes for Azure
Batch pools
Article • 03/27/2025

Some Azure virtual machine (VM) series support the use of ephemeral OS disks, which
create the OS disk on the node virtual machine local storage. The default Batch pool
configuration uses Azure managed disks for the node OS disk, where the managed disk
is like a physical disk, but virtualized and persisted in remote Azure Storage.

For Batch workloads, the main benefits of using ephemeral OS disks are reduced costs
associated with pools, the potential for faster node start time, and improved application
performance due to better OS disk performance. When choosing whether ephemeral OS
disks should be used for your workload, consider the following impacts:

There's lower read/write latency to ephemeral OS disks, which may lead to


improved application performance.
There's no storage cost for ephemeral OS disks, whereas there's a cost for each
managed OS disk.
Reimage for compute nodes is faster for ephemeral disks compared to managed
disks, when supported by Batch.
Node start time may be slightly faster when ephemeral OS disks are used.
Ephemeral OS disks aren't highly durable and available; when a VM is removed for
any reason, the OS disk is lost. Since Batch workloads are inherently stateless, and
don't normally rely on changes to the OS disk being persisted, ephemeral OS disks
are appropriate to use for most Batch workloads.
Ephemeral OS disks aren't currently supported by all Azure VM series. If a VM size
doesn't support an ephemeral OS disk, a managed OS disk must be used.

7 Note

Ephemeral OS disk configuration is only applicable to 'virtualMachineConfiguration'


pools, and aren't supported by 'cloudServiceConfiguration’ pools. We recommend
using 'virtualMachineConfiguration for your Batch pools, as
'cloudServiceConfiguration' pools do not support all features and no new
capabilities are planned.

VM series support
To determine whether a VM series supports ephemeral OS disks, check the
documentation for each VM instance. For example, the Ddv4 and Ddsv4-series supports
ephemeral OS disks.

Alternately, you can programmatically query to check the 'EphemeralOSDiskSupported'


capability. An example PowerShell cmdlet to query this capability is provided in the
ephemeral OS disk frequently asked questions.

Create a pool that uses ephemeral OS disks


The EphemeralOSDiskSettings property isn't set by default. You must set this property in
order to configure ephemeral OS disk use on the pool nodes.

 Tip

Ephemeral OS disks cannot be used in conjunction with Spot VMs in Batch pools
due to the service managed eviction policy.

The following example shows how to create a Batch pool where the nodes use
ephemeral OS disks and not managed disks.

Code examples
This code snippet shows how to create a pool with ephemeral OS disks using Azure
Batch Python SDK with the Ephemeral OS disk using the temporary disk (cache).

Python

virtual_machine_configuration=batch.models.VirtualMachineConfiguration(
image_reference=image_ref_to_use,
node_agent_sku_id=node_sku_id,
os_disk=batch.models.OSDisk(
ephemeral_os_disk_settings=batch.models.DiffDiskSettings(
placement=batch.models.DiffDiskPlacement.cache_disk
)
)
)

This is the same code snippet but for creating a pool with ephemeral OS disks using the
Azure Batch .NET SDK and C#.

C#
VirtualMachineConfiguration virtualMachineConfiguration = new
VirtualMachineConfiguration(
imageReference: imageReference,
nodeAgentSkuId: nodeAgentSku
);
virtualMachineConfiguration.OSDisk = new OSDisk();
virtualMachineConfiguration.OSDisk.EphemeralOSDiskSettings = new
DiffDiskSettings();
virtualMachineConfiguration.OSDisk.EphemeralOSDiskSettings.Placement =
DiffDiskPlacement.CacheDisk;

Next steps
See the Ephemeral OS Disks FAQ.
Learn about the Batch service workflow and primary resources such as pools,
nodes, jobs, and tasks.
Learn about costs that may be associated with Azure Batch workloads.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Create an Azure Batch pool with
Automatic Operating System (OS)
Upgrade
Article • 12/20/2024

When you create an Azure Batch pool, you can provision the pool with nodes that have
Auto OS Upgrade enabled. This article explains how to set up a Batch pool with Auto OS
Upgrade.

Why use Auto OS Upgrade?


Auto OS Upgrade is used to implement an automatic operating system upgrade
strategy and control within Azure Batch Pools. Here are some reasons for using Auto OS
Upgrade:

Security. Auto OS Upgrade ensures timely patching of vulnerabilities and security


issues within the operating system image, to enhance the security of compute
resources. It helps prevent potential security vulnerabilities from posing a threat to
applications and data.
Minimized Availability Disruption. Auto OS Upgrade is used to minimize the
availability disruption of compute nodes during OS upgrades. It is achieved
through task-scheduling-aware upgrade deferral and support for rolling upgrades,
ensuring that workloads experience minimal disruption.
Flexibility. Auto OS Upgrade allows you to configure your automatic operating
system upgrade strategy, including percentage-based upgrade coordination and
rollback support. It means you can customize your upgrade strategy to meet your
specific performance and availability requirements.
Control. Auto OS Upgrade provides you with control over your operating system
upgrade strategy to ensure secure, workload-aware upgrade deployments. You can
tailor your policy configurations to meet the specific needs of your organization.

In summary, the use of Auto OS Upgrade helps improve security, minimize availability
disruptions, and provide both greater control and flexibility for your workloads.

How does Auto OS Upgrade work?


When upgrading images, VMs in Azure Batch Pool will follow roughly the same work
flow as VirtualMachineScaleSets. To learn more about the detailed steps involved in the
Auto OS Upgrade process for VirtualMachineScaleSets, you can refer to
VirtualMachineScaleSet page.

However, if automaticOSUpgradePolicy.osRollingUpgradeDeferral is set to 'true' and an


upgrade becomes available when a batch node is actively running tasks, the upgrade
will be delayed until all tasks have been completed on the node.

7 Note

If a pool has enabled osRollingUpgradeDeferral, its nodes will be displayed as


upgradingos state during the upgrade process. Please note that the upgradingos
state will only be shown when you are using the the API version of 2024-02-01 or
later. If you're using an old API version to call GetTVM/ListTVM, the node will be in a
rebooting state when upgrading.

Supported OS images
Only certain OS platform images are currently supported for automatic upgrade. For
detailed images list, you can get from VirtualMachineScaleSet page.

Requirements
The version property of the image must be set to latest.
For Batch Management API, use API version 2024-02-01 or higher. For Batch
Service API, use API version 2024-02-01.19.0 or higher.
Ensure that external resources specified in the pool are available and updated.
Examples include SAS URI for bootstrapping payload in VM extension properties,
payload in storage account, reference to secrets in the model, and more.
If you are using the
property virtualMachineConfiguration.windowsConfiguration.enableAutomaticUpdates,
this property must set to 'false' in the pool definition.
The enableAutomaticUpdates property enables in-VM patching where "Windows
Update" applies operating system patches without replacing the OS disk. With
automatic OS image upgrades enabled, an extra patching process through
Windows Update isn't required.

Additional requirements for custom images


When a new version of the image is published and replicated to the region of that
pool, the VMs will be upgraded to the latest version of the Azure Compute Gallery
image. If the new image isn't replicated to the region where the pool is deployed,
the VM instances won't be upgraded to the latest version. Regional image
replication allows you to control the rollout of the new image for your VMs.
The new image version shouldn't be excluded from the latest version for that
gallery image. Image versions excluded from the gallery image's latest version
won't be rolled out through automatic OS image upgrade.

Configure Auto OS Upgrade


If you intend to implement Auto OS Upgrades within a pool, it's essential to configure
the UpgradePolicy field during the pool creation process. To configure automatic OS
image upgrades, make sure that the
automaticOSUpgradePolicy.enableAutomaticOSUpgrade property is set to 'true' in the
pool definition.

7 Note

Upgrade Policy mode and Automatic OS Upgrade Policy are separate settings and
control different aspects of the provisioned scale set by Azure Batch. The Upgrade
Policy mode will determine what happens to existing instances in scale set.
However, Automatic OS Upgrade Policy enableAutomaticOSUpgrade is specific to
the OS image and tracks changes the image publisher has made and determines
what happens when there is an update to the image.

REST API
The following example describes how to create a pool with Auto OS Upgrade via REST
API:

HTTP

PUT
https://management.azure.com/subscriptions/<subscriptionid>/resourceGroups/<
resourcegroupName>/providers/Microsoft.Batch/batchAccounts/<batchaccountname
>/pools/<poolname>?api-version=2024-02-01

Request Body

JSON

{
"name": "test1",
"type": "Microsoft.Batch/batchAccounts/pools",
"parameters": {
"properties": {
"vmSize": "Standard_d4s_v3",
"deploymentConfiguration": {
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "MicrosoftWindowsServer",
"offer": "WindowsServer",
"sku": "2019-datacenter-smalldisk",
"version": "latest"
},
"nodePlacementConfiguration": {
"policy": "Zonal"
},
"nodeAgentSKUId": "batch.node.windows amd64",
"windowsConfiguration": {
"enableAutomaticUpdates": false
}
}
},
"scaleSettings": {
"fixedScale": {
"targetDedicatedNodes": 2,
"targetLowPriorityNodes": 0
}
},
"upgradePolicy": {
"mode": "Automatic",
"automaticOSUpgradePolicy": {
"disableAutomaticRollback": true,
"enableAutomaticOSUpgrade": true,
"useRollingUpgradePolicy": true,
"osRollingUpgradeDeferral": true
},
"rollingUpgradePolicy": {
"enableCrossZoneUpgrade": true,
"maxBatchInstancePercent": 20,
"maxUnhealthyInstancePercent": 20,
"maxUnhealthyUpgradedInstancePercent": 20,
"pauseTimeBetweenBatches": "PT0S",
"prioritizeUnhealthyInstances": false,
"rollbackFailedInstancesOnPolicyBreach": false
}
}
}
}
}

SDK (C#)
The following code snippet shows an example of how to use the Batch .NET client
library to create a pool of Auto OS Upgrade via C# codes. For more details about Batch
.NET, view the reference documentation.

C#

public async Task CreateUpgradePolicyPool()


{
// Authenticate
var clientId = Environment.GetEnvironmentVariable("CLIENT_ID");
var clientSecret = Environment.GetEnvironmentVariable("CLIENT_SECRET");
var tenantId = Environment.GetEnvironmentVariable("TENANT_ID");
var subscriptionId =
Environment.GetEnvironmentVariable("SUBSCRIPTION_ID");
ClientSecretCredential credential = new
ClientSecretCredential(tenantId, clientId, clientSecret);
ArmClient client = new ArmClient(credential, subscriptionId);

// Get an existing Batch account


string resourceGroupName = "testrg";
string accountName = "testaccount";
ResourceIdentifier batchAccountResourceId =
BatchAccountResource.CreateResourceIdentifier(subscriptionId,
resourceGroupName, accountName);
BatchAccountResource batchAccount =
client.GetBatchAccountResource(batchAccountResourceId);

// get the collection of this BatchAccountPoolResource


BatchAccountPoolCollection collection =
batchAccount.GetBatchAccountPools();

// Define the pool


string poolName = "testpool";
BatchAccountPoolData data = new BatchAccountPoolData()
{
VmSize = "Standard_d4s_v3",
DeploymentConfiguration = new BatchDeploymentConfiguration()
{
VmConfiguration = new BatchVmConfiguration(new
BatchImageReference()
{
Publisher = "MicrosoftWindowsServer",
Offer = "WindowsServer",
Sku = "2019-datacenter-smalldisk",
Version = "latest",
},
nodeAgentSkuId: "batch.node.windows amd64")
{
NodePlacementPolicy = BatchNodePlacementPolicyType.Zonal,
IsAutomaticUpdateEnabled = false
},
},
ScaleSettings = new BatchAccountPoolScaleSettings()
{
FixedScale = new BatchAccountFixedScaleSettings()
{
TargetDedicatedNodes = 2,
TargetLowPriorityNodes = 0,
},
},
UpgradePolicy = new UpgradePolicy()
{
Mode = UpgradeMode.Automatic,
AutomaticOSUpgradePolicy = new AutomaticOSUpgradePolicy()
{
DisableAutomaticRollback = true,
EnableAutomaticOSUpgrade = true,
UseRollingUpgradePolicy = true,
OSRollingUpgradeDeferral = true
},
RollingUpgradePolicy = new RollingUpgradePolicy()
{
EnableCrossZoneUpgrade = true,
MaxBatchInstancePercent = 20,
MaxUnhealthyInstancePercent = 20,
MaxUnhealthyUpgradedInstancePercent = 20,
PauseTimeBetweenBatches = "PT0S",
PrioritizeUnhealthyInstances = false,
RollbackFailedInstancesOnPolicyBreach = false,
}
}
};

ArmOperation<BatchAccountPoolResource> lro = await


collection.CreateOrUpdateAsync(WaitUntil.Completed, poolName, data);
BatchAccountPoolResource result = lro.Value;

// the variable result is a resource, you could call other operations


on this instance as well
// but just for demo, we get its data from this resource instance
BatchAccountPoolData resourceData = result.Data;
// for demo we just print out the id
Console.WriteLine($"Succeeded on id: {resourceData.Id}");
}

FAQs
Will my tasks be disrupted if I enabled Auto OS Upgrade?

Tasks won't be disrupted when


automaticOSUpgradePolicy.osRollingUpgradeDeferral is set to 'true'. In that case,
the upgrade will be postponed until node becomes idle. Otherwise, node will
upgrade when it receives a new OS version, regardless of whether it is currently
running a task or not. So we strongly advise enabling
automaticOSUpgradePolicy.osRollingUpgradeDeferral.

Next steps
Learn how to use a managed image to create a pool.
Learn how to use the Azure Compute Gallery to create a pool.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Use extensions with Batch pools
Article • 03/04/2025

Extensions are small applications that facilitate post-provisioning configuration and


setup on Batch compute nodes. You can select any of the extensions that are allowed by
Azure Batch and install them on the compute nodes as they're provisioned. After that,
the extension can perform its intended operation.

You can check the live status of the extensions you use and retrieve the information they
return in order to pursue any detection, correction, or diagnostics capabilities.

Prerequisites
Pools with extensions must use Virtual Machine Configuration.
The CustomScript extension type is reserved for the Azure Batch service and can't
be overridden.
Some extensions may need pool-level Managed Identity accessible in the context
of a compute node in order to function properly. See configuring managed
identities in Batch pools if applicable for the extensions.

 Tip

Extensions cannot be added to an existing pool. Pools must be recreated to add,


remove, or update extensions.

Supported extensions
The following extensions can currently be installed when creating a Batch pool:

Azure Key Vault extension for Linux


Azure Key Vault extension for Windows
Azure Monitor Logs analytics and monitoring extension for Linux
Azure Monitor Logs analytics and monitoring extension for Windows
Azure Desired State Configuration (DSC) extension
Azure Diagnostics extension for Windows VMs
HPC GPU driver extension for Windows on AMD
HPC GPU driver extension for Windows on NVIDIA
HPC GPU driver extension for Linux on NVIDIA
Microsoft Antimalware extension for Windows
Azure Monitor agent for Linux
Azure Monitor agent for Windows
Application Health extension

You can request support for other publishers and/or extension types by opening a
support request.

Create a pool with extensions


The following example creates a Batch pool of Linux/Windows nodes that uses the
Azure Key Vault extension.

REST API URI

HTTP

PUT
https://management.azure.com/subscriptions/<subscriptionId>/resourceGroups/<
resourceGroup>/providers/Microsoft.Batch/batchAccounts/<batchaccountName>/po
ols/<batchpoolName>?api-version=2021-01-01

Request Body for Linux node

JSON

{
"name": "test1",
"type": "Microsoft.Batch/batchAccounts/pools",
"properties": {
"vmSize": "STANDARD_DS2_V2",
"taskSchedulingPolicy": {
"nodeFillType": "Pack"
},
"deploymentConfiguration": {
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "microsoftcblmariner",
"offer": "cbl-mariner",
"sku": "cbl-mariner-2",
"version": "latest"
},
"nodeAgentSkuId": "batch.node.mariner 2.0",
"extensions": [
{
"name": "secretext",
"type": "KeyVaultForLinux",
"publisher": "Microsoft.Azure.KeyVault",
"typeHandlerVersion": "3.0",
"autoUpgradeMinorVersion": true,
"settings": {
"secretsManagementSettings": {
"pollingIntervalInS": "300",
"certificateStoreLocation":
"/var/lib/waagent/Microsoft.Azure.KeyVault",
"requireInitialSync": true,
"observedCertificates": [

"https://testkvwestus2.vault.azure.net/secrets/authsecreat"
]
},
"authenticationSettings": {
"msiEndpoint": "http://169.254.169.254/metadata/identity",
"msiClientId": "885b1a3d-f13c-4030-afcf-9f05044d78dc"
}
},
"protectedSettings": {}
}
]
}
},
"scaleSettings": {
"fixedScale": {
"targetDedicatedNodes": 1,
"targetLowPriorityNodes": 0,
"resizeTimeout": "PT15M"
}
}
},
"identity": {
"type": "UserAssigned",
"userAssignedIdentities": {
"/subscriptions/aaaa0a0a-bb1b-cc2c-dd3d-
eeeeee4e4e4e/resourceGroups/ACR/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/testumaforpools": {}
}
}
}

Request Body for Windows node

JSON

{
"name": "test1",
"type": "Microsoft.Batch/batchAccounts/pools",
"properties": {
"vmSize": "STANDARD_DS2_V2",
"taskSchedulingPolicy": {
"nodeFillType": "Pack"
},
"deploymentConfiguration": {
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "microsoftwindowsserver",
"offer": "windowsserver",
"sku": "2022-datacenter",
"version": "latest"
},
"nodeAgentSkuId": "batch.node.windows amd64",
"extensions": [
{
"name": "secretext",
"type": "KeyVaultForWindows",
"publisher": "Microsoft.Azure.KeyVault",
"typeHandlerVersion": "3.0",
"autoUpgradeMinorVersion": true,
"settings": {
"secretsManagementSettings": {
"pollingIntervalInS": "300",
"requireInitialSync": true,
"observedCertificates": [
{

"https://testkvwestus2.vault.azure.net/secrets/authsecreat"
"certificateStoreLocation":
"LocalMachine",
"keyExportable": true
}
]
},
"authenticationSettings": {
"msiEndpoint":
"http://169.254.169.254/metadata/identity",
"msiClientId": "885b1a3d-f13c-4030-afcf-
9f05044d78dc"
}
},
"protectedSettings":{}
}
]
}
},
"scaleSettings": {
"fixedScale": {
"targetDedicatedNodes": 1,
"targetLowPriorityNodes": 0,
"resizeTimeout": "PT15M"
}
}
},
"identity": {
"type": "UserAssigned",
"userAssignedIdentities": {
"/subscriptions/aaaa0a0a-bb1b-cc2c-dd3d-
eeeeee4e4e4e/resourceGroups/ACR/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/testumaforpools": {}
}
}
}

Get extension data from a pool


The following example retrieves data from the Azure Key Vault extension.

REST API URI

HTTP

GET https://<accountName>.
<region>.batch.azure.com/pools/<poolName>/nodes/<tvmNodeName>/extensions/sec
retext?api-version=2010-01-01

Response Body

JSON

{
"odata.metadata":
"https://testwestus2batch.westus2.batch.azure.com/$metadata#extensions/@Elem
ent",
"instanceView": {
"name": "secretext",
"statuses": [
{
"code": "ProvisioningState/succeeded",
"level": 0,
"displayStatus": "Provisioning succeeded",
"message": "Successfully started Key Vault extension service. 2021-
02-08T19:49:39Z"
}
]
},
"vmExtension": {
"name": "KVExtensions",
"publisher": "Microsoft.Azure.KeyVault",
"type": "KeyVaultForLinux",
"typeHandlerVersion": "1.0",
"autoUpgradeMinorVersion": true,
"settings": "{\r\n \"secretsManagementSettings\": {\r\n
\"pollingIntervalInS\": \"300\",\r\n \"certificateStoreLocation\":
\"/var/lib/waagent/Microsoft.Azure.KeyVault\",\r\n
\"requireInitialSync\": true,\r\n \"observedCertificates\": [\r\n
\"https://testkvwestus2.vault.azure.net/secrets/testumi\"\r\n ]\r\n
},\r\n \"authenticationSettings\": {\r\n \"msiEndpoint\":
\"http://169.254.169.254/metadata/identity\",\r\n \"msiClientId\":
\"885b1a3d-f13c-4030-afcf-922f05044d78dc\"\r\n }\r\n}"
}
}

Troubleshooting Key Vault Extension


If Key Vault extension is configured incorrectly, the compute node might be in a usable
state. To troubleshoot Key Vault extension failure, you can temporarily set
requireInitialSync to false and redeploy your pool, then the compute node is in idle
state, you can log in to the compute node to check KeyVault extension logs for errors
and fix the configuration issues. Visit following Key Vault extension doc link for more
information.

Azure Key Vault extension for Linux


Azure Key Vault extension for Windows

Considerations for Application Health


extension
The Batch Node Agent running on the node always starts an HTTP server that returns
the health status of the agent. This HTTP server listens on local IP address 127.0.0.1 and
port 29879. It always returns a 200 status but with the response body being either
healthy or unhealthy. Any other response (or lack thereof) is considered an "unknown"
status. This setup is in line with the guidelines running a HTTP server which provides a
"Rich Health State" per the official "Application Health extension" documentation.

If you set up your own health server, please ensure that the HTTP server listens on an
unique port. It is suggested that your health server should query the Batch Node Agent
server and combine with your health signal to generate a composite health result.
Otherwise you might end up with a "healthy" node that doesn't have a properly
functioning Batch Agent.

Next steps
Learn about various ways to copy applications and data to pool nodes.
Learn more about working with nodes and pools.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Configure managed identities in Batch
pools
07/08/2025

Managed identities for Azure resources eliminate complicated identity and credential
management by providing an identity for the Azure resource in Microsoft Entra ID (Azure AD
ID). This identity is used to obtain Microsoft Entra tokens to authenticate with target resources
in Azure.

When adding a User-Assigned Managed Identity to a Batch Pool, it is crucial to set the Identity
property in your configuration. This property links the managed identity to the pool, enabling it
to access Azure resources securely. Incorrect setting of the Identity property can result in
common errors, such as access issues or upload errors.

For more information on configuring managed identities in Azure Batch, please refer to the
Azure Batch Managed Identities documentation.

This topic explains how to enable user-assigned managed identities on Batch pools and how to
use managed identities within the nodes.

) Important

Creating pools with managed identities can only be performed with the Batch
Management Plane APIs or SDKs using Entra authentication. It is not possible to create
pools with managed identities using the Batch Service APIs or SDKs. For more
information, see the overview documentation for Batch APIs and tools.

Create a user-assigned managed identity


First, create your user-assigned managed identity in the same tenant as your Batch account.
You can create the identity using the Azure portal, the Azure Command-Line Interface (Azure
CLI), PowerShell, Azure Resource Manager, or the Azure REST API. This managed identity
doesn't need to be in the same resource group or even in the same subscription.

 Tip

A system-assigned managed identity created for a Batch account for customer data
encryption cannot be used as a user-assigned managed identity on a Batch pool as
described in this document. If you wish to use the same managed identity on both the
Batch account and Batch pool, then use a common user-assigned managed identity
instead.

Create a Batch pool with user-assigned managed


identities
After you create one or more user-assigned managed identities, you can create a Batch pool
with that identity or those identities. You can:

Use the Azure portal to create the Batch pool


Use the Batch .NET management library to create the Batch pool

2 Warning

In-place updates of pool managed identities are not supported while the pool has active
nodes. Existing compute nodes will not be updated with changes. It is recommended to
scale the pool down to zero compute nodes before modifying the identity collection to
ensure all VMs have the same set of identities assigned.

Create Batch pool in Azure portal


To create a Batch pool with a user-assigned managed identity through the Azure portal:

1. Sign in to the Azure portal .


2. In the search bar, enter and select Batch accounts.
3. On the Batch accounts page, select the Batch account where you want to create a Batch
pool.
4. In the menu for the Batch account, under Features, select Pools.
5. In the Pools menu, select Add to add a new Batch pool.
6. For Pool ID, enter an identifier for your pool.
7. For Identity, change the setting to User assigned.
8. Under User assigned managed identity, select Add.
9. Select the user assigned managed identity or identities you want to use. Then, select Add.

7 Note

You can assign only one managed identity at a time for both the autostorage account
level and the batch account level. However, at the pool level, you have the flexibility to use
multiple user-assigned managed identities.
1. Under Operating System, select the publisher, offer, and SKU to use.
2. Optionally, enable the managed identity in the container registry:
a. For Container configuration, change the setting to Custom. Then, select your custom
configuration.
b. For Start task select Enabled. Then, select Resource files and add your storage
container information.
c. Enable Container settings.
d. Change Container registry to Custom
e. For Identity reference, select the storage container.

Create Batch pool with .NET


To create a Batch pool with a user-assigned managed identity with the Batch .NET
management library, use the following example code:

C#

var credential = new DefaultAzureCredential();


ArmClient _armClient = new ArmClient(credential);

var batchAccountIdentifier = ResourceIdentifier.Parse("your-batch-account-


resource-id");
BatchAccountResource batchAccount =
_armClient.GetBatchAccountResource(batchAccountIdentifier);

var poolName = "HelloWorldPool";


var imageReference = new BatchImageReference()
{
Publisher = "canonical",
Offer = "0001-com-ubuntu-server-jammy",
Sku = "22_04-lts",
Version = "latest"
};
string nodeAgentSku = "batch.node.ubuntu 22.04";

var batchAccountPoolData = new BatchAccountPoolData()


{
VmSize = "Standard_DS1_v2",
DeploymentConfiguration = new BatchDeploymentConfiguration()
{
VmConfiguration = new BatchVmConfiguration(imageReference, nodeAgentSku)
},
ScaleSettings = new BatchAccountPoolScaleSettings()
{
FixedScale = new BatchAccountFixedScaleSettings()
{
TargetDedicatedNodes = 1
}
}
};

ArmOperation<BatchAccountPoolResource> armOperation =
batchAccount.GetBatchAccountPools().CreateOrUpdate(
WaitUntil.Completed, poolName, batchAccountPoolData);
BatchAccountPoolResource pool = armOperation.Value;

7 Note

To include the Identity property use the following example code:

C#

var pool = batchClient.PoolOperations.CreatePool(


poolId: "myPool",
virtualMachineSize: "STANDARD_D2_V2",
cloudServiceConfiguration: new CloudServiceConfiguration(osFamily: "4"),
targetDedicatedNodes: 1,
identity: new PoolIdentity(
type: PoolIdentityType.UserAssigned,
userAssignedIdentities: new Dictionary<string, UserAssignedIdentity>
{
{ "/subscriptions/{subscription-id}/resourceGroups/{resource-
group}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{identity-
name}", new UserAssignedIdentity() }
}
));

Use user-assigned managed identities in Batch


nodes
Many Azure Batch functions that access other Azure resources directly on the compute nodes,
such as Azure Storage or Azure Container Registry, support managed identities. For more
information on using managed identities with Azure Batch, see the following links:

Resource files
Output files
Azure Container Registry
Azure Blob container file system

You can also manually configure your tasks so that the managed identities can directly access
Azure resources that support managed identities.
Within the Batch nodes, you can get managed identity tokens and use them to authenticate
through Microsoft Entra authentication via the Azure Instance Metadata Service.

For Windows, the PowerShell script to get an access token to authenticate is:

PowerShell

$Response = Invoke-RestMethod -Uri


'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-
01&resource={Resource App Id Url}' -Method GET -Headers @{Metadata="true"}

For Linux, the Bash script is:

Bash

curl 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-
01&resource={Resource App Id Url}' -H Metadata:true

For more information, see How to use managed identities for Azure resources on an Azure VM
to acquire an access token.

Next steps
Learn more about Managed identities for Azure resources.
Learn how to use customer-managed keys with user-managed identities.
Learn how to enable automatic certificate rotation in a Batch pool.

) Note: The author created this article with assistance from AI. Learn more
Enable automatic certificate rotation in
a Batch pool
Article • 04/16/2024

You can create a Batch pool with a certificate that can automatically be renewed. To do
so, your pool must be created with a user-assigned managed identity that has access to
the certificate in Azure Key Vault.

Create a user-assigned identity


First, create your user-assigned managed identity in the same tenant as your Batch
account. This managed identity doesn't need to be in the same resource group or even
in the same subscription.

Be sure to note the Client ID of the user-assigned managed identity. You need this value
later.

Create your certificate


Next, you need to create a certificate and add it to Azure Key Vault. If you haven't
already created a key vault, you need to do that first. For instructions, see Quickstart: Set
and retrieve a certificate from Azure Key Vault using the Azure portal.

When creating your certificate, be sure to set Lifetime Action Type to automatically
renew, and specify the number of days after which the certificate should renew.
After your certificate has been created, make note of its Secret Identifier. You need this
value later.
Add an access policy in Azure Key Vault
In your key vault, assign a Key Vault access policy that allows your user-assigned
managed identity to access secrets and certificates. For detailed instructions, see Assign
a Key Vault access policy using the Azure portal.

Create a Batch pool with a user-assigned


managed identity
Create a Batch pool with your managed identity by using the Batch .NET management
library. For more information, see Configure managed identities in Batch pools.
 Tip

Existing pools cannot be updated with the Key Vault VM extension. You will need to
recreate your pool.

The following example uses the Batch Management REST API to create a pool. Be sure
to use your certificate's Secret Identifier for observedCertificates and your managed
identity's Client ID for msiClientId , replacing the example data below.

REST API URI

HTTP

PUT
https://management.azure.com/subscriptions/<subscriptionid>/resourceGroups/<
resourcegroupName>/providers/Microsoft.Batch/batchAccounts/<batchaccountname
>/pools/<poolname>?api-version=2021-01-01

Request Body for Linux node

JSON

{
"name": "test2",
"type": "Microsoft.Batch/batchAccounts/pools",
"properties": {
"vmSize": "STANDARD_DS2_V2",
"taskSchedulingPolicy": {
"nodeFillType": "Pack"
},
"deploymentConfiguration": {
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "canonical",
"offer": "ubuntuserver",
"sku": "20.04-lts",
"version": "latest"
},
"nodeAgentSkuId": "batch.node.ubuntu 20.04",
"extensions": [
{
"name": "KVExtensions",
"type": "KeyVaultForLinux",
"publisher": "Microsoft.Azure.KeyVault",
"typeHandlerVersion": "3.0",
"autoUpgradeMinorVersion": true,
"settings": {
"secretsManagementSettings": {
"pollingIntervalInS": "300",
"certificateStoreLocation":
"/var/lib/waagent/Microsoft.Azure.KeyVault",
"requireInitialSync": true,
"observedCertificates": [

"https://testkvwestus2s.vault.azure.net/secrets/authcertforumatesting/8f5f3f
491afd48cb99286ba2aacd39af"
]
},
"authenticationSettings": {
"msiEndpoint": "http://169.254.169.254/metadata/identity",
"msiClientId": "b9f6dd56-d2d6-4967-99d7-8062d56fd84c"
}
}
}
]
}
},
"scaleSettings": {
"fixedScale": {
"targetDedicatedNodes": 1,
"resizeTimeout": "PT15M"
}
}
},
"identity": {
"type": "UserAssigned",
"userAssignedIdentities": {
"/subscriptions/042998e4-36dc-4b7d-8ce3-
a7a2c4877d33/resourceGroups/ACR/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/testumaforpools": {}
}
}
}

Request Body for Windows node

JSON

{
"name": "test2",
"type": "Microsoft.Batch/batchAccounts/pools",
"properties": {
"vmSize": "STANDARD_DS2_V2",
"taskSchedulingPolicy": {
"nodeFillType": "Pack"
},
"deploymentConfiguration": {
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "microsoftwindowsserver",
"offer": "windowsserver",
"sku": "2022-datacenter",
"version": "latest"
},
"nodeAgentSkuId": "batch.node.windows amd64",
"extensions": [
{
"name": "KVExtensions",
"type": "KeyVaultForWindows",
"publisher": "Microsoft.Azure.KeyVault",
"typeHandlerVersion": "3.0",
"autoUpgradeMinorVersion": true,
"settings": {
"secretsManagementSettings": {
"pollingIntervalInS": "300",
"requireInitialSync": true,
"observedCertificates": [
{
"url":
"https://testkvwestus2s.vault.azure.net/secrets/authcertforumatesting/8f5f3f
491afd48cb99286ba2aacd39af",
"certificateStoreLocation":
"LocalMachine",
"keyExportable": true
}
]
},
"authenticationSettings": {
"msiEndpoint":
"http://169.254.169.254/metadata/identity",
"msiClientId": "b9f6dd56-d2d6-4967-99d7-
8062d56fd84c"
}
},
}
]
}
},
"scaleSettings": {
"fixedScale": {
"targetDedicatedNodes": 1,
"resizeTimeout": "PT15M"
}
},
},
"identity": {
"type": "UserAssigned",
"userAssignedIdentities": {
"/subscriptions/042998e4-36dc-4b7d-8ce3-
a7a2c4877d33/resourceGroups/ACR/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/testumaforpools": {}
}
}
}
Validate the certificate
To confirm that the certificate is successfully deployed, log in to the compute node. You
should see output similar to the following:

root@74773db5fe1b42ab9a4b6cf679d929da000000:/var/lib/waagent/Microsoft.Azure
.KeyVault.KeyVaultForLinux-1.0.1363.13/status# cat 1.status
[{"status":{"code":0,"formattedMessage":{"lang":"en","message":"Successfully
started Key Vault extension service. 2021-03-
03T23:12:23Z"},"operation":"Service
start.","status":"success"},"timestampUTC":"2021-03-
03T23:12:23Z","version":"1.0"}]root@74773db5fe1b42ab9a4b6cf679d929da000000:/
var/lib/waagent/Microsoft.Azure.KeyVault.KeyVaultForLinux-
1.0.1363.13/status#

Troubleshooting Key Vault Extension


If Key Vault extension is configured incorrectly, the compute node might be in usable
state. To troubleshoot Key Vault extension failure, you can temporarily set
requireInitialSync to false and redeploy your pool, then the compute node is in idle
state, you can log in to the compute node to check KeyVault extension logs for errors
and fix the configuration issues. Visit following Key Vault extension doc link for more
information.

Azure Key Vault extension for Linux


Azure Key Vault extension for Windows

Next steps
Learn more about Managed identities for Azure resources.
Learn how to use customer-managed keys with user-managed identities.
Mount a virtual file system on a Batch
pool
Article • 06/10/2024

Azure Batch supports mounting cloud storage or an external file system on Windows or
Linux compute nodes in Batch pools. When a compute node joins the pool, the virtual
file system mounts and acts as a local drive on that node. This article shows you how to
mount a virtual file system on a pool of compute nodes by using the Batch
Management Library for .NET.

Mounting the file system to the pool makes accessing data easier and more efficient
than requiring tasks to get their own data from a large shared data set. Consider a
scenario where multiple tasks need access to a common set of data, like rendering a
movie. Each task renders one or more frames at once from the scene files. By mounting
a drive that contains the scene files, it's easier for each compute node to access the
shared data.

Also, you can choose the underlying file system to meet performance, throughout, and
input/output operations per second (IOPS) requirements. You can independently scale
the file system based on the number of compute nodes that concurrently access the
data.

For example, you could use an Avere vFXT distributed in-memory cache to support large
movie-scale renders with thousands of concurrent render nodes that access on-
premises source data. Or, for data that's already in cloud-based blob storage, you can
use BlobFuse to mount the data as a local file system. Azure Files provides a similar
workflow to that of BlobFuse and is available on both Windows and Linux.

Supported configurations
You can mount the following types of file systems:

Azure Files
Azure Blob storage
Network File System (NFS), including an Avere vFXT cache
Common Internet File System (CIFS)

Batch supports the following virtual file system types for node agents that are produced
for their respective publisher and offer.
ノ Expand table

OS Type Azure Files share Azure Blob container NFS mount CIFS mount

Linux ✔️ ✔️ ✔️ ✔️

Windows ✔️ ❌ ❌ ❌

7 Note

Mounting a virtual file system isn't supported on Batch pools created before
August 8, 2019.

Networking requirements
When you use virtual file mounts with Batch pools in a virtual network, keep the
following requirements in mind, and ensure that no required traffic is blocked. For more
information, see Batch pools in a virtual network.

Azure Files shares require TCP port 445 to be open for traffic to and from the
storage service tag. For more information, see Use an Azure file share with

Windows.

Azure Blob containers require TCP port 443 to be open for traffic to and from the
storage service tag. Virtual machines (VMs) must have access to
https://packages.microsoft.com to download the blobfuse and gpg packages.

Depending on your configuration, you might need access to other URLs.

Network File System (NFS) requires access to port 2049 by default. Your
configuration might have other requirements. VMs must have access to the
appropriate package manager to download the nfs-common (for Debian or Ubuntu)
packages. The URL might vary based on your OS version. Depending on your
configuration, you might also need access to other URLs.

Mounting Azure Blob or Azure Files through NFS might have more networking
requirements. For example, your compute nodes might need to use the same
virtual network subnet as the storage account.

Common Internet File System (CIFS) requires access to TCP port 445. VMs must
have access to the appropriate package manager to download the cifs-utils
package. The URL might vary based on your OS version.
Mounting configuration and implementation
Mounting a virtual file system on a pool makes the file system available to every
compute node in the pool. Configuration for the file system happens when a compute
node joins a pool, restarts, or is reimaged.

To mount a file system on a pool, you create a MountConfiguration object that matches
your virtual file system: AzureBlobFileSystemConfiguration ,
AzureFileShareConfiguration , NfsMountConfiguration , or CifsMountConfiguration .

All mount configuration objects need the following base parameters. Some mount
configurations have specific parameters for the particular file system, which the code
examples present in more detail.

Account name or source of the storage account.

Relative mount path or source, the location of the file system to mount on the
compute node, relative to the standard \fsmounts directory accessible via
AZ_BATCH_NODE_MOUNTS_DIR .

The exact \fsmounts directory location varies depending on node OS. For example,
the location on an Ubuntu node maps to mnt\batch\tasks\fsmounts.

Mount options or BlobFuse options that describe specific parameters for


mounting a file system.

When you create the pool and the MountConfiguration object, you assign the object to
the MountConfigurationList property. Mounting for the file system happens when a
node joins the pool, restarts, or is reimaged.

The Batch agent implements mounting differently on Windows and Linux.

On Linux, Batch installs the package cifs-utils . Then, Batch issues the mount
command.

On Windows, Batch uses cmdkey to add your Batch account credentials. Then,
Batch issues the mount command through net use . For example:

PowerShell

net use S: \\<storage-account-name>.file.core.windows.net\<fileshare>


/u:AZURE\<storage-account-name> <storage-account-key>
Mounting the file system creates an environment variable AZ_BATCH_NODE_MOUNTS_DIR ,
which points to the location of the mounted file system and log files. You can use the
log files for troubleshooting and debugging.

Mount an Azure Files share with PowerShell


You can use Azure PowerShell to mount an Azure Files share on a Windows or Linux
Batch pool. The following procedure walks you through configuring and mounting an
Azure file share file system on a Batch pool.

) Important

The maximum number of mounted file systems on a pool is 10. For details and
other limits, see Batch service quotas and limits.

Prerequisites
An Azure account with an active subscription.
Azure PowerShell installed, or use Azure Cloud Shell and select PowerShell for
the interface.
An existing Batch account with a linked Azure Storage account that has a file share.

Windows

1. Sign in to your Azure subscription, replacing the placeholder with your


subscription ID.

PowerShell

Connect-AzAccount -Subscription "<subscription-ID>"

2. Get the context for your Batch account. Replace the <batch-account-name>
placeholder with your Batch account name.

PowerShell

$context = Get-AzBatchAccount -AccountName <batch-account-name>


3. Create a Batch pool with the following settings. Replace the <storage-account-
name> , <storage-account-key> , and <file-share-name> placeholders with the

values from the storage account that's linked to your Batch account. Replace
the <pool-name> placeholder with the name you want for the pool.

The following script creates a pool with one Windows Server 2016 Datacenter,
Standard_D2_V2 size node, and then mounts the Azure file share to the S drive
of the node.

PowerShell

$fileShareConfig = New-Object -TypeName


"Microsoft.Azure.Commands.Batch.Models.PSAzureFileShareConfiguratio
n" -ArgumentList @("<storage-account-name>", "https://<storage-
account-name>.file.core.windows.net/batchfileshare1", "S", "
<storage-account-key>")

$mountConfig = New-Object -TypeName


"Microsoft.Azure.Commands.Batch.Models.PSMountConfiguration" -
ArgumentList @($fileShareConfig)

$imageReference = New-Object -TypeName


"Microsoft.Azure.Commands.Batch.Models.PSImageReference" -
ArgumentList @("WindowsServer", "MicrosoftWindowsServer", "2016-
Datacenter", "latest")

$configuration = New-Object -TypeName


"Microsoft.Azure.Commands.Batch.Models.PSVirtualMachineConfiguratio
n" -ArgumentList @($imageReference, "batch.node.windows amd64")

New-AzBatchPool -Id "<pool-name>" -VirtualMachineSize


"STANDARD_D2_V2" -VirtualMachineConfiguration $configuration -
TargetDedicatedComputeNodes 1 -MountConfiguration @($mountConfig) -
BatchContext $context

4. Connect to the node and check that the output file is correct.

Access the mounted files


Azure Batch tasks can access the mounted files by using the drive's direct path, for
example:

PowerShell

cmd /c "more S:\folder1\out.txt & timeout /t 90 > NULL"


The Azure Batch agent grants access only for Azure Batch tasks. If you use Remote
Desktop Protocol (RDP) to connect to the node, your user account doesn't have
automatic access to the mounting drive. When you connect to the node over RDP,
you must add credentials for the storage account to access the S drive directly.

Use cmdkey to add the credentials. Replace the <storage-account-name> and


<storage-account-key > placeholders with your own information.

PowerShell

cmdkey /add:"<storage-account-name>.file.core.windows.net" /user:"Azure\


<storage-account-name>" /pass:"<storage-account-key>"

Troubleshoot mount issues


If a mount configuration fails, the compute node fails and the node state is set to
Unusable. To diagnose a mount configuration failure, inspect the ComputeNodeError
property for details on the error.

To get log files for debugging, you can use the OutputFiles API to upload the *.log files.
The *.log files contain information about the file system mount at the
AZ_BATCH_NODE_MOUNTS_DIR location. Mount log files have the format: <type>-

<mountDirOrDrive>.log for each mount. For example, a CIFS mount at a mount


directory named test has a mount log file named: cifs-test.log.

Investigate mounting errors


You can RDP or SSH to the node to check the log files pertaining to filesystem mounts.
The following example error message is possible when you try to mount an Azure file
share to a Batch node:

Output

Mount Configuration Error | An error was encountered while configuring


specified mount(s)
Message: System error (out of memory, cannot fork, no more loop devices)
MountConfigurationPath: S

If you receive this error, RDP or SSH to the node to check the related log files. The Batch
agent implements mounting differently on Windows and Linux for Azure file shares. On
Linux, Batch installs the package cifs-utils . Then, Batch issues the mount command.
On Windows, Batch uses cmdkey to add your Batch account credentials. Then, Batch
issues the mount command through net use . For example:

PowerShell

net use S: \\<storage-account-name>.file.core.windows.net\<fileshare>


/u:AZURE\<storage-account-name> <storage-account-key>

Windows

1. Connect to the node over RDP.

2. Open the log file fshare-S.log, at D:\batch\tasks\fsmounts.

3. Review the error messages, for example:

Output

CMDKEY: Credential added successfully.


System error 86 has occurred.

The specified network password is not correct.

4. Troubleshoot the problem by using the Azure file shares troubleshooter .

If you can't use RDP or SSH to check the log files on the node, you can upload the logs
to your Azure storage account. You can use this method for both Windows and Linux
logs.

1. In the Azure portal , search for and select the Batch account that has your pool.

2. On the Batch account page, select Pools from the left navigation.

3. On the Pools page, select the pool's name.

4. On the pool's page, select Nodes from the left navigation.

5. On the Nodes page, select the node's name.

6. On the node's page, select Upload batch logs.

7. On the Upload batch logs pane, select Pick storage container.

8. On the Storage accounts page, select a storage account.


9. On the Containers page, select or create a container to upload the files to, and
select Select.

10. Select Start upload.

11. When the upload completes, download the files and open agent-debug.log.

12. Review the error messages, for example:

Output

..20210322T113107.448Z.00000000-0000-0000-0000-
000000000000.ERROR.agent.mount.filesystems.basefilesystem.basefilesyste
m.py.run_cmd_persist_output_async.59.2912.MainThread.3580.Mount command
failed with exit code: 2, output:

CMDKEY: Credential added successfully.

System error 86 has occurred.

The specified network password is not correct.

13. Troubleshoot the problem by using the Azure file shares troubleshooter .

Manually mount a file share with PowerShell


If you can't diagnose or fix mounting errors, you can use PowerShell to mount the file
share manually instead.

Windows

1. Create a pool without a mounting configuration. For example:

PowerShell

$imageReference = New-Object -TypeName


"Microsoft.Azure.Commands.Batch.Models.PSImageReference" -
ArgumentList @("WindowsServer", "MicrosoftWindowsServer", "2016-
Datacenter", "latest")

$configuration = New-Object -TypeName


"Microsoft.Azure.Commands.Batch.Models.PSVirtualMachineConfiguratio
n" -ArgumentList @($imageReference, "batch.node.windows amd64")

New-AzBatchPool -Id "<pool-name>" -VirtualMachineSize


"STANDARD_D2_V2" -VirtualMachineConfiguration $configuration -
TargetDedicatedComputeNodes 1 -BatchContext $Context
2. Wait for the node to be in the Idle state.

3. In the Azure portal , search for and select the storage account that has your
file share.

4. On the storage account page's menu, select File shares from the left
navigation.

5. On the File shares page, select the file share you want to mount.

6. On the file share's page, select Connect.

7. In the Connect pane, select the Windows tab.

8. For Drive letter, enter the drive you want to use. The default is Z.

9. For Authentication method, select how you want to connect to the file share.

10. Select Show Script, and copy the PowerShell script for mounting the file share.

11. Connect to the node over RDP.

12. Run the command you copied to mount the file share.

13. Note any error messages in the output. Use this information to troubleshoot
any networking-related issues.

Example mount configurations


The following code example configurations demonstrate mounting various file share
systems to a pool of compute nodes.

Azure Files share


Azure Files is the standard Azure cloud file system offering. The following configuration
mounts an Azure Files share named <file-share-name> to the S drive. For information
about the parameters in the example, see Mount SMB Azure file share on Windows or
Create an NFS Azure file share and mount it on a Linux VM using the Azure portal.

C#

new PoolAddParameter
{
Id = poolId,
MountConfiguration = new[]
{
new MountConfiguration
{
AzureFileShareConfiguration = new AzureFileShareConfiguration
{
AccountName = "<storage-account-name>",
AzureFileUrl = "https://<storage-account-
name>.file.core.windows.net/<file-share-name>",
AccountKey = "<storage-account-key>",
RelativeMountPath = "S",
MountOptions = "-o
vers=3.0,dir_mode=0777,file_mode=0777,sec=ntlmssp"
},
}
}
}

Azure Blob container


Another option is to use Azure Blob storage via BlobFuse. Mounting a blob file system
requires either an account key, shared access signature (SAS) key, or managed identity
with access to your storage account.

For information on getting these keys or identity, see the following articles:

Manage storage account access keys

Grant limited access to Azure Storage resources using shared access signatures
(SAS)

Configure managed identities in Batch pools

 Tip

If you use a managed identity, ensure that the identity has been assigned to
the pool so that it's available on the VM doing the mounting. The identity
must also have the Storage Blob Data Contributor role.

The following configuration mounts a blob file system with BlobFuse options. For
illustration purposes, the example shows AccountKey , SasKey and IdentityReference ,
but you can actually specify only one of these methods.

C#
new PoolAddParameter
{
Id = poolId,
MountConfiguration = new[]
{
new MountConfiguration
{
AzureBlobFileSystemConfiguration = new
AzureBlobFileSystemConfiguration
{
AccountName = "<storage-account-name>",
ContainerName = "<container-name>",
// Use only one of the following three lines:
AccountKey = "<storage-account-key>",
SasKey = "<sas-key>",
IdentityReference = new
ComputeNodeIdentityReference("/subscriptions/<subscription>/resourceGroups/<
resource-
group>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<identity-
name>"),
RelativeMountPath = "<relative-mount-path>",
BlobfuseOptions = "-o attr_timeout=240 -o entry_timeout=240
-o negative_timeout=120 "
},
}
}
}

To get default access to the BlobFuse mounted directory, run the task as an
administrator. BlobFuse mounts the directory at the user space, and at pool creation
mounts the directory as root. In Linux, all administrator tasks are root. The FUSE
reference page describes all options for the FUSE module.

For more information and tips on using BlobFuse, see the following references:

Blobfuse2 project
Blobfuse Troubleshoot FAQ
GitHub issues in the azure-storage-fuse repository

NFS
You can mount NFS shares to pool nodes to allow Batch to access traditional file
systems. The setup can be a single NFS server deployed in the cloud or an on-premises
NFS server accessed over a virtual network. NFS mounts support Avere vFXT, a
distributed in-memory cache for data-intensive high-performance computing (HPC)
tasks. NFS mounts also support other standard NFS-compliant interfaces, such as NFS
for Azure Blob and NFS for Azure Files.
The following example shows a configuration for an NFS file system mount:

C#

new PoolAddParameter
{
Id = poolId,
MountConfiguration = new[]
{
new MountConfiguration
{
NfsMountConfiguration = new NFSMountConfiguration
{
Source = "<source>",
RelativeMountPath = "<relative-mount-path>",
MountOptions = "options ver=3.0"
},
}
}
}

CIFS
Mounting CIFS to pool nodes is another way to provide access to traditional file
systems. CIFS is a file-sharing protocol that provides an open and cross-platform
mechanism for requesting network server files and services. CIFS is based on the
enhanced version of the SMB protocol for internet and intranet file sharing.

The following example shows a configuration for a CIFS file mount.

C#

new PoolAddParameter
{
Id = poolId,
MountConfiguration = new[]
{
new MountConfiguration
{
CifsMountConfiguration = new CIFSMountConfiguration
{
Username = "<storage-account-name>",
RelativeMountPath = "<relative-mount-path>",
Source = "<source>",
Password = "<storage-account-key>",
MountOptions = "-o
vers=3.0,dir_mode=0777,file_mode=0777,serverino,domain=<domain-name>"
},
}
}
}

7 Note

Looking for an example using PowerShell rather than C#? You can find another
great example here: Mount Azure File to Azure Batch Pool .

Next steps
Mount an Azure Files share with Windows
Mount an Azure Files share with Linux
Blobfuse2 - A Microsoft supported Azure Storage FUSE driver
Network File System overview
Microsoft SMB protocol and CIFS protocol overview

Feedback
Was this page helpful?  Yes  No

Provide product feedback


Use an Azure file share with a Batch pool
Article • 04/25/2025

Azure Files offers fully managed file shares in the cloud that are accessible via the Server
Message Block (SMB) protocol. You can mount and use an Azure file share on Batch pool
compute nodes.

Considerations for use with Batch


Consider using an Azure file share when you have pools that run a relatively low number of
parallel tasks if using non-premium Azure Files. Review the performance and scale targets to
determine if Azure Files (which uses an Azure Storage account) should be used, given your
expected pool size and number of asset files.

Azure file shares are cost-efficient and can be configured with data replication to another
region to be globally redundant.

You can mount an Azure file share concurrently from an on-premises computer. However,
ensure that you understand concurrency implications, especially when using REST APIs.

See also the general planning considerations for Azure file shares.

Create a file share


You can create an Azure file share in a storage account that is linked to your Batch account, or
in a separate storage account. For more information, see Create an Azure file share.

Mount an Azure file share on a Batch pool


For details on how to mount an Azure file share on a pool, see Mount a virtual file system on a
Batch pool.

Next steps
To learn about other options to read and write data in Batch, see Persist job and task
output.
Use RDMA or GPU instances in Batch
pools
Article • 02/04/2025

To run certain Batch jobs, you can take advantage of Azure VM sizes designed for large-
scale computation. For example:

To run multi-instance MPI workloads, choose HB, HC, NC, or ND series or other
sizes that have a network interface for Remote Direct Memory Access (RDMA).
These sizes connect to an InfiniBand network for inter-node communication, which
can accelerate MPI applications.

For CUDA applications, choose N-series sizes that include NVIDIA Tesla graphics
processing unit (GPU) cards.

This article provides guidance and examples to use some of Azure's specialized sizes in
Batch pools. For specs and background, see:

High performance compute VM sizes (Linux, Windows)

GPU-enabled VM sizes (Linux, Windows)

7 Note

Certain VM sizes might not be available in the regions where you create your Batch
accounts. To check that a size is available, see Products available by region and
Choose a VM size for a Batch pool.

Dependencies
The RDMA or GPU capabilities of compute-intensive sizes in Batch are supported only in
certain operating systems. The supported operating systems for these VM sizes include
only a subset of those available for virtual machine creation. Depending on how you
create your Batch pool, you might need to install or configure extra driver or other
software on the nodes. The following tables summarize these dependencies. See linked
articles for details. For options to configure Batch pools, see later in this article.

Linux pools - Virtual machine configuration


ノ Expand table

Size Capability Operating Required Pool settings


systems software

H16r, H16mr RDMA Ubuntu 22.04 Intel MPI 5 Enable inter-node


NC24r, LTS communication, disable
NC24rs_v2, (Azure Linux RDMA concurrent task
NC24rs_v3, Marketplace) drivers execution
ND24rs*

NCv3, NDv2, NVIDIA Tesla Ubuntu 22.04 NVIDIA CUDA N/A


NDv4, NDv5 GPU (varies by LTS or CUDA
series series) (Azure Toolkit drivers
Marketplace)

NVv3, NVv4, Accelerated Ubuntu 22.04 NVIDIA GRID N/A


NVv5 series Visualization LTS drivers or AMD
GPU (Azure GPU drivers
Marketplace)

*
RDMA-capable N-series sizes also include NVIDIA Tesla GPUs

) Important

This document references a release version of Linux that is nearing or at, End of
Life(EOL). Please consider updating to a more current version.

Windows pools - Virtual Machine Configuration

ノ Expand table

Size Capability Operating Required Pool settings


systems software

H16r, H16mr RDMA Windows Server Microsoft MPI Enable inter-node


NC24r, 2016, 2012 R2, 2012 R2 or communication, disable
NC24rs_v2, or later, or concurrent task
NC24rs_v3, 2012 (Azure Intel MPI 5 execution
ND24rs* Marketplace)
Windows
RDMA drivers

NC, NCv2, NVIDIA Tesla Windows Server NVIDIA CUDA N/A


NCv3, ND, GPU (varies by 2016 or or CUDA
NDv2 series series) Toolkit drivers
Size Capability Operating Required Pool settings
systems software

2012 R2 (Azure
Marketplace)

NV, NVv2, NVIDIA Tesla Windows Server NVIDIA GRID N/A


NVv4 series M60 GPU 2016 or drivers
2012 R2 (Azure
Marketplace)

*
RDMA-capable N-series sizes also include NVIDIA Tesla GPUs

Windows pools - Cloud Services Configuration

2 Warning

Cloud Services Configuration pools are deprecated . Please use Virtual Machine
Configuration pools instead.

ノ Expand table

Size Capability Operating systems Required software Pool settings

H16r, RDMA Windows Server 2016, Microsoft MPI 2012 Enable inter-node
H16mr 2012 R2, 2012, or R2 or later, or communication,
2008 R2 (Guest OS Intel MPI 5 disable concurrent task
family) execution
Windows RDMA
drivers

7 Note

N-series sizes are not supported in Cloud Services Configuration pools.

Pool configuration options


To configure a specialized VM size for your Batch pool, you have several options to
install required software or drivers:

For pools in the virtual machine configuration, choose a preconfigured Azure


Marketplace VM image that has drivers and software preinstalled. Examples:
Data Science Virtual Machine for Linux or Windows - includes NVIDIA CUDA
drivers

Linux images for Batch container workloads that also include GPU and RDMA
drivers:

Ubuntu Server (with GPU and RDMA drivers) for Azure Batch container pools

Create a custom Windows or Linux VM image with installed drivers, software, or


other settings required for the VM size.

Install GPU and RDMA drivers by VM extension.

Create a Batch application package from a zipped driver or application installer.


Then, configure Batch to deploy this package to pool nodes and install once when
each node is created. For example, if the application package is an installer, create
a start task command line to silently install the app on all pool nodes. Consider
using an application package and a pool start task if your workload depends on a
particular driver version.

7 Note

The start task must run with elevated (admin) permissions, and it must wait for
success. Long-running tasks will increase the time to provision a Batch pool.

Example: NVIDIA GPU drivers on Windows NC


VM pool
To run CUDA applications on a pool of Windows NC nodes, you need to install NVIDIA
GPU drivers. The following sample steps use an application package to install the
NVIDIA GPU drivers. You might choose this option if your workload depends on a
specific GPU driver version.

1. Download a setup package for the GPU drivers on Windows Server 2016 from the
NVIDIA website - for example, version 411.82 . Save the file locally using a
short name like GPUDriverSetup.exe.
2. Create a zip file of the package.
3. Upload the package to your Batch account. For steps, see the application packages
guidance. Specify an application ID such as GPUDriver, and a version such as
411.82.
4. Using the Batch APIs or Azure portal, create a pool in the virtual machine
configuration with the desired number of nodes and scale. The following table
shows sample settings to install the NVIDIA GPU drivers silently using a start task:

ノ Expand table

Setting Value

Image Type Marketplace (Linux/Windows)

Publisher MicrosoftWindowsServer

Offer WindowsServer

Sku 2016-Datacenter

Node size NC6 Standard

Application GPUDriver, version 411.82


package references

Start task enabled True


Command line - cmd /c
"%AZ_BATCH_APP_PACKAGE_GPUDriver#411.82%\\GPUDriverSetup.exe /s"
User identity - Pool autouser, admin
Wait for success - True

Example: NVIDIA GPU drivers on a Linux NC


VM pool
To run CUDA applications on a pool of Linux NC nodes, you need to install necessary
NVIDIA Tesla GPU drivers from the CUDA Toolkit. The following sample steps create and
deploy a custom Ubuntu 22.04 LTS image with the GPU drivers:

1. Deploy an Azure NC-series VM running Ubuntu 22.04 LTS. For example, create the
VM in the US South Central region.
2. Add the NVIDIA GPU Drivers extension to the VM by using the Azure portal, a
client computer that connects to the Azure subscription, or Azure Cloud Shell.
Alternatively, follow the steps to connect to the VM and install CUDA drivers
manually.
3. Follow the steps to create an Azure Compute Gallery image for Batch.
4. Create a Batch account in a region that supports NC VMs.
5. Using the Batch APIs or Azure portal, create a pool using the custom image and
with the desired number of nodes and scale. The following table shows sample
pool settings for the image:
ノ Expand table

Setting Value

Image Type Custom Image

Custom Image Name of the image

Node agent SKU batch.node.ubuntu 22.04

Node size NC6 Standard

Example: Microsoft MPI on a Windows H16r


VM pool
To run Windows MPI applications on a pool of Azure H16r VM nodes, you need to
configure the HpcVmDrivers extension and install Microsoft MPI. Here are sample steps
to deploy a custom Windows Server 2016 image with the necessary drivers and
software:

1. Deploy an Azure H16r VM running Windows Server 2016. For example, create the
VM in the US West region.
2. Add the HpcVmDrivers extension to the VM by running an Azure PowerShell
command from a client computer that connects to your Azure subscription, or
using Azure Cloud Shell.
3. Make a Remote Desktop connection to the VM.
4. Download the setup package (MSMpiSetup.exe) for the latest version of
Microsoft MPI, and install Microsoft MPI.
5. Follow the steps to create an Azure Compute Gallery image for Batch.
6. Using the Batch APIs or Azure portal, create a pool using the Azure Compute
Gallery and with the desired number of nodes and scale. The following table shows
sample pool settings for the image:

ノ Expand table

Setting Value

Image Type Custom Image

Custom Image Name of the image

Node agent SKU batch.node.windows amd64

Node size H16r Standard


Setting Value

Internode communication enabled True

Max tasks per node 1

Next steps
To run MPI jobs on an Azure Batch pool, see the Windows or Linux examples.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Provision Linux compute nodes in Batch
pools
Article • 05/18/2023

You can use Azure Batch to run parallel compute workloads on both Linux and Windows
virtual machines. This article details how to create pools of Linux compute nodes in the
Batch service by using both the Batch Python and Batch .NET client libraries.

Virtual Machine Configuration


When you create a pool of compute nodes in Batch, you have two options from which
to select the node size and operating system: Cloud Services Configuration and Virtual
Machine Configuration. Virtual Machine Configuration pools are composed of Azure
VMs, which may be created from either Linux or Windows images. When you create a
pool with Virtual Machine Configuration, you specify an available compute node size,
the virtual machine image reference to be installed on the nodes,and the Batch node
agent SKU (a program that runs on each node and provides an interface between the
node and the Batch service).

Virtual machine image reference


The Batch service uses virtual machine scale sets to provide compute nodes in the
Virtual Machine Configuration. You can specify an image from the Azure Marketplace ,
or use the Azure Compute Gallery to prepare a custom image.

When you create a virtual machine image reference, you must specify the following
properties:

Image reference property Example

Publisher Canonical

Offer UbuntuServer

SKU 20.04-LTS

Version latest

 Tip
You can learn more about these properties and how to specify Marketplace images
in Find Linux VM images in the Azure Marketplace with the Azure CLI. Note that
some Marketplace images are not currently compatible with Batch.

List of virtual machine images


Not all Marketplace images are compatible with the currently available Batch node
agents. To list all supported Marketplace virtual machine images for the Batch service
and their corresponding node agent SKUs, use list_supported_images (Python),
ListSupportedImages (Batch .NET), or the corresponding API in another language SDK.

Node agent SKU


The Batch node agent is a program that runs on each node in the pool and provides
the command-and-control interface between the node and the Batch service. There are
different implementations of the node agent, known as SKUs, for different operating
systems. Essentially, when you create a Virtual Machine Configuration, you first specify
the virtual machine image reference, and then you specify the node agent to install on
the image. Typically, each node agent SKU is compatible with multiple virtual machine
images. To view the supported node agent SKUs and virtual machine image
compatibilities, you can use the Azure Batch CLI command:

Azure CLI

az batch pool supported-images list

For more information, you can refer to Account - List Supported Images - REST API
(Azure Batch Service) | Microsoft Docs.

Create a Linux pool: Batch .NET


The following code snippet shows an example of how to use the Batch .NET client
library to create a pool of Ubuntu Server compute nodes. For more details about Batch
.NET, view the reference documentation.

The following code snippet uses the PoolOperations.ListSupportedImages method to


select from the list of currently supported Marketplace image and node agent SKU
combinations. This technique is recommended, because the list of supported
combinations may change from time to time. Most commonly, supported combinations
are added.
C#

// Pool settings
const string poolId = "LinuxNodesSamplePoolDotNet";
const string vmSize = "STANDARD_D2_V3";
const int nodeCount = 1;

// Obtain a collection of all available node agent SKUs.


// This allows us to select from a list of supported
// VM image/node agent combinations.
List<ImageInformation> images =
batchClient.PoolOperations.ListSupportedImages().ToList();

// Find the appropriate image information


ImageInformation image = null;
foreach (var img in images)
{
if (img.ImageReference.Publisher == "Canonical" &&
img.ImageReference.Offer == "UbuntuServer" &&
img.ImageReference.Sku == "20.04-LTS")
{
image = img;
break;
}
}

// Create the VirtualMachineConfiguration for use when actually


// creating the pool
VirtualMachineConfiguration virtualMachineConfiguration =
new VirtualMachineConfiguration(image.ImageReference,
image.NodeAgentSkuId);

// Create the unbound pool object using the VirtualMachineConfiguration


// created above
CloudPool pool = batchClient.PoolOperations.CreatePool(
poolId: poolId,
virtualMachineSize: vmSize,
virtualMachineConfiguration: virtualMachineConfiguration,
targetDedicatedComputeNodes: nodeCount);

// Commit the pool to the Batch service


await pool.CommitAsync();

Although the previous snippet uses the PoolOperations.istSupportedImages method to


dynamically list and select from supported image and node agent SKU combinations
(recommended), you can also configure an ImageReference explicitly:

C#

ImageReference imageReference = new ImageReference(


publisher: "Canonical",
offer: "UbuntuServer",
sku: "20.04-LTS",
version: "latest");

Connect to Linux nodes using SSH


During development or while troubleshooting, you may find it necessary to sign in to
the nodes in your pool. Unlike Windows compute nodes, you can't use Remote Desktop
Protocol (RDP) to connect to Linux nodes. Instead, the Batch service enables SSH access
on each node for remote connection.

Instead of a password, you can specify an SSH public key when you create a user on a
node.

In .NET, use the ComputeNodeUser.SshPublicKey property.

Pricing
Azure Batch is built on Azure Cloud Services and Azure Virtual Machines technology.
The Batch service itself is offered at no cost, which means you are charged only for the
compute resources (and associated costs that entails) that your Batch solutions
consume. When you choose Virtual Machine Configuration, you are charged based on
the Virtual Machines pricing structure.

If you deploy applications to your Batch nodes using application packages, you are also
charged for the Azure Storage resources that your application packages consume.

Next steps
Explore the Python code samples in the azure-batch-samples GitHub
repository to see how to perform common Batch operations, such as pool, job,
and task creation. The README that accompanies the Python samples has
details about how to install the required packages.
Learn about using Azure Spot VMs with Batch.
Use Spot VMs with Batch workloads
Article • 04/02/2025

Azure Batch offers Spot virtual machines (VMs) to reduce the cost of Batch workloads.
Spot VMs make new types of Batch workloads possible by enabling a large amount of
compute power to be used for a low cost.

Spot VMs take advantage of surplus capacity in Azure. When you specify Spot VMs in
your pools, Azure Batch can use this surplus, when available.

The tradeoff for using Spot VMs is that those VMs might not always be available, or they
might get preempted at any time, depending on available capacity. For this reason, Spot
VMs are most suitable for batch and asynchronous processing workloads where the job
completion time is flexible and the work is distributed across many VMs.

Spot VMs are offered at a reduced price compared with dedicated VMs. To learn more
about pricing, see Batch pricing .

Differences between Spot and low-priority VMs


Batch offers two types of low-cost preemptible VMs:

Spot VMs, a modern Azure-wide offering also available as single-instance VMs or


Virtual Machine Scale Sets.
Low-priority VMs, a legacy offering only available through Azure Batch.

The type of node you get depends on your Batch account's pool allocation mode, which
can be set during account creation. Batch accounts that use the user subscription pool
allocation mode always get Spot VMs. Batch accounts that use the Batch managed pool
allocation mode always get low-priority VMs.

2 Warning

Low-priority VMs will be retired after 30 September 2025. Please migrate to Spot
VMs in Batch before then.

Azure Spot VMs and Batch low-priority VMs are similar but have a few differences in
behavior.

ノ Expand table
Spot VMs Low-priority VMs

Supported Batch User-subscription Batch accounts Batch-managed Batch accounts


accounts

Supported Batch Virtual Machine Configuration Virtual Machine Configuration


pool configurations and Cloud Service Configuration
(deprecated)

Available regions All regions that support Spot VMs All regions except Microsoft
Azure operated by 21Vianet

Customer eligibility Not available for some subscription Available for all Batch customers
offer types. See more about Spot
limitations.

Possible reasons for Capacity Capacity


eviction

Pricing Model Variable discounts relative to Fixed discounts relative to


standard VM prices standard VM prices

Quota model Subject to core quotas on your Subject to core quotas on your
subscription Batch account

Availability SLA None None

Batch support for Spot VMs


Azure Batch provides several capabilities that make it easy to consume and benefit from
Spot VMs:

Batch pools can contain both dedicated VMs and Spot VMs. The number of each
type of VM can be specified when a pool is created, or changed at any time for an
existing pool, by using the explicit resize operation or by using autoscale. Job and
task submission can remain unchanged, regardless of the VM types in the pool.
You can also configure a pool to completely use Spot VMs to run jobs as cheaply
as possible, but spin up dedicated VMs if the capacity drops below a minimum
threshold, to keep jobs running.
Batch pools automatically seek the target number of Spot VMs. If VMs are
preempted or unavailable, Batch attempts to replace the lost capacity and return
to the target.
When tasks are interrupted, Batch detects and automatically requeues tasks to run
again.
Spot VMs have a separate vCPU quota that differs from the one for dedicated VMs.
The quota for Spot VMs is higher than the quota for dedicated VMs, because Spot
VMs cost less. For more information, see Batch service quotas and limits.

Considerations and use cases


Many Batch workloads are a good fit for Spot VMs. Consider using Spot VMs when jobs
are broken into many parallel tasks, or when you have many jobs that are scaled out and
distributed across many VMs.

Some examples of batch processing use cases that are well suited for Spot VMs are:

Development and testing: In particular, if large-scale solutions are being


developed, significant savings can be realized. All types of testing can benefit, but
large-scale load testing and regression testing are great uses.
Supplementing on-demand capacity: Spot VMs can be used to supplement
regular dedicated VMs. When available, jobs can scale and therefore complete
quicker for lower cost; when not available, the baseline of dedicated VMs remains
available.
Flexible job execution time: If there's flexibility in the time jobs have to complete,
then potential drops in capacity can be tolerated. However, with the addition of
Spot VMs, jobs frequently run faster and for a lower cost.

Batch pools can be configured to use Spot VMs in a few ways:

A pool can use only Spot VMs. In this case, Batch recovers any preempted capacity
when available. This configuration is the cheapest way to execute jobs.
Spot VMs can be used with a fixed baseline of dedicated VMs. The fixed number of
dedicated VMs ensures there's always some capacity to keep a job progressing.
A pool can use a dynamic mix of dedicated and Spot VMs, so that the cheaper
Spot VMs are solely used when available, but the full-priced dedicated VMs scale
up when required. This configuration keeps a minimum amount of capacity
available to keep jobs progressing.

Keep in mind the following practices when planning your use of Spot VMs:

To maximize the use of surplus capacity in Azure, suitable jobs can scale out.
Occasionally, VMs might not be available or are preempted, which results in
reduced capacity for jobs and could lead to task interruption and reruns.
Tasks with shorter execution times tend to work best with Spot VMs. Jobs with
longer tasks might be impacted more if interrupted. If long-running tasks
implement checkpointing to save progress as they execute, this impact might be
reduced.
Long-running MPI jobs that utilize multiple VMs aren't well suited for Spot VMs,
because one preempted VM can lead to the whole job having to run again.
Spot nodes may be marked as unusable if network security group (NSG) rules are
configured incorrectly.

Create and manage pools with Spot VMs


A Batch pool can contain both dedicated and Spot VMs (also referred to as compute
nodes). You can set the target number of compute nodes for both dedicated and Spot
VMs. The target number of nodes specifies the number of VMs you want to have in the
pool.

The following example creates a pool using Azure virtual machines, in this case Linux
VMs, with a target of 5 dedicated VMs and 20 Spot VMs:

C#

ImageReference imageRef = new ImageReference(


publisher: "Canonical",
offer: "UbuntuServer",
sku: "20.04-LTS",
version: "latest");

// Create the pool


VirtualMachineConfiguration virtualMachineConfiguration =
new VirtualMachineConfiguration("batch.node.ubuntu 20.04", imageRef);

pool = batchClient.PoolOperations.CreatePool(
poolId: "vmpool",
targetDedicatedComputeNodes: 5,
targetLowPriorityComputeNodes: 20,
virtualMachineSize: "Standard_D2_v2",
virtualMachineConfiguration: virtualMachineConfiguration);

You can get the current number of nodes for both dedicated and Spot VMs:

C#

int? numDedicated = pool1.CurrentDedicatedComputeNodes;


int? numLowPri = pool1.CurrentLowPriorityComputeNodes;

Pool nodes have a property to indicate if the node is a dedicated or Spot VM:

C#
bool? isNodeDedicated = poolNode.IsDedicated;

Spot VMs might occasionally be preempted. When preemption happens, tasks that were
running on the preempted node VMs are requeued and run again when capacity
returns.

For Virtual Machine Configuration pools, Batch also performs the following behaviors:

The preempted VMs have their state updated to Preempted.


The VM is effectively deleted, leading to loss of any data stored locally on the VM.
A list nodes operation on the pool still returns the preempted nodes.
The pool continually attempts to reach the target number of Spot nodes available.
When replacement capacity is found, the nodes keep their IDs, but are reinitialized,
going through Creating and Starting states before they're available for task
scheduling.
Preemption counts are available as a metric in the Azure portal.

Scale pools containing Spot VMs


As with pools solely consisting of dedicated VMs, it's possible to scale a pool containing
Spot VMs by calling the Resize method or by using autoscale.

The pool resize operation takes a second optional parameter that updates the value of
targetLowPriorityNodes :

C#

pool.Resize(targetDedicatedComputeNodes: 0, targetLowPriorityComputeNodes:
25);

The pool autoscale formula supports Spot VMs as follows:

You can get or set the value of the service-defined variable


$TargetLowPriorityNodes .

You can get the value of the service-defined variable $CurrentLowPriorityNodes .


You can get the value of the service-defined variable $PreemptedNodeCount . This
variable returns the number of nodes in the preempted state and allows you to
scale up or down the number of dedicated nodes, depending on the number of
preempted nodes that are unavailable.
Configure jobs and tasks
Jobs and tasks may require some extra configuration for Spot nodes:

The JobManagerTask property of a job has an AllowLowPriorityNode property.


When this property is true, the job manager task can be scheduled on either a
dedicated or Spot node. If it's false, the job manager task is scheduled to a
dedicated node only.
The AZ_BATCH_NODE_IS_DEDICATED environment variable is available to a task
application so that it can determine whether it's running on a Spot or on a
dedicated node.

View metrics for Spot VMs


New metrics are available in the Azure portal for Spot nodes. These metrics are:

Low-Priority Node Count


Low-Priority Core Count
Preempted Node Count

To view these metrics in the Azure portal:

1. Navigate to your Batch account in the Azure portal.


2. Select Metrics from the Monitoring section.
3. Select the metrics you desire from the Metric list.

Limitations
Spot VMs in Batch don't support setting a max price and don't support price-
based evictions. They can only be evicted for capacity reasons.
Spot VMs are only available for Virtual Machine Configuration pools and not for
Cloud Service Configuration pools, which are deprecated .
Spot VMs aren't available for some clouds, VM sizes, and subscription offer types.
See more about Spot VM limitations.
Currently, ephemeral OS disks aren't supported with Spot VMs due to the service-
managed eviction policy of Stop-Deallocate.

Next steps
Learn about the Batch service workflow and primary resources such as pools,
nodes, jobs, and tasks.
Learn about the Batch APIs and tools available for building Batch solutions.
Start to plan the move from low-priority VMs to Spot VMs. If you use low-priority
VMs with Cloud Services Configuration pools (which are deprecated ), plan to
migrate to Virtual Machine Configuration pools instead.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Azure Batch pool and node errors
Article • 01/22/2025

Some Azure Batch pool creation and management operations happen immediately.
Detecting failures for these operations is straightforward, because errors usually return
immediately from the API, command line, or user interface. However, some operations
are asynchronous, run in the background, and take several minutes to complete. This
article describes ways to detect and avoid failures that can occur in the background
operations for pools and nodes.

Make sure to set your applications to implement comprehensive error checking,


especially for asynchronous operations. Comprehensive error checking can help you
promptly identify and diagnose issues.

Pool errors
Pool errors might be related to resize timeout or failure, automatic scaling failure, or
pool deletion failure. With the inclusion of more detailed error messages, diagnosing
and resolving these issues has become more straightforward.

Relay Provider Error Details


Relay provider errors are directly relayed from the underlying Azure resource providers,
such as the Azure Virtual Machine Scale Set (VMSS), and they offer deeper insights into
why a pool operation failed. These errors typically occur when a pool's creation, resizing,
or deletion is impacted by a lower-layer service issue.

Structure of Relay Provider Error


These errors are provided in a structured JSON format containing the following key
components:

Error Code: The type of error encountered (e.g., AllocationFailed, BadRequest, etc.).
Error Message: Brief description of the error
Provider Error Json: A detailed error message generated by the underlying Azure
service (e.g., VMSS).
Provider Error Truncated: A Boolean indicating whether the provider error
message has been truncated due to size limits.
Example Relay Provider Errors

Example 1

Error Code: AllocationFailed


Error Message: Desired number of dedicated nodes could not be allocated
Provider Error JSON:

JSON

{
"error": {
"code": "BadRequest",
"message": "The selected VM size 'STANDARD_A1_V2' cannot boot Hypervisor
Generation '2'. If this was a Create operation, please ensure that the
Hypervisor Generation of the Image matches the Hypervisor Generation of the
selected VM Size. If this was an Update operation, please choose a
Hypervisor Generation '2' VM Size."
}
}

Provider Error JSON Truncated: False

This error indicates a mismatch between the VM size and the Hypervisor generation. The
error message suggests selecting a compatible VM size to resolve the issue.

Example 2

Error Code: AllocationFailed


Error Message: An internal error was occurred while resizing the pool
Provider Error JSON:

JSON

{
"error": {
"code": "ScopeLocked",
"message": "The scope '/subscriptions/<subscription-
id>/resourceGroups/<resource-group-
name>/providers/Microsoft.Compute/VirtualMachineScaleSets/<guid>-azurebatch-
VMSS-D' cannot perform write operation because the following scope(s) are
locked: '/subscriptions/<subscription-id>/resourceGroups/<resource-group-
name>/providers/Microsoft.Compute/VirtualMachineScaleSets/<guid>-azurebatch-
VMSS-D'. Please remove the lock and try again."
}
}
Provider Error JSON Truncated: False

This error indicates that the pool resize operation failed because a scope was locked,
preventing the write operation; removing the lock can resolve the issue.

Relay provider errors offer deeper insights into pool operation failures, making it easier
to diagnose and resolve issues directly from the Azure services.

Resize timeout or failure


When you create a new pool or resize an existing pool, you specify the target number of
nodes. The create or resize operation completes immediately, but the actual allocation
of new nodes or removal of existing nodes might take several minutes. You can specify
the resize timeout in the Pool - Add or Pool - Resize APIs. If Batch can't allocate the
target number of nodes during the resize timeout period, the pool goes into a steady
state, and reports resize errors.

The resizeError property lists the errors that occurred for the most recent evaluation.

Common causes for resize errors include:

Resize timeout too short. Usually, the default timeout of 15 minutes is long
enough to allocate or remove pool nodes. If you're allocating a large number of
nodes, such as more than 1,000 nodes from an Azure Marketplace image, or more
than 300 nodes from a custom virtual machine (VM) image, you can set the resize
timeout to 30 minutes.

Insufficient core quota. A Batch account is limited in the number of cores it can
allocate across all pools, and stops allocating nodes once it reaches that quota. You
can increase the core quota so Batch can allocate more nodes. For more
information, see Batch service quotas and limits.

Insufficient subnet IPs when a pool is in a virtual network. A virtual network


subnet must have enough IP addresses to allocate to every requested pool node.
Otherwise, the nodes can't be created. For more information, see Create an Azure
Batch pool in a virtual network.

Insufficient resources when a pool is in a virtual network. When you create a pool
in a virtual network, you might create resources such as load balancers, public IPs,
and network security groups (NSGs) in the same subscription as the Batch account.
Make sure the subscription quotas are sufficient for these resources.

Large pools with custom VM images. Large pools that use custom VM images can
take longer to allocate, and resize timeouts can occur. For recommendations on
limits and configuration, see Create a pool with the Azure Compute Gallery.

Automatic scaling failures


You can set Azure Batch to automatically scale the number of nodes in a pool, and you
define the parameters for the automatic scaling formula for the pool. The Batch service
then uses the formula to periodically evaluate the number of nodes in the pool and set
new target numbers. For more information, see Create an automatic formula for scaling
compute nodes in a Batch pool.

The following issues can occur when you use automatic scaling:

The automatic scaling evaluation fails.


The resulting resize operation fails and times out.
A problem with the automatic scaling formula leads to incorrect node target
values. The resize might either work or time out.

To get information about the last automatic scaling evaluation, use the autoScaleRun
property. This property reports the evaluation time, the values and result, and any
performance errors.

The pool resize complete event captures information about all evaluations.

Pool deletion failures


To delete a pool that contains nodes, Batch first deletes the nodes, which can take
several minutes to complete. Batch then deletes the pool object itself.

Batch sets the poolState to deleting during the deletion process. The calling application
can detect if the pool deletion is taking too long by using the state and
stateTransitionTime properties.

If the pool deletion is taking longer than expected, Batch retries periodically until the
pool is successfully deleted. In some cases, the delay is due to an Azure service outage
or other temporary issues. Other factors that prevent successful pool deletion might
require you to take action to correct the issue. These factors can include the following
issues:

Resource locks might be placed on Batch-created resources, or on network


resources that Batch uses.

Resources that you created might depend on a Batch-created resource. For


instance, if you create a pool in a virtual network, Batch creates an NSG, a public IP
address, and a load balancer. If you're using these resources outside the pool, you
can't delete the pool.

The Microsoft.Batch resource provider might be unregistered from the


subscription that contains your pool.

For user subscription mode Batch accounts, Microsoft Azure Batch might no
longer have the Contributor or Owner role to the subscription that contains your
pool. For more information, see Allow Batch to access the subscription.

Node errors
Even when Batch successfully allocates nodes in a pool, various issues can cause some
nodes to be unhealthy and unable to run tasks. These nodes still incur charges, so it's
important to detect problems to avoid paying for nodes you can't use. Knowing about
common node errors and knowing the current jobState is useful for troubleshooting.

Start task failures


You can specify an optional startTask for a pool. As with any task, the start task uses a
command line and can download resource files from storage. The start task runs for
each node when the node starts. The waitForSuccess property specifies whether Batch
waits until the start task completes successfully before it schedules any tasks to a node.
If you configure the node to wait for successful start task completion, but the start task
fails, the node isn't usable but still incurs charges.

You can detect start task failures by using the taskExecutionResult and
taskFailureInformation properties of the top-level startTaskInformation node property.

A failed start task also causes Batch to set the computeNodeState to starttaskfailed , if
waitForSuccess was set to true .

As with any task, there can be many causes for a start task failure. To troubleshoot,
check the stdout, stderr, and any other task-specific log files.

Start tasks must be re-entrant, because the start task can run multiple times on the same
node, for example when the node is reimaged or rebooted. In rare cases, when a start
task runs after an event causes a node reboot, one operating system (OS) or ephemeral
disk reimages while the other doesn't. Since Batch start tasks and all Batch tasks run
from the ephemeral disk, this situation isn't usually a problem. However, in cases where
the start task installs an application to the OS disk and keeps other data on the
ephemeral disk, there can be sync problems. Protect your application accordingly if you
use both disks.

Application package download failure


You can specify one or more application packages for a pool. Batch downloads the
specified package files to each node and uncompresses the files after the node starts,
but before it schedules tasks. It's common to use a start task command with application
packages, for example to copy files to a different location or to run setup.

If an application package fails to download and uncompress, the computeNodeError


property reports the failure, and sets the node state to unusable .

Container download failure


You can specify one or more container references on a pool. Batch downloads the
specified containers to each node. If the container fails to download, the
computeNodeError property reports the failure, and sets the node state to unusable .

Node OS updates
For Windows pools, enableAutomaticUpdates is set to true by default. Although
allowing automatic updates is recommended, updates can interrupt task progress,
especially if the tasks are long-running. You can set this value to false if you need to
ensure that an OS update doesn't happen unexpectedly.

Node in unusable state


Batch might set the computeNodeState to unusable for many reasons. You can't
schedule tasks to an unusable node, but the node still incurs charges.

If Batch can determine the cause, the computeNodeError property reports it. If a node is
in an unusable state, but has no computeNodeError, it means Batch is unable to
communicate with the VM. In this case, Batch always tries to recover the VM. However,
Batch doesn't automatically attempt to recover VMs that failed to install application
packages or containers, even if their state is unusable .

Other reasons for unusable nodes might include the following causes:

A custom VM image is invalid. For example, the image isn't properly prepared.
A VM is moved because of an infrastructure failure or a low-level upgrade. Batch
recovers the node.
A VM image has been deployed on hardware that doesn't support it.
The VMs are in an Azure virtual network, and traffic has been blocked to key ports.
The VMs are in a virtual network, but outbound traffic to Azure Storage is blocked.
The VMs are in a virtual network with a custom DNS configuration, and the DNS
server can't resolve Azure storage.

Node agent log files


The Batch agent process that runs on each pool node provides log files that might help
if you need to contact support about a pool node issue. You can upload log files for a
node via the Azure portal, Batch Explorer, or the Compute Node - Upload Batch Service
Logs API. After you upload and save the log files, you can delete the node or pool to
save the cost of running the nodes.

Node disk full


Batch uses the temporary drive on a node pool VM to store files such as the following
job files, task files, and shared files:

Application package files


Task resource files
Application-specific files downloaded to one of the Batch folders
Stdout and stderr files for each task application execution
Application-specific output files

Files like application packages or start task resource files write only once when Batch
creates the pool node. Even though they only write once, if these files are too large they
could fill the temporary drive.

Other files, such as stdout and stderr, are written for each task that a node runs. If a large
number of tasks run on the same node, or the task files are too large, they could fill the
temporary drive.

The node also needs a small amount of space on the OS disk to create users after it
starts.

The size of the temporary drive depends on the VM size. One consideration when
picking a VM size is to ensure that the temporary drive has enough space for the
planned workload.
When you add a pool in the Azure portal, you can display the full list of VM sizes,
including a Resource disk size column. The articles that describe VM sizes have tables
with a Temp Storage column. For more information, see Compute optimized virtual
machine sizes. For an example size table, see Fsv2-series.

You can specify a retention time for files written by each task. The retention time
determines how long to keep the task files before automatically cleaning them up. You
can reduce the retention time to lower storage requirements.

If the temporary or OS disk runs out of space, or is close to running out of space, the
node moves to the unusable computeNoteState, and the node error says that the disk is
full.

If you're not sure what's taking up space on the node, try remote connecting to the
node and investigating manually. You can also use the File - List From Compute Node
API to examine files, for example task outputs, in Batch managed folders. This API only
lists files in the Batch managed directories. If your tasks created files elsewhere, this API
doesn't show them.

After you make sure to retrieve any data you need from the node or upload it to a
durable store, you can delete data as needed to free up space.

You can delete old completed jobs or tasks whose task data is still on the nodes. Look in
the recentTasks collection in the taskInformation on the node, or use the File - List
From Compute Node API. Deleting a job deletes all the tasks in the job. Deleting the
tasks in the job triggers deletion of data in the task directories on the nodes, and frees
up space. Once you've freed up enough space, reboot the node. The node should move
out of unusable state and into idle again.

To recover an unusable node in VirtualMachineConfiguration pools, you can remove the


node from the pool by using the Pool - Remove Nodes API. Then you can grow the pool
again to replace the bad node with a fresh one.

) Important

Reimage isn't currently supported for VirtualMachineConfiguration pools.

Next steps
Learn about job and task error checking.
Learn about best practices for working with Azure Batch.
Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Use Azure Pipelines to build and deploy an
HPC solution
Article • 04/02/2025

Azure DevOps tools can automate building and testing Azure Batch high performance
computing (HPC) solutions. Azure Pipelines provides modern continuous integration (CI) and
continuous deployment (CD) processes for building, deploying, testing, and monitoring
software. These processes accelerate your software delivery, allowing you to focus on your code
rather than support infrastructure and operations.

This article shows how to set up CI/CD processes by using Azure Pipelines with Azure Resource
Manager templates (ARM templates) to deploy HPC solutions on Azure Batch. The example
creates a build and release pipeline to deploy an Azure Batch infrastructure and release an
application package. The following diagram shows the general deployment flow, assuming the
code is developed locally:

Prerequisites
To follow the steps in this article, you need:

An Azure DevOps organization, and an Azure DevOps project with an Azure Repos
repository created in the organization. You must have Project Administrator, Build
Administrator, and Release Administrator roles in the Azure DevOps project.

An active Azure subscription with Owner or other role that includes role assignment
abilities. For more information, see Understand Azure role assignments.

A basic understanding of source control and ARM template syntax.

Prepare the solution


The example in this article uses several ARM templates and an existing open-source video
processing application, FFmpeg . You can copy or download these resources and push them
to your Azure Repos repository.

) Important

This example deploys Windows software on Windows-based Batch nodes. Azure Pipelines,
ARM templates, and Batch also fully support Linux software and nodes.

Understand the ARM templates


Three capability templates, similar to units or modules, implement specific pieces of
functionality. An end-to-end solution template then deploys the underlying capability
templates. This linked template structure allows each capability template to be individually
tested and reused across solutions.

For detailed information about the templates, see the Resource Manager template reference
guide for Microsoft.Batch resource types.
Storage account template
Save the following code as a file named storageAccount.json. This template defines an Azure
Storage account, which is required to deploy the application to the Batch account.

JSON

{
"$schema": "https://schema.management.azure.com/schemas/2015-01-
01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"accountName": {
"type": "string",
"metadata": {
"description": "Name of the Azure Storage Account"
}
}
},
"variables": {},
"resources": [
{
"type": "Microsoft.Storage/storageAccounts",
"name": "[parameters('accountName')]",
"sku": {
"name": "Standard_LRS"
},
"apiVersion": "2018-02-01",
"location": "[resourceGroup().location]",
"properties": {}
}
],
"outputs": {
"blobEndpoint": {
"type": "string",
"value": "[reference(resourceId('Microsoft.Storage/storageAccounts',
parameters('accountName'))).primaryEndpoints.blob]"
},
"resourceId": {
"type": "string",
"value": "[resourceId('Microsoft.Storage/storageAccounts',
parameters('accountName'))]"
}
}
}

Batch account template


Save the following code as a file named batchAccount.json. This template defines a Batch
account. The Batch account acts as a platform to run applications across node pools.

JSON
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-
01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"batchAccountName": {
"type": "string",
"metadata": {
"description": "Name of the Azure Batch Account"
}
},
"storageAccountId": {
"type": "string",
"metadata": {
"description": "ID of the Azure Storage Account"
}
}
},
"variables": {},
"resources": [
{
"name": "[parameters('batchAccountName')]",
"type": "Microsoft.Batch/batchAccounts",
"apiVersion": "2017-09-01",
"location": "[resourceGroup().location]",
"properties": {
"poolAllocationMode": "BatchService",
"autoStorage": {
"storageAccountId": "[parameters('storageAccountId')]"
}
}
}
],
"outputs": {}
}

Batch pool template


Save the following code as a file named batchAccountPool.json. This template creates a node
pool and nodes in the Batch account.

JSON

{
"$schema": "https://schema.management.azure.com/schemas/2015-01-
01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"batchAccountName": {
"type": "string",
"metadata": {
"description": "Name of the Azure Batch Account"
}
},
"batchAccountPoolName": {
"type": "string",
"metadata": {
"description": "Name of the Azure Batch Account Pool"
}
}
},
"variables": {},
"resources": [
{
"name": "[concat(parameters('batchAccountName'),'/',
parameters('batchAccountPoolName'))]",
"type": "Microsoft.Batch/batchAccounts/pools",
"apiVersion": "2017-09-01",
"properties": {
"deploymentConfiguration": {
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "MicrosoftWindowsServer",
"offer": "WindowsServer",
"sku": "2022-datacenter",
"version": "latest"
},
"nodeAgentSkuId": "batch.node.windows amd64"
}
},
"vmSize": "Standard_D2s_v3"
}
}
],
"outputs": {}
}

Orchestrator template
Save the following code as a file named deployment.json. This final template acts as an
orchestrator to deploy the three underlying capability templates.

JSON

{
"$schema": "https://schema.management.azure.com/schemas/2015-01-
01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"StorageContainerUri": {
"type": "string",
"metadata": {
"description": "URI of the Blob Storage Container containing the
Azure Resource Manager templates"
}
},
"StorageContainerSasToken": {
"type": "string",
"metadata": {
"description": "The SAS token of the container containing the Azure
Resource Manager templates"
}
},
"applicationStorageAccountName": {
"type": "string",
"metadata": {
"description": "Name of the Azure Storage Account"
}
},
"batchAccountName": {
"type": "string",
"metadata": {
"description": "Name of the Azure Batch Account"
}
},
"batchAccountPoolName": {
"type": "string",
"metadata": {
"description": "Name of the Azure Batch Account Pool"
}
}
},
"variables": {},
"resources": [
{
"apiVersion": "2017-05-10",
"name": "storageAccountDeployment",
"type": "Microsoft.Resources/deployments",
"properties": {
"mode": "Incremental",
"templateLink": {
"uri": "[concat(parameters('StorageContainerUri'), 'arm-
templates/storageAccount.json', parameters('StorageContainerSasToken'))]",
"contentVersion": "1.0.0.0"
},
"parameters": {
"accountName": {"value": "
[parameters('applicationStorageAccountName')]"}
}
}
},
{
"apiVersion": "2017-05-10",
"name": "batchAccountDeployment",
"type": "Microsoft.Resources/deployments",
"dependsOn": [
"storageAccountDeployment"
],
"properties": {
"mode": "Incremental",
"templateLink": {
"uri": "[concat(parameters('StorageContainerUri'), 'arm-
templates/batchAccount.json', parameters('StorageContainerSasToken'))]",
"contentVersion": "1.0.0.0"
},
"parameters": {
"batchAccountName": {"value": "
[parameters('batchAccountName')]"},
"storageAccountId": {"value": "
[reference('storageAccountDeployment').outputs.resourceId.value]"}
}
}
},
{
"apiVersion": "2017-05-10",
"name": "poolDeployment",
"type": "Microsoft.Resources/deployments",
"dependsOn": [
"batchAccountDeployment"
],
"properties": {
"mode": "Incremental",
"templateLink": {
"uri": "[concat(parameters('StorageContainerUri'), 'arm-
templates/batchAccountPool.json', parameters('StorageContainerSasToken'))]",
"contentVersion": "1.0.0.0"
},
"parameters": {
"batchAccountName": {"value": "
[parameters('batchAccountName')]"},
"batchAccountPoolName": {"value": "
[parameters('batchAccountPoolName')]"}
}
}
}
],
"outputs": {}
}

Set up your repository


Upload the ARM templates, FFmpeg app, and a YAML build definition file into your Azure Repos
repository.

1. Upload the four ARM templates to an arm-templates folder in your repository.

2. For the application package, download and extract the Windows 64-bit version of FFmpeg
4.3.1 , and upload it to a hpc-application folder in your repository.

3. For the build definition, save the following definition as a file named hpc-app.build.yml,
and upload it to a pipelines folder in your repository.

yml

# To publish an application into Batch, you need to


# first zip the file, and then publish an artifact, so
# you can take the necessary steps in your release pipeline.
steps:
# First, zip up the files required in the Batch account.
# For this instance, those are the ffmpeg files.
- task: ArchiveFiles@2
displayName: 'Archive applications'
inputs:
rootFolderOrFile: hpc-application
includeRootFolder: false
archiveFile:
'$(Build.ArtifactStagingDirectory)/package/$(Build.BuildId).zip'
# Publish the zip file, so you can use it as part
# of your Release pipeline later.
- task: PublishPipelineArtifact@0
inputs:
artifactName: 'hpc-application'
targetPath: '$(Build.ArtifactStagingDirectory)/package'

When you're finished setting up your repository, the folder structure should have the following
main sections:

An arm-templates folder that contains the ARM templates.


A hpc-application folder that contains ffmpeg.
A pipelines folder that contains the YAML build definition file for the Build pipeline.

7 Note

This example codebase structure demonstrates that you can store application,
infrastructure, and pipeline code in the same repository.

Create the Azure pipeline


After you set up the source code repository, use Azure Pipelines to implement a build, test, and
deployment pipeline for your application. In this stage of a pipeline, you typically run tests to
validate code and build pieces of the software. The number and types of tests, and any other
tasks that you run, depend on your overall build and release strategy.
Create the Build pipeline
In this section, you create a YAML build pipeline to work with the ffmpeg software that runs in
the Batch account.

1. In your Azure DevOps project, select Pipelines from the left navigation, and then select
New pipeline.

2. On the Where is your code screen, select Azure Repos Git.

3. On the Select a repository screen, select your repository.

7 Note

You can also create a build pipeline by using a visual designer. On the New pipeline
page, select Use the classic editor. You can use a YAML template in the visual
designer. For more information, see Define your Classic pipeline.

4. On the Configure your pipeline screen, select Existing Azure Pipelines YAML file.

5. On the Select an existing YAML file screen, select the hpc-app.build.yml file from your
repository, and then select Continue.
6. On the Review your pipeline YAML screen, review the build configuration, and then select
Run, or select the dropdown caret next to Run and select Save. This template enables
continuous integration, so the build automatically triggers when a new commit to the
repository meets the conditions set in the build.

7. You can view live build progress updates. To see build outcomes, select the appropriate
run from your build definition in Azure Pipelines.
7 Note

If you use a client application to run your HPC solution, you need to create a separate build
definition for that application. For how-to guides, see the Azure Pipelines documentation.

Create the Release pipeline


You use an Azure Pipelines Release pipeline to deploy your application and underlying
infrastructure. Release pipelines enable CD and automate your release process. There are
several steps to deploy your application and underlying infrastructure.

The linked templates for this solution must be accessible from a public HTTP or HTTPS
endpoint. This endpoint could be a GitHub repository, an Azure Blob Storage account, or
another storage location. To ensure that the uploaded template artifacts remain secure, hold
them in a private mode, but access them by using some form of shared access signature (SAS)
token.

The following example demonstrates how to deploy an infrastructure and application by using
templates from an Azure Storage blob.

Set up the pipeline


1. In your Azure DevOps project, select Pipelines > Releases in the left navigation.

2. On the next screen, select New > New release pipeline.

3. On the Select a template screen, select Empty job, and then close the Stage screen.

4. Select New release pipeline at the top of the page and rename the pipeline to something
relevant for your pipeline, such as Deploy Azure Batch + Pool.

5. In the Artifacts section, select Add.

6. On the Add an artifact screen, select Build and then select your Build pipeline to get the
output for the HPC application.

7 Note

You can create a Source alias or accept the default. Take note of the Source alias
value, as you need it to create tasks in the release definition.
7. Select Add.

8. On the pipeline page, select Add next to Artifacts to create a link to another artifact, your
Azure Repos repository. This link is required to access the ARM templates in your
repository. ARM templates don't need compilation, so you don't need to push them
through a build pipeline.

7 Note

Again note the Source alias value to use later.


9. Select the Variables tab. Create the following variables in your pipeline so you don't have
to reenter the same information into multiple tasks.

ノ Expand table

Name Value

applicationStorageAccountName Name for the storage account to hold the HPC application binaries.

batchAccountApplicationName Name for the application in the Batch account.

batchAccountName Name for the Batch account.

batchAccountPoolName Name for the pool of virtual machines (VMs) to do the processing.

batchApplicationId Unique ID for the Batch application, of the form:


/subscriptions/<subscriptionId>/resourceGroups/<resourceGroupName>^
/providers/Microsoft.Batch/batchAccounts/<batchAccountName>^
/applications/<batchAccountApplicationName> .

Replace the <subscriptionId> placeholder with your Azure subscription


ID, and the other placeholders with the values you set for the other
variables in this list.

batchApplicationVersion Semantic version of your Batch application, in this case 4.3.1.


Name Value

location Azure region for the resources to be deployed.

resourceGroupName Name for the resource group to deploy resources in.

storageAccountName Name for the storage account to hold the linked ARM templates.

StorageContainerSasToken $(<referenceName>.StorageContainerSasToken) . Replace the


<referenceName placeholder with the Reference name value you
configure in the Output Variables section of the following Azure File
Copy step.

StorageContainerUri $(<referenceName>.StorageContainerUri) . Replace the <referenceName>


placeholder with the Reference name value you configure in the
Output Variables section of the Azure File Copy step.

10. Select the Tasks tab, and then select Agent job.

11. On the Agent job screen, under Agent pool, select Azure Pipelines.

12. Under Agent Specification, select windows-latest.


Add tasks
Create six tasks to:

Download the zipped ffmpeg files.


Deploy a storage account to host the nested ARM templates.
Copy the ARM templates to the storage account.
Deploy the Batch account and required dependencies.
Create an application in the Batch account.
Upload the application package to the Batch account.

For each new task that the following steps specify:

1. Select the + symbol next to Agent job in the left pane.

2. Search for and select the specified task in the right pane.

3. Add or select the properties to configure the task.

4. Select Add.
Create the tasks as follows:

1. Select the Download Pipeline Artifacts task, and set the following properties:

Display name: Enter Download ApplicationPackage to Agent.


Artifact name: Enter hpc-application.
Destination directory: Enter $(System.DefaultWorkingDirectory) .

2. Create an Azure Storage account to store your ARM templates. You could use an existing
storage account, but to support this self-contained example and isolation of content,
make a dedicated storage account.

Select the ARM Template deployment: Resource Group scope task, and set the following
properties:

Display name: Enter Deploy storage account for ARM templates.


Azure Resource Manager connection: Select the appropriate Azure subscription.
Subscription: Select the appropriate Azure subscription.
Action: Select Create or update resource group.
Resource group: Enter $(resourceGroupName) .
Location: Enter $(location) .
Template: Enter $(System.ArtifactsDirectory)/<AzureRepoArtifactSourceAlias>/arm-
templates/storageAccount.json . Replace the <AzureRepoArtifactSourceAlias>

placeholder with the repository Source alias you noted previously.


Override template parameters: Enter -accountName $(storageAccountName) .

3. Upload the artifacts from source control into the storage account. Part of this Azure File
Copy task outputs the Storage account container URI and SAS token to a variable, so they
can be reused in later steps.

Select the Azure File Copy task, and set the following properties:
Display name: Enter AzureBlob File Copy.
Source: Enter $(System.ArtifactsDirectory)/<AzureRepoArtifactSourceAlias>/arm-
templates/ . Replace the <AzureRepoArtifactSourceAlias> placeholder with the

repository Source alias you noted previously.


Azure Subscription: Select the appropriate Azure subscription.
Destination Type: Select Azure Blob.
RM Storage Account: Enter $(storageAccountName) .
Container Name: Enter templates.
Reference name: Expand Output Variables, then enter ffmpeg.

7 Note

If this step fails, make sure your Azure DevOps organization has Storage Blob
Contributor role in the storage account.

4. Deploy the orchestrator ARM template to create the Batch account and pool. This
template includes parameters for the Storage account container URI and SAS token. The
variables required in the ARM template are held in the variables section of the release
definition and were set from the AzureBlob File Copy task.

Select the ARM Template deployment: Resource Group scope task, and set the following
properties:

Display name: Enter Deploy Azure Batch.


Azure Resource Manager connection: Select the appropriate Azure subscription.
Subscription: Select the appropriate Azure subscription.
Action: Select Create or update resource group.
Resource group: Enter $(resourceGroupName) .
Location: Enter $(location) .
Template location: Select URL of the file.
Template link: Enter $(StorageContainerUri)arm-
templates/deployment.json$(StorageContainerSasToken) .

Override template parameters: Enter -StorageContainerUri $(StorageContainerUri)


-StorageContainerSasToken $(StorageContainerSasToken) -

applicationStorageAccountName $(applicationStorageAccountName) -batchAccountName


$(batchAccountName) -batchAccountPoolName $(batchAccountPoolName) .

A common practice is to use Azure Key Vault tasks. If the service principal connected to
your Azure subscription has an appropriate access policy set, it can download secrets from
Key Vault and be used as a variable in your pipeline. The name of the secret is set with the
associated value. For example, you could reference a secret of sshPassword with
$(sshPassword) in the release definition.

5. Call Azure CLI to create an application in Azure Batch.


Select the Azure CLI task, and set the following properties:

Display name: Enter Create application in Azure Batch account.


Azure Resource Manager connection: Select the appropriate Azure subscription.
Script Type: Select PowerShell Core.
Script Location: Select Inline script.
Inline Script: Enter az batch application create --application-name
$(batchAccountApplicationName) --name $(batchAccountName) --resource-group

$(resourceGroupName) .

6. Call Azure CLI to upload associated packages to the application, in this case the ffmpeg
files.

Select the Azure CLI task, and set the following properties:

Display name: Enter Upload package to Azure Batch account.


Azure Resource Manager connection: Select the appropriate Azure subscription.
Script Type: Select PowerShell Core.
Script Location: Select Inline script.
Inline Script: Enter az batch application package create --application-name
$(batchAccountApplicationName) --name $(batchAccountName) --resource-group

$(resourceGroupName) --version $(batchApplicationVersion) --package-


file=$(System.DefaultWorkingDirectory)/$(Release.Artifacts.

<AzureBuildArtifactSourceAlias>.BuildId).zip . Replace the

<AzureBuildArtifactSourceAlias> placeholder with the Build Source alias you noted


previously.

7 Note

The version number of the application package is set to a variable. The variable allows
overwriting previous versions of the package and lets you manually control the
package version pushed to Azure Batch.

Create and run the release


1. When you finish creating all the steps, select Save at the top of the pipeline page, and
then select OK.

2. Select Create release at the top of the page.

3. To view live release status, select the link at the top of the page that says the release has
been created.

4. To view the log output from the agent, hover over the stage and then select the Logs
button.
Test the environment
Once the environment is set up, confirm that the following tests run successfully. Replace the
placeholders with your resource group and Batch account values.

Connect to the Batch account


Connect to the new Batch account by using Azure CLI from a command prompt.

1. Sign in to your Azure account with az login and follow the instructions to authenticate.
2. Authenticate the Batch account with az batch account login -g <resourceGroup> -n
<batchAccount> .

List the available applications

Azure CLI
az batch application list -g <resourceGroup> -n <batchAccount>

Check that the pool is valid

Azure CLI

az batch pool list

In the command output, note the value of currentDedicatedNodes to adjust in the next test.

Resize the pool


Run the following command to resize the pool so there are compute nodes available for job
and task testing. Replace the <poolName> placeholder with your pool name value, and the
<targetNumber> placeholder with a number that's greater than the currentDedicatedNodes from

the previous command output. Check status by running the az batch pool list command until
the resizing completes and shows the target number of nodes.

Azure CLI

az batch pool resize --pool-id <poolname> --target-dedicated-nodes <target number>

Next steps
See these tutorials to learn how to interact with a Batch account via a simple application.

Run a parallel workload with Azure Batch by using the Python API
Run a parallel workload with Azure Batch by using the .NET API

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Job preparation and release tasks on Batch
compute nodes
07/01/2025

An Azure Batch job often requires setup before its tasks are executed, and post-job
maintenance when its tasks are completed. For example, you might need to download
common task input data to your compute nodes, or upload task output data to Azure Storage
after the job completes. You can use job preparation and job release tasks for these operations.

A job preparation task runs before a job's tasks, on all compute nodes scheduled to run at
least one task.
A job release task runs once the job is completed, on each node in the pool that ran a job
preparation task.

As with other Batch tasks, you can specify a command line to invoke when a job preparation or
release task runs. Job preparation and release tasks offer familiar Batch task features such as:

Resource file download.


Elevated execution.
Custom environment variables.
Maximum execution duration.
Retry count.
File retention time.

This article shows how to use the JobPreparationTask and JobReleaseTask classes in the Batch
.NET library.

 Tip

Job preparation and release tasks are especially helpful in shared pool environments, in
which a pool of compute nodes persists between job runs and is used by many jobs.

Use cases for job preparation and release tasks


Job preparation and job release tasks are a good fit for the following scenarios:

Download common task data. Batch jobs often require a common set of data as input for
a job's tasks. You can use a job preparation task to download this data to each node
before the execution of the job's other tasks.
For example, in daily risk analysis calculations, market data is job-specific yet common to
all tasks in the job. You can use a job preparation task to download this market data,
which is often several gigabytes in size, to each compute node so that any task that runs
on the node can use it.

Delete job and task output. In a shared pool environment, where a pool's compute
nodes aren't decommissioned between jobs, you might need to delete job data between
runs. For example, you might need to conserve disk space on the nodes, or satisfy your
organization's security policies. You can use a job release task to delete data that a job
preparation task downloaded or that task execution generated.

Retain logs. You might want to keep a copy of log files that your tasks generate, or crash
dump files that failed applications generate. You can use a job release task to compress
and upload this data to an Azure Storage account.

Job preparation task


Before it runs job tasks, Batch runs the job preparation task on each compute node scheduled
to run a task. By default, Batch waits for the job preparation task to complete before running
scheduled job tasks, but you can configure it not to wait.

If the node restarts, the job preparation task runs again, but you can also disable this behavior.
If you have a job with a job preparation task and a job manager task, the job preparation task
runs before the job manager task and before all other tasks. The job preparation task always
runs first.

The job preparation task runs only on nodes that are scheduled to run a task. This behavior
prevents unnecessary runs on nodes that aren't assigned any tasks. Nodes might not be
assigned any tasks when the number of job tasks is less than the number of nodes in the pool.
This behavior also applies when concurrent task execution is enabled, which leaves some nodes
idle if the task count is lower than the total possible concurrent tasks.

7 Note

JobPreparationTask differs from CloudPool.StartTask in that JobPreparationTask runs at


the start of each job, whereas StartTask runs only when a compute node first joins a pool
or restarts.

Job release task


Once you mark a job as completed, the job release task runs on each node in the pool that ran
a job preparation task. You mark a job as completed by issuing a terminate request. This
request sets the job state to terminating, terminates any active or running tasks associated with
the job, and runs the job release task. The job then moves to the completed state.

7 Note

Deleting a job also executes the job release task. However, if a job is already terminated,
the release task doesn't run a second time if the job is later deleted.

Job release tasks can run for a maximum of 15 minutes before the Batch service terminates
them. For more information, see the REST API reference documentation.

Job preparation and release tasks with Batch .NET


To run a job preparation task, assign a JobPreparationTask object to your job's
CloudJob.JobPreparationTask property. Similarly, to use a job release task, initialize a
JobReleaseTask and assign it to your job's CloudJob.JobReleaseTask.

In the following code snippet, myBatchClient is an instance of BatchClient, and myPool is an


existing pool within the Batch account.

C#

// Create the CloudJob for CloudPool "myPool"


CloudJob myJob =
myBatchClient.JobOperations.CreateJob(
"JobPrepReleaseSampleJob",
new PoolInformation() { PoolId = "myPool" });

// Specify the command lines for the job preparation and release tasks
string jobPrepCmdLine =
"cmd /c echo %AZ_BATCH_NODE_ID% >
%AZ_BATCH_NODE_SHARED_DIR%\\shared_file.txt";
string jobReleaseCmdLine =
"cmd /c del %AZ_BATCH_NODE_SHARED_DIR%\\shared_file.txt";

// Assign the job preparation task to the job


myJob.JobPreparationTask =
new JobPreparationTask { CommandLine = jobPrepCmdLine };

// Assign the job release task to the job


myJob.JobReleaseTask =
new JobReleaseTask { CommandLine = jobReleaseCmdLine };

await myJob.CommitAsync();
The job release task runs when a job is terminated or deleted. You terminate a job by using
JobOperations.TerminateJobAsync, and delete a job by using JobOperations.DeleteJobAsync.
You typically terminate or delete a job when its tasks are completed, or when a timeout you
define is reached.

C#

// Terminate the job to mark it as completed. Terminate initiates the


// job release task on any node that ran job tasks. Note that the
// job release task also runs when a job is deleted, so you don't
// have to call Terminate if you delete jobs after task completion.

await myBatchClient.JobOperations.TerminateJobAsync("JobPrepReleaseSampleJob");

Code sample on GitHub


To see job preparation and release tasks in action, build and run the JobPrepRelease sample
project from GitHub. This console application takes the following actions:

1. Creates a pool with two nodes.


2. Creates a job with job preparation, release, and standard tasks.
3. Runs the job preparation task, which first writes the node ID to a text file in a node's
shared directory.
4. Runs a task on each node that writes its task ID to the same text file.
5. Once all tasks are completed or the timeout is reached, prints the contents of each node's
text file to the console.
6. Runs the job release task to delete the file from the node when the job is completed.
7. Prints the exit codes of the job preparation and release tasks for each node they ran on.
8. Pauses execution to allow confirmation of job and/or pool deletion.

Output from the sample application is similar to the following example:

Output

Attempting to create pool: JobPrepReleaseSamplePool


Created pool JobPrepReleaseSamplePool with 2 nodes
Checking for existing job JobPrepReleaseSampleJob...
Job JobPrepReleaseSampleJob not found, creating...
Submitting tasks and awaiting completion...
All tasks completed.

Contents of shared\job_prep_and_release.txt on tvm-2434664350_1-20160623t173951z:


-------------------------------------------
tvm-2434664350_1-20160623t173951z tasks:
task001
task004
task005
task006

Contents of shared\job_prep_and_release.txt on tvm-2434664350_2-20160623t173951z:


-------------------------------------------
tvm-2434664350_2-20160623t173951z tasks:
task008
task002
task003
task007

Waiting for job JobPrepReleaseSampleJob to reach state Completed


...

tvm-2434664350_1-20160623t173951z:
Prep task exit code: 0
Release task exit code: 0

tvm-2434664350_2-20160623t173951z:
Prep task exit code: 0
Release task exit code: 0

Delete job? [yes] no


yes
Delete pool? [yes] no
yes

Sample complete, hit ENTER to exit...

7 Note

The varying creation and start times of nodes in a new pool means some nodes are ready
for tasks before others, so you might see different output. Specifically, because the tasks
complete quickly, one of the pool's nodes might run all of the job's tasks. If this occurs,
the job preparation and release tasks don't exist for the node that ran no tasks.

View job preparation and release tasks in the Azure


portal
You can use the Azure portal to view Batch job properties and tasks, including job
preparation and release tasks. From your Batch account page, select Jobs from the left
navigation and then select a job. If you run the sample application, navigate to the job page
after the tasks complete, but before you delete the job and pool.

You can monitor job progress and status by expanding Approximate task count on the job
Overview or Tasks page.
The following screenshot shows the JobPrepReleaseSampleJob page after the sample
application runs. This job had preparation and release tasks, so you can select Preparation
tasks or Release tasks in the left navigation to see their properties.

Next steps
Learn about error checking for jobs and tasks.
Learn how to use application packages to prepare Batch compute nodes for task
execution.
Explore different ways to copy data and application to Batch compute nodes.
Learn about using the Azure Batch File Conventions library to persist logs and other job
and task output data.
Batch Container Isolation Task
Article • 04/02/2025

Azure Batch offers an isolation configuration at the task level, allowing tasks to avoid
mounting the entire ephemeral disk or the entire AZ_BATCH_NODE_ROOT_DIR . Instead, you
can customize the specific Azure Batch data paths you want to attach to the container
task.

7 Note

Azure Batch Data Path refers to the specific paths on an Azure Batch node
designated for tasks and applications. All these paths are located under
AZ_BATCH_NODE_ROOT_DIR .

Why we need isolation feature in container task


In a Windows container task workload, the entire ephemeral disk (D:) is attached to the
task's container. For a Linux container task workload, Azure Batch attaches the entire
AZ_BATCH_NODE_ROOT_DIR to the task's container, both in ReadWrite mode. However, if

you want to customize your container volumes, this setup may cause some data to be
shared across all containers running on the node. To address the same, we support the
ability to customize the Azure Batch data paths that you want to attach to the task
container.

Security: Prevents the container task data from leaking into the host machine or
altering data on the host machine.
Customize: You can customize your container task volumes as needed.

7 Note

To use this feature, please ensure that your node agent version is greater than
1.11.11.

Configuring host data path attachments for


containers
For Linux node: We can just attach the same path into container.
For Windows node: Since Windows containers don't have a D: disk, we need to
mount the path. Refer to the listed paths that you can choose to mount.

ノ Expand table

Azure Batch Data Path Path in Host Machine Path in Container

AZ_BATCH_APP_PACKAGE_ D:\batch\tasks\applications C:\batch\tasks\applications

AZ_BATCH_NODE_SHARED_DIR D:\batch\tasks\shared C:\batch\tasks\shared

AZ_BATCH_NODE_STARTUP_DIR D:\batch\tasks\startup C:\batch\tasks\startup

AZ_BATCH_NODE_MOUNTS_DIR D:\batch\tasks\fsmounts C:\batch\tasks\fsmounts

AZ_BATCH_NODE_STARTUP_WORKING_DIR D:\batch\tasks\startup\wd C:\batch\tasks\startup\wd

AZ_BATCH_JOB_PREP_DIR C:\batch\tasks\workitems\ D:\batch\tasks\workitems\


{workitemname}\ {workitemname}\
{jobname}\ {jobname}\
{jobpreptaskname} {jobpreptaskname}

AZ_BATCH_JOB_PREP_WORKING_DIR C:\batch\tasks\workitems\ D:\batch\tasks\workitems\


{workitemname}\ {workitemname}\
{jobname}\ {jobname}\
{jobpreptaskname}\wd {jobpreptaskname}\wd

AZ_BATCH_TASK_DIR D:\batch\tasks\workitems\ C:\batch\tasks\workitems\


{workitemname}\ {workitemname}\
{jobname}\{taskname} {jobname}\{taskname}

AZ_BATCH_TASK_WORKING_DIR D:\batch\tasks\workitems\ C:\batch\tasks\workitems\


{workitemname}\ {workitemname}\
{jobname}\{taskname}\wd {jobname}\{taskname}\wd

Refer to the listed data paths that you can choose to attach to the container. Any
unselected data paths have their associated environment variables removed.

ノ Expand table

Data Path Enum Data Path with be attached to container

Shared AZ_BATCH_NODE_SHARED_DIR

Applications AZ_BATCH_APP_PACKAGE_*

Startup AZ_BATCH_NODE_STARTUP_DIR, AZ_BATCH_NODE_STARTUP_WORKING_DIR

Vfsmounts AZ_BATCH_NODE_MOUNTS_DIR
Data Path Enum Data Path with be attached to container

JobPrep AZ_BATCH_JOB_PREP_DIR, AZ_BATCH_JOB_PREP_WORKING_DIR

Task AZ_BATCH_TASK_DIR, AZ_BATCH_TASK_WORKING_DIR

Run a container isolation task

7 Note

If you use an empty list, the NodeAgent will not mount any data paths into
the task's container. If you use null, the NodeAgent will mount the entire
ephemeral disk (in Windows) or AZ_BATCH_NODE_ROOT_DIR (in Linux).
If you don't mount the task data path into the container, you must set the
task's property workingDirectory to containerImageDefault.

Before running a container isolation task, you must create a pool with a container. For
more information on how to create it, see this guide Docker container workload.

REST API

The following example describes how to create a container task with data isolation
using REST API:

HTTP

POST {batchUrl}/jobs/{jobId}/tasks?api-version=2024-07-01.20.0

JSON

{
"id": "taskId",
"commandLine": "bash -c 'echo hello'",
"containerSettings": {
"imageName": "ubuntu",
"containerHostBatchBindMounts": [
{
"source": "Task",
"isReadOnly": true
}
]
},
"userIdentity": {
"autoUser": {
"scope": "task",
"elevationLevel": "nonadmin"
}
}
}

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Run tasks concurrently to maximize usage
of Batch compute nodes
07/01/2025

You can maximize resource usage on a smaller number of compute nodes in your pool by
running more than one task simultaneously on each node.

While some scenarios work best with all of a node's resources dedicated to a single task,
certain workloads may see shorter job times and lower costs when multiple tasks share those
resources. Consider the following scenarios:

Minimize data transfer for tasks that are able to share data. You can dramatically reduce
data transfer charges by copying shared data to a smaller number of nodes, then
executing tasks in parallel on each node. This strategy especially applies if the data to be
copied to each node must be transferred between geographic regions.
Maximize memory usage for tasks that require a large amount of memory, but only
during short periods of time, and at variable times during execution. You can employ
fewer, but larger, compute nodes with more memory to efficiently handle such spikes.
These nodes have multiple tasks running in parallel on each node, but each task can take
advantage of the nodes' plentiful memory at different times.
Mitigate node number limits when inter-node communication is required within a pool.
Currently, pools configured for inter-node communication are limited to 50 compute
nodes. If each node in such a pool is able to execute tasks in parallel, a greater number of
tasks can be executed simultaneously.
Replicate an on-premises compute cluster, such as when you first move a compute
environment to Azure. If your current on-premises solution executes multiple tasks per
compute node, you can increase the maximum number of node tasks to more closely
mirror that configuration.

Example scenario
As an example, imagine a task application with CPU and memory requirements such that
Standard_D1 nodes are sufficient. However, in order to finish the job in the required time, 1,000
of these nodes are needed.

Instead of using Standard_D1 nodes that have one CPU core, you could use Standard_D14
nodes that have 16 cores each, and enable parallel task execution. You could potentially use 16
times fewer nodes instead of 1,000 nodes, only 63 would be required. If large application files
or reference data are required for each node, job duration and efficiency are improved, since
the data is copied to only 63 nodes.
Enable parallel task execution
You configure compute nodes for parallel task execution at the pool level. With the Batch .NET
library, set the CloudPool.TaskSlotsPerNode property when you create a pool. If you're using
the Batch REST API, set the taskSlotsPerNode element in the request body during pool
creation.

7 Note

You can set the taskSlotsPerNode element and TaskSlotsPerNode property only at pool
creation time. They can't be modified after a pool has already been created.

Azure Batch allows you to set task slots per node up to (4x) the number of node cores. For
example, if the pool is configured with nodes of size "Large" (four cores), then
taskSlotsPerNode may be set to 16. However, regardless of how many cores the node has, you
can't have more than 256 task slots per node. For details on the number of cores for each of
the node sizes, see Sizes for Cloud Services (classic). For more information on service limits, see
Batch service quotas and limits.

 Tip

Be sure to take into account the taskSlotsPerNode value when you construct an autoscale
formula for your pool. For example, a formula that evaluates $RunningTasks could be
dramatically affected by an increase in tasks per node. For more information, see Create
an automatic formula for scaling compute nodes in a Batch pool.

Specify task distribution


When enabling concurrent tasks, it's important to specify how you want the tasks to be
distributed across the nodes in the pool.

By using the CloudPool.TaskSchedulingPolicy property, you can specify that tasks should be
assigned evenly across all nodes in the pool ("spreading"). Or you can specify that as many
tasks as possible should be assigned to each node before tasks are assigned to another node
in the pool ("packing").

As an example, consider the pool of Standard_D14 nodes (in the previous example) that is
configured with a CloudPool.TaskSlotsPerNode value of 16. If the
CloudPool.TaskSchedulingPolicy is configured with a ComputeNodeFillType of Pack, it would
maximize usage of all 16 cores of each node and allow an autoscaling pool to remove unused
nodes (nodes without any tasks assigned) from the pool. Autoscaling minimizes resource usage
and can save money.

Define variable slots per task


A task can be defined with the CloudTask.RequiredSlots property, specifying how many slots it
requires to run on a compute node. The default value is 1. You can set variable task slots if your
tasks have different weights associated with their resource usage on the compute node.
Variable task slots let each compute node have a reasonable number of concurrent running
tasks without overwhelming system resources like CPU or memory.

For example, for a pool with property taskSlotsPerNode = 8 , you can submit multi-core
required CPU-intensive tasks with requiredSlots = 8 , while other tasks can be set to
requiredSlots = 1 . When this mixed workload is scheduled, the CPU-intensive tasks run
exclusively on their compute nodes, while other tasks can run concurrently (up to eight tasks at
once) on other nodes. The mixed workload helps you balance your workload across compute
nodes and improve resource usage efficiency.

Be sure you don't specify a task's requiredSlots to be greater than the pool's
taskSlotsPerNode , or the task never runs. The Batch Service doesn't currently validate this

conflict when you submit tasks. It doesn't validate the conflict, because a job may not have a
pool bound at submission time, or it could change to a different pool by disabling/re-enabling.

 Tip

When using variable task slots, it's possible that large tasks with more required slots can
temporarily fail to be scheduled because not enough slots are available on any compute
node, even when there are still idle slots on some nodes. You can raise the job priority for
these tasks to increase their chance to compete for available slots on nodes.

The Batch service emits the TaskScheduleFailEvent when it fails to schedule a task to run
and keeps retrying the scheduling until required slots become available. You can listen to
that event to detect potential task scheduling issues and mitigate accordingly.

Batch .NET example


The following Batch .NET API code snippets show how to create a pool with multiple task slots
per node and how to submit a task with required slots.

Create a pool with multiple task slots per node


This code snippet shows a request to create a pool that contains four nodes, with four task
slots allowed per node. It specifies a task scheduling policy that fills each node with tasks prior
to assigning tasks to another node in the pool.

For more information on adding pools by using the Batch .NET API, see
BatchClient.PoolOperations.CreatePool.

C#

CloudPool pool =
batchClient.PoolOperations.CreatePool(
poolId: "mypool",
targetDedicatedComputeNodes: 4
virtualMachineSize: "standard_d1_v2",
VirtualMachineConfiguration: new VirtualMachineConfiguration(
imageReference: new ImageReference(
publisher: "MicrosoftWindowsServer",
offer: "WindowsServer",
sku: "2019-datacenter-core",
version: "latest"),
nodeAgentSkuId: "batch.node.windows amd64");

pool.TaskSlotsPerNode = 4;
pool.TaskSchedulingPolicy = new TaskSchedulingPolicy(ComputeNodeFillType.Pack);
pool.Commit();

Create a task with required slots


This code snippet creates a task with nondefault requiredSlots . This task runs when there are
enough free slots available on a compute node.

C#

CloudTask task = new CloudTask(taskId, taskCommandLine)


{
RequiredSlots = 2
};

List compute nodes with counts for running tasks and slots
This code snippet lists all compute nodes in the pool and prints the counts for running tasks
and task slots per node.

C#
ODATADetailLevel nodeDetail = new ODATADetailLevel(selectClause:
"id,runningTasksCount,runningTaskSlotsCount");
IPagedEnumerable<ComputeNode> nodes =
batchClient.PoolOperations.ListComputeNodes(poolId, nodeDetail);

await nodes.ForEachAsync(node =>


{
Console.WriteLine(node.Id + " :");
Console.WriteLine($"RunningTasks = {node.RunningTasksCount}, RunningTaskSlots
= {node.RunningTaskSlotsCount}");

}).ConfigureAwait(continueOnCapturedContext: false);

List task counts for the job


This code snippet gets task counts for the job, which includes both tasks and task slots count
per task state.

C#

TaskCountsResult result = await


batchClient.JobOperations.GetJobTaskCountsAsync(jobId);

Console.WriteLine("\t\tActive\tRunning\tCompleted");
Console.WriteLine($"TaskCounts:\t{result.TaskCounts.Active}\t{result.TaskCounts.Ru
nning}\t{result.TaskCounts.Completed}");
Console.WriteLine($"TaskSlotCounts:\t{result.TaskSlotCounts.Active}\t{result.TaskS
lotCounts.Running}\t{result.TaskSlotCounts.Completed}");

Batch REST example


The following Batch REST API code snippets show how to create a pool with multiple task slots
per node and how to submit a task with required slots.

Create a pool with multiple task slots per node


This snippet shows a request to create a pool that contains two large nodes with a maximum of
four tasks per node.

For more information on adding pools by using the REST API, see Add a pool to an account.

JSON

{
"odata.metadata":"https://myaccount.myregion.batch.azure.com/$metadata#pools/@Elem
ent",
"id":"mypool",
"vmSize":"large",
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "canonical",
"offer": "ubuntuserver",
"sku": "20.04-lts"
},
"nodeAgentSKUId": "batch.node.ubuntu 20.04"
},
"targetDedicatedComputeNodes":2,
"taskSlotsPerNode":4,
"enableInterNodeCommunication":true,
}

Create a task with required slots


This snippet shows a request to add a task with nondefault requiredSlots . This task only runs
when there are enough free slots available on the compute node.

JSON

{
"id": "taskId",
"commandLine": "bash -c 'echo hello'",
"userIdentity": {
"autoUser": {
"scope": "task",
"elevationLevel": "nonadmin"
}
},
"requiredSLots": 2
}

Code sample on GitHub


The ParallelTasks project on GitHub illustrates the use of the CloudPool.TaskSlotsPerNode
property.

This C# console application uses the Batch .NET library to create a pool with one or more
compute nodes. It executes a configurable number of tasks on those nodes to simulate a
variable load. Output from the application shows which nodes executed each task. The
application also provides a summary of the job parameters and duration.
The following example shows the summary portion of the output from two different runs of
the ParallelTasks sample application. Job durations shown here don't include pool creation
time, since each job was submitted to a previously created pool whose compute nodes were in
the Idle state at submission time.

The first execution of the sample application shows that with a single node in the pool and the
default setting of one task per node, the job duration is over 30 minutes.

Console

Nodes: 1
Node size: large
Task slots per node: 1
Max slots per task: 1
Tasks: 32
Duration: 00:30:01.4638023

The second run of the sample shows a significant decrease in job duration. This reduction is
because the pool was configured with four tasks per node, allowing for parallel task execution
to complete the job in nearly a quarter of the time.

Console

Nodes: 1
Node size: large
Task slots per node: 4
Max slots per task: 1
Tasks: 32
Duration: 00:08:48.2423500

Next steps
Batch Explorer
Azure Batch samples on GitHub .
Create task dependencies to run tasks that depend on other tasks.
Create task dependencies to run tasks that
depend on other tasks
07/01/2025

With Batch task dependencies, you create tasks that are scheduled for execution on compute
nodes after the completion of one or more parent tasks. For example, you can create a job that
renders each frame of a 3D movie with separate, parallel tasks. The final task merges the
rendered frames into the complete movie only after all frames have been successfully
rendered. In other words, the final task is dependent on the previous parent tasks.

Some scenarios where task dependencies are useful include:

MapReduce-style workloads in the cloud.


Jobs whose data processing tasks can be expressed as a directed acyclic graph (DAG).
Prerendering and post-rendering processes, where each task must complete before the
next task can begin.
Any other job in which downstream tasks depend on the output of upstream tasks.

By default, dependent tasks are scheduled for execution only after the parent task has
completed successfully. You can optionally specify a dependency action to override the default
behavior and run the dependent task even if the parent task fails.

In this article, we discuss how to configure task dependencies by using the Batch .NET library.
We first show you how to enable task dependency on your jobs, and then demonstrate how to
configure a task with dependencies. We also describe how to specify a dependency action to
run dependent tasks if the parent fails. Finally, we discuss the dependency scenarios that Batch
supports.

Enable task dependencies


To use task dependencies in your Batch application, you must first configure the job to use task
dependencies. In Batch .NET, enable it on your CloudJob by setting its UsesTaskDependencies
property to true :

C#

CloudJob unboundJob = batchClient.JobOperations.CreateJob( "job001",


new PoolInformation { PoolId = "pool001" });

// IMPORTANT: This is REQUIRED for using task dependencies.


unboundJob.UsesTaskDependencies = true;
In the preceding code snippet, "batchClient" is an instance of the BatchClient class.

Create dependent tasks


To create a task that depends on the completion of one or more parent tasks, you can specify
that the task "depends on" the other tasks. In Batch .NET, configure the CloudTask.DependsOn
property with an instance of the TaskDependencies class:

C#

// Task 'Flowers' depends on completion of both 'Rain' and 'Sun'


// before it is run.
new CloudTask("Flowers", "cmd.exe /c echo Flowers")
{
DependsOn = TaskDependencies.OnIds("Rain", "Sun")
},

This code snippet creates a dependent task with task ID "Flowers". The "Flowers" task depends
on tasks "Rain" and "Sun". Task "Flowers" will be scheduled to run on a compute node only
after tasks "Rain" and "Sun" are completed successfully.

7 Note

By default, a task is considered to be completed successfully when it is in the completed


state and its exit code is 0 . In Batch .NET, this means a CloudTask.State property value is
Completed and the CloudTask's TaskExecutionInformation.ExitCode property value is 0 .

To learn how to change this, see the Dependency actions section.

Dependency scenarios
There are three basic task dependency scenarios that you can use in Azure Batch: one-to-one,
one-to-many, and task ID range dependency. These three scenarios can be combined to
provide a fourth scenario: many-to-many.

ノ Expand table
Scenario Example Illustration

One-to-one taskB depends on taskA

taskB won't be scheduled for execution until taskA has


completed successfully

One-to- taskC depends on both taskA and taskB


many
taskC won't be scheduled for execution until both taskA and
taskB are completed successfully

Task ID range taskD depends on a range of tasks

taskD won't be scheduled for execution until the tasks with IDs
1 through 10 are completed successfully

 Tip

You can create many-to-many relationships, such as where tasks C, D, E, and F each
depend on tasks A and B. It's useful, for example, in parallelized preprocessing scenarios
where your downstream tasks depend on the output of multiple upstream tasks.

In the examples in this section, a dependent task runs only after the parent tasks complete
successfully. It's the default behavior for a dependent task. You can run a dependent task
after a parent task fails by specifying a dependency action to override the default
behavior.

One-to-one
In a one-to-one relationship, a task depends on the successful completion of one parent task.
To create the dependency, provide a single task ID to the TaskDependencies.OnId static
method when you populate the CloudTask.DependsOn property.

C#
// Task 'taskA' doesn't depend on any other tasks
new CloudTask("taskA", "cmd.exe /c echo taskA"),

// Task 'taskB' depends on completion of task 'taskA'


new CloudTask("taskB", "cmd.exe /c echo taskB")
{
DependsOn = TaskDependencies.OnId("taskA")
},

One-to-many
In a one-to-many relationship, a task depends on the completion of multiple parent tasks. To
create the dependency, provide a collection of specific task IDs to the TaskDependencies.OnIds
static method when you populate the CloudTask.DependsOn property.

C#

// 'Rain' and 'Sun' don't depend on any other tasks


new CloudTask("Rain", "cmd.exe /c echo Rain"),
new CloudTask("Sun", "cmd.exe /c echo Sun"),

// Task 'Flowers' depends on completion of both 'Rain' and 'Sun'


// before it is run.
new CloudTask("Flowers", "cmd.exe /c echo Flowers")
{
DependsOn = TaskDependencies.OnIds("Rain", "Sun")
},

) Important

Your dependent task creation fails if the combined length of parent task IDs is greater
than 64,000 characters. To specify a large number of parent tasks, consider using a Task ID
range instead.

Task ID range
In a dependency on a range of parent tasks, a task depends on the completion of tasks whose
IDs lie within a range that you specify.

To create the dependency, provide the first and last task IDs in the range to the
TaskDependencies.OnIdRange static method when you populate the CloudTask.DependsOn
property.
) Important

When you use task ID ranges for your dependencies, only tasks with IDs representing
integer values are selected by the range. For example, the range 1..10 selects tasks 3 and
7 , but not 5flamingoes .

Leading zeroes aren't significant when evaluating range dependencies, so tasks with string
identifiers 4 , 04 , and 004 are within the range, Since they're all treated as task 4 , the first
one to complete satisfies the dependency.

For the dependent task to run, every task in the range must satisfy the dependency, either
by completing successfully or by completing with a failure that is mapped to a
dependency action set to Satisfy.

C#

// Tasks 1, 2, and 3 don't depend on any other tasks. Because


// we will be using them for a task range dependency, we must
// specify string representations of integers as their ids.
new CloudTask("1", "cmd.exe /c echo 1"),
new CloudTask("2", "cmd.exe /c echo 2"),
new CloudTask("3", "cmd.exe /c echo 3"),

// Task 4 depends on a range of tasks, 1 through 3


new CloudTask("4", "cmd.exe /c echo 4")
{
// To use a range of tasks, their ids must be integer values.
// Note that we pass integers as parameters to TaskIdRange,
// but their ids (above) are string representations of the ids.
DependsOn = TaskDependencies.OnIdRange(1, 3)
},

Dependency actions
By default, a dependent task or set of tasks runs only after a parent task is completed
successfully. In some scenarios, you may want to run dependent tasks even if the parent task
fails. You can override the default behavior by specifying a dependency action that indicates
whether a dependent task is eligible to run.

For example, suppose that a dependent task is awaiting data from the completion of the
upstream task. If the upstream task fails, the dependent task may still be able to run using
older data. In this case, a dependency action can specify that the dependent task is eligible to
run despite the failure of the parent task.
A dependency action is based on an exit condition for the parent task. You can specify a
dependency action for any of the following exit conditions:

Whenever a pre-processing error occurs.


Whenever a file upload error occurs. If the task exits with an exit code that was specified
via exitCodes or exitCodeRanges, and then encounters a file upload error, the action
specified by the exit code takes precedence.
Whenever the task exits with an exit code defined by the ExitCodes property.
Whenever the task exits with an exit code that falls within a range specified by the
ExitCodeRanges property.
The default case, if the task exits with an exit code not defined by ExitCodes or
ExitCodeRanges, or if the task exits with a pre-processing error and the
PreProcessingError property isn't set, or if the task fails with a file upload error and the
FileUploadError property isn't set.

For .NET, these conditions are defined as properties of the ExitConditions class.

To specify a dependency action, set the ExitOptions.DependencyAction property for the exit
condition to one of the following options:

Satisfy: Indicates that dependent tasks are eligible to run if the parent task exits with a
specified error.
Block: Indicates that dependent tasks aren't eligible to run.

The default setting for the DependencyAction property is Satisfy for exit code 0, and Block for
all other exit conditions.

The following code snippet sets the DependencyAction property for a parent task. If the
parent task exits with a preprocessing error, or with the specified error codes, the dependent
task is blocked. If the parent task exits with any other nonzero error, the dependent task is
eligible to run.

C#

// Task A is the parent task.


new CloudTask("A", "cmd.exe /c echo A")
{
// Specify exit conditions for task A and their dependency actions.
ExitConditions = new ExitConditions
{
// If task A exits with a pre-processing error, block any downstream tasks
(in this example, task B).
PreProcessingError = new ExitOptions
{
DependencyAction = DependencyAction.Block
},
// If task A exits with the specified error codes, block any downstream
tasks (in this example, task B).
ExitCodes = new List<ExitCodeMapping>
{
new ExitCodeMapping(10, new ExitOptions() { DependencyAction =
DependencyAction.Block }),
new ExitCodeMapping(20, new ExitOptions() { DependencyAction =
DependencyAction.Block })
},
// If task A succeeds or fails with any other error, any downstream tasks
become eligible to run
// (in this example, task B).
Default = new ExitOptions
{
DependencyAction = DependencyAction.Satisfy
}
}
},
// Task B depends on task A. Whether it becomes eligible to run depends on how
task A exits.
new CloudTask("B", "cmd.exe /c echo B")
{
DependsOn = TaskDependencies.OnId("A")
},

Code sample
The TaskDependencies sample project on GitHub demonstrates:

How to enable task dependency on a job.


How to create tasks that depend on other tasks.
How to execute those tasks on a pool of compute nodes.

Next steps
Learn about the application packages feature of Batch, which provides an easy way to
deploy and version the applications that your tasks execute on compute nodes.
Learn about error checking for jobs and tasks.
Run tasks under user accounts in Batch
Article • 03/04/2025

7 Note

The user accounts discussed in this article are different from user accounts used for
Remote Desktop Protocol (RDP) or Secure Shell (SSH), for security reasons.

To connect to a node running the Linux virtual machine configuration via SSH, see Install
and configure xrdp to use Remote Desktop with Ubuntu. To connect to nodes running
Windows via RDP, see How to connect and sign on to an Azure virtual machine
running Windows.

A task in Azure Batch always runs under a user account. By default, tasks run under standard
user accounts, without administrator permissions. For certain scenarios, you may want to
configure the user account under which you want a task to run. This article discusses the
types of user accounts and how to configure them for your scenario.

Types of user accounts


Azure Batch provides two types of user accounts for running tasks:

Auto-user accounts. Auto-user accounts are built-in user accounts that are created
automatically by the Batch service. By default, tasks run under an auto-user account. You
can configure the auto-user specification for a task to indicate under which auto-user
account a task should run. The auto-user specification allows you to specify the
elevation level and scope of the auto-user account that runs the task.

A named user account. You can specify one or more named user accounts for a pool
when you create the pool. Each user account is created on each node of the pool. In
addition to the account name, you specify the user account password, elevation level,
and, for Linux pools, the SSH private key. When you add a task, you can specify the
named user account under which that task should run.

) Important

The Batch service version 2017-01-01.4.0 introduced a breaking change that requires
that you update your code to call that version or later. See Update your code to the
latest Batch client library for quick guidelines for updating your Batch code from an
older version.
User account access to files and directories
Both an auto-user account and a named user account have read/write access to the task's
working directory, shared directory, and multi-instance tasks directory. Both types of accounts
have read access to the startup and job preparation directories.

If a task runs under the same account that was used for running a start task, the task has
read-write access to the start task directory. Similarly, if a task runs under the same account
that was used for running a job preparation task, the task has read-write access to the job
preparation task directory. If a task runs under a different account than the start task or job
preparation task, then the task has only read access to the respective directory.

) Important

Distinct task users in Batch aren't a sufficient security boundary for isolation between
tasks and its associated task data. In Batch, the security isolation boundary is at the pool
level. However improper access control of the Batch API can lead to access of all pools
under a Batch account with sufficient permission. Refer to best practices about pool
security.

For more information on accessing files and directories from a task, see Files and directories.

Elevated access for tasks


The user account's elevation level indicates whether a task runs with elevated access. Both an
auto-user account and a named user account can run with elevated access. The two options
for elevation level are:

NonAdmin: The task runs as a standard user without elevated access. The default
elevation level for a Batch user account is always NonAdmin.
Admin: The task runs as a user with elevated access and operates with full Administrator
permissions.

Auto-user accounts
By default, tasks run in Batch under an auto-user account, as a standard user without elevated
access, and with pool scope. Pool scope means that the task runs under an auto-user account
that is available to any task in the pool. For more information about pool scope, see Run a
task as an auto-user with pool scope.

The alternative to pool scope is task scope. When the auto-user specification is configured for
task scope, the Batch service creates an auto-user account for that task only.
There are four possible configurations for the auto-user specification, each of which
corresponds to a unique auto-user account:

Non-admin access with task scope


Admin (elevated) access with task scope
Non-admin access with pool scope
Admin access with pool scope

7 Note

Auto-user accounts with elevated admin access have direct write access to all other task
directories on the compute node executing the task. Consider running your tasks with
the least privilege required for successful execution.

Run a task as an auto-user with elevated access


You can configure the auto-user specification for administrator privileges when you need to
run a task with elevated access. For example, a start task may need elevated access to install
software on the node.

7 Note

Use elevated access only when necessary. A typical use case for using elevated admin
access is for a start task that must install software on the compute node before other
tasks can be scheduled. For subsequent tasks, you should use the installed software as a
task user without elevation.

The following code snippets show how to configure the auto-user specification. The examples
set the elevation level to Admin and the scope to Task .

Batch .NET

C#

task.UserIdentity = new UserIdentity(new AutoUserSpecification(elevationLevel:


ElevationLevel.Admin, scope: AutoUserScope.Task));

Batch Java

Java
taskToAdd.withId(taskId)
.withUserIdentity(new UserIdentity()
.withAutoUser(new AutoUserSpecification()
.withElevationLevel(ElevationLevel.ADMIN))
.withScope(AutoUserScope.TASK));
.withCommandLine("cmd /c echo hello");

Batch Python

Python

user = batchmodels.UserIdentity(
auto_user=batchmodels.AutoUserSpecification(
elevation_level=batchmodels.ElevationLevel.admin,
scope=batchmodels.AutoUserScope.task))
task = batchmodels.TaskAddParameter(
id='task_1',
command_line='cmd /c "echo hello world"',
user_identity=user)
batch_client.task.add(job_id=jobid, task=task)

Run a task as an auto-user with pool scope


When a node is provisioned, two pool-wide auto-user accounts are created on each node in
the pool, one with elevated access, and one without elevated access. Setting the auto-user's
scope to pool scope for a given task runs the task under one of these two pool-wide auto-
user accounts.

When you specify pool scope for the auto-user, all tasks that run with administrator access
run under the same pool-wide auto-user account. Similarly, tasks that run without
administrator permissions also run under a single pool-wide auto-user account.

The advantage to running under the same auto-user account is that tasks are able to easily
share data with other tasks running on the same node. There are also performance benefits
to user account reuse.

Sharing secrets between tasks is one scenario where running tasks under one of the two
pool-wide auto-user accounts is useful. For example, suppose a start task needs to provision
a secret onto the node that other tasks can use. You could use the Windows Data Protection
API (DPAPI), but it requires administrator privileges. Instead, you can protect the secret at the
user level. Tasks running under the same user account can access the secret without elevated
access.

Another scenario where you may want to run tasks under an auto-user account with pool
scope is a Message Passing Interface (MPI) file share. An MPI file share is useful when the
nodes in the MPI task need to work on the same file data. The head node creates a file share
that the child nodes can access if they're running under the same auto-user account.

The following code snippet sets the auto-user's scope to pool scope for a task in Batch .NET.
The elevation level is omitted, so the task runs under the standard pool-wide auto-user
account.

C#

task.UserIdentity = new UserIdentity(new AutoUserSpecification(scope:


AutoUserScope.Pool));

Named user accounts


You can define named user accounts when you create a pool. A named user account has a
name and password that you provide. You can specify the elevation level for a named user
account. For Linux nodes, you can also provide an SSH private key.

A named user account exists on all nodes in the pool and is available to all tasks running on
those nodes. You may define any number of named users for a pool. When you add a task or
task collection, you can specify that the task runs under one of the named user accounts
defined on the pool.

A named user account is useful when you want to run all tasks in a job under the same user
account, but isolate them from tasks running in other jobs at the same time. For example, you
can create a named user for each job, and run each job's tasks under that named user
account. Each job can then share a secret with its own tasks, but not with tasks running in
other jobs.

You can also use a named user account to run a task that sets permissions on external
resources such as file shares. With a named user account, you control the user identity and
can use that user identity to set permissions.

Named user accounts enable password-less SSH between Linux nodes. You can use a named
user account with Linux nodes that need to run multi-instance tasks. Each node in the pool
can run tasks under a user account defined on the whole pool. For more information about
multi-instance tasks, see Use multi-instance tasks to run MPI applications.

Create named user accounts


To create named user accounts in Batch, add a collection of user accounts to the pool. The
following code snippets show how to create named user accounts in .NET, Java, and Python.
These code snippets show how to create both admin and non-admin named accounts on a
pool.
Batch .NET example (Windows)

C#

CloudPool pool = null;


Console.WriteLine("Creating pool [{0}]...", poolId);

// Create a pool using Virtual Machine Configuration.


pool = batchClient.PoolOperations.CreatePool(
poolId: poolId,
targetDedicatedComputeNodes: 2,
virtualMachineSize: "standard_d2s_v3",
VirtualMachineConfiguration: new VirtualMachineConfiguration(
imageReference: new ImageReference(
publisher: "MicrosoftWindowsServer",
offer: "WindowsServer",
sku: "2022-datacenter-core",
version: "latest"),
nodeAgentSkuId: "batch.node.windows amd64");

// Add named user accounts.


pool.UserAccounts = new List<UserAccount>
{
new UserAccount("adminUser", "A1bC2d", ElevationLevel.Admin),
new UserAccount("nonAdminUser", "A1bC2d", ElevationLevel.NonAdmin),
};

// Commit the pool.


await pool.CommitAsync();

Batch .NET example (Linux)

C#

CloudPool pool = null;

// Obtain a collection of all available node agent SKUs.


List<NodeAgentSku> nodeAgentSkus =
batchClient.PoolOperations.ListNodeAgentSkus().ToList();

// Define a delegate specifying properties of the VM image to use.


Func<ImageReference, bool> isUbuntu2404 = imageRef =>
imageRef.Publisher == "Canonical" &&
imageRef.Offer == "ubuntu-24_04-lts" &&
imageRef.Sku.Contains("server");

// Obtain the first node agent SKU in the collection that matches
NodeAgentSku ubuntuAgentSku = nodeAgentSkus.First(sku =>
sku.VerifiedImageReferences.Any(isUbuntu2404));

// Select an ImageReference from those available for node agent.


ImageReference imageReference =
ubuntuAgentSku.VerifiedImageReferences.First(isUbuntu2404);
// Create the virtual machine configuration to use to create the pool.
VirtualMachineConfiguration virtualMachineConfiguration =
new VirtualMachineConfiguration(imageReference, ubuntuAgentSku.Id);

Console.WriteLine("Creating pool [{0}]...", poolId);

// Create the unbound pool.


pool = batchClient.PoolOperations.CreatePool(
poolId: poolId,
targetDedicatedComputeNodes: 2,
virtualMachineSize: "Standard_d2s_v3",
virtualMachineConfiguration: virtualMachineConfiguration);
// Add named user accounts.
pool.UserAccounts = new List<UserAccount>
{
new UserAccount(
name: "adminUser",
password: "A1bC2d",
elevationLevel: ElevationLevel.Admin,
linuxUserConfiguration: new LinuxUserConfiguration(
uid: 12345,
gid: 98765,
sshPrivateKey: new Guid().ToString()
)),
new UserAccount(
name: "nonAdminUser",
password: "A1bC2d",
elevationLevel: ElevationLevel.NonAdmin,
linuxUserConfiguration: new LinuxUserConfiguration(
uid: 45678,
gid: 98765,
sshPrivateKey: new Guid().ToString()
)),
};

// Commit the pool.


await pool.CommitAsync();

Batch Java example

Java

List<UserAccount> userList = new ArrayList<>();


userList.add(new
UserAccount().withName(adminUserAccountName).withPassword(adminPassword).withElev
ationLevel(ElevationLevel.ADMIN));
userList.add(new
UserAccount().withName(nonAdminUserAccountName).withPassword(nonAdminPassword).wi
thElevationLevel(ElevationLevel.NONADMIN));
PoolAddParameter addParameter = new PoolAddParameter()
.withId(poolId)
.withTargetDedicatedNodes(POOL_VM_COUNT)
.withVmSize(POOL_VM_SIZE)
.withVirtualMachineConfiguration(configuration)
.withUserAccounts(userList);
batchClient.poolOperations().createPool(addParameter);

Batch Python example

Python

users = [
batchmodels.UserAccount(
name='pool-admin',
password='A1bC2d',
elevation_level=batchmodels.ElevationLevel.admin)
batchmodels.UserAccount(
name='pool-nonadmin',
password='A1bC2d',
elevation_level=batchmodels.ElevationLevel.non_admin)
]
pool = batchmodels.PoolAddParameter(
id=pool_id,
user_accounts=users,
virtual_machine_configuration=batchmodels.VirtualMachineConfiguration(
image_reference=image_ref_to_use,
node_agent_sku_id=sku_to_use),
vm_size=vm_size,
target_dedicated=vm_count)
batch_client.pool.add(pool)

Run a task under a named user account with elevated


access
To run a task as an elevated user, set the task's UserIdentity property to a named user
account that was created with its ElevationLevel property set to Admin .

This code snippet specifies that the task should run under a named user account. This named
user account was defined on the pool when the pool was created. In this case, the named
user account was created with admin permissions:

C#

CloudTask task = new CloudTask("1", "cmd.exe /c echo 1");


task.UserIdentity = new UserIdentity(AdminUserAccountName);

Update your code to the latest Batch client


library
The Batch service version 2017-01-01.4.0 introduced a breaking change, replacing the
runElevated property available in earlier versions with the userIdentity property. The
following tables provide a simple mapping that you can use to update your code from earlier
versions of the client libraries.

Batch .NET

ノ Expand table

If your code uses... Update it to....

CloudTask.RunElevated = CloudTask.UserIdentity = new UserIdentity(new


true; AutoUserSpecification(elevationLevel: ElevationLevel.Admin));

CloudTask.RunElevated = CloudTask.UserIdentity = new UserIdentity(new


false; AutoUserSpecification(elevationLevel: ElevationLevel.NonAdmin));

CloudTask.RunElevated not No update required


specified

Batch Java

ノ Expand table

If your code uses... Update it to....

CloudTask.withRunElevated(true); CloudTask.withUserIdentity(new UserIdentity().withAutoUser(new


AutoUserSpecification().withElevationLevel(ElevationLevel.ADMIN));

CloudTask.withRunElevated(false); CloudTask.withUserIdentity(new UserIdentity().withAutoUser(new


AutoUserSpecification().withElevationLevel(ElevationLevel.NONADMIN));

CloudTask.withRunElevated not No update required


specified

Batch Python

ノ Expand table

If your code uses... Update it to....

run_elevated=True user_identity=user , where


user = batchmodels.UserIdentity(
auto_user=batchmodels.AutoUserSpecification(
elevation_level=batchmodels.ElevationLevel.admin))
If your code uses... Update it to....

run_elevated=False user_identity=user , where


user = batchmodels.UserIdentity(
auto_user=batchmodels.AutoUserSpecification(
elevation_level=batchmodels.ElevationLevel.non_admin))

run_elevated not specified No update required

Next steps
Learn about the Batch service workflow and primary resources such as pools, nodes,
jobs, and tasks.
Learn about files and directories in Azure Batch.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Submit a large number of tasks to a
Batch job
Article • 06/13/2024

When you run large-scale Azure Batch workloads, you might want to submit tens of
thousands, hundreds of thousands, or even more tasks to a single job.

This article shows you how to submit large numbers of tasks with substantially increased
throughput to a single Batch job. After tasks are submitted, they enter the Batch queue
for processing on the pool you specify for the job.

Use task collections


When adding a large number of tasks, use the appropriate methods or overloads
provided by the Batch APIs to add tasks as a collection rather than one at a time.
Generally, you construct a task collection by defining tasks as you iterate over a set of
input files or parameters for your job.

The maximum size of the task collection that you can add in a single call depends on the
Batch API you use.

APIs allowing collections of up to 100 tasks


These Batch APIs limit the collection to 100 tasks. The limit could be smaller depending
on the size of the tasks (for example, if the tasks have a large number of resource files or
environment variables).

REST API
Python API
Node.js API

When using these APIs, you need to provide logic to divide the number of tasks to meet
the collection limit, and to handle errors and retries in case of task addition failures. If a
task collection is too large to add, the request generates an error and should be retried
again with fewer tasks.

APIs allowing collections of larger numbers of tasks


Other Batch APIs support much larger task collections, limited only by RAM availability
on the submitting client. These APIs transparently handle dividing the task collection
into "chunks" for the lower-level APIs and retries for task addition failures.

.NET API
Java API
Azure Batch CLI extension with Batch CLI templates
Python SDK extension

Increase throughput of task submission


It can take some time to add a large collection of tasks to a job. For example, adding
20,000 tasks via the .NET API might take up to one minute. Depending on the Batch API
and your workload, you can improve task throughput by modifying one or more of the
following.

Task size
Adding large tasks takes longer than adding smaller ones. To reduce the size of each
task in a collection, you can simplify the task command line, reduce the number of
environment variables, or handle requirements for task execution more efficiently.

For example, instead of using a large number of resource files, install task dependencies
using a start task on the pool, or use an application package or Docker container.

Number of parallel operations


Depending on the Batch API, you can increase throughput by increasing the maximum
number of concurrent operations by the Batch client. Configure this setting using the
BatchClientParallelOptions.MaxDegreeOfParallelism property in the .NET API, or the
threads parameter of methods such as TaskOperations.add_collection in the Batch

Python SDK extension. (This property is not available in the native Batch Python SDK.)

By default, this property is set to 1, but you can set it higher to improve throughput of
operations. You trade off increased throughput by consuming network bandwidth and
some CPU performance. Task throughput increases by up to 100 times the
MaxDegreeOfParallelism or threads . In practice, you should set the number of

concurrent operations to below 100.

The Azure Batch CLI extension with Batch templates increases the number of concurrent
operations automatically based on the number of available cores, but this property is
not configurable in the CLI.
HTTP connection limits
Having many concurrent HTTP connections can throttle the performance of the Batch
client when it is adding large numbers of tasks. Some APIs limit the number of HTTP
connections. When developing with the .NET API, for example, the
ServicePointManager.DefaultConnectionLimit property is set to 2 by default. We
recommend that you increase the value to a number close to or greater than the
number of parallel operations.

Example: Batch .NET


The following C# snippets show settings to configure when adding a large number of
tasks using the Batch .NET API.

To increase task throughput, increase the value of the MaxDegreeOfParallelism property


of the BatchClient. For example:

C#

BatchClientParallelOptions parallelOptions = new


BatchClientParallelOptions()
{
MaxDegreeOfParallelism = 15
};
...

Add a task collection to the job using the appropriate overload of the AddTaskAsync or
AddTask method. For example:

C#

// Add a list of tasks as a collection


List<CloudTask> tasksToAdd = new List<CloudTask>(); // Populate with your
tasks
...
await batchClient.JobOperations.AddTaskAsync(jobId, tasksToAdd,
parallelOptions);

Example: Batch CLI extension


Using the Azure Batch CLI extensions with Batch CLI templates, create a job template
JSON file that includes a task factory . The task factory configures a collection of
related tasks for a job from a single task definition.
The following is a sample job template for a one-dimensional parametric sweep job with
a large number of tasks (in this case, 250,000). The task command line is a simple echo
command.

JSON

{
"job": {
"type": "Microsoft.Batch/batchAccounts/jobs",
"apiVersion": "2016-12-01",
"properties": {
"id": "myjob",
"constraints": {
"maxWallClockTime": "PT5H",
"maxTaskRetryCount": 1
},
"poolInfo": {
"poolId": "mypool"
},
"taskFactory": {
"type": "parametricSweep",
"parameterSets": [
{
"start": 1,
"end": 250000,
"step": 1
}
],
"repeatTask": {
"commandLine": "/bin/bash -c 'echo Hello world from task
{0}'",
"constraints": {
"retentionTime":"PT1H"
}
}
},
"onAllTasksComplete": "terminatejob"
}
}
}

To run a job with the template, see Use Azure Batch CLI templates and file transfer.

Example: Batch Python SDK extension


To use the Azure Batch Python SDK extension, first install the Python SDK and the
extension:
pip install azure-batch
pip install azure-batch-extensions

After importing the package using import azext.batch as batch , set up a


BatchExtensionsClient that uses the SDK extension:

Python

client = batch.BatchExtensionsClient(
base_url=BATCH_ACCOUNT_URL, resource_group=RESOURCE_GROUP_NAME,
batch_account=BATCH_ACCOUNT_NAME)
...

Create a collection of tasks to add to a job. For example:

Python

tasks = list()
# Populate the list with your tasks
...

Add the task collection using task.add_collection. Set the threads parameter to increase
the number of concurrent operations:

Python

try:
client.task.add_collection(job_id, threads=100)
except Exception as e:
raise e

The Batch Python SDK extension also supports adding task parameters to job using a
JSON specification for a task factory. For example, configure job parameters for a
parametric sweep similar to the one in the preceding Batch CLI template example:

Python

parameter_sweep = {
"job": {
"type": "Microsoft.Batch/batchAccounts/jobs",
"apiVersion": "2016-12-01",
"properties": {
"id": "myjob",
"poolInfo": {
"poolId": "mypool"
},
"taskFactory": {
"type": "parametricSweep",
"parameterSets": [
{
"start": 1,
"end": 250000,
"step": 1
}
],
"repeatTask": {
"commandLine": "/bin/bash -c 'echo Hello world from task
{0}'",
"constraints": {
"retentionTime": "PT1H"
}
}
},
"onAllTasksComplete": "terminatejob"
}
}
}
...
job_json = client.job.expand_template(parameter_sweep)
job_parameter = client.job.jobparameter_from_json(job_json)

Add the job parameters to the job. Set the threads parameter to increase the number
of concurrent operations:

Python

try:
client.job.add(job_parameter, threads=50)
except Exception as e:
raise e

Next steps
Learn more about using the Azure Batch CLI extension with Batch CLI templates.
Learn more about the Batch Python SDK extension .
Read about best practices for Azure Batch.

Feedback
Was this page helpful?  Yes  No
Provide product feedback
Schedule Batch jobs for efficiency
Article • 03/21/2025

Scheduling Batch jobs lets you prioritize the jobs you want to run first, while taking into
account task dependencies. You can also make sure to use the least amount of
resources. Nodes can be decommissioned when not needed, and tasks that are
dependent on other tasks are spun up just in time to optimize the workflows. Since only
one job at a time runs, jobs can be set to autocomplete, and a new one doesn't start
until the previous one completes.

The tasks you schedule using the job manager task are associated with a job. The job
manager task will create tasks for the job. To do so, the job manager task needs to
authenticate with the Batch account. Use the AZ_BATCH_AUTHENTICATION_TOKEN
access token. The token allows access to the rest of the job.

To manage a job using the Azure CLI, see az batch job-schedule. You can also create job
schedules in the Azure portal.

Schedule a job in the Azure portal


1. Sign in to the Azure portal .

2. Select the Batch account you want to schedule jobs in.

3. In the left navigation pane, select Job schedules.

4. Select Add to create a new job schedule.

5. Under Basic form, enter the following information:

Job schedule ID: A unique identifier for this job schedule.


Display name: This name is optional and doesn't have to be unique. It has a
maximum length of 1024 characters.

6. In the Schedule section, enter the following information:

Do not run until: Specifies the earliest time the job will run. If you don't set
this, the schedule becomes ready to run jobs immediately.

Do not run after: No jobs will run after the time you enter here. If you don't
specify a time, then you're creating a recurring job schedule, which remains
active until you explicitly terminate it.

Recurrence interval: Select Enabled if you want to specify the amount of time
between jobs. You can have only one job at a time scheduled, so if it's time to
create a new job under a job schedule but the previous job is still running,
the Batch service won't create the new job until the previous job finishes.

Start window: Select Custom if you'd like to specify the time interval within
which a job must be created. If a job isn't created within this window, no new
job will be created until the next recurrence of the schedule.

7. In the Job Specification section, enter the following information:


Pool ID: Select the pool where you want the job to run. To choose from a list
of pools in your Batch account, select Update.

Job configuration task: Select Update to name and configure the job
manager task, as well as the job preparation task and job release tasks, if
you're using them.

8. In the Advanced settings section, enter the following information:

Display name: This name is optional and doesn't have to be unique. It has a
maximum length of 1024 characters.

Priority: Use the slider to set a priority for the job, or enter a value in the box.

Max wall clock time: Select Custom if you want to set a maximum amount of
time for the job to run. If you do so, Batch will terminate the job if it doesn't
complete within that time frame.

Max task retry count: Select Custom if you want to specify the number of
times a task can be retried, or Unlimited if you want the task to be tried for
as many times as is needed. This isn't the same as the number of retries an
API call might have.

When all tasks complete: The default is NoAction, but you can select
TerminateJob if you prefer to terminate the job when all tasks have been
completed (or if there are no tasks in the job).

When a task fails: A task fails if the retry count is exhausted or there's an
error when starting the task. The default is NoAction, but you can select
PerformExitOptionsJobAction if you prefer to take the action associated with
the task's exit condition if it fails.
9. Select Save to create your job schedule.

To track the execution of the job, return to Job schedules and select the job schedule.
Expand Execution info to see details. You can also terminate, delete, or disable the job
schedule from this screen.

Next steps
Learn more about jobs and tasks.
Create task dependencies to run tasks that depend on other tasks.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Azure Batch job and task errors
Article • 04/25/2025

Various errors can happen when you add, schedule, or run Azure Batch jobs and tasks. It's
straightforward to detect errors that occur when you add jobs and tasks. The API, command
line, or user interface usually returns any failures immediately. This article covers how to check
for and handle errors that occur after jobs and tasks are submitted.

Job failures
A job is a group of one or more tasks, which specify command lines to run. You can specify the
following optional parameters when you add a job. These parameters influence how the job
can fail.

JobConstraints. You can optionally use the maxWallClockTime property to set the
maximum amount of time a job can be active or running. If the job exceeds the
maxWallClockTime , the job terminates with the terminateReason property set to

MaxWallClockTimeExpiry in the JobExecutionInformation.

JobPreparationTask. You can optionally specify a job preparation task to run on each
compute node scheduled to run a job task. The node runs the job preparation task before
the first time it runs a task for the job. If the job preparation task fails, the task doesn't run
and the job doesn't complete.

JobReleaseTask. You can optionally specify a job release task for jobs that have a job
preparation task. When a job is being terminated, the job release task runs on each pool
node that ran a job preparation task. If a job release task fails, the job still moves to a
completed state.

In the Azure portal, you can set these parameters in the Job manager, preparation and release
tasks and Advanced sections of the Batch Add job screen.

Job properties
Check the following job properties in the JobExecutionInformation for errors:

The terminateReason property indicates MaxWallClockTimeExpiry if the job exceeded the


maxWallClockTime specified in the job constraints and therefore the job terminated. This

property can also be set to taskFailed if the job's onTaskFailure attribute is set to
performExitOptionsJobAction , and a task fails with an exit condition that specifies a
jobAction of terminatejob .
The JobSchedulingError property is set if there has been a scheduling error.

Job preparation tasks


An instance of a job preparation task runs on each compute node the first time the node runs a
task for the job. You can think of the job preparation task as a task template, with multiple
instances being run, up to the number of nodes in a pool. Check the job preparation task
instances to determine if there were errors.

You can use the Job - List Preparation and Release Task Status API to list the execution status of
all instances of job preparation and release tasks for a specified job. As with other tasks,
JobPreparationTaskExecutionInformation is available with properties such as failureInfo ,
exitCode , and result .

When a job preparation task runs, the task that triggered the job preparation task moves to a
taskState of preparing . If the job preparation task fails, the triggering task reverts to the
active state and doesn't run.

If a job preparation task fails, the triggering job task doesn't run. The job doesn't complete and
is stuck. If there are no other jobs with tasks that can be scheduled, the pool might not be
used.

Job release tasks


An instance of a job release task runs when the job is being terminated on each node that ran
a job preparation task. Check the job release task instances to determine if there were errors.

You can use the Job - List Preparation and Release Task Status API to list the execution status of
all instances of job preparation and release tasks for a specified job. As with other tasks,
JobReleaseTaskExecutionInformation is available with properties such as failureInfo ,
exitCode , and result .

If one or more job release tasks fail, the job is still terminated and moves to a completed state.

Task failures
Job tasks can fail for the following reasons:

The task command line fails and returns with a nonzero exit code.
One or more resourceFiles specified for a task don't download.
One or more outputFiles specified for a task don't upload.
The elapsed time for the task exceeds the maxWallClockTime property specified in the
TaskConstraints.

In all cases, check the following properties for errors and information about the errors:

The TaskExecutionInformation property has multiple properties that provide information


about an error. The taskExecutionResult indicates if the task failed for any reason, and
exitCode and failureInfo provide more information about the failure.

The task always moves to the completed TaskState, whether it succeeded or failed.

Consider the impact of task failures on the job and on any task dependencies. You can specify
ExitConditions to configure actions for dependencies and for the job.

DependencyAction controls whether to block or run tasks that depend on the failed task.
JobAction controls whether the failed task causes the job to be disabled, terminated, or
unchanged.

Task command lines


Task command lines don't run under a shell on compute nodes, so they can't natively use shell
features such as environment variable expansion. To take advantage of such features, you must
invoke the shell in the command line. For more information, see Command-line expansion of
environment variables.

Task command line output writes to stderr.txt and stdout.txt files. Your application might also
write to application-specific log files. Make sure to implement comprehensive error checking
for your application to promptly detect and diagnose issues.

Task logs
If the pool node that ran a task still exists, you can get and view the task log files. Several APIs
allow listing and getting task files, such as File - Get From Task. You can also list and view log
files for a task or node by using the Azure portal .

1. At the top of the Overview page for a node, select Upload batch logs.
2. On the Upload Batch logs page, select Pick storage container, select an Azure Storage
container to upload to, and then select Start upload.

3. You can view, open, or download the logs from the storage container page.
Output files
Because Batch pools and pool nodes are often ephemeral, with nodes being continuously
added and deleted, it's best to save the log files when the job runs. Task output files are a
convenient way to save log files to Azure Storage. For more information, see Persist task data
to Azure Storage with the Batch service API.

On every file upload, Batch writes two log files to the compute node, fileuploadout.txt and
fileuploaderr.txt. You can examine these log files to learn more about a specific failure. If the file
upload wasn't attempted, for example because the task itself couldn't run, these log files don't
exist.

Next steps
Learn more about Batch jobs and tasks and job preparation and release tasks.
Learn about Batch pool and node errors.
Persist job and task output
Article • 04/02/2025

A task running in Azure Batch may produce output data when it runs. Task output data
often needs to be stored for retrieval by other tasks in the job, the client application that
executed the job, or both. Tasks write output data to the file system of a Batch compute
node, but all data on the node is lost when it is reimaged or when the node leaves the
pool. Tasks may also have a file retention period, after which files created by the task are
deleted. For these reasons, it's important to persist task output that you'll need later to a
data store such as Azure Storage.

For storage account options in Batch, see Batch accounts and Azure Storage accounts.

Some common examples of task output include:

Files created when the task processes input data.


Log files associated with task execution.

This article describes various options for persisting output data. You can persist output
data from Batch tasks and jobs to Azure Storage, or other stores.

Options for persisting output


There are multiple ways to persist output data. Choose the best method for your
scenario:

Use the Batch service API.


Use the Batch File Conventions library for .NET.
Use the Batch File Conventions library for C# and .NET applications.
Use the Batch File Conventions standard for languages other than .NET.
Use a custom file movement solution.

Batch service API


You can use the Batch service API to persist output data. Specify output files in Azure
Storage for task data when you add a task to a job or add a collection of tasks to a job.

For more information, see Persist task data to Azure Storage with the Batch service API.

Batch File Conventions library


The Batch File Conventions standard is an optional set of conventions for naming task
output files in Azure Storage. The standard provides naming conventions for a file's
destination container and blob path, based on the names of the job and task.

It's optional to use the File Conventions standard for naming your output data files. You
can choose to name the destination container and blob path instead. If you do use the
File Conventions standard, then you can view your output files in the Azure portal .

If you're building a Batch solution with C# and .NET, you can use the Batch File
Conventions library for .NET . The library moves output files to Azure Storage, and
names destination containers and blobs according to the Batch File Conventions
standard.

For more information, see Persist job and task data to Azure Storage with the Batch File
Conventions library for .NET.

Batch File Conventions standard


If you're using a language other than .NET, you can implement the Batch File
Conventions standard in your own application. Use this approach when:

You want to use a common naming scheme.


You want to view task output in the Azure portal .

Custom file movement solution


You can also implement your own complete file movement solution. Use this approach
when:

You want to persist task data to a data store other than Azure Storage. For
example, you want to upload files to a data store like Azure SQL or Azure
DataLake. Create a custom script or executable to upload to that location. Then,
call the custom script or executable on the command line after running your
primary executable. For example, on a Windows node, call doMyWork.exe &&
uploadMyFilesToSql.exe .

You want to do checkpointing or early uploading of initial results.


You want to maintain granular control over error handling. For example, you want
to use task dependency actions to take certain upload actions based on specific
task exit codes.

Design considerations
When you design your Batch solution, consider the following factors.

Compute nodes are often transient, especially in Batch pools with autoscaling enabled.
You can only see output from a task:

While the node where the task is running exists.


During the file retention period that you set for the task.

When you view a Batch task in the Azure portal, and select Files on node, you see all
files for that task, not just the output files. To retrieve task output directly from the
compute nodes in your pool, you need the file name and its output location on the
node.

If you want to keep task output data longer, configure the task to upload its output files
to a data store. It's recommended to use Azure storage as the data store. There's
integration for writing task output data to Azure Storage in the Batch service API. You
can use other durable storage options to keep your data. However, you need to write
the application logic for other storage options yourself.

To view your output data in Azure Storage, use the Azure portal or an Azure Storage
client application, such as Azure Storage Explorer . Note your output file's location, and
go to that location directly.

Next step
PersistOutputs sample project

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Persist task data to Azure Storage with the
Batch service API
07/01/2025

A task running in Azure Batch may produce output data when it runs. Task output data often
needs to be stored for retrieval by other tasks in the job, the client application that executed
the job, or both. Tasks write output data to the file system of a Batch compute node, but all
data on the node is lost when it is reimaged or when the node leaves the pool. Tasks may also
have a file retention period, after which files created by the task are deleted. For these reasons,
it's important to persist task output that you'll need later to a data store such as Azure Storage.

For storage account options in Batch, see Batch accounts and Azure Storage accounts.

The Batch service API supports persisting output data to Azure Storage for tasks and job
manager tasks that run on pools with Virtual Machine Configuration. When you add a task, you
can specify a container in Azure Storage as the destination for the task's output. The Batch
service then writes any output data to that container when the task is complete.

When using the Batch service API to persist task output, you don't need to modify the
application that the task is running. Instead, with a few modifications to your client application,
you can persist the task's output from within the same code that creates the task.

) Important

Persisting task data to Azure Storage with the Batch service API does not work with pools
created before February 1, 2018 .

When do I use the Batch service API to persist task


output?
Azure Batch provides more than one way to persist task output. Using the Batch service API is a
convenient approach that's best suited to these scenarios:

You want to write code to persist task output from within your client application, without
modifying the application that your task is running.
You want to persist output from Batch tasks and job manager tasks in pools created with
the virtual machine configuration.
You want to persist output to an Azure Storage container with an arbitrary name.
You want to persist output to an Azure Storage container named according to the Batch
File Conventions standard .

If your scenario differs from those listed above, you may need to consider a different approach.
For example, the Batch service API does not currently support streaming output to Azure
Storage while the task is running. To stream output, consider using the Batch File Conventions
library, available for .NET. For other languages, you'll need to implement your own solution. For
more information about other options, see Persist job and task output to Azure Storage.

Create a container in Azure Storage


To persist task output to Azure Storage, you'll need to create a container that serves as the
destination for your output files. Create the container before you run your task, preferably
before you submit your job, by using the appropriate Azure Storage client library or SDK. For
more information about Azure Storage APIs, see the Azure Storage documentation.

For example, if you are writing your application in C#, use the Azure Storage client library for
.NET . The following example shows how to create a container:

C#

CloudBlobContainer container =
storageAccount.CreateCloudBlobClient().GetContainerReference(containerName);
await container.CreateIfNotExists();

Specify output files for task output


To specify output files for a task, create a collection of OutputFile objects and assign it to the
CloudTask.OutputFiles property when you create the task. You can use a Shared Access
Signature (SAS) or managed identity to authenticate access to the container.

Using a Shared Access Signature


After you create the container, get a shared access signature (SAS) with write access to the
container. A SAS provides delegated access to the container. The SAS grants access with a
specified set of permissions and over a specified time interval. The Batch service needs an SAS
with write permissions to write task output to the container. For more information about SAS,
see Using shared access signatures (SAS) in Azure Storage.

When you get a SAS using the Azure Storage APIs, the API returns a SAS token string. This
token string includes all parameters of the SAS, including the permissions and the interval over
which the SAS is valid. To use the SAS to access a container in Azure Storage, you need to
append the SAS token string to the resource URI. The resource URI, together with the
appended SAS token, provides authenticated access to Azure Storage.

The following example shows how to get a write-only SAS token string for the container, then
appends the SAS to the container URI:

C#

string containerSasToken = container.GetSharedAccessSignature(new


SharedAccessBlobPolicy()
{
SharedAccessExpiryTime = DateTimeOffset.UtcNow.AddDays(1),
Permissions = SharedAccessBlobPermissions.Write
});

string containerSasUrl = container.Uri.AbsoluteUri + containerSasToken;

The following C# code example creates a task that writes random numbers to a file named
output.txt . The example creates an output file for output.txt to be written to the container.

The example also creates output files for any log files that match the file pattern std*.txt (e.g.,
stdout.txt and stderr.txt ). The container URL requires the SAS that was created previously

for the container. The Batch service uses the SAS to authenticate access to the container.

C#

new CloudTask(taskId, "cmd /v:ON /c \"echo off && set && (FOR /L %i IN
(1,1,100000) DO (ECHO !RANDOM!)) > output.txt\"")
{
OutputFiles = new List<OutputFile>
{
new OutputFile(
filePattern: @"..\std*.txt",
destination: new OutputFileDestination(
new OutputFileBlobContainerDestination(
containerUrl: containerSasUrl,
path: taskId)),
uploadOptions: new OutputFileUploadOptions(
uploadCondition: OutputFileUploadCondition.TaskCompletion)),
new OutputFile(
filePattern: @"output.txt",
destination:
new OutputFileDestination(new OutputFileBlobContainerDestination(
containerUrl: containerSasUrl,
path: taskId + @"\output.txt")),
uploadOptions: new OutputFileUploadOptions(
uploadCondition: OutputFileUploadCondition.TaskCompletion)),
}
7 Note

If using this example with Linux, be sure to change the backslashes to forward slashes.

Using Managed Identity


Instead of generating and passing a SAS with write access to the container to Batch, a
managed identity can be used to authenticate with Azure Storage. The identity must be
assigned to the Batch Pool, and also have the Storage Blob Data Contributor role assignment
for the container to be written to. The Batch service can then be told to use the managed
identity instead of a SAS to authenticate access to the container.

C#

CloudBlobContainer container =
storageAccount.CreateCloudBlobClient().GetContainerReference(containerName);
await container.CreateIfNotExists();

new CloudTask(taskId, "cmd /v:ON /c \"echo off && set && (FOR /L %i IN
(1,1,100000) DO (ECHO !RANDOM!)) > output.txt\"")
{
OutputFiles = new List<OutputFile>
{
new OutputFile(
filePattern: @"..\std*.txt",
destination: new OutputFileDestination(
new OutputFileBlobContainerDestination(
containerUrl: container.Uri,
path: taskId,
identityReference: new ComputeNodeIdentityReference() {
ResourceId =
"/subscriptions/SUB/resourceGroups/RG/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/identity-name"} })),
uploadOptions: new OutputFileUploadOptions(
uploadCondition: OutputFileUploadCondition.TaskCompletion))
}
}

Specify a file pattern for matching


When you specify an output file, you can use the OutputFile.FilePattern property to specify a
file pattern for matching. The file pattern may match zero files, a single file, or a set of files that
are created by the task.
The FilePattern property supports standard filesystem wildcards such as * (for non-recursive
matches) and ** (for recursive matches). For example, the code sample above specifies the file
pattern to match std*.txt non-recursively:

filePattern: @"..\std*.txt"

To upload a single file, specify a file pattern with no wildcards. For example, the code sample
above specifies the file pattern to match output.txt :

filePattern: @"output.txt"

Specify an upload condition


The Output​File​Upload​Options.UploadCondition property permits conditional uploading of
output files. A common scenario is to upload one set of files if the task succeeds, and a
different set of files if it fails. For example, you may want to upload verbose log files only when
the task fails and exits with a nonzero exit code. Similarly, you may want to upload result files
only if the task succeeds, as those files may be missing or incomplete if the task fails.

The code sample above sets the UploadCondition property to TaskCompletion. This setting
specifies that the file is to be uploaded after the tasks completes, regardless of the value of the
exit code.

uploadCondition: OutputFileUploadCondition.TaskCompletion

For other settings, see the Output​File​Upload​Condition enum.

Disambiguate files with the same name


The tasks in a job may produce files that have the same name. For example, stdout.txt and
stderr.txt are created for every task that runs in a job. Because each task runs in its own

context, these files don't conflict on the node's file system. However, when you upload files
from multiple tasks to a shared container, you'll need to disambiguate files with the same
name.

The Output​File​Blob​Container​Destination.​Path property specifies the destination blob or virtual


directory for output files. You can use the Path property to name the blob or virtual directory in
such a way that output files with the same name are uniquely named in Azure Storage. Using
the task ID in the path is a good way to ensure unique names and easily identify files.

If the FilePattern property is set to a wildcard expression, then all files that match the pattern
are uploaded to the virtual directory specified by the Path property. For example, if the
container is mycontainer , the task ID is mytask , and the file pattern is ..\std*.txt , then the
absolute URIs to the output files in Azure Storage will be similar to:

https://myaccount.blob.core.windows.net/mycontainer/mytask/stderr.txt
https://myaccount.blob.core.windows.net/mycontainer/mytask/stdout.txt

If the FilePattern property is set to match a single file name, meaning it does not contain any
wildcard characters, then the value of the Path property specifies the fully qualified blob name.
If you anticipate naming conflicts with a single file from multiple tasks, then include the name
of the virtual directory as part of the file name to disambiguate those files. For example, set the
Path property to include the task ID, the delimiter character (typically a forward slash), and the
file name:

path: taskId + @"/output.txt"

The absolute URIs to the output files for a set of tasks will be similar to:

https://myaccount.blob.core.windows.net/mycontainer/task1/output.txt
https://myaccount.blob.core.windows.net/mycontainer/task2/output.txt

For more information about virtual directories in Azure Storage, see List the blobs in a
container.

Many Output Files


When a task specifies numerous output files, you may encounter limits imposed by the Azure
Batch API. It is advisable to keep your tasks small and keep the number of output files low.

If you encounter limits, consider reducing the number of output files by employing File
Patterns or using file containers such as tar or zip to consolidate the output files. Alternatively,
utilize mounting or other approaches to persist output data (see Persist job and task output).

Diagnose file upload errors


If uploading output files to Azure Storage fails, then the task moves to the Completed state
and the Task​Execution​Information.​Failure​Information property is set. Examine the
FailureInformation property to determine what error occurred. For example, here is an error
that occurs on file upload if the container cannot be found:
Category: UserError
Code: FileUploadContainerNotFound
Message: One of the specified Azure container(s) was not found while attempting to
upload an output file

On every file upload, Batch writes two log files to the compute node, fileuploadout.txt and
fileuploaderr.txt . You can examine these log files to learn more about a specific failure. In

cases where the file upload was never attempted, for example because the task itself couldn't
run, then these log files will not exist.

Diagnose file upload performance


The fileuploadout.txt file logs upload progress. You can examine this file to learn more about
how long your file uploads are taking. Keep in mind that there are many contributing factors to
upload performance, including the size of the node, other activity on the node at the time of
the upload, whether the target container is in the same region as the Batch pool, how many
nodes are uploading to the storage account at the same time, and so on.

Use the Batch service API with the Batch File


Conventions standard
When you persist task output with the Batch service API, you can name your destination
container and blobs however you like. You can also choose to name them according to the
Batch File Conventions standard . The File Conventions standard determines the names of the
destination container and blob in Azure Storage for a given output file based on the names of
the job and task. If you do use the File Conventions standard for naming output files, then your
output files are available for viewing in the Azure portal .

If you are developing in C#, you can use the methods built into the Batch File Conventions
library for .NET . This library creates the properly named containers and blob paths for you.
For example, you can call the API to get the correct name for the container, based on the job
name:

C#

string containerName = job.OutputStorageContainerName();

You can use the CloudJobExtensions.GetOutputStorageContainerUrl method to return a shared


access signature (SAS) URL that is used to write to the container. You can then pass this SAS to
the Output​File​Blob​Container​Destination constructor.

If you are developing in a language other than C#, you will need to implement the File
Conventions standard yourself.

Code sample
The PersistOutputs sample project is one of the Azure Batch code samples on GitHub. This
Visual Studio solution demonstrates how to use the Batch client library for .NET to persist task
output to durable storage. To run the sample, follow these steps:

1. Open the project in Visual Studio 2019.


2. Add your Batch and Storage account credentials to AccountSettings.settings in the
Microsoft.Azure.Batch.Samples.Common project.
3. Build (but do not run) the solution. Restore any NuGet packages if prompted.
4. Use the Azure portal to upload an application package for PersistOutputsTask. Include
the PersistOutputsTask.exe and its dependent assemblies in the .zip package, set the
application ID to "PersistOutputsTask", and the application package version to "1.0".
5. Start (run) the PersistOutputs project.
6. When prompted to choose the persistence technology to use for running the sample,
enter 2 to run the sample using the Batch service API to persist task output.
7. If desired, run the sample again, entering 3 to persist output with the Batch service API,
and also to name the destination container and blob path according to the File
Conventions standard.

Next steps
To learn more about persisting task output with the File Conventions library for .NET, see
Persist job and task data to Azure Storage with the Batch File Conventions library for .NET.
To learn about other approaches for persisting output data in Azure Batch, see Persist job
and task output to Azure Storage.
Persist job and task data to Azure
Storage with the Batch File Conventions
library for .NET
Article • 04/02/2025

A task running in Azure Batch may produce output data when it runs. Task output data
often needs to be stored for retrieval by other tasks in the job, the client application that
executed the job, or both. Tasks write output data to the file system of a Batch compute
node, but all data on the node is lost when it is reimaged or when the node leaves the
pool. Tasks may also have a file retention period, after which files created by the task are
deleted. For these reasons, it's important to persist task output that you'll need later to a
data store such as Azure Storage.

For storage account options in Batch, see Batch accounts and Azure Storage accounts.

You can persist task data from Azure Batch using the File Conventions library for .NET .
The File Conventions library simplifies the process of storing and retrieving task output
data in Azure Storage. You can use the File Conventions library in both task and client
code. In task mode, use the library to persist files. In client mode, use the library to list
and retrieve files. Your task code can also retrieve the output of upstream tasks using
the library, such as in a task dependencies scenario.

To retrieve output files with the File Conventions library, locate the files for a job or task.
You don't need to know the names or locations of the files. Instead, you can list the files
by ID and purpose. For example, list all intermediate files for a given task. Or, get a
preview file for a given job.

Starting with version 2017-05-01, the Batch service API supports persisting output data
to Azure Storage for tasks and job manager tasks that run on pools created with the
virtual machine (VM) configuration. You can persist output from within the code that
creates a task. This method is an alternative to the File Conventions library. You can
modify your Batch client applications to persist output without needing to update the
application that your task is running. For more information, see Persist task data to
Azure Storage with the Batch service API.

Library use cases


Azure Batch provides multiple ways to persist task output. Use the File Conventions
library when you want to:
Modify the code for the application that your task is running to persist files.
Stream data to Azure Storage while the task is still running.
Persist data from pools.
Locate and download task output files by ID or purpose in your client application
or other tasks.
View task output in the Azure portal.

For other scenarios, you might want to consider a different approach. For more
information on other options, see Persist job and task output to Azure Storage.

What is the Batch File Conventions standard?


The Batch File Conventions standard provides a naming scheme for the destination
containers and blob paths to which your output files are written. Files persisted to Azure
storage that follow the standard are automatically viewable in the Azure portal.

The File Conventions library for .NET automatically names your storage containers and
task output files according to the standard. The library also provides methods to query
output files in Azure Storage. You can query by job ID, task ID, or purpose.

If you're developing with a language other than .NET, you can implement the File
Conventions standard yourself in your application. For more information, see Implement
the Batch File Conventions standard.

Link an Azure Storage account


To persist output data to Azure Storage using the File Conventions library, first link an
Azure Storage account to your Batch account.

1. Sign in to the Azure portal .


2. Search for and select Batch in the search bar.
3. Select the Batch account to link with Azure Storage.
4. On the Batch account page, under Settings, select Storage Account.
5. If you don't already have an Azure Storage account associated with your Batch
account, select Storage Account (None).
6. Select the Azure Storage account to use. For best performance, use an account in
the same region as the Batch account.

Persist output data


You can persist job and task output data with the File Conventions library. First, create a
container in Azure Storage. Then, save the output to the container. Use the Azure
Storage client library for .NET in your task code to upload the task output to the
container.

For more information about working with containers and blobs in Azure Storage, see
Get started with Azure Blob storage using .NET.

All job and task outputs persisted with the File Conventions library are stored in the
same container. If a large number of tasks try to persist files at the same time, Azure
Storage throttling limits might be enforced. For more information, see Performance and
scalability checklist for Blob storage.

Create storage container


To persist task output to Azure Storage, first create a container by calling
CloudJob.PrepareOutputStorageAsync. This extension method takes a
CloudStorageAccount object as a parameter. The method creates a container named
according to the File Conventions standard. The container's contents are discoverable by
the Azure portal and the retrieval methods described in this article.

Typically, create a container in your client application, which creates your pools, jobs,
and tasks. For example:

C#

CloudJob job = batchClient.JobOperations.CreateJob(


"myJob",
new PoolInformation { PoolId = "myPool" });

// Create reference to the linked Azure Storage account


CloudStorageAccount linkedStorageAccount =
new CloudStorageAccount(myCredentials, true);

// Create the blob storage container for the outputs


await job.PrepareOutputStorageAsync(linkedStorageAccount);

Store task outputs


After creating your storage container, tasks can save output to the container using
TaskOutputStorage. This class is available in the File Conventions library.

In your task code, create a TaskOutputStorage object. When the task completes its work,
call the TaskOutputStorage.SaveAsync method. This step saves the output to Azure
Storage.

C#

CloudStorageAccount linkedStorageAccount = new


CloudStorageAccount(myCredentials);
string jobId = Environment.GetEnvironmentVariable("AZ_BATCH_JOB_ID");
string taskId = Environment.GetEnvironmentVariable("AZ_BATCH_TASK_ID");

TaskOutputStorage taskOutputStorage = new TaskOutputStorage(


linkedStorageAccount, jobId, taskId);

/* Code to process data and produce output file(s) */

await taskOutputStorage.SaveAsync(TaskOutputKind.TaskOutput,
"frame_full_res.jpg");
await taskOutputStorage.SaveAsync(TaskOutputKind.TaskPreview,
"frame_low_res.jpg");

The kind parameter of the TaskOutputStorage.SaveAsync method categorizes the


persisted files. There are four predefined TaskOutputKind types: TaskOutput ,
TaskPreview , TaskLog , and TaskIntermediate. You can also define custom categories of

output.

Specify what type of outputs to list when you query Batch later. Then, when you list the
outputs for a task, you can filter on one of the output types. For example, filter to "Give
me the preview output for task 109." For more information, see Retrieve output data.

The output type also determines where an output file appears in the Azure portal. Files
in the category TaskOutput are under Task output files. Files in the category TaskLog
are under Task logs.

Store job outputs


You can also store the outputs associated with an entire job. For example, in the merge
task of a movie-rendering job, you can persist the fully rendered movie as a job output.
When your job completes, your client application can list and retrieve the outputs for
the job. Your client application doesn't have to query the individual tasks.

Store job output by calling the JobOutputStorage.SaveAsync method. Specify the


JobOutputKind and filename. For example:

C#

CloudJob job = new JobOutputStorage(acct, jobId);


JobOutputStorage jobOutputStorage = job.OutputStorage(linkedStorageAccount);
await jobOutputStorage.SaveAsync(JobOutputKind.JobOutput, "mymovie.mp4");
await jobOutputStorage.SaveAsync(JobOutputKind.JobPreview,
"mymovie_preview.mp4");

As with the TaskOutputKind type for task outputs, use the JobOutputKind type to
categorize a job's persisted files. Later, you can list a specific type of output. The
JobOutputKind type includes both output and preview categories. The type also
supports creating custom categories.

Store task logs


You might also need to persist files that are updated during the execution of a task. For
example, you might need to persist log files, or stdout.txt and stderr.txt . The File
Conventions library provides the TaskOutputStorage.SaveTrackedAsync method to
persist these kinds of files. Track updates to a file on the node at a specified interval with
SaveTrackedAsync. Then, persist those updates to Azure Storage.

The following example uses SaveTrackedAsync to update stdout.txt in Azure Storage


every 15 seconds during the execution of the task:

C#

TimeSpan stdoutFlushDelay = TimeSpan.FromSeconds(3);


string logFilePath = Path.Combine(
Environment.GetEnvironmentVariable("AZ_BATCH_TASK_DIR"), "stdout.txt");

// The primary task logic is wrapped in a using statement that sends updates
to
// the stdout.txt blob in Storage every 15 seconds while the task code runs.
using (ITrackedSaveOperation stdout =
await taskStorage.SaveTrackedAsync(
TaskOutputKind.TaskLog,
logFilePath,
"stdout.txt",
TimeSpan.FromSeconds(15)))
{
/* Code to process data and produce output file(s) */

// We are tracking the disk file to save our standard output, but the
// node agent may take up to 3 seconds to flush the stdout stream to
// disk. So give the file a moment to catch up.
await Task.Delay(stdoutFlushDelay);
}

Replace the commented section Code to process data and produce output file(s) with
whatever code your task normally does. For example, you might have code that
downloads data from Azure Storage, then performs transformations or calculations. You
can wrap this code in a using block to periodically update a file with SaveTrackedAsync.

The node agent is a program that runs on each node in the pool. This program provides
the command-and-control interface between the node and the Batch service. The
Task.Delay call is required at the end of this using block. The call makes sure that the

node agent has time to flush the contents of standard to the stdout.txt file on the
node. Without this delay, it's possible to miss the last few seconds of output. You might
not need this delay for all files.

When you enable file tracking with SaveTrackedAsync, only appends to the tracked file
are persisted to Azure Storage. Only use this method for tracking non-rotating log files,
or other files that are written to with append operations to the end of the file.

Retrieve output data


To retrieve output files for a specific task or job, you don't need to know the path in
Azure Storage, or file names. Instead, you can request output files by task or job ID.

The following example code iterates through a job's tasks. Next, the code prints some
information about the output files for the task. Then, the code downloads the files from
AzureStorage.

C#

foreach (CloudTask task in myJob.ListTasks())


{
foreach (OutputFileReference output in
task.OutputStorage(storageAccount).ListOutputs(
TaskOutputKind.TaskOutput))
{
Console.WriteLine($"output file: {output.FilePath}");

output.DownloadToFileAsync(
$"{jobId}-{output.FilePath}",
System.IO.FileMode.Create).Wait();
}
}

View output files in the Azure portal


If your task output files use the Batch File Conventions standard , you can view the files
in the Azure portal.
To enable the display of your output files in the portal, you must satisfy the following
requirements:

For output files to automatically display in the Azure portal, you must:

1. Link an Azure Storage account to your Batch account.


2. Follow the predefined naming conventions for Azure Storage containers and files.
Review the README for all definitions. If you use the File Conventions library
to persist your output, your files are persisted according to the File Conventions
standard.

To view task output files and logs in the Azure portal:

1. Sign in to the Azure portal .


2. Go to the task for which you want to view output.
3. Select either Saved output files or Saved logs.

Code sample
The PersistOutputs sample project is one of the Azure Batch code samples on
GitHub. This Visual Studio solution shows how to use the Azure Batch File Conventions
library to persist task output to durable storage. To run the sample, follow these steps:

1. Open the project in Visual Studio 2019.


2. Add your Batch and Azure Storage account credentials to
AccountSettings.settings in the Microsoft.Azure.Batch.Samples.Common project.
3. Build the solution. Don't run the solution yet.
4. If prompted, restore any NuGet packages.
5. Upload an application package for PersistOutputsTask through the Azure portal.
a. Include the PersistOutputsTask.exe executable and its dependent assemblies in
the .zip package.
b. Set the application ID to PersistOutputsTask .
c. Set the application package version to 1.0 .
6. Select Start to run the project.
7. When prompted to select the persistence technology to use, enter 1. This option
runs the sample using the File Conventions library to persist task output.

Get the Batch File Conventions library for .NET


The Batch File Conventions library for .NET is available on NuGet . The library extends
the CloudJob and CloudTask classes with new methods. For more information, see the
File Conventions library reference documentation.

The File Conventions library source code is available on GitHub.

Next steps
Persist job and task output to Azure Storage
Persist task data to Azure Storage with the Batch service API

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Monitor Azure Batch
Article • 07/19/2024

This article describes:

The types of monitoring data you can collect for this service.
Ways to analyze that data.

7 Note

If you're already familiar with this service and/or Azure Monitor and just want to
know how to analyze monitoring data, see the Analyze section near the end of this
article.

When you have critical applications and business processes that rely on Azure resources,
you need to monitor and get alerts for your system. The Azure Monitor service collects
and aggregates metrics and logs from every component of your system. Azure Monitor
provides you with a view of availability, performance, and resilience, and notifies you of
issues. You can use the Azure portal, PowerShell, Azure CLI, REST API, or client libraries
to set up and view monitoring data.

For more information on Azure Monitor, see the Azure Monitor overview.
For more information on how to monitor Azure resources in general, see Monitor
Azure resources with Azure Monitor.

Resource types
Azure uses the concept of resource types and IDs to identify everything in a
subscription. Azure Monitor similarly organizes core monitoring data into metrics and
logs based on resource types, also called namespaces. Different metrics and logs are
available for different resource types. Your service might be associated with more than
one resource type.

Resource types are also part of the resource IDs for every resource running in Azure. For
example, one resource type for a virtual machine is Microsoft.Compute/virtualMachines .
For a list of services and their associated resource types, see Resource providers.

For more information about the resource types for Batch, see Batch monitoring data
reference.
Data storage
For Azure Monitor:

Metrics data is stored in the Azure Monitor metrics database.


Log data is stored in the Azure Monitor logs store. Log Analytics is a tool in the
Azure portal that can query this store.
The Azure activity log is a separate store with its own interface in the Azure portal.

You can optionally route metric and activity log data to the Azure Monitor logs store.
You can then use Log Analytics to query the data and correlate it with other log data.

Many services can use diagnostic settings to send metric and log data to other storage
locations outside Azure Monitor. Examples include Azure Storage, hosted partner
systems, and non-Azure partner systems, by using Event Hubs.

For detailed information on how Azure Monitor stores data, see Azure Monitor data
platform.

Access diagnostics logs in storage


If you archive Batch diagnostic logs in a storage account, a storage container is created
in the storage account as soon as a related event occurs. Blobs are created according to
the following naming pattern:

JSON

insights-{log category name}/resourceId=/SUBSCRIPTIONS/{subscription ID}/


RESOURCEGROUPS/{resource group name}/PROVIDERS/MICROSOFT.BATCH/
BATCHACCOUNTS/{Batch account name}/y={four-digit numeric year}/
m={two-digit numeric month}/d={two-digit numeric day}/
h={two-digit 24-hour clock hour}/m=00/PT1H.json

For example:

JSON

insights-metrics-pt1m/resourceId=/SUBSCRIPTIONS/XXXXXXXX-XXXX-XXXX-XXXX-
XXXXXXXXXXXX/
RESOURCEGROUPS/MYRESOURCEGROUP/PROVIDERS/MICROSOFT.BATCH/
BATCHACCOUNTS/MYBATCHACCOUNT/y=2018/m=03/d=05/h=22/m=00/PT1H.json

Each PT1H.json blob file contains JSON-formatted events that occurred within the hour
specified in the blob URL (for example, h=12 ). During the present hour, events are
appended to the PT1H.json file as they occur. The minute value ( m=00 ) is always 00 ,
since diagnostic log events are broken into individual blobs per hour. All times are in
UTC.

The following example shows a PoolResizeCompleteEvent entry in a PT1H.json log file.


The entry includes information about the current and target number of dedicated and
low-priority nodes and the start and end time of the operation.

JSON

{ "Tenant": "65298bc2729a4c93b11c00ad7e660501", "time": "2019-08-


22T20:59:13.5698778Z", "resourceId": "/SUBSCRIPTIONS/XXXXXXXX-XXXX-XXXX-
XXXX-
XXXXXXXXXXXX/RESOURCEGROUPS/MYRESOURCEGROUP/PROVIDERS/MICROSOFT.BATCH/BATCHA
CCOUNTS/MYBATCHACCOUNT/", "category": "ServiceLog", "operationName":
"PoolResizeCompleteEvent", "operationVersion": "2017-06-01", "properties":
{"id":"MYPOOLID","nodeDeallocationOption":"Requeue","currentDedicatedNodes":
10,"targetDedicatedNodes":100,"currentLowPriorityNodes":0,"targetLowPriority
Nodes":0,"enableAutoScale":false,"isAutoPool":false,"startTime":"2019-08-22
20:50:59.522","endTime":"2019-08-22
20:59:12.489","resultCode":"Success","resultMessage":"The operation
succeeded"}}

To access the logs in your storage account programmatically, use the Storage APIs.

Azure Monitor platform metrics


Azure Monitor provides platform metrics for most services. These metrics are:

Individually defined for each namespace.


Stored in the Azure Monitor time-series metrics database.
Lightweight and capable of supporting near real-time alerting.
Used to track the performance of a resource over time.

Collection: Azure Monitor collects platform metrics automatically. No configuration is


required.

Routing: You can also usually route platform metrics to Azure Monitor Logs / Log
Analytics so you can query them with other log data. For more information, see the
Metrics diagnostic setting. For how to configure diagnostic settings for a service, see
Create diagnostic settings in Azure Monitor.

For a list of all metrics it's possible to gather for all resources in Azure Monitor, see
Supported metrics in Azure Monitor.
Examples of metrics in a Batch account are Pool Create Events, Low-Priority Node Count,
and Task Complete Events. These metrics can help identify trends and can be used for
data analysis.

7 Note

Metrics emitted in the last 3 minutes might still be aggregating, so values might be
underreported during this time frame. Metric delivery isn't guaranteed and might
be affected by out-of-order delivery, data loss, or duplication.

For a complete list of available metrics for Batch, see Batch monitoring data reference.

Azure Monitor resource logs


Resource logs provide insight into operations that were done by an Azure resource.
Logs are generated automatically, but you must route them to Azure Monitor logs to
save or query them. Logs are organized in categories. A given namespace might have
multiple resource log categories.

Collection: Resource logs aren't collected and stored until you create a diagnostic setting
and route the logs to one or more locations. When you create a diagnostic setting, you
specify which categories of logs to collect. There are multiple ways to create and
maintain diagnostic settings, including the Azure portal, programmatically, and though
Azure Policy.

Routing: The suggested default is to route resource logs to Azure Monitor Logs so you
can query them with other log data. Other locations such as Azure Storage, Azure Event
Hubs, and certain Microsoft monitoring partners are also available. For more
information, see Azure resource logs and Resource log destinations.

For detailed information about collecting, storing, and routing resource logs, see
Diagnostic settings in Azure Monitor.

For a list of all available resource log categories in Azure Monitor, see Supported
resource logs in Azure Monitor.

All resource logs in Azure Monitor have the same header fields, followed by service-
specific fields. The common schema is outlined in Azure Monitor resource log schema.

For the available resource log categories, their associated Log Analytics tables, and the
logs schemas for Batch, see Batch monitoring data reference.
You must explicitly enable diagnostic settings for each Batch account you want to
monitor.

For the Batch service, you can collect the following logs:

ServiceLog: Events emitted by the Batch service during the lifetime of an individual
resource such as a pool or task.
AllMetrics: Metrics at the Batch account level.

The following screenshot shows an example diagnostic setting that sends allLogs and
AllMetrics to a Log Analytics workspace.

When you create an Azure Batch pool, you can install any of the following monitoring-
related extensions on the compute nodes to collect and analyze data:

Azure Monitor agent for Linux


Azure Monitor agent for Windows
Azure Diagnostics extension for Windows VMs
Azure Monitor Logs analytics and monitoring extension for Linux
Azure Monitor Logs analytics and monitoring extension for Windows

For a comparison of the different extensions and agents and the data they collect, see
Compare agents.

Azure activity log


The activity log contains subscription-level events that track operations for each Azure
resource as seen from outside that resource; for example, creating a new resource or
starting a virtual machine.

Collection: Activity log events are automatically generated and collected in a separate
store for viewing in the Azure portal.

Routing: You can send activity log data to Azure Monitor Logs so you can analyze it
alongside other log data. Other locations such as Azure Storage, Azure Event Hubs, and
certain Microsoft monitoring partners are also available. For more information on how
to route the activity log, see Overview of the Azure activity log.

For Batch accounts specifically, the activity log collects events related to account
creation and deletion and key management.

Analyze monitoring data


There are many tools for analyzing monitoring data.

Azure Monitor tools


Azure Monitor supports the following basic tools:

Metrics explorer, a tool in the Azure portal that allows you to view and analyze
metrics for Azure resources. For more information, see Analyze metrics with Azure
Monitor metrics explorer.

Log Analytics, a tool in the Azure portal that allows you to query and analyze log
data by using the Kusto query language (KQL). For more information, see Get
started with log queries in Azure Monitor.

The activity log, which has a user interface in the Azure portal for viewing and basic
searches. To do more in-depth analysis, you have to route the data to Azure
Monitor logs and run more complex queries in Log Analytics.

Tools that allow more complex visualization include:

Dashboards that let you combine different kinds of data into a single pane in the
Azure portal.
Workbooks, customizable reports that you can create in the Azure portal.
Workbooks can include text, metrics, and log queries.
Grafana, an open platform tool that excels in operational dashboards. You can use
Grafana to create dashboards that include data from multiple sources other than
Azure Monitor.
Power BI, a business analytics service that provides interactive visualizations across
various data sources. You can configure Power BI to automatically import log data
from Azure Monitor to take advantage of these visualizations.

When you analyze count-based Batch metrics like Dedicated Core Count or Low-Priority
Node Count, use the Avg aggregation. For event-based metrics like Pool Resize
Complete Events, use the Count aggregation. Avoid using the Sum aggregation, which
adds up the values of all data points received over the period of the chart.

Azure Monitor export tools


You can get data out of Azure Monitor into other tools by using the following methods:

Metrics: Use the REST API for metrics to extract metric data from the Azure
Monitor metrics database. The API supports filter expressions to refine the data
retrieved. For more information, see Azure Monitor REST API reference.

Logs: Use the REST API or the associated client libraries.

Another option is the workspace data export.

To get started with the REST API for Azure Monitor, see Azure monitoring REST API
walkthrough.

Kusto queries
You can analyze monitoring data in the Azure Monitor Logs / Log Analytics store by
using the Kusto query language (KQL).

) Important

When you select Logs from the service's menu in the portal, Log Analytics opens
with the query scope set to the current service. This scope means that log queries
will only include data from that type of resource. If you want to run a query that
includes data from other Azure services, select Logs from the Azure Monitor menu.
See Log query scope and time range in Azure Monitor Log Analytics for details.

For a list of common queries for any service, see the Log Analytics queries interface.

Sample queries
Here are a few sample log queries for Batch:
Pool resizes: Lists resize times by pool and result code (success or failure):

Kusto

AzureDiagnostics
| where OperationName=="PoolResizeCompleteEvent"
| summarize operationTimes=make_list(startTime_s) by poolName=id_s,
resultCode=resultCode_s

Task durations: Gives the elapsed time of tasks in seconds, from task start to task
complete.

Kusto

AzureDiagnostics
| where OperationName=="TaskCompleteEvent"
| extend taskId=id_s, ElapsedTime=datetime_diff('second',
executionInfo_endTime_t, executionInfo_startTime_t) // For longer running
tasks, consider changing 'second' to 'minute' or 'hour'
| summarize taskList=make_list(taskId) by ElapsedTime

Failed tasks per job: Lists failed tasks by parent job.

Kusto

AzureDiagnostics
| where OperationName=="TaskFailEvent"
| summarize failedTaskList=make_list(id_s) by jobId=jobId_s, ResourceId

Alerts
Azure Monitor alerts proactively notify you when specific conditions are found in your
monitoring data. Alerts allow you to identify and address issues in your system before
your customers notice them. For more information, see Azure Monitor alerts.

There are many sources of common alerts for Azure resources. For examples of common
alerts for Azure resources, see Sample log alert queries. The Azure Monitor Baseline
Alerts (AMBA) site provides a semi-automated method of implementing important
platform metric alerts, dashboards, and guidelines. The site applies to a continually
expanding subset of Azure services, including all services that are part of the Azure
Landing Zone (ALZ).

The common alert schema standardizes the consumption of Azure Monitor alert
notifications. For more information, see Common alert schema.
Types of alerts
You can alert on any metric or log data source in the Azure Monitor data platform. There
are many different types of alerts depending on the services you're monitoring and the
monitoring data you're collecting. Different types of alerts have various benefits and
drawbacks. For more information, see Choose the right monitoring alert type.

The following list describes the types of Azure Monitor alerts you can create:

Metric alerts evaluate resource metrics at regular intervals. Metrics can be platform
metrics, custom metrics, logs from Azure Monitor converted to metrics, or
Application Insights metrics. Metric alerts can also apply multiple conditions and
dynamic thresholds.
Log alerts allow users to use a Log Analytics query to evaluate resource logs at a
predefined frequency.
Activity log alerts trigger when a new activity log event occurs that matches
defined conditions. Resource Health alerts and Service Health alerts are activity log
alerts that report on your service and resource health.

Some Azure services also support smart detection alerts, Prometheus alerts, or
recommended alert rules.

For some services, you can monitor at scale by applying the same metric alert rule to
multiple resources of the same type that exist in the same Azure region. Individual
notifications are sent for each monitored resource. For supported Azure services and
clouds, see Monitor multiple resources with one alert rule.

7 Note

If you're creating or running an application that runs on your service, Azure


Monitor application insights might offer more types of alerts.

Batch alert rules


Because metric delivery can be subject to inconsistencies such as out-of-order delivery,
data loss, or duplication, you should avoid alerts that trigger on a single data point.
Instead, use thresholds to account for these inconsistencies over a period of time.

For example, you might want to configure a metric alert when your low priority core
count falls to a certain level. You could then use this alert to adjust the composition of
your pools. For best results, set a period of 10 or more minutes where the alert triggers
if the average low priority core count falls lower than the threshold value for the entire
period. This time period allows for metrics to aggregate so that you get more accurate
results.

The following table lists some alert rule triggers for Batch. These alert rules are just
examples. You can set alerts for any metric, log entry, or activity log entry listed in the
Batch monitoring data reference.

ノ Expand table

Alert Condition Description


type

Metric Unusable node Whenever the Unusable Node Count is greater than 0
count

Metric Task Fail Events Whenever the total Task Fail Events is greater than dynamic
threshold

Advisor recommendations
For some services, if critical conditions or imminent changes occur during resource
operations, an alert displays on the service Overview page in the portal. You can find
more information and recommended fixes for the alert in Advisor recommendations
under Monitoring in the left menu. During normal operations, no advisor
recommendations display.

For more information on Azure Advisor, see Azure Advisor overview.

Other Batch monitoring options


Batch Explorer is a free, rich-featured, standalone client tool to help create, debug,
and monitor Azure Batch applications. You can use Azure Batch Insights with Batch
Explorer to get system statistics for your Batch nodes, such as virtual machine (VM)
performance counters.

In your Batch applications, you can use the Batch .NET library to monitor or query the
status of your resources including jobs, tasks, nodes, and pools. For example:

Monitor the task state.


Monitor the node state.
Monitor the pool state.
Monitor pool usage in the account.
Count pool nodes by state.
You can use the Batch APIs to create list queries for Batch jobs, tasks, compute nodes,
and other resources. For more information about how to filter list queries, see Create
queries to list Batch resources efficiently.

Or, instead of potentially time-consuming list queries that return detailed information
about large collections of tasks or nodes, you can use the Get Task Counts and List Pool
Node Counts operations to get counts for Batch tasks and compute nodes. For more
information, see Monitor Batch solutions by counting tasks and nodes by state.

Insights
Some services in Azure have a built-in monitoring dashboard in the Azure portal that
provides a starting point for monitoring your service. These dashboards are called
insights, and you can find them in the Insights Hub of Azure Monitor in the Azure
portal.

Application Insights
You can integrate Application Insights with your Azure Batch applications to instrument
your code with custom metrics and tracing. For a detailed walkthrough of how to add
Application Insights to a Batch .NET solution, instrument application code, monitor the
application in the Azure portal, and build custom dashboards, see Monitor and debug
an Azure Batch .NET application with Application Insights and accompanying code
sample .

Related content
See Batch monitoring data reference for a reference of the metrics, logs, and other
important values created for Batch.
See Monitoring Azure resources with Azure Monitor for general details on
monitoring Azure resources.
Learn about the Batch APIs and tools available for building Batch solutions.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Monitor and debug an Azure Batch .NET
application with Application Insights
Article • 11/06/2024

Application Insights provides an elegant and powerful way for developers to monitor
and debug applications deployed to Azure services. Use Application Insights to monitor
performance counters and exceptions as well as instrument your code with custom
metrics and tracing. Integrating Application Insights with your Azure Batch application
allows you to gain deep insights into behaviors and investigate issues in near-real time.

This article shows how to add and configure the Application Insights library into your
Azure Batch .NET solution and instrument your application code. It also shows ways to
monitor your application via the Azure portal and build custom dashboards. For
Application Insights support in other languages, see the languages, platforms, and
integrations documentation.

A sample C# solution with code to accompany this article is available on GitHub . This
example adds Application Insights instrumentation code to the TopNWords example.
If you're not familiar with that example, try building and running TopNWords first. Doing
this will help you understand a basic Batch workflow of processing a set of input blobs
in parallel on multiple compute nodes.

Prerequisites
Visual Studio 2017 or later

Batch account and linked storage account

Application Insights resource. Use the Azure portal to create an Application


Insights resource. Select the General Application type.

Copy the instrumentation key from the Azure portal. You'll need this value later.

7 Note

You may be charged for data stored in Application Insights. This includes
the diagnostic and monitoring data discussed in this article.

Add Application Insights to your project


The Microsoft.ApplicationInsights.WindowsServer NuGet package and its
dependencies are required for your project. Add or restore them to your application's
project. To install the package, use the Install-Package command or NuGet Package
Manager.

PowerShell

Install-Package Microsoft.ApplicationInsights.WindowsServer

Reference Application Insights from your .NET application by using the


Microsoft.ApplicationInsights namespace.

Instrument your code


To instrument your code, your solution needs to create an Application Insights
TelemetryClient. In the example, the TelemetryClient loads its configuration from the
ApplicationInsights.config file. Be sure to update ApplicationInsights.config in the
following projects with your Application Insights instrumentation key:
Microsoft.Azure.Batch.Samples.TelemetryStartTask and TopNWordsSample.

XML

<InstrumentationKey>YOUR-IKEY-GOES-HERE</InstrumentationKey>

Also add the instrumentation key in the file TopNWords.cs.

The example in TopNWords.cs uses the following instrumentation calls from the
Application Insights API:

TrackMetric() - Tracks how long, on average, a compute node takes to download

the required text file.


TrackTrace() - Adds debugging calls to your code.
TrackEvent() - Tracks interesting events to capture.

This example purposely leaves out exception handling. Instead, Application Insights
automatically reports unhandled exceptions, which significantly improves the debugging
experience.

The following snippet illustrates how to use these methods.

C#
public void CountWords(string blobName, int numTopN, string
storageAccountName, string storageAccountKey)
{
// simulate exception for some set of tasks
Random rand = new Random();
if (rand.Next(0, 10) % 10 == 0)
{
blobName += ".badUrl";
}

// log the url we are downloading the file from


insightsClient.TrackTrace(new TraceTelemetry(string.Format("Task {0}:
Download file from: {1}", this.taskId, blobName), SeverityLevel.Verbose));

// open the cloud blob that contains the book


var storageCred = new StorageCredentials(storageAccountName,
storageAccountKey);
CloudBlockBlob blob = new CloudBlockBlob(new Uri(blobName),
storageCred);
using (Stream memoryStream = new MemoryStream())
{
// calculate blob download time
DateTime start = DateTime.Now;
blob.DownloadToStream(memoryStream);
TimeSpan downloadTime = DateTime.Now.Subtract(start);

// track how long the blob takes to download on this node


// this will help debug timing issues or identify poorly performing
nodes
insightsClient.TrackMetric("Blob download in seconds",
downloadTime.TotalSeconds, this.CommonProperties);

memoryStream.Position = 0; //Reset the stream


var sr = new StreamReader(memoryStream);
var myStr = sr.ReadToEnd();
string[] words = myStr.Split(' ');

// log how many words were found in the text file


insightsClient.TrackTrace(new TraceTelemetry(string.Format("Task
{0}: Found {1} words", this.taskId, words.Length), SeverityLevel.Verbose));
var topNWords =
words.
Where(word => word.Length > 0).
GroupBy(word => word, (key, group) => new
KeyValuePair<String, long>(key, group.LongCount())).
OrderByDescending(x => x.Value).
Take(numTopN).
ToList();
foreach (var pair in topNWords)
{
Console.WriteLine("{0} {1}", pair.Key, pair.Value);
}

// emit an event to track the completion of the task


insightsClient.TrackEvent("Done counting words");
}
}

Azure Batch telemetry initializer helper


When reporting telemetry for a given server and instance, Application Insights uses the
Azure VM Role and VM name for the default values. In the context of Azure Batch, the
example shows how to use the pool name and compute node name instead. Use a
telemetry initializer to override the default values.

C#

using Microsoft.ApplicationInsights.Channel;
using Microsoft.ApplicationInsights.Extensibility;
using System;
using System.Threading;

namespace Microsoft.Azure.Batch.Samples.TelemetryInitializer
{
public class AzureBatchNodeTelemetryInitializer : ITelemetryInitializer
{
// Azure Batch environment variables
private const string PoolIdEnvironmentVariable = "AZ_BATCH_POOL_ID";
private const string NodeIdEnvironmentVariable = "AZ_BATCH_NODE_ID";

private string roleInstanceName;


private string roleName;

public void Initialize(ITelemetry telemetry)


{
if (string.IsNullOrEmpty(telemetry.Context.Cloud.RoleName))
{
// override the role name with the Azure Batch Pool name
string name = LazyInitializer.EnsureInitialized(ref
this.roleName, this.GetPoolName);
telemetry.Context.Cloud.RoleName = name;
}

if (string.IsNullOrEmpty(telemetry.Context.Cloud.RoleInstance))
{
// override the role instance with the Azure Batch Compute
Node name
string name = LazyInitializer.EnsureInitialized(ref
this.roleInstanceName, this.GetNodeName);
telemetry.Context.Cloud.RoleInstance = name;
}
}

private string GetPoolName()


{
return
Environment.GetEnvironmentVariable(PoolIdEnvironmentVariable) ??
string.Empty;
}

private string GetNodeName()


{
return
Environment.GetEnvironmentVariable(NodeIdEnvironmentVariable) ??
string.Empty;
}
}
}

To enable the telemetry initializer, the ApplicationInsights.config file in the


TopNWordsSample project includes the following:

XML

<TelemetryInitializers>
<Add
Type="Microsoft.Azure.Batch.Samples.TelemetryInitializer.AzureBatchNodeTelem
etryInitializer, Microsoft.Azure.Batch.Samples.TelemetryInitializer"/>
</TelemetryInitializers>

Update the job and tasks to include Application


Insights binaries
In order for Application Insights to run correctly on your compute nodes, make sure the
binaries are correctly placed. Add the required binaries to your task's resource files
collection so that they get downloaded at the time your task executes. The following
snippets are similar to code in Job.cs.

First, create a static list of Application Insights files to upload.

C#

private static readonly List<string> AIFilesToUpload = new List<string>()


{
// Application Insights config and assemblies
"ApplicationInsights.config",
"Microsoft.ApplicationInsights.dll",
"Microsoft.AI.Agent.Intercept.dll",
"Microsoft.AI.DependencyCollector.dll",
"Microsoft.AI.PerfCounterCollector.dll",
"Microsoft.AI.ServerTelemetryChannel.dll",
"Microsoft.AI.WindowsServer.dll",
// custom telemetry initializer assemblies
"Microsoft.Azure.Batch.Samples.TelemetryInitializer.dll",
};
...

Next, create the staging files that are used by the task.

C#

...
// create file staging objects that represent the executable and its
dependent assembly to run as the task.
// These files are copied to every node before the corresponding task is
scheduled to run on that node.
FileToStage topNWordExe = new FileToStage(TopNWordsExeName,
stagingStorageAccount);
FileToStage storageDll = new FileToStage(StorageClientDllName,
stagingStorageAccount);

// Upload Application Insights assemblies


List<FileToStage> aiStagedFiles = new List<FileToStage>();
foreach (string aiFile in AIFilesToUpload)
{
aiStagedFiles.Add(new FileToStage(aiFile, stagingStorageAccount));
}
...

The FileToStage method is a helper function in the code sample that allows you to
easily upload a file from local disk to an Azure Storage blob. Each file is later
downloaded to a compute node and referenced by a task.

Finally, add the tasks to the job and include the necessary Application Insights binaries.

C#

...
// initialize a collection to hold the tasks that will be submitted in their
entirety
List<CloudTask> tasksToRun = new List<CloudTask>
(topNWordsConfiguration.NumberOfTasks);
for (int i = 1; i <= topNWordsConfiguration.NumberOfTasks; i++)
{
CloudTask task = new CloudTask("task_no_" + i, String.Format("{0} --Task
{1} {2} {3} {4}",
TopNWordsExeName,
string.Format("https://{0}.blob.core.windows.net/{1}",
accountSettings.StorageAccountName,
documents[i]),
topNWordsConfiguration.TopWordCount,
accountSettings.StorageAccountName,
accountSettings.StorageAccountKey));

//This is the list of files to stage to a container -- for each job, one
container is created and
//files all resolve to Azure Blobs by their name (so two tasks with the
same named file will create just 1 blob in
//the container).
task.FilesToStage = new List<IFileStagingProvider>
{
// required application binaries
topNWordExe,
storageDll,
};
foreach (FileToStage stagedFile in aiStagedFiles)
{
task.FilesToStage.Add(stagedFile);
}
task.RunElevated = false;
tasksToRun.Add(task);
}

View data in the Azure portal


Now that you've configured the job and tasks to use Application Insights, run the
example job in your pool. Navigate to the Azure portal and open the Application
Insights resource that you provisioned. After the pool is provisioned, you should start to
see data flowing and getting logged. The rest of this article touches on only a few
Application Insights features, but feel free to explore the full feature set.

View live stream data


To view trace logs in your Applications Insights resource, click Live Stream. The following
screenshot shows how to view live data coming from the compute nodes in the pool, for
example the CPU usage per compute node.

View trace logs


To view trace logs in your Applications Insights resource, click Search. This view shows a
list of diagnostic data captured by Application Insights including traces, events, and
exceptions.

The following screenshot shows how a single trace for a task is logged and later queried
for debugging purposes.

View unhandled exceptions


Application Insights logs exceptions thrown from your application. In this case, within
seconds of the application throwing the exception, you can drill into a specific exception
and diagnose the issue.
Measure blob download time
Custom metrics are also a valuable tool in the portal. For example, you can display the
average time it took each compute node to download the required text file it was
processing.

To create a sample chart:

1. In your Application Insights resource, click Metrics Explorer > Add chart.
2. Click Edit on the chart that was added.
3. Update the chart details as follows:

Set Chart type to Grid.


Set Aggregation to Average.
Set Group by to NodeId.
In Metrics, select Custom > Blob download in seconds.
Adjust display Color palette to your choice.
Monitor compute nodes continuously
You may have noticed that all metrics, including performance counters, are only logged
when the tasks are running. This behavior is useful because it limits the amount of data
that Application Insights logs. However, there are cases when you would always like to
monitor the compute nodes. For example, they might be running background work
which is not scheduled via the Batch service. In this case, set up a monitoring process to
run for the life of the compute node.

One way to achieve this behavior is to spawn a process that loads the Application
Insights library and runs in the background. In the example, the start task loads the
binaries on the machine and keeps a process running indefinitely. Configure the
Application Insights configuration file for this process to emit additional data you're
interested in, such as performance counters.

C#
...
// Batch start task telemetry runner
private const string BatchStartTaskFolderName = "StartTask";
private const string BatchStartTaskTelemetryRunnerName =
"Microsoft.Azure.Batch.Samples.TelemetryStartTask.exe";
private const string BatchStartTaskTelemetryRunnerAIConfig =
"ApplicationInsights.config";
...
CloudPool pool = client.PoolOperations.CreatePool(
topNWordsConfiguration.PoolId,
targetDedicated: topNWordsConfiguration.PoolNodeCount,
virtualMachineSize: "standard_d1_v2",
VirtualMachineConfiguration: new VirtualMachineConfiguration(
imageReference: new ImageReference(
publisher: "MicrosoftWindowsServer",
offer: "WindowsServer",
sku: "2019-datacenter-core",
version: "latest"),
nodeAgentSkuId: "batch.node.windows amd64");
...

// Create a start task which will run a dummy exe in background that simply
emits performance
// counter data as defined in the relevant ApplicationInsights.config.
// Note that the waitForSuccess on the start task was not set so the Compute
Node will be
// available immediately after this command is run.
pool.StartTask = new StartTask()
{
CommandLine = string.Format("cmd /c {0}",
BatchStartTaskTelemetryRunnerName),
ResourceFiles = resourceFiles
};
...

 Tip

To increase the manageability of your solution, you can bundle the assembly in an
application package. Then, to deploy the application package automatically to your
pools, add an application package reference to the pool configuration.

Throttle and sample data


Due to the large-scale nature of Azure Batch applications running in production, you
might want to limit the amount of data collected by Application Insights to manage
costs. See Sampling in Application Insights for some mechanisms to achieve this.
Next steps
Learn more about Application Insights.
For Application Insights support in other languages, see the languages, platforms,
and integrations documentation.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Create queries to list Batch resources
efficiently
07/01/2025

Most Azure Batch applications do monitoring or other operations that query the Batch service.
Such list queries often happen at regular intervals. For example, before you can check for
queued tasks in a job, you must get data on every task in that job. Reducing the amount of
data that the Batch service returns for queries improves your application's performance. This
article explains how to create and execute such queries in an efficient way. You can create
filtered queries for Batch jobs, tasks, compute nodes, and other resources with the Batch .NET
library.

7 Note

The Batch service provides API support for the common scenarios of counting tasks in a
job, and counting compute nodes in Batch pool. You can call the operations Get Task
Counts and List Pool Node Counts instead of using a list query. However, these more
efficient operations return more limited information that might not be up to date. For
more information, see Count tasks and compute nodes by state.

Specify a detail level


There can be thousands of entities like jobs, tasks, and compute nodes in a production Batch
application. For each query you make about the resources, a potentially large amount of data
goes from the Batch service to your application. Limit how many items and what information
your query returns to improve performance.

This Batch .NET API code snippet lists every task that is associated with a job, along with all of
the properties of each task.

C#

// Get a collection of all of the tasks and all of their properties for job-001
IPagedEnumerable<CloudTask> allTasks =
batchClient.JobOperations.ListTasks("job-001");

Apply a detail level to your query to list information more efficiently. Supply an
ODATADetailLevel object to the JobOperations.ListTasks method. This snippet returns only the
ID, command line, and compute node information properties of completed tasks.
C#

// Configure an ODATADetailLevel specifying a subset of tasks and


// their properties to return
ODATADetailLevel detailLevel = new ODATADetailLevel();
detailLevel.FilterClause = "state eq 'completed'";
detailLevel.SelectClause = "id,commandLine,nodeInfo";

// Supply the ODATADetailLevel to the ListTasks method


IPagedEnumerable<CloudTask> completedTasks =
batchClient.JobOperations.ListTasks("job-001", detailLevel);

In this example scenario, if there are thousands of tasks in the job, the results from the second
query typically are returned more quickly than from the first query. For more information about
using ODATADetailLevel when you list items with the Batch .NET API, see the section Efficient
querying in Batch .NET.

) Important

We highly recommend that you always supply an ODATADetailLevel object to your .NET
API list calls for maximum efficiency and performance of your application. By specifying a
detail level, you can help to lower Batch service response times, improve network
utilization, and minimize memory usage by client applications.

Use query strings


You can use the Batch .NET and Batch REST APIs to reduce how many items that a query
returns, and how much information the query returns for each item. There are three query
string types you can use to narrow your query: $filter, $select, and $expand.

For the Batch .NET API, see the ODATADetailLevel Class properties. Also review the section
Efficient querying in Batch .NET.

For the Batch REST API, see the Batch REST API reference. Find the List reference for the
resource you want to query. Then, review the URI Parameters section for details about
$filter , $select , and $expand . For example, see the URI parameters for Pool - List. Also see

how to make efficient Batch queries with the Azure CLI.

7 Note

When constructing any of the three query string types, you must ensure that the property
names and case match that of their REST API element counterparts. For example, when
working with the .NET CloudTask class, you must specify state instead of State, even
though the .NET property is CloudTask.State. For more information, see the property
mappings between the .NET and REST APIs.

Filter
The $filter expression string reduces the number of items that are returned. For example, you
can list only the running tasks for a job, or list only compute nodes that are ready to run tasks.

This string consists of one or more expressions, with an expression that consists of a property
name, operator, and value. The properties that can be specified are specific to each entity type
that you query, as are the operators that are supported for each property. Multiple expressions
can be combined by using the logical operators and and or .

This example lists only the running render tasks: (state eq 'running') and startswith(id,
'renderTask') .

Select
The $select expression string limits the property values that are returned for each item. You
specify a list of comma-separated property names, and only those property values are returned
for the items in the query results. You can specify any of the properties for the entity type
you're querying.

This example specifies that only three property values should be returned for each task: id,
state, stateTransitionTime .

Expand
The $expand expression string reduces the number of API calls that are required to obtain
certain information. You can use this string to obtain more information about each item with a
single API call. This method helps to improve performance by reducing API calls. Use an
$expand string instead of getting the list of entities and requesting information about each list

item.

Similar to $select , $expand controls whether certain data is included in list query results. When
all properties are required and no select string is specified, $expand must be used to get
statistics information. If a select string is used to obtain a subset of properties, then stats can
be specified in the select string, and $expand doesn't need to be specified.
Supported uses of this string include listing jobs, job schedules, tasks, and pools. Currently, the
string only supports statistics information.

This example specifies that statistics information should be returned for each item in the list:
stats .

Rules for filter, select, and expand strings


Make sure properties' names in filter, select, and expand strings appear as they do in the
Batch REST API. This rule applies even when you use Batch .NET or one of the other Batch
SDKs.
All property names are case-sensitive, but property values are case insensitive.
Date/time strings can be one of two formats, and must be preceded with DateTime .
W3C-DTF format example: creationTime gt DateTime'2011-05-08T08:49:37Z'
RFC 1123 format example: creationTime gt DateTime'Sun, 08 May 2011 08:49:37 GMT'
Boolean strings are either true or false .
If an invalid property or operator is specified, a 400 (Bad Request) error will result.

Efficient querying in Batch .NET


In the Batch .NET API, the ODATADetailLevel class provides filter, select, and expand strings to
list operations. The ODataDetailLevel class has three public string properties. You can specify
these properties in the constructor, or set the properties directly on the object. Then, pass the
ODataDetailLevel object as a parameter to the various list operations such as ListPools,

ListJobs, and ListTasks.

ODATADetailLevel.FilterClause: Limit the number of items that are returned.


ODATADetailLevel.SelectClause: Specify which property values are returned with each
item.
ODATADetailLevel.ExpandClause: Retrieve data for all items in a single API call instead of
separate calls for each item.

The following code snippet uses the Batch .NET API to query the Batch service efficiently for
the statistics of a specific set of pools. The Batch user has both test and production pools. The
test pool IDs are prefixed with "test", and the production pool IDs are prefixed with "prod".
myBatchClient is a properly initialized instance of the BatchClient class.

C#

// First we need an ODATADetailLevel instance on which to set the filter, select,


// and expand clause strings
ODATADetailLevel detailLevel = new ODATADetailLevel();
// We want to pull only the "test" pools, so we limit the number of items returned
// by using a FilterClause and specifying that the pool IDs must start with "test"
detailLevel.FilterClause = "startswith(id, 'test')";

// To further limit the data that crosses the wire, configure the SelectClause to
// limit the properties that are returned on each CloudPool object to only
// CloudPool.Id and CloudPool.Statistics
detailLevel.SelectClause = "id, stats";

// Specify the ExpandClause so that the .NET API pulls the statistics for the
// CloudPools in a single underlying REST API call. Note that we use the pool's
// REST API element name "stats" here as opposed to "Statistics" as it appears in
// the .NET API (CloudPool.Statistics)
detailLevel.ExpandClause = "stats";

// Now get our collection of pools, minimizing the amount of data that is returned
// by specifying the detail level that we configured above
List<CloudPool> testPools =
await myBatchClient.PoolOperations.ListPools(detailLevel).ToListAsync();

 Tip

An instance of ODATADetailLevel that is configured with Select and Expand clauses can
also be passed to appropriate Get methods, such as PoolOperations.GetPool, to limit the
amount of data that is returned.

Batch REST to .NET API mappings


Property names in filter, select, and expand strings must reflect their REST API counterparts,
both in name and case. The tables below provide mappings between the .NET and REST API
counterparts.

Mappings for filter strings


.NET list methods: Each of the .NET API methods in this column accepts an
ODATADetailLevel object as a parameter.
REST list requests: Each REST API page listed in this column contains a table with the
properties and operations allowed in filter strings. You can use these property names and
operations when you construct an ODATADetailLevel.FilterClause string.

ノ Expand table
.NET list methods REST list requests

CertificateOperations.ListCertificates List the certificates in an account

CloudTask.ListNodeFiles List the files associated with a task

JobOperations.ListJobPreparationAndReleaseTaskStatus List the status of the job preparation and job


release tasks for a job

JobOperations.ListJobs List the jobs in an account

JobOperations.ListNodeFiles List the files on a node

JobOperations.ListTasks List the tasks associated with a job

JobScheduleOperations.ListJobSchedules List the job schedules in an account

JobScheduleOperations.ListJobs List the jobs associated with a job schedule

PoolOperations.ListComputeNodes List the compute nodes in a pool

PoolOperations.ListPools List the pools in an account

Mappings for select strings


Batch .NET types: Batch .NET API types.
REST API entities: Each page in this column contains one or more tables that list the REST
API property names for the type. These property names are used when you construct
select strings. You use these same property names when you construct an
ODATADetailLevel.SelectClause string.

ノ Expand table

Batch .NET types REST API entities

Certificate Get information about a certificate

CloudJob Get information about a job

CloudJobSchedule Get information about a job schedule

ComputeNode Get information about a node

CloudPool Get information about a pool

CloudTask Get information about a task

Example: construct a filter string


To construct a filter string for ODATADetailLevel.FilterClause, find the corresponding REST API
page. Selectable properties and their supported operators are in the first multi-row table. For
example, to retrieve all tasks whose exit code was nonzero, check List the tasks associated with
a job for the applicable property string and allowable operators:

ノ Expand table

Property Operations allowed Type

executionInfo/exitCode eq, ge, gt, le , lt Int

The related filter string is:

(executionInfo/exitCode lt 0) or (executionInfo/exitCode gt 0)

Example: construct a select string


To construct ODATADetailLevel.SelectClause, find the corresponding REST API page for the
entity that you're listing. Selectable properties and their supported operators are in the first
multi-row table. For example, to retrieve only the ID and command line for each task in a list,
check Get information about a task:

ノ Expand table

Property Type Notes

id String The ID of the task.

commandLine String The command line of the task.

The related select string is:

id, commandLine

Code samples

Efficient list queries


The EfficientListQueries sample project shows how efficient list querying affects application
performance. This C# console application creates and adds a large number of tasks to a job.
Then, the application makes multiple calls to the JobOperations.ListTasks method and passes
ODATADetailLevel objects. These objects are configured with different property values to vary
the amount of data to be returned. This sample produces output similar to:

Adding 5000 tasks to job jobEffQuery...


5000 tasks added in 00:00:47.3467587, hit ENTER to query tasks...

4943 tasks retrieved in 00:00:04.3408081 (ExpandClause: | FilterClause: state eq


'active' | SelectClause: id,state)
0 tasks retrieved in 00:00:00.2662920 (ExpandClause: | FilterClause: state eq
'running' | SelectClause: id,state)
59 tasks retrieved in 00:00:00.3337760 (ExpandClause: | FilterClause: state eq
'completed' | SelectClause: id,state)
5000 tasks retrieved in 00:00:04.1429881 (ExpandClause: | FilterClause: |
SelectClause: id,state)
5000 tasks retrieved in 00:00:15.1016127 (ExpandClause: | FilterClause: |
SelectClause: id,state,environmentSettings)
5000 tasks retrieved in 00:00:17.0548145 (ExpandClause: stats | FilterClause: |
SelectClause: )

Sample complete, hit ENTER to continue...

The example shows you can greatly lower query response times by limiting the properties and
the number of items that are returned. You can find this and other sample projects in the
azure-batch-samples repository on GitHub.

BatchMetrics library
The following BatchMetrics sample project demonstrates how to efficiently monitor Azure
Batch job progress using the Batch API.

This sample includes a .NET class library project, which you can incorporate into your own
projects. There's also a simple command-line program to exercise and demonstrate the use of
the library.

The sample application within the project demonstrates these operations:

Selecting specific attributes to download only the properties you need


Filtering on state transition times to download only changes since the last query

For example, the following method appears in the BatchMetrics library. It returns an
ODATADetailLevel that specifies that only the id and state properties should be obtained for
the entities that are queried. It also specifies that only entities whose state has changed since
the specified DateTime parameter should be returned.

C#
internal static ODATADetailLevel OnlyChangedAfter(DateTime time)
{
return new ODATADetailLevel(
selectClause: "id, state",
filterClause: string.Format("stateTransitionTime gt DateTime'{0:o}'",
time)
);
}

Next steps
Maximize Azure Batch compute resource usage with concurrent node tasks. Some types
of workloads can benefit from executing parallel tasks on larger (but fewer) compute
nodes. Check out the example scenario in the article for details on such a scenario.
Monitor Batch solutions by counting tasks and nodes by state
Monitor Batch solutions by counting tasks
and nodes by state
Article • 05/02/2025

To monitor and manage large-scale Azure Batch solutions, you may need to determine counts
of resources in various states. Azure Batch provides efficient operations to get counts for Batch
tasks and compute nodes. You can use these operations instead of potentially time-consuming
list queries that return detailed information about large collections of tasks or nodes.

Get Task Counts gets an aggregate count of active, running, and completed tasks in a job,
and of tasks that succeeded or failed. By counting tasks in each state, you can easily
display job progress to a user, or detect unexpected delays or failures that may affect the
job.

List Pool Node Counts gets the number of dedicated and Spot compute nodes in each
pool that are in various states: creating, idle, offline, preempted, rebooting, reimaging,
starting, and others. By counting nodes in each state, you can determine when you have
adequate compute resources to run your jobs, and identify potential issues with your
pools.

At times, the numbers returned by these operations may not be up to date. If you need to be
sure that a count is accurate, use a list query to count these resources. List queries also let you
get information about other Batch resources such as applications. For more information about
applying filters to list queries, see Create queries to list Batch resources efficiently.

Task state counts


The Get Task Counts operation counts tasks by the following states:

Active: A task that's queued and ready to run but isn't currently assigned to any compute
node. A task is also active if it's dependent on a parent task that hasn't yet completed.
Running: A task that has been assigned to a compute node but hasn't yet finished. A task
is counted as running when its state is either preparing or running , as indicated by the
Getinformation about a task operation.
Completed: A task that's no longer eligible to run, because it either finished successfully,
or finished unsuccessfully and also exhausted its retry limit.
Succeeded: A task where the result of task execution is success . Batch determines
whether a task has succeeded or failed by checking the TaskExecutionResult property of
the executionInfo property.
Failed: A task where the result of task execution is failure .
The following .NET code sample shows how to retrieve task counts by state.

C#

var taskCounts = await batchClient.JobOperations.GetJobTaskCountsAsync("job-1");

Console.WriteLine("Task count in active state: {0}", taskCounts.Active);


Console.WriteLine("Task count in preparing or running state: {0}",
taskCounts.Running);
Console.WriteLine("Task count in completed state: {0}", taskCounts.Completed);
Console.WriteLine("Succeeded task count: {0}", taskCounts.Succeeded);
Console.WriteLine("Failed task count: {0}", taskCounts.Failed);

You can use a similar pattern for REST and other supported languages to get task counts for a
job.

Node state counts


The List Pool Node Counts operation counts compute nodes by the following states in each
pool. Separate aggregate counts are provided for dedicated nodes and Spot nodes in each
pool.

Creating: An Azure-allocated VM that hasn't yet started to join a pool.


Idle: A compute node that's availale and currently not running any tasks.
LeavingPool: A node that is leaving the pool, either because the user explicitly removed it
or because the pool is resizing or autoscaling down.
Offline: A node that Batch cannot use to schedule new tasks.
Preempted: A Spot node that was removed from the pool because Azure reclaimed the
VM. A preempted node can be reinitialized when replacement Spot VM capacity is
available.
Rebooting: A node that is restarting.
Reimaging: A node where the OS is being reinstalled.
Running : A node that is running one or more tasks (other than the start task).
Starting: A node where the Batch service is starting up.
StartTaskFailed: A node where the start task failed after all retries, and waitForSuccess is
enabled. This node cannot run tasks.
Unknown: A node that lost contact with the Batch service and whose state isn't known.
Unusable: A node that can't be used for task execution because of errors.
WaitingForStartTask: A node on which the start task is running, but waitForSuccess is
enabled and it hasn't completed.

The following C# snippet shows how to list node counts for all pools in the current account:
C#

foreach (var nodeCounts in batchClient.PoolOperations.ListPoolNodeCounts())


{
Console.WriteLine("Pool Id: {0}", nodeCounts.PoolId);

Console.WriteLine("Total dedicated node count: {0}",


nodeCounts.Dedicated.Total);

// Get dedicated node counts in Idle and Offline states; you can get
additional states.
Console.WriteLine("Dedicated node count in Idle state: {0}",
nodeCounts.Dedicated.Idle);
Console.WriteLine("Dedicated node count in Offline state: {0}",
nodeCounts.Dedicated.Offline);

Console.WriteLine("Total Spot node count: {0}", nodeCounts.LowPriority.Total);

// Get Spot node counts in Running and Preempted states; you can get
additional states.
Console.WriteLine("Spot node count in Running state: {0}",
nodeCounts.LowPriority.Running);
Console.WriteLine("Spot node count in Preempted state: {0}",
nodeCounts.LowPriority.Preempted);
}

The following C# snippet shows how to list node counts for a given pool in the current
account.

C#

foreach (var nodeCounts in batchClient.PoolOperations.ListPoolNodeCounts(new


ODATADetailLevel(filterClause: "poolId eq 'testpool'")))
{
Console.WriteLine("Pool Id: {0}", nodeCounts.PoolId);

Console.WriteLine("Total dedicated node count: {0}",


nodeCounts.Dedicated.Total);

// Get dedicated node counts in Idle and Offline states; you can get
additional states.
Console.WriteLine("Dedicated node count in Idle state: {0}",
nodeCounts.Dedicated.Idle);
Console.WriteLine("Dedicated node count in Offline state: {0}",
nodeCounts.Dedicated.Offline);

Console.WriteLine("Total Spot node count: {0}", nodeCounts.LowPriority.Total);

// Get Spot node counts in Running and Preempted states; you can get
additional states.
Console.WriteLine("Spot node count in Running state: {0}",
nodeCounts.LowPriority.Running);
Console.WriteLine("Spot node count in Preempted state: {0}",
nodeCounts.LowPriority.Preempted);
}

You can use a similar pattern for REST and other supported languages to get node counts for
pools.

Next steps
Learn about the Batch service workflow and primary resources such as pools, nodes, jobs,
and tasks.
Learn about applying filters to queries that list Batch resources, see Create queries to list
Batch resources efficiently.
Manage Batch resources with
PowerShell cmdlets
Article • 04/02/2025

With the Azure Batch PowerShell cmdlets, you can perform and script many common
Batch tasks. This is a quick introduction to the cmdlets you can use to manage your
Batch accounts and work with your Batch resources such as pools, jobs, and tasks.

For a complete list of Batch cmdlets and detailed cmdlet syntax, see the Azure Batch
cmdlet reference.

We recommend that you update your Azure PowerShell modules frequently to take
advantage of service updates and enhancements.

Prerequisites
Install and configure the Azure PowerShell module. To install a specific Azure Batch
module, such as a pre-release module, see the PowerShell Gallery .

Run the Connect-AzAccount cmdlet to connect to your subscription (the Azure


Batch cmdlets ship in the Azure Resource Manager module):

PowerShell

Connect-AzAccount

Register with the Batch provider namespace. You only need to perform this
operation once per subscription.

PowerShell

Register-AzResourceProvider -ProviderNamespace Microsoft.Batch

Manage Batch accounts and keys

Create a Batch account


New-AzBatchAccount creates a Batch account in a specified resource group. If you
don't already have a resource group, create one by running the New-AzResourceGroup
cmdlet. Specify one of the Azure regions in the Location parameter, such as "Central
US". For example:

PowerShell

New-AzResourceGroup –Name MyBatchResourceGroup –Location "Central US"

Then, create a Batch account in the resource group. Specify a name for the account in
<account_name>, and the location and name of your resource group. Creating the
Batch account can take some time to complete. For example:

PowerShell

New-AzBatchAccount –AccountName <account_name> –Location "Central US" –


ResourceGroupName <res_group_name>

7 Note

The Batch account name must be unique to the Azure region for the resource
group, contain between 3 and 24 characters, and use lowercase letters and
numbers only.

Get account access keys


Get-AzBatchAccountKeys shows the access keys associated with an Azure Batch
account. For example, run the following to get the primary and secondary keys of the
account you created.

PowerShell

$Account = Get-AzBatchAccountKeys –AccountName <account_name>

$Account.PrimaryAccountKey

$Account.SecondaryAccountKey

Generate a new access key


New-AzBatchAccountKey generates a new primary or secondary account key for an
Azure Batch account. For example, to generate a new primary key for your Batch
account, type:
PowerShell

New-AzBatchAccountKey -AccountName <account_name> -KeyType Primary

7 Note

To generate a new secondary key, specify "Secondary" for the KeyType parameter.
You have to regenerate the primary and secondary keys separately.

Delete a Batch account


Remove-AzBatchAccount deletes a Batch account. For example:

PowerShell

Remove-AzBatchAccount -AccountName <account_name>

When prompted, confirm you want to remove the account. Account removal can take
some time to complete.

Create a BatchAccountContext object


You can authenticate to manage Batch resources using either shared key authentication
or Microsoft Entra authentication. To authenticate using the Batch PowerShell cmdlets,
first create a BatchAccountContext object to store your account credentials or identity.
You pass the BatchAccountContext object into cmdlets that use the BatchContext
parameter.

Shared key authentication


PowerShell

$context = Get-AzBatchAccountKeys -AccountName <account_name>

7 Note

By default, the account's primary key is used for authentication, but you can
explicitly select the key to use by changing your BatchAccountContext object’s
KeyInUse property: $context.KeyInUse = "Secondary" .
Microsoft Entra authentication
PowerShell

$context = Get-AzBatchAccount -AccountName <account_name>

Create and modify Batch resources


Use cmdlets such as New-AzBatchPool, New-AzBatchJob, and New-AzBatchTask to
create resources under a Batch account. There are corresponding Get- and Set- cmdlets
to update the properties of existing resources, and Remove- cmdlets to remove
resources under a Batch account.

When using many of these cmdlets, in addition to passing a BatchContext object, you
need to create or pass objects that contain detailed resource settings, as shown in the
following example. See the detailed help for each cmdlet for additional examples.

Create a Batch pool


When creating or updating a Batch pool, you specify a configuration. Pools should
generally be configured with Virtual Machine Configuration, which lets you either
specify one of the supported Linux or Windows VM images listed in the Azure Virtual
Machines Marketplace , or provide a custom image that you have prepared. Cloud
Services Configuration pools provide only Windows compute nodes and do not support
all Batch features.

When you run New-AzBatchPool, pass the operating system settings in a


PSVirtualMachineConfiguration or PSCloudServiceConfiguration object. For example, the
following snippet creates a Batch pool with size Standard_A1 compute nodes in the
virtual machine configuration, imaged with Ubuntu Server 20.04-LTS. Here, the
VirtualMachineConfiguration parameter specifies the $configuration variable as the
PSVirtualMachineConfiguration object. The BatchContext parameter specifies a
previously defined variable $context as the BatchAccountContext object.

PowerShell

$imageRef = New-Object -TypeName


"Microsoft.Azure.Commands.Batch.Models.PSImageReference" -ArgumentList
@("UbuntuServer","Canonical","20.04-LTS")

$configuration = New-Object -TypeName


"Microsoft.Azure.Commands.Batch.Models.PSVirtualMachineConfiguration" -
ArgumentList @($imageRef, "batch.node.ubuntu 20.04")
New-AzBatchPool -Id "mypspool" -VirtualMachineSize "Standard_a1" -
VirtualMachineConfiguration $configuration -AutoScaleFormula
'$TargetDedicated=4;' -BatchContext $context

The target number of compute nodes in the new pool is calculated by an autoscaling
formula. In this case, the formula is simply $TargetDedicated=4, indicating the number
of compute nodes in the pool is 4 at most.

Query for pools, jobs, tasks, and other details


Use cmdlets such as Get-AzBatchPool, Get-AzBatchJob, and Get-AzBatchTask to query
for entities created under a Batch account.

Query for data


As an example, use Get-AzBatchPools to find your pools. By default this queries for all
pools under your account, assuming you already stored the BatchAccountContext object
in $context:

PowerShell

Get-AzBatchPool -BatchContext $context

Use an OData filter


You can supply an OData filter using the Filter parameter to find only the objects you’re
interested in. For example, you can find all pools with IDs starting with “myPool”:

PowerShell

$filter = "startswith(id,'myPool')"

Get-AzBatchPool -Filter $filter -BatchContext $context

This method is not as flexible as using “Where-Object” in a local pipeline. However, the
query gets sent to the Batch service directly so that all filtering happens on the server
side, saving Internet bandwidth.

Use the Id parameter


An alternative to an OData filter is to use the Id parameter. To query for a specific pool
with id "myPool":

PowerShell

Get-AzBatchPool -Id "myPool" -BatchContext $context

The Id parameter supports only full-ID search; not wildcards or OData-style filters.

Use the MaxCount parameter


By default, each cmdlet returns a maximum of 1000 objects. If you reach this limit, either
refine your filter to bring back fewer objects, or explicitly set a maximum using the
MaxCount parameter. For example:

PowerShell

Get-AzBatchTask -MaxCount 2500 -BatchContext $context

To remove the upper bound, set MaxCount to 0 or less.

Use the PowerShell pipeline


Batch cmdlets use the PowerShell pipeline to send data between cmdlets. This has the
same effect as specifying a parameter, but makes working with multiple entities easier.

For example, find and display all tasks under your account:

PowerShell

Get-AzBatchJob -BatchContext $context | Get-AzBatchTask -BatchContext


$context

Restart (reboot) every compute node in a pool:

PowerShell

Get-AzBatchComputeNode -PoolId "myPool" -BatchContext $context | Restart-


AzBatchComputeNode -BatchContext $context

Application package management


Application packages provide a simplified way to deploy applications to the compute
nodes in your pools. With the Batch PowerShell cmdlets, you can upload and manage
application packages in your Batch account, and deploy package versions to compute
nodes.

) Important

You must link an Azure Storage account to your Batch account to use application
packages.

Create an application:

PowerShell

New-AzBatchApplication -AccountName <account_name> -ResourceGroupName


<res_group_name> -ApplicationId "MyBatchApplication"

Add an application package:

PowerShell

New-AzBatchApplicationPackage -AccountName <account_name> -ResourceGroupName


<res_group_name> -ApplicationId "MyBatchApplication" -ApplicationVersion
"1.0" -Format zip -FilePath package001.zip

Set the default version for the application:

PowerShell

Set-AzBatchApplication -AccountName <account_name> -ResourceGroupName


<res_group_name> -ApplicationId "MyBatchApplication" -DefaultVersion "1.0"

List an application's packages

PowerShell

$application = Get-AzBatchApplication -AccountName <account_name> -


ResourceGroupName <res_group_name> -ApplicationId "MyBatchApplication"

$application.ApplicationPackages

Delete an application package

PowerShell
Remove-AzBatchApplicationPackage -AccountName <account_name> -
ResourceGroupName <res_group_name> -ApplicationId "MyBatchApplication" -
ApplicationVersion "1.0"

Delete an application

PowerShell

Remove-AzBatchApplication -AccountName <account_name> -ResourceGroupName


<res_group_name> -ApplicationId "MyBatchApplication"

7 Note

You must delete all of an application's application package versions before you
delete the application. You will receive a 'Conflict' error if you try to delete an
application that currently has application packages.

Deploy an application package


You can specify one or more application packages for deployment when you create a
pool. When you specify a package at pool creation time, it is deployed to each node as
the node joins pool. Packages are also deployed when a node is rebooted or reimaged.

Specify the -ApplicationPackageReference option when creating a pool to deploy an


application package to the pool's nodes as they join the pool. First, create a
PSApplicationPackageReference object, and configure it with the application ID and
package version you want to deploy to the pool's compute nodes:

PowerShell

$appPackageReference = New-Object
Microsoft.Azure.Commands.Batch.Models.PSApplicationPackageReference

$appPackageReference.ApplicationId = "MyBatchApplication"

$appPackageReference.Version = "1.0"

Now create the pool, and specify the package reference object as the argument to the
ApplicationPackageReferences option:

PowerShell
New-AzBatchPool -Id "PoolWithAppPackage" -VirtualMachineSize "Small" -
VirtualMachineConfiguration $configuration -BatchContext $context -
ApplicationPackageReferences $appPackageReference

You can find more information on application packages in Deploy applications to


compute nodes with Batch application packages.

Update a pool's application packages


To update the applications assigned to an existing pool, first create a
PSApplicationPackageReference object with the desired properties (application ID and
package version):

PowerShell

$appPackageReference = New-Object
Microsoft.Azure.Commands.Batch.Models.PSApplicationPackageReference

$appPackageReference.ApplicationId = "MyBatchApplication"

$appPackageReference.Version = "2.0"

Next, get the pool from Batch, clear out any existing packages, add the new package
reference, and update the Batch service with the new pool settings:

PowerShell

$pool = Get-AzBatchPool -BatchContext $context -Id "PoolWithAppPackage"

$pool.ApplicationPackageReferences.Clear()

$pool.ApplicationPackageReferences.Add($appPackageReference)

Set-AzBatchPool -BatchContext $context -Pool $pool

You've now updated the pool's properties in the Batch service. To actually deploy the
new application package to compute nodes in the pool, however, you must restart or
reimage those nodes. You can restart every node in a pool with this command:

PowerShell

Get-AzBatchComputeNode -PoolId "PoolWithAppPackage" -BatchContext $context |


Restart-AzBatchComputeNode -BatchContext $context
 Tip

You can deploy multiple application packages to the compute nodes in a pool. If
you'd like to add an application package instead of replacing the currently
deployed packages, omit the $pool.ApplicationPackageReferences.Clear() line
above.

Next steps
Review the Azure Batch cmdlet reference for detailed cmdlet syntax and examples.
Learn how to deploy applications to compute nodes with Batch application
packages.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Manage Batch resources with Azure CLI
07/01/2025

You can manage your Azure Batch accounts and resources using the Azure Command-Line
Interface (Azure CLI). There are commands for creating and updating Batch resources such as
pools, jobs, and tasks. You can also create scripts for many of the same tasks you do through
Batch APIs, PowerShell cmdlets, and the Azure portal.

You can run the Azure CLI in Azure Cloud Shell or install the Azure CLI locally. Versions are
available for Windows, Mac, and Linux operating systems (OS).

This article explains how to use the Azure CLI with Batch accounts and resources.

Set up the Azure CLI


Choose how you want to set up the Azure CLI:

Run the Azure CLI in Cloud Shell.


Install the Azure CLI locally.
Install the Azure CLI on Windows
Install the Azure CLI on macOS
Install the Azure CLI on Linux for multiple Linux distributions.

If you're new to using the Azure CLI, see Get started with the Azure CLI before you continue.

If you've previously installed the Azure CLI locally, make sure to update your installation to the
latest version.

Authenticate with the Azure CLI


To use the Azure CLI with Batch, first sign into your Azure account, then sign in to your Batch
account.

Sign in to Azure account


To use the Azure CLI, first sign in to your Azure account. This step gives you access to Azure
Resource Manager commands, which include Batch Management service commands. Then, you
can run commands to manage Batch accounts, keys, application packages, and quotas.

You can authenticate your Azure account in the Azure CLI) in two ways. To run commands by
yourself, sign in to the Azure CLI interactively. The Azure CLI caches your credentials, and can
use those same credentials to sign you into your Batch account after. To run commands from a
script or an application, sign in to the Azure CLI with a service principal.

To sign in to the Azure CLI interactively, run az login :

Azure CLI

az login

Sign in to Batch account


Next, sign in to your Batch account in the Azure CLI using the az batch account login
command. This step gives you access to Batch service commands. Then, you can manage Batch
resources like pools, jobs, and tasks.

You can authenticate your Batch account in the Azure CLI in two ways. The default method is to
authenticate using Microsoft Entra ID. We recommend using this method in most scenarios.
Another option is to use Shared Key authentication.

If you're creating Azure CLI scripts to automate Batch commands, you can use either
authentication method. In some scenarios, Shared Key authentication might be simpler than
creating a service principal.

Authenticate with Microsoft Entra ID


The default method for authenticating with your Batch account is through Microsoft Entra ID.
When you sign in to the Azure CLI interactively or with a service principal, you can use those
same cached credentials to sign you into your Batch account with Microsoft Entra ID. This
authentication method also offers Azure role-based access control (Azure RBAC). With Azure
RBAC, user access depends on their assigned role, not account keys. You only need to manage
the Azure roles, not account keys. Microsoft Entra ID then handles access and authentication.

To sign in to your Batch account with Microsoft Entra ID, run az batch login . Make sure to
include the require parameters for your Batch account's name ( -n ), and your resource group's
name ( -g ).

Azure CLI

az batch account login -g <your-resource-group> -n <your-batch-account>

Authenticate with Shared Key


You can also use Shared Key authentication to sign into your Batch account. This method uses
your account access keys to authenticate Azure CLI commands for the Batch service.

To sign in to your Batch account with Shared Key authentication, run az batch login with the
parameter --shared-key-auth . Make sure to include the require parameters for your Batch
account's name ( -n ), and your resource group's name ( -g ).

Azure CLI

az batch account login -g <your-resource-group> -n <your-batch-account> --shared-


key-auth

Learn Batch commands


The Azure CLI reference documentation lists all Azure CLI commands for Batch.

To list all Batch commands in the Azure CLI, run az batch -h .

There are multiple example CLI scripts for common Batch tasks. These examples show how to
use many available commands for Batch in the Azure CLI. You can learn how to create and
manage Batch accounts, pools, jobs, and tasks.

Use Batch CLI extension commands


You can use the Batch CLI extension to run Batch jobs without writing code. The extension
provides commands to use JSON templates for creating pools, jobs, and tasks with the Azure
CLI. The extension also provides commands to connect to an Azure Storage account linked to
your Batch account. Then, you can upload job input files, and download job input files.

Create resources with JSON


You can create most Batch resources using only command-line parameters. Some features
require you specify a JSON configuration file instead. The JSON file contains the configuration
details for your new resource. For example, you have to use a JSON file to specify resource files
for a start task.

For example, to use a JSON file to configure a new Batch pool resource:

Azure CLI

az batch pool <your-batch-pool-configuration>.json


When you specify a JSON file for a new resource, don't use other parameters in your
command. The service only uses the JSON file to configure the resource.

The Batch REST API reference documentation lists any JSON syntax required to create a
resource.

To see the JSON syntax required to create a resource, refer to the Batch REST API reference
documentation. Go to the Examples section in the resource operation's reference page. Then,
find the subsection titled Add <resource type>. For example, Add a basic task. Use the
example JSON code as templates for your configuration files.

For a sample script that specifies a JSON file, see Run a job and tasks with Batch.

Query Batch resources efficiently


You can query your Batch account and list all resources using the list command. For example,
to list the pools in your account and tasks in a job:

Azure CLI

az batch pool list


az batch task list --job-id <your-job-id>

To limit the amount of data your Batch query returns, specify an OData clause. All filtering
occurs server-side, so you only receive the data you request. Use these OData clauses to save
bandwidth and time with list operations. For more information, see Design efficient list
queries for Batch resources.

ノ Expand table

Clause Description

--select-clause Returns a subset of properties for each entity.


[select-clause]

--filter-clause Returns only entities that match the specified OData expression.
[filter-clause]

--expand-clause Obtains the entity information in a single underlying REST call. The expand
[expand-clause] clause currently supports only the stats property.

For an example script that shows how to use these clauses, see Run a job and tasks with Batch.
Troubleshooting
To get help with any Batch command, add -h to the end of your command. Don't add other
options. For example, to get help creating a Batch account, run az batch account create -h .

To return verbose command output, add -v or -vv to the end of your command. Use these
switches to display the full error output. The -vv flag returns the actual REST requests and
responses.

To view the command output in JSON format, add --json to the end of your command. For
example, to display the properties of a pool named pool001, run az batch pool show pool001 -
-json . Then, copy and modify the output to create Batch resources using a JSON configuration

file.

General Azure CLI troubleshooting

The Azure CLI can run in several shell environments, but with slight format variations. If you
have unexpected results with Azure CLI commands, see How to use the Azure CLI successfully.

Next steps
Quickstart: Run your first Batch job with the Azure CLI
Use Azure Batch CLI templates and file
transfer
Article • 04/02/2025

2 Warning

The Batch Azure CLI extension will be retired on 30 September 2024. Please
uninstall the extension with the command az extension remove --name azure-
batch-cli-extensions .

By using a Batch extension to Azure CLI, users can run Batch jobs without writing code.

Create and use JSON template files with Azure CLI to create Batch pools, jobs, and tasks.
Use CLI extension commands to easily upload job input files to the storage account
associated with the Batch account, and download job output files.

7 Note

JSON files don't support the same functionality as Azure Resource Manager
templates. They are meant to be formatted like the raw REST request body. The CLI
extension doesn't change any existing commands, but it does have a similar
template option that adds partial Azure Resource Manager template functionality.
See Azure Batch CLI Extensions for Windows, Mac and Linux .

Overview
An extension to the Azure CLI enables Batch to be used end-to-end by users who are
not developers. With only CLI commands, you can create a pool, upload input data,
create jobs and associated tasks, and download the resulting output data. No additional
code is required. Run the CLI commands directly or integrate them into scripts.

Batch templates build on the existing Batch support in the Azure CLI for JSON files to
specify property values when creating pools, jobs, tasks, and other items. Batch
templates add the following capabilities:

Parameters can be defined. When the template is used, only the parameter values
are specified to create the item, with other item property values specified in the
template body. A user who understands Batch and the applications to be run by
Batch can create templates, specifying pool, job, and task property values. A user
less familiar with Batch and/or the applications only needs to specify the values for
the defined parameters.

Job task factories create one or more tasks associated with a job, avoiding the
need for many task definitions to be created and significantly simplifying job
submission.

Jobs typically use input data files and produce output data files. A storage account is
associated, by default, with each Batch account. You can transfer files to and from this
storage account using Azure CLI, with no coding and no storage credentials.

For example, ffmpeg is a popular application that processes audio and video files.
Using the Azure Batch CLI extension, you could make it easier for a user to invoke
ffmpeg to transcode source video files to different resolutions. The process might look
like this:

Create a pool template. The user creating the template knows how to call the
ffmpeg application and its requirements; they specify the appropriate OS, VM size,
how ffmpeg is installed (from an application package or using a package manager,
for example), and other pool property values. Parameters are created so when the
template is used, only the pool ID and number of VMs need to be specified.
Create a job template. The user creating the template knows how ffmpeg needs to
be invoked to transcode source video to a different resolution and specifies the
task command line; they also know that there is a folder containing the source
video files, with a task required per input file.
An end user with a set of video files to transcode first creates a pool using the pool
template, specifying only the pool ID and number of VMs required. They can then
upload the source files to transcode. A job can then be submitted using the job
template, specifying only the pool ID and location of the source files uploaded. The
Batch job is created, with one task per input file being generated. Finally, the
transcoded output files can be downloaded.

Installation
To install the Azure Batch CLI extension, first Install the Azure CLI 2.0, or run the Azure
CLI in Azure Cloud Shell.

Install the latest version of the Batch extension using the following Azure CLI command:

Azure CLI
az extension add --name azure-batch-cli-extensions

For more information about the Batch CLI extension and additional installation options,
see the GitHub repo .

To use the CLI extension features, you need an Azure Batch account and, for the
commands that transfer files to and from storage, a linked storage account.

To log into a Batch account with the Azure CLI, see Manage Batch resources with Azure
CLI.

Templates
Azure Batch templates are similar to Azure Resource Manager templates, in functionality
and syntax. They are JSON files that contain item property names and values, but add
the following main concepts:

Parameters: Allow property values to be specified in a body section, with only


parameter values needing to be supplied when the template is used. For example,
the complete definition for a pool could be placed in the body and only one
parameter defined for poolId ; only a pool ID string therefore needs to be supplied
to create a pool. The template body can be authored by someone with knowledge
of Batch and the applications to be run by Batch; only values for the author-
defined parameters must be supplied when the template is used. This lets users
without any in-depth Batch and/or application knowledge use the templates.
Variables: Allow simple or complex parameter values to be specified in one place
and used in one or more places in the template body. Variables can simplify and
reduce the size of the template, as well as make it more maintainable by having
one location to change properties.
Higher-level constructs: Some higher-level constructs are available in the template
that are not yet available in the Batch APIs. For example, a task factory can be
defined in a job template that creates multiple tasks for the job, using a common
task definition. These constructs avoid the need to code to dynamically create
multiple JSON files, such as one file per task, as well as create script files to install
applications via a package manager.

Pool templates
Pool templates support the standard template capabilities of parameters and variables.
They also support package references, which optionally allow software to be copied to
pool nodes by using package managers. The package manager and package ID are
specified in the package reference. By declaring one or more packages, you avoid
creating a script that gets the required packages, installing the script, and running the
script on each pool node.

The following is an example of a template that creates a pool of Linux VMs with ffmpeg
installed. To use it, supply only a pool ID string and the number of VMs in the pool:

JSON

{
"parameters": {
"nodeCount": {
"type": "int",
"metadata": {
"description": "The number of pool nodes"
}
},
"poolId": {
"type": "string",
"metadata": {
"description": "The pool ID "
}
}
},
"pool": {
"type": "Microsoft.Batch/batchAccounts/pools",
"apiVersion": "2016-12-01",
"properties": {
"id": "[parameters('poolId')]",
"virtualMachineConfiguration": {
"imageReference": {
"publisher": "Canonical",
"offer": "UbuntuServer",
"sku": "20.04-LTS",
"version": "latest"
},
"nodeAgentSKUId": "batch.node.ubuntu 20.04"
},
"vmSize": "STANDARD_D3_V2",
"targetDedicatedNodes": "[parameters('nodeCount')]",
"enableAutoScale": false,
"taskSlotsPerNode": 1,
"packageReferences": [
{
"type": "aptPackage",
"id": "ffmpeg"
}
]
}
}
}
If the template file was named pool-ffmpeg.json, then invoke the template as follows:

Azure CLI

az batch pool create --template pool-ffmpeg.json

The CLI prompts you to provide values for the poolId and nodeCount parameters. You
can also supply the parameters in a JSON file. For example:

JSON

{
"poolId": {
"value": "mypool"
},
"nodeCount": {
"value": 2
}
}

If the parameters JSON file was named pool-parameters.json, then invoke the template
as follows:

Azure CLI

az batch pool create --template pool-ffmpeg.json --parameters pool-


parameters.json

Job templates
Job templates support the standard template capabilities of parameters and variables.
They also support the task factory construct, which creates multiple tasks for a job from
one task definition. Three types of task factory are supported: parametric sweep, task
per file, and task collection.

The following is an example of a template that creates a job to transcode MP4 video
files with ffmpeg to one of two lower resolutions. It creates one task per source video
file. See File groups and file transfer for more about file groups for job input and output.

JSON

{
"parameters": {
"poolId": {
"type": "string",
"metadata": {
"description": "The name of Azure Batch pool which runs the
job"
}
},
"jobId": {
"type": "string",
"metadata": {
"description": "The name of Azure Batch job"
}
},
"resolution": {
"type": "string",
"defaultValue": "428x240",
"allowedValues": [
"428x240",
"854x480"
],
"metadata": {
"description": "Target video resolution"
}
}
},
"job": {
"type": "Microsoft.Batch/batchAccounts/jobs",
"apiVersion": "2016-12-01",
"properties": {
"id": "[parameters('jobId')]",
"constraints": {
"maxWallClockTime": "PT5H",
"maxTaskRetryCount": 1
},
"poolInfo": {
"poolId": "[parameters('poolId')]"
},
"taskFactory": {
"type": "taskPerFile",
"source": {
"fileGroup": "ffmpeg-input"
},
"repeatTask": {
"commandLine": "ffmpeg -i {fileName} -y -s
[parameters('resolution')] -strict -2
{fileNameWithoutExtension}_[parameters('resolution')].mp4",
"resourceFiles": [
{
"blobSource": "{url}",
"filePath": "{fileName}"
}
],
"outputFiles": [
{
"filePattern": "
{fileNameWithoutExtension}_[parameters('resolution')].mp4",
"destination": {
"autoStorage": {
"path": "
{fileNameWithoutExtension}_[parameters('resolution')].mp4",
"fileGroup": "ffmpeg-output"
}
},
"uploadOptions": {
"uploadCondition": "TaskSuccess"
}
}
]
}
},
"onAllTasksComplete": "terminatejob"
}
}
}

If the template file was named job-ffmpeg.json, then invoke the template as follows:

Azure CLI

az batch job create --template job-ffmpeg.json

As before, the CLI prompts you to provide values for the parameters. You can also
supply the parameters in a JSON file.

Use templates in Batch Explorer


You can upload a Batch CLI template to the Batch Explorer desktop application to
create a Batch pool or job. You can also select from predefined pool and job templates
in the Batch Explorer Gallery.

To upload a template:

1. In Batch Explorer, select Gallery > Local templates.


2. Select, or drag and drop, a local pool or job template.
3. Select Use this template, and follow the on-screen prompts.

File groups and file transfer


Most jobs and tasks require input files and produce output files. Usually, input files and
output files are transferred, either from the client to the node, or from the node to the
client. The Azure Batch CLI extension abstracts away file transfer and utilizes the storage
account that you can associate with each Batch account.
A file group equates to a container that is created in the Azure storage account. The file
group may have subfolders.

The Batch CLI extension provides commands to upload files from client to a specified
file group and download files from the specified file group to a client.

Azure CLI

az batch file upload --local-path c:\source_videos\*.mp4


--file-group ffmpeg-input

az batch file download --file-group ffmpeg-output --local-path


c:\output_lowres_videos

Pool and job templates allow files stored in file groups to be specified for copy onto
pool nodes or off pool nodes back to a file group. For example, in the job template
specified previously, the file group ffmpeg-input is specified for the task factory as the
location of the source video files copied down to the node for transcoding. The file
group ffmpeg-output is the location where the transcoded output files are copied from
the node running each task.

Summary
Template and file transfer support have currently been added only to the Azure CLI. The
goal is to expand the audience that can use Batch to users who do not need to develop
code using the Batch APIs, such as researchers and IT users. Without coding, users with
knowledge of Azure, Batch, and the applications to be run by Batch can create templates
for pool and job creation. With template parameters, users without detailed knowledge
of Batch and the applications can use the templates.

Try out the Batch extension for the Azure CLI and provide us with any feedback or
suggestions, either in the comments for this article or via the Batch Community repo .

Next steps
View detailed installation and usage documentation, samples, and source code in
the Azure GitHub repo .
Learn more about using Batch Explorer to create and manage Batch resources.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Get started with Batch SDK for
JavaScript
Article • 04/02/2025

Learn the basics of building a Batch client in JavaScript using Azure Batch JavaScript
SDK . We take a step by step approach of understanding a scenario for a batch
application and then setting it up using JavaScript.

Prerequisites
This article assumes that you have a working knowledge of JavaScript and familiarity
with Linux. It also assumes that you have an Azure account setup with access rights to
create Batch and Storage services.

We recommend reading Azure Batch Technical Overview before you go through the
steps outlined this article.

Understand the scenario


Here, we have a simple script written in Python that downloads all csv files from an
Azure Blob storage container and converts them to JSON. To process multiple storage
account containers in parallel, we can deploy the script as an Azure Batch job.

Azure Batch architecture


The following diagram depicts how we can scale the Python script using Azure Batch
and a client.
The JavaScript sample deploys a batch job with a preparation task (explained in detail
later) and a set of tasks depending on the number of containers in the storage account.
You can download the scripts from the GitHub repository.

Sample Code
Preparation task shell scripts
Python csv to JSON processor

 Tip

The JavaScript sample in the link specified does not contain specific code to be
deployed as an Azure function app. You can refer to the following links for
instructions to create one.

Create function app


Create timer trigger function

Build the application


Now, let us follow the process step by step into building the JavaScript client:

Step 1: Install Azure Batch SDK


You can install Azure Batch SDK for JavaScript using the npm install command.

npm install @azure/batch


This command installs the latest version of azure-batch JavaScript SDK.

 Tip

In an Azure Function app, you can go to "Kudu Console" in the Azure function's
Settings tab to run the npm install commands. In this case to install Azure Batch
SDK for JavaScript.

Step 2: Create an Azure Batch account


You can create it from the Azure portal or from command line (PowerShell /Azure CLI).

Following are the commands to create one through Azure CLI.

Create a Resource Group, skip this step if you already have one where you want to
create the Batch Account:

az group create -n "<resource-group-name>" -l "<location>"

Next, create an Azure Batch account.

az batch account create -l "<location>" -g "<resource-group-name>" -n "<batch-

account-name>"

Each Batch account has its corresponding access keys. These keys are needed to create
further resources in Azure batch account. A good practice for production environment is
to use Azure Key Vault to store these keys. You can then create a Service principal for the
application. Using this service principal the application can create an OAuth token to
access keys from the key vault.

az batch account keys list -g "<resource-group-name>" -n "<batch-account-name>"

Copy and store the key to be used in the subsequent steps.

Step 3: Create an Azure Batch service client


Following code snippet first imports the azure-batch JavaScript module and then creates
a Batch Service client. You need to first create a SharedKeyCredentials object with the
Batch account key copied from the previous step.

JavaScript
// Initializing Azure Batch variables

import { BatchServiceClient, BatchSharedKeyCredentials } from


"@azure/batch";

// Replace values below with Batch Account details


const batchAccountName = '<batch-account-name>';
const batchAccountKey = '<batch-account-key>';
const batchEndpoint = '<batch-account-url>';

const credentials = new BatchSharedKeyCredentials(batchAccountName,


batchAccountKey);
const batchClient = new BatchServiceClient(credentials, batchEndpoint);

The Azure Batch URI can be found in the Overview tab of the Azure portal. It is of the
format:

https://accountname.location.batch.azure.com

Refer to the screenshot:

Step 4: Create an Azure Batch pool


An Azure Batch pool consists of multiple VMs (also known as Batch Nodes). Azure Batch
service deploys the tasks on these nodes and manages them. You can define the
following configuration parameters for your pool.

Type of Virtual Machine image


Size of Virtual Machine nodes
Number of Virtual Machine nodes

 Tip
The size and number of Virtual Machine nodes largely depend on the number of
tasks you want to run in parallel and also the task itself. We recommend testing to
determine the ideal number and size.

The following code snippet creates the configuration parameter objects.

JavaScript

// Creating Image reference configuration for Ubuntu Linux VM


const imgRef = {
publisher: "Canonical",
offer: "UbuntuServer",
sku: "20.04-LTS",
version: "latest"
}
// Creating the VM configuration object with the SKUID
const vmConfig = {
imageReference: imgRef,
nodeAgentSKUId: "batch.node.ubuntu 20.04"
};
// Number of VMs to create in a pool
const numVms = 4;

// Setting the VM size


const vmSize = "STANDARD_D1_V2";

 Tip

For the list of Linux VM images available for Azure Batch and their SKU IDs, see List
of virtual machine images.

Once the pool configuration is defined, you can create the Azure Batch pool. The Batch
pool command creates Azure Virtual Machine nodes and prepares them to be ready to
receive tasks to execute. Each pool should have a unique ID for reference in subsequent
steps.

The following code snippet creates an Azure Batch pool.

JavaScript

// Create a unique Azure Batch pool ID


const now = new Date();
const poolId =
`processcsv_${now.getFullYear()}${now.getMonth()}${now.getDay()}${now.getHou
rs()}${now.getSeconds()}`;

const poolConfig = {
id: poolId,
displayName: "Processing csv files",
vmSize: vmSize,
virtualMachineConfiguration: vmConfig,
targetDedicatedNodes: numVms,
enableAutoScale: false
};

// Creating the Pool


var pool = batchClient.pool.add(poolConfig, function (error, result){
if(error!=null){console.log(error.response)};
});

You can check the status of the pool created and ensure that the state is in "active"
before going ahead with submission of a Job to that pool.

JavaScript

var cloudPool =
batchClient.pool.get(poolId,function(error,result,request,response){
if(error == null)
{

if(result.state == "active")
{
console.log("Pool is active");
}
}
else
{
if(error.statusCode==404)
{
console.log("Pool not found yet returned 404...");

}
else
{
console.log("Error occurred while retrieving pool data");
}
}
});

Following is a sample result object returned by the pool.get function.

{
id: 'processcsv_2022002321',
displayName: 'Processing csv files',
url: 'https://<batch-account-
name>.westus.batch.azure.com/pools/processcsv_2022002321',
eTag: '0x8D9D4088BC56FA1',
lastModified: 2022-01-10T07:12:21.943Z,
creationTime: 2022-01-10T07:12:21.943Z,
state: 'active',
stateTransitionTime: 2022-01-10T07:12:21.943Z,
allocationState: 'steady',
allocationStateTransitionTime: 2022-01-10T07:13:35.103Z,
vmSize: 'standard_d1_v2',
virtualMachineConfiguration: {
imageReference: {
publisher: 'Canonical',
offer: 'UbuntuServer',
sku: '20.04-LTS',
version: 'latest'
},
nodeAgentSKUId: 'batch.node.ubuntu 20.04'
},
resizeTimeout: 'PT15M',
currentDedicatedNodes: 4,
currentLowPriorityNodes: 0,
targetDedicatedNodes: 4,
targetLowPriorityNodes: 0,
enableAutoScale: false,
enableInterNodeCommunication: false,
taskSlotsPerNode: 1,
taskSchedulingPolicy: { nodeFillType: 'Spread' }}

Step 4: Submit an Azure Batch job


An Azure Batch job is a logical group of similar tasks. In our scenario, it is "Process csv to
JSON." Each task here could be processing csv files present in each Azure Storage
container.

These tasks would run in parallel and deployed across multiple nodes, orchestrated by
the Azure Batch service.

 Tip

You can use the taskSlotsPerNode property to specify maximum number of tasks
that can run concurrently on a single node.

Preparation task

The VM nodes created are blank Ubuntu nodes. Often, you need to install a set of
programs as prerequisites. Typically, for Linux nodes you can have a shell script that
installs the prerequisites before the actual tasks run. However it could be any
programmable executable.

The shell script in this example installs Python-pip and the Azure Storage Blob SDK for
Python.

You can upload the script on an Azure Storage Account and generate a SAS URI to
access the script. This process can also be automated using the Azure Storage JavaScript
SDK.

 Tip

A preparation task for a job runs only on the VM nodes where the specific task
needs to run. If you want prerequisites to be installed on all nodes irrespective of
the tasks that run on it, you can use the startTask property while adding a pool.
You can use the following preparation task definition for reference.

A preparation task is specified during the submission of Azure Batch job. Following are
some configurable preparation task parameters:

ID: A unique identifier for the preparation task


commandLine: Command line to execute the task executable
resourceFiles: Array of objects that provide details of files needed to be
downloaded for this task to run. Following are its options
httpUrl: The URL of the file to download
filePath: Local path to download and save the file
fileMode: Only applicable for Linux nodes, fileMode is in octal format with a
default value of 0770
waitForSuccess: If set to true, the task does not run on preparation task failures
runElevated: Set it to true if elevated privileges are needed to run the task.

Following code snippet shows the preparation task script configuration sample:

JavaScript

var jobPrepTaskConfig = {id:"installprereq",commandLine:"sudo sh


startup_prereq.sh > startup.log",resourceFiles: [{ 'httpUrl': 'Blob sh url',
'filePath': 'startup_prereq.sh' }],waitForSuccess:true,runElevated:true,
userIdentity: {autoUser: {elevationLevel: "admin", scope: "pool"}}}

If there are no prerequisites to be installed for your tasks to run, you can skip the
preparation tasks. Following code creates a job with display name "process csv files."
JavaScript

// Setting Batch Pool ID


const poolInfo = { poolId: poolId };
// Batch job configuration object
const jobId = "processcsvjob";
const jobConfig = {
id: jobId,
displayName: "process csv files",
jobPreparationTask: jobPrepTaskConfig,
poolInfo: poolInfo
};
// Adding Azure batch job to the pool
const job = batchClient.job.add(jobConfig, function (error, result) {
if (error !== null) {
console.log("An error occurred while creating the job...");
console.log(error.response);
}
}
);

Step 5: Submit Azure Batch tasks for a job


Now that our process csv job is created, let us create tasks for that job. Assuming we
have four containers, we have to create four tasks, one for each container.

If we look at the Python script , it accepts two parameters:

container name: The Storage container to download files from


pattern: An optional parameter of file name pattern

Assuming we have four containers "con1", "con2", "con3","con4" following code shows
submitting four tasks to the Azure batch job "process csv" we created earlier.

JavaScript

// storing container names in an array


const containerList = ["con1", "con2", "con3", "con4"]; //Replace with
list of blob containers within storage account
containerList.forEach(function (val, index) {
console.log("Submitting task for container : " + val);
const containerName = val;
const taskID = containerName + "_process";
// Task configuration object
const taskConfig = {
id: taskID,
displayName: 'process csv in ' + containerName,
commandLine: 'python processcsv.py --container ' + containerName,
resourceFiles: [{ 'httpUrl': 'Blob script url', 'filePath':
'processcsv.py' }]
};

const task = batchClient.task.add(jobId, taskConfig, function (error,


result) {
if (error !== null) {
console.log("Error occurred while creating task for container "
+ containerName + ". Details : " + error.response);
}
else {
console.log("Task for container : " + containerName + "
submitted successfully");
}
});
});

The code adds multiple tasks to the pool. And each of the tasks is executed on a node in
the pool of VMs created. If the number of tasks exceeds the number of VMs in a pool or
the taskSlotsPerNode property, the tasks wait until a node is made available. This
orchestration is handled by Azure Batch automatically.

The portal has detailed views on the tasks and job statuses. You can also use the list and
get functions in the Azure JavaScript SDK. Details are provided in the documentation
link .

Next steps
Learn about the Batch service workflow and primary resources such as pools,
nodes, jobs, and tasks.
See the Batch JavaScript reference to explore the Batch API.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Use multi-instance tasks to run Message
Passing Interface (MPI) applications in
Batch
Article • 04/02/2025

Multi-instance tasks allow you to run an Azure Batch task on multiple compute nodes
simultaneously. These tasks enable high performance computing scenarios like Message
Passing Interface (MPI) applications in Batch. In this article, you learn how to execute
multi-instance tasks using the Batch .NET library.

7 Note

While the examples in this article focus on Batch .NET, MS-MPI, and Windows
compute nodes, the multi-instance task concepts discussed here are applicable to
other platforms and technologies (Python and Intel MPI on Linux nodes, for
example).

Multi-instance task overview


In Batch, each task is normally executed on a single compute node--you submit multiple
tasks to a job, and the Batch service schedules each task for execution on a node.
However, by configuring a task's multi-instance settings, you tell Batch to instead create
one primary task and several subtasks that are then executed on multiple nodes.

When you submit a task with multi-instance settings to a job, Batch performs several
steps unique to multi-instance tasks:
1. The Batch service creates one primary and several subtasks based on the multi-
instance settings. The total number of tasks (primary plus all subtasks) matches the
number of instances (compute nodes) you specify in the multi-instance settings.
2. Batch designates one of the compute nodes as the master, and schedules the
primary task to execute on the master. It schedules the subtasks to execute on the
remainder of the compute nodes allocated to the multi-instance task, one subtask
per node.
3. The primary and all subtasks download any common resource files you specify in
the multi-instance settings.
4. After the common resource files have been downloaded, the primary and subtasks
execute the coordination command you specify in the multi-instance settings. The
coordination command is typically used to prepare nodes for executing the task.
This can include starting background services (such as Microsoft MPI's smpd.exe )
and verifying that the nodes are ready to process inter-node messages.
5. The primary task executes the application command on the master node after the
coordination command has been completed successfully by the primary and all
subtasks. The application command is the command line of the multi-instance task
itself, and is executed only by the primary task. In an MS-MPI -based solution, this
is where you execute your MPI-enabled application using mpiexec.exe .

7 Note

Though it is functionally distinct, the "multi-instance task" is not a unique task type
like the StartTask or JobPreparationTask. The multi-instance task is simply a
standard Batch task (CloudTask in Batch .NET) whose multi-instance settings have
been configured. In this article, we refer to this as the multi-instance task.

Requirements for multi-instance tasks


Multi-instance tasks require a pool with inter-node communication enabled, and with
concurrent task execution disabled. To disable concurrent task execution, set the
CloudPool.TaskSlotsPerNode property to 1.

7 Note

Batch limits the size of a pool that has inter-node communication enabled.

This code snippet shows how to create a pool for multi-instance tasks using the Batch
.NET library.
C#

CloudPool myCloudPool =
myBatchClient.PoolOperations.CreatePool(
poolId: "MultiInstanceSamplePool",
targetDedicatedComputeNodes: 3
virtualMachineSize: "standard_d1_v2",
VirtualMachineConfiguration: new VirtualMachineConfiguration(
imageReference: new ImageReference(
publisher: "MicrosoftWindowsServer",
offer: "WindowsServer",
sku: "2019-datacenter-core",
version: "latest"),
nodeAgentSkuId: "batch.node.windows amd64");

// Multi-instance tasks require inter-node communication, and those nodes


// must run only one task at a time.
myCloudPool.InterComputeNodeCommunicationEnabled = true;
myCloudPool.TaskSlotsPerNode = 1;

7 Note

If you try to run a multi-instance task in a pool with internode communication


disabled, or with a taskSlotsPerNode value greater than 1, the task is never
scheduled--it remains indefinitely in the "active" state.

Pools with InterComputeNodeCommunication enabled will not allow automatically


the deprovision of the node.

Use a StartTask to install MPI


To run MPI applications with a multi-instance task, you first need to install an MPI
implementation (MS-MPI or Intel MPI, for example) on the compute nodes in the pool.
This is a good time to use a StartTask, which executes whenever a node joins a pool, or
is restarted. This code snippet creates a StartTask that specifies the MS-MPI setup
package as a resource file. The start task's command line is executed after the resource
file is downloaded to the node. In this case, the command line performs an unattended
install of MS-MPI.

C#

// Create a StartTask for the pool which we use for installing MS-MPI on
// the nodes as they join the pool (or when they are restarted).
StartTask startTask = new StartTask
{
CommandLine = "cmd /c MSMpiSetup.exe -unattend -force",
ResourceFiles = new List<ResourceFile> { new
ResourceFile("https://mystorageaccount.blob.core.windows.net/mycontainer/MSM
piSetup.exe", "MSMpiSetup.exe") },
UserIdentity = new UserIdentity(new
AutoUserSpecification(elevationLevel: ElevationLevel.Admin)),
WaitForSuccess = true
};
myCloudPool.StartTask = startTask;

// Commit the fully configured pool to the Batch service to actually create
// the pool and its compute nodes.
await myCloudPool.CommitAsync();

Remote direct memory access (RDMA)


When you choose an RDMA-capable size such as A9 for the compute nodes in your
Batch pool, your MPI application can take advantage of Azure's high-performance, low-
latency remote direct memory access (RDMA) network.

Look for the sizes specified as "RDMA capable" in Sizes for virtual machines in Azure (for
VirtualMachineConfiguration pools) or Sizes for Cloud Services (for
CloudServicesConfiguration pools).

7 Note

To take advantage of RDMA on Linux compute nodes, you must use Intel MPI on
the nodes.

Create a multi-instance task with Batch .NET


Now that we've covered the pool requirements and MPI package installation, let's create
the multi-instance task. In this snippet, we create a standard CloudTask, then configure
its MultiInstanceSettings property. As mentioned earlier, the multi-instance task is not a
distinct task type, but a standard Batch task configured with multi-instance settings.

C#

// Create the multi-instance task. Its command line is the "application


command"
// and will be executed *only* by the primary, and only after the primary
and
// subtasks execute the CoordinationCommandLine.
CloudTask myMultiInstanceTask = new CloudTask(id: "mymultiinstancetask",
commandline: "cmd /c mpiexec.exe -wdir %AZ_BATCH_TASK_SHARED_DIR%
MyMPIApplication.exe");
// Configure the task's MultiInstanceSettings. The CoordinationCommandLine
will be executed by
// the primary and all subtasks.
myMultiInstanceTask.MultiInstanceSettings =
new MultiInstanceSettings(numberOfNodes) {
CoordinationCommandLine = @"cmd /c start cmd /c ""%MSMPI_BIN%\smpd.exe""
-d",
CommonResourceFiles = new List<ResourceFile> {
new
ResourceFile("https://mystorageaccount.blob.core.windows.net/mycontainer/MyM
PIApplication.exe",
"MyMPIApplication.exe")
}
};

// Submit the task to the job. Batch will take care of splitting it into
subtasks and
// scheduling them for execution on the nodes.
await myBatchClient.JobOperations.AddTaskAsync("mybatchjob",
myMultiInstanceTask);

Primary task and subtasks


When you create the multi-instance settings for a task, you specify the number of
compute nodes that are to execute the task. When you submit the task to a job, the
Batch service creates one primary task and enough subtasks that together match the
number of nodes you specified.

These tasks are assigned an integer ID in the range of 0 to numberOfInstances - 1. The


task with ID 0 is the primary task, and all other IDs are subtasks. For example, if you
create the following multi-instance settings for a task, the primary task would have an ID
of 0, and the subtasks would have IDs 1 through 9.

C#

int numberOfNodes = 10;


myMultiInstanceTask.MultiInstanceSettings = new
MultiInstanceSettings(numberOfNodes);

Master node
When you submit a multi-instance task, the Batch service designates one of the
compute nodes as the "master" node, and schedules the primary task to execute on the
master node. The subtasks are scheduled to execute on the remainder of the nodes
allocated to the multi-instance task.
Coordination command
The coordination command is executed by both the primary and subtasks.

The invocation of the coordination command is blocking--Batch does not execute the
application command until the coordination command has returned successfully for all
subtasks. The coordination command should therefore start any required background
services, verify that they are ready for use, and then exit. For example, this coordination
command for a solution using MS-MPI version 7 starts the SMPD service on the node,
then exits:

cmd /c start cmd /c ""%MSMPI_BIN%\smpd.exe"" -d

Note the use of start in this coordination command. This is required because the
smpd.exe application does not return immediately after execution. Without the use of
the start command, this coordination command would not return, and would therefore
block the application command from running.

Application command
Once the primary task and all subtasks have finished executing the coordination
command, the multi-instance task's command line is executed by the primary task only.
We call this the application command to distinguish it from the coordination command.

For MS-MPI applications, use the application command to execute your MPI-enabled
application with mpiexec.exe . For example, here is an application command for a
solution using MS-MPI version 7:

cmd /c ""%MSMPI_BIN%\mpiexec.exe"" -c 1 -wdir %AZ_BATCH_TASK_SHARED_DIR%

MyMPIApplication.exe

7 Note

Because MS-MPI's mpiexec.exe uses the CCP_NODES variable by default (see


Environment variables), the example application command line above excludes it.

Environment variables
Batch creates several environment variables specific to multi-instance tasks on the
compute nodes allocated to a multi-instance task. Your coordination and application
command lines can reference these environment variables, as can the scripts and
programs they execute.

The following environment variables are created by the Batch service for use by multi-
instance tasks:

CCP_NODES

AZ_BATCH_NODE_LIST
AZ_BATCH_HOST_LIST

AZ_BATCH_MASTER_NODE
AZ_BATCH_TASK_SHARED_DIR

AZ_BATCH_IS_CURRENT_NODE_MASTER

For full details on these and the other Batch compute node environment variables,
including their contents and visibility, see Compute node environment variables.

 Tip

The Batch Linux MPI code sample contains an example of how several of these
environment variables can be used.

Resource files
There are two sets of resource files to consider for multi-instance tasks: common
resource files that all tasks download (both primary and subtasks), and the resource
files specified for the multi-instance task itself, which only the primary task downloads.

You can specify one or more common resource files in the multi-instance settings for a
task. These common resource files are downloaded from Azure Storage into each node's
task shared directory by the primary and all subtasks. You can access the task shared
directory from application and coordination command lines by using the
AZ_BATCH_TASK_SHARED_DIR environment variable. The AZ_BATCH_TASK_SHARED_DIR path is

identical on every node allocated to the multi-instance task, thus you can share a single
coordination command between the primary and all subtasks. Batch does not "share"
the directory in a remote access sense, but you can use it as a mount or share point as
mentioned earlier in the tip on environment variables.

Resource files that you specify for the multi-instance task itself are downloaded to the
task's working directory, AZ_BATCH_TASK_WORKING_DIR , by default. As mentioned, in
contrast to common resource files, only the primary task downloads resource files
specified for the multi-instance task itself.
) Important

Always use the environment variables AZ_BATCH_TASK_SHARED_DIR and


AZ_BATCH_TASK_WORKING_DIR to refer to these directories in your command lines. Do

not attempt to construct the paths manually.

Task lifetime
The lifetime of the primary task controls the lifetime of the entire multi-instance task.
When the primary exits, all of the subtasks are terminated. The exit code of the primary
is the exit code of the task, and is therefore used to determine the success or failure of
the task for retry purposes.

If any of the subtasks fail, exiting with a non-zero return code, for example, the entire
multi-instance task fails. The multi-instance task is then terminated and retried, up to its
retry limit.

When you delete a multi-instance task, the primary and all subtasks are also deleted by
the Batch service. All subtask directories and their files are deleted from the compute
nodes, just as for a standard task.

TaskConstraints for a multi-instance task, such as the MaxTaskRetryCount,


MaxWallClockTime, and RetentionTime properties, are honored as they are for a
standard task, and apply to the primary and all subtasks. However, if you change
theRetentionTime property after adding the multi-instance task to the job, this change is
applied only to the primary task, and all of the subtasks continue to use the original
RetentionTime.

A compute node's recent task list reflects the ID of a subtask if the recent task was part
of a multi-instance task.

Obtain information about subtasks


To obtain information on subtasks by using the Batch .NET library, call the
CloudTask.ListSubtasks method. This method returns information on all subtasks, and
information about the compute node that executed the tasks. From this information,
you can determine each subtask's root directory, the pool ID, its current state, exit code,
and more. You can use this information in combination with the
PoolOperations.GetNodeFile method to obtain the subtask's files. Note that this method
does not return information for the primary task (ID 0).
7 Note

Unless otherwise stated, Batch .NET methods that operate on the multi-instance
CloudTask itself apply only to the primary task. For example, when you call the
CloudTask.ListNodeFiles method on a multi-instance task, only the primary task's
files are returned.

The following code snippet shows how to obtain subtask information, as well as request
file contents from the nodes on which they executed.

C#

// Obtain the job and the multi-instance task from the Batch service
CloudJob boundJob = batchClient.JobOperations.GetJob("mybatchjob");
CloudTask myMultiInstanceTask = boundJob.GetTask("mymultiinstancetask");

// Now obtain the list of subtasks for the task


IPagedEnumerable<SubtaskInformation> subtasks =
myMultiInstanceTask.ListSubtasks();

// Asynchronously iterate over the subtasks and print their stdout and
stderr
// output if the subtask has completed
await subtasks.ForEachAsync(async (subtask) =>
{
Console.WriteLine("subtask: {0}", subtask.Id);
Console.WriteLine("exit code: {0}", subtask.ExitCode);

if (subtask.State == SubtaskState.Completed)
{
ComputeNode node =
await
batchClient.PoolOperations.GetComputeNodeAsync(subtask.ComputeNodeInformatio
n.PoolId,

subtask.ComputeNodeInformation.ComputeNodeId);

NodeFile stdOutFile = await


node.GetNodeFileAsync(subtask.ComputeNodeInformation.TaskRootDirectory +
"\\" + Constants.StandardOutFileName);
NodeFile stdErrFile = await
node.GetNodeFileAsync(subtask.ComputeNodeInformation.TaskRootDirectory +
"\\" + Constants.StandardErrorFileName);
stdOut = await stdOutFile.ReadAsStringAsync();
stdErr = await stdErrFile.ReadAsStringAsync();

Console.WriteLine("node: {0}:", node.Id);


Console.WriteLine("stdout.txt: {0}", stdOut);
Console.WriteLine("stderr.txt: {0}", stdErr);
}
else
{
Console.WriteLine("\tSubtask {0} is in state {1}", subtask.Id,
subtask.State);
}
});

Code sample
The MultiInstanceTasks code sample on GitHub demonstrates how to use a multi-
instance task to run an MS-MPI application on Batch compute nodes. Follow the steps
below to run the sample.

Preparation
1. Download the MS-MPI SDK and Redist installers and install them. After installation
you can verify that the MS-MPI environment variables have been set.
2. Build a Release version of the MPIHelloWorld sample MPI program. This is the
program that will be run on compute nodes by the multi-instance task.
3. Create a zip file containing MPIHelloWorld.exe (which you built in step 2) and
MSMpiSetup.exe (which you downloaded in step 1). You'll upload this zip file as an

application package in the next step.


4. Use the Azure portal to create a Batch application called "MPIHelloWorld", and
specify the zip file you created in the previous step as version "1.0" of the
application package. See Upload and manage applications for more information.

 Tip

Building a Release version of MPIHelloWorld.exe ensures that you don't have to


include any additional dependencies (for example, msvcp140d.dll or
vcruntime140d.dll ) in your application package.

Execution
1. Download the azure-batch-samples .zip file from GitHub.

2. Open the MultiInstanceTasks solution in Visual Studio 2019. The


MultiInstanceTasks.sln solution file is located in:

azure-batch-samples\CSharp\ArticleProjects\MultiInstanceTasks\
3. Enter your Batch and Storage account credentials in AccountSettings.settings in
the Microsoft.Azure.Batch.Samples.Common project.

4. Build and run the MultiInstanceTasks solution to execute the MPI sample
application on compute nodes in a Batch pool.

5. Optional: Use the Azure portal or Batch Explorer to examine the sample pool,
job, and task ("MultiInstanceSamplePool", "MultiInstanceSampleJob",
"MultiInstanceSampleTask") before you delete the resources.

 Tip

You can download Visual Studio Community for free if you don't already have
Visual Studio.

Output from MultiInstanceTasks.exe is similar to the following:

Creating pool [MultiInstanceSamplePool]...


Creating job [MultiInstanceSampleJob]...
Adding task [MultiInstanceSampleTask] to job [MultiInstanceSampleJob]...
Awaiting task completion, timeout in 00:30:00...

Main task [MultiInstanceSampleTask] is in state [Completed] and ran on


compute node [tvm-1219235766_1-20161017t162002z]:
---- stdout.txt ----
Rank 2 received string "Hello world" from Rank 0
Rank 1 received string "Hello world" from Rank 0

---- stderr.txt ----

Main task completed, waiting 00:00:10 for subtasks to complete...

---- Subtask information ----


subtask: 1
exit code: 0
node: tvm-1219235766_3-20161017t162002z
stdout.txt:
stderr.txt:
subtask: 2
exit code: 0
node: tvm-1219235766_2-20161017t162002z
stdout.txt:
stderr.txt:

Delete job? [yes] no: yes


Delete pool? [yes] no: yes
Sample complete, hit ENTER to exit...

Next steps
Read more about MPI support for Linux on Azure Batch.
Learn how to create pools of Linux compute nodes for use in your Azure Batch MPI
solutions.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Use Azure Batch to run container
workloads
Article • 06/05/2024

U Caution

This article references CentOS, a Linux distribution that is nearing End Of Life (EOL)
status. Please consider your use and planning accordingly. For more information,
see the CentOS End Of Life guidance.

Azure Batch lets you run and scale large numbers of batch computing jobs on Azure.
Batch tasks can run directly on virtual machines (nodes) in a Batch pool, but you can
also set up a Batch pool to run tasks in Docker-compatible containers on the nodes. This
article shows you how to create a pool of compute nodes that support running
container tasks, and then run container tasks on the pool.

The code examples here use the Batch .NET and Python SDKs. You can also use other
Batch SDKs and tools, including the Azure portal, to create container-enabled Batch
pools and to run container tasks.

Why use containers?


Containers provide an easy way to run Batch tasks without having to manage an
environment and dependencies to run applications. Containers deploy applications as
lightweight, portable, self-sufficient units that can run in several different environments.
For example, build and test a container locally, then upload the container image to a
registry in Azure or elsewhere. The container deployment model ensures that the
runtime environment of your application is always correctly installed and configured
wherever you host the application. Container-based tasks in Batch can also take
advantage of features of non-container tasks, including application packages and
management of resource files and output files.

Prerequisites
You should be familiar with container concepts and how to create a Batch pool and job.

SDK versions: The Batch SDKs support container images as of the following
versions:
Batch REST API version 2017-09-01.6.0
Batch .NET SDK version 8.0.0
Batch Python SDK version 4.0
Batch Java SDK version 3.0
Batch Node.js SDK version 3.0

Accounts: In your Azure subscription, you need to create a Batch account and
optionally an Azure Storage account.

A supported virtual machine (VM) image: Containers are only supported in pools
created with the Virtual Machine Configuration, from a supported image (listed in
the next section). If you provide a custom image, see the considerations in the
following section and the requirements in Use a managed image to create a
custom image pool.

7 Note

From Batch SDK versions:

Batch .NET SDK version 16.0.0


Batch Python SDK version 14.0.0
Batch Java SDK version 11.0.0
Batch Node.js SDK version 11.0.0

Currently, the containerConfiguration requires Type property to be passed and the


supported values are: ContainerType.DockerCompatible and
ContainerType.CriCompatible .

Keep in mind the following limitations:

Batch provides remote direct memory access (RDMA) support only for containers
that run on Linux pools.
For Windows container workloads, you should choose a multicore VM size for your
pool.

) Important

Docker, by default, creates a network bridge with a subnet specification of


172.17.0.0/16 . If you are specifying a virtual network for your pool, ensure that

there are no conflicting IP ranges.


Supported VM images
Use one of the following supported Windows or Linux images to create a pool of VM
compute nodes for container workloads. For more information about Marketplace
images that are compatible with Batch, see List of virtual machine images.

Windows support
Batch supports Windows server images that have container support designations. The
API to list all supported images in Batch denotes a DockerCompatible capability if the
image supports Docker containers. Batch allows, but doesn't directly support, images
published by Mirantis with capability noted as DockerCompatible . These images may
only be deployed under a User Subscription pool allocation mode Batch account.

You can also create a custom image to enable container functionality on Windows.

7 Note

The image SKUs -with-containers or -with-containers-smalldisk are retired.


Please see the announcement for details and alternative container runtime
options.

Linux support
For Linux container workloads, Batch currently supports the following Linux images
published in the Azure Marketplace without the need for a custom image.

Publisher: microsoft-dsvm
Offer: ubuntu-hpc
Publisher: almalinux
Offer: 8-hpc-gen1
Offer: 8-hpc-gen2

Alternate image options

Currently there are other images published by microsoft-azure-batch that support


container workloads:

Publisher: microsoft-azure-batch
Offer: centos-container
Offer: centos-container-rdma (For use exclusively on VM SKUs with Infiniband)
Offer: ubuntu-server-container
Offer: ubuntu-server-container-rdma (For use exclusively on VM SKUs with
Infiniband)

2 Warning

It is recommended to use images other than those published by microsoft-azure-


batch as these images are deprecated due to imminent image end-of-life.

Notes

The docker data root of the above images lies in different places:

For the Azure Batch published microsoft-azure-batch images (Offer: centos-


container-rdma , etc.), the docker data root is mapped to /mnt/batch/docker, which

is located on the temporary disk.


For the HPC image, or microsoft-dsvm (Offer: ubuntu-hpc , etc.), the docker data
root is unchanged from the Docker default, which is /var/lib/docker on Linux and
C:\ProgramData\Docker on Windows. These folders are located on the OS disk.

For non-Batch published images, the OS disk has the potential risk of being filled up
quickly as container images are downloaded.

Potential solutions for customers


Change the docker data root in a start task when creating a pool in BatchExplorer.
Here's an example of the Start Task command:

C#

1) sudo systemctl stop docker


2) sudo vi /lib/systemd/system/docker.service
+++
FROM:
ExecStart=/usr/bin/docker daemon -H fd://
TO:
ExecStart=/usr/bin/docker daemon -g /new/path/docker -H fd://
+++
3) sudo systemctl daemon-reload
4) sudo systemctl start docker
These images are only supported for use in Azure Batch pools and are geared for
Docker container execution. They feature:

A pre-installed Docker-compatible Moby container runtime .


Pre-installed NVIDIA GPU drivers and NVIDIA container runtime, to streamline
deployment on Azure N-series VMs.
VM images with the suffix of -rdma are pre-configured with support for InfiniBand
RDMA VM sizes. These VM images shouldn't be used with VM sizes that don't have
InfiniBand support.

You can also create custom images compatible for Batch containers on one of the Linux
distributions that's compatible with Batch. For Docker support on a custom image,
install a suitable Docker-compatible runtime, such as a version of Docker or Mirantis
Container Runtime . Installing just a Docker-CLI compatible tool is insufficient; a
Docker Engine compatible runtime is required.

) Important

Neither Microsoft or Azure Batch will provide support for issues related to Docker
(any version or edition), Mirantis Container Runtime, or Moby runtimes. Customers
electing to use these runtimes in their images should reach out to the company or
entity providing support for runtime issues.

More considerations for using a custom Linux image:

To take advantage of the GPU performance of Azure N-series sizes when using a
custom image, pre-install NVIDIA drivers. Also, you need to install the Docker
Engine Utility for NVIDIA GPUs, NVIDIA Docker .
To access the Azure RDMA network, use an RDMA-capable VM size. Necessary
RDMA drivers are installed in the CentOS HPC and Ubuntu images supported by
Batch. Extra configuration may be needed to run MPI workloads. See Use RDMA or
GPU instances in Batch pool.

Container configuration for Batch pool


To enable a Batch pool to run container workloads, you must specify
ContainerConfiguration settings in the pool's VirtualMachineConfiguration object. This
article provides links to the Batch .NET API reference. Corresponding settings are in the
Batch Python API.
You can create a container-enabled pool with or without prefetched container images,
as shown in the following examples. The pull (or prefetch) process lets you preload
container images from either Docker Hub or another container registry on the Internet.
For best performance, use an Azure container registry in the same region as the Batch
account.

The advantage of prefetching container images is that when tasks first start running,
they don't have to wait for the container image to download. The container
configuration pulls container images to the VMs when the pool is created. Tasks that run
on the pool can then reference the list of container images and container run options.

7 Note

Docker Hub limits the number of image pulls. Ensure that your workload doesn't
exceed published rate limits for Docker Hub-based images. It's recommended to
use Azure Container Registry directly or leverage Artifact cache in ACR.

Pool without prefetched container images


To configure a container-enabled pool without prefetched container images, define
ContainerConfiguration and VirtualMachineConfiguration objects as shown in the

following examples. These examples use the Ubuntu Server for Azure Batch container
pools image from the Marketplace.

Note: Ubuntu server version used in the example is for illustration purposes. Feel free to
change the node_agent_sku_id to the version you're using.

Python

image_ref_to_use = batch.models.ImageReference(
publisher='microsoft-dsvm',
offer='ubuntu-hpc',
sku='2204',
version='latest')

"""
Specify container configuration. This is required even though there are no
prefetched images.
"""

container_conf = batch.models.ContainerConfiguration()

new_pool = batch.models.PoolAddParameter(
id=pool_id,
virtual_machine_configuration=batch.models.VirtualMachineConfiguration(
image_reference=image_ref_to_use,
container_configuration=container_conf,
node_agent_sku_id='batch.node.ubuntu 22.04'),
vm_size='STANDARD_D2S_V3',
target_dedicated_nodes=1)
...

C#

ImageReference imageReference = new ImageReference(


publisher: "microsoft-dsvm",
offer: "ubuntu-hpc",
sku: "2204",
version: "latest");

// Specify container configuration. This is required even though there are


no prefetched images.
ContainerConfiguration containerConfig = new ContainerConfiguration();

// VM configuration
VirtualMachineConfiguration virtualMachineConfiguration = new
VirtualMachineConfiguration(
imageReference: imageReference,
nodeAgentSkuId: "batch.node.ubuntu 22.04");
virtualMachineConfiguration.ContainerConfiguration = containerConfig;

// Create pool
CloudPool pool = batchClient.PoolOperations.CreatePool(
poolId: poolId,
targetDedicatedComputeNodes: 1,
virtualMachineSize: "STANDARD_D2S_V3",
virtualMachineConfiguration: virtualMachineConfiguration);

Prefetch images for container configuration


To prefetch container images on the pool, add the list of container images
( container_image_names in Python) to the ContainerConfiguration .

The following basic Python example shows how to prefetch a standard Ubuntu
container image from Docker Hub .

Python

image_ref_to_use = batch.models.ImageReference(
publisher='microsoft-dsvm',
offer='ubuntu-hpc',
sku='2204',
version='latest')

"""
Specify container configuration, fetching the official Ubuntu container
image from Docker Hub.
"""

container_conf = batch.models.ContainerConfiguration(
container_image_names=['ubuntu'])

new_pool = batch.models.PoolAddParameter(
id=pool_id,
virtual_machine_configuration=batch.models.VirtualMachineConfiguration(
image_reference=image_ref_to_use,
container_configuration=container_conf,
node_agent_sku_id='batch.node.ubuntu 22.04'),
vm_size='STANDARD_D2S_V3',
target_dedicated_nodes=1)
...

The following C# example assumes that you want to prefetch a TensorFlow image from
Docker Hub . This example includes a start task that runs in the VM host on the pool
nodes. You might run a start task in the host, for example, to mount a file server that can
be accessed from the containers.

C#

ImageReference imageReference = new ImageReference(


publisher: "microsoft-dsvm",
offer: "ubuntu-hpc",
sku: "2204",
version: "latest");

ContainerRegistry containerRegistry = new ContainerRegistry(


registryServer: "https://hub.docker.com",
userName: "UserName",
password: "YourPassword"
);

// Specify container configuration, prefetching Docker images


ContainerConfiguration containerConfig = new ContainerConfiguration();
containerConfig.ContainerImageNames = new List<string> {
"tensorflow/tensorflow:latest-gpu" };
containerConfig.ContainerRegistries = new List<ContainerRegistry> {
containerRegistry };

// VM configuration
VirtualMachineConfiguration virtualMachineConfiguration = new
VirtualMachineConfiguration(
imageReference: imageReference,
nodeAgentSkuId: "batch.node.ubuntu 22.04");
virtualMachineConfiguration.ContainerConfiguration = containerConfig;

// Set a native host command line start task


StartTask startTaskContainer = new StartTask( commandLine: "<native-host-
command-line>" );

// Create pool
CloudPool pool = batchClient.PoolOperations.CreatePool(
poolId: poolId,
virtualMachineSize: "Standard_NC6S_V3",
virtualMachineConfiguration: virtualMachineConfiguration);

// Start the task in the pool


pool.StartTask = startTaskContainer;
...

Prefetch images from a private container registry


You can also prefetch container images by authenticating to a private container registry
server. In the following examples, the ContainerConfiguration and
VirtualMachineConfiguration objects prefetch a private TensorFlow image from a

private Azure container registry. The image reference is the same as in the previous
example.

Python

image_ref_to_use = batch.models.ImageReference(
publisher='microsoft-dsvm',
offer='ubuntu-hpc',
sku='2204',
version='latest')

# Specify a container registry


container_registry = batch.models.ContainerRegistry(
registry_server="myRegistry.azurecr.io",
user_name="myUsername",
password="myPassword")

# Create container configuration, prefetching Docker images from the


container registry
container_conf = batch.models.ContainerConfiguration(
container_image_names = ["myRegistry.azurecr.io/samples/myImage"],
container_registries =[container_registry])

new_pool = batch.models.PoolAddParameter(
id="myPool",

virtual_machine_configuration=batch.models.VirtualMachineConfiguration(
image_reference=image_ref_to_use,
container_configuration=container_conf,
node_agent_sku_id='batch.node.ubuntu 22.04'),
vm_size='STANDARD_D2S_V3',
target_dedicated_nodes=1)
C#

// Specify a container registry


ContainerRegistry containerRegistry = new ContainerRegistry(
registryServer: "myContainerRegistry.azurecr.io",
userName: "myUserName",
password: "myPassword");

// Create container configuration, prefetching Docker images from the


container registry
ContainerConfiguration containerConfig = new ContainerConfiguration();
containerConfig.ContainerImageNames = new List<string> {
"myContainerRegistry.azurecr.io/tensorflow/tensorflow:latest-gpu" };
containerConfig.ContainerRegistries = new List<ContainerRegistry> {
containerRegistry } );

// VM configuration
VirtualMachineConfiguration virtualMachineConfiguration = new
VirtualMachineConfiguration(
imageReference: imageReference,
nodeAgentSkuId: "batch.node.ubuntu 22.04");
virtualMachineConfiguration.ContainerConfiguration = containerConfig;

// Create pool
CloudPool pool = batchClient.PoolOperations.CreatePool(
poolId: poolId,
targetDedicatedComputeNodes: 2,
virtualMachineSize: "Standard_NC6S_V3",
virtualMachineConfiguration: virtualMachineConfiguration);
...

Managed identity support for ACR


When you access containers stored in Azure Container Registry , either a
username/password or a managed identity can be used to authenticate with the service.
To use a managed identity, first ensure that the identity has been assigned to the pool
and that the identity has the AcrPull role assigned for the container registry you wish
to access. Then, instruct Batch with which identity to use when authenticating with ACR.

C#

ContainerRegistry containerRegistry = new ContainerRegistry(


registryServer: "myContainerRegistry.azurecr.io",
identityReference: new ComputeNodeIdentityReference() { ResourceId =
"/subscriptions/SUB/resourceGroups/RG/providers/Microsoft.ManagedIdentity/us
erAssignedIdentities/identity-name" }
);

// Create container configuration, prefetching Docker images from the


container registry
ContainerConfiguration containerConfig = new ContainerConfiguration();
containerConfig.ContainerImageNames = new List<string> {
"myContainerRegistry.azurecr.io/tensorflow/tensorflow:latest-gpu" };
containerConfig.ContainerRegistries = new List<ContainerRegistry> {
containerRegistry } );

// VM configuration
VirtualMachineConfiguration virtualMachineConfiguration = new
VirtualMachineConfiguration(
imageReference: imageReference,
nodeAgentSkuId: "batch.node.ubuntu 22.04");
virtualMachineConfiguration.ContainerConfiguration = containerConfig;

// Create pool
CloudPool pool = batchClient.PoolOperations.CreatePool(
poolId: poolId,
targetDedicatedComputeNodes: 2,
virtualMachineSize: "Standard_NC6S_V3",
virtualMachineConfiguration: virtualMachineConfiguration);
...

Container settings for the task


To run a container task on a container-enabled pool, specify container-specific settings.
Settings include the image to use, registry, and container run options.

Use the ContainerSettings property of the task classes to configure container-


specific settings. These settings are defined by the TaskContainerSettings class. The
--rm container option doesn't require another --runtime option since it's taken

care of by Batch.

If you run tasks on container images, the cloud task and job manager task require
container settings. However, the start task, job preparation task, and job release
task don't require container settings (that is, they can run within a container
context or directly on the node).

For Linux, Batch maps the user/group permission to the container. If access to any
folder within the container requires Administrator permission, you may need to run
the task as pool scope with admin elevation level. This ensures that Batch runs the
task as root in the container context. Otherwise, a non-admin user might not have
access to those folders.

For container pools with GPU-enabled hardware, Batch automatically enables GPU
for container tasks, so you shouldn't include the –gpus argument.

Container task command line


When you run a container task, Batch automatically uses the docker create command
to create a container using the image specified in the task. Batch then controls task
execution in the container.

As with non-container Batch tasks, you set a command line for a container task. Because
Batch automatically creates the container, the command line only specifies the
command or commands that run in the container.

The following are the default behaviors Batch applies to Docker container tasks:

Batch will run the container with the specified task commandline as the CMD .
Batch won't interfere with the specified ENTRYPOINT of the container image.
Batch will override the WORKDIR with the Batch task working directory.

Ensure that you review the Docker documentation between ENTRYPOINT and CMD so
you understand the interaction effects that can arise when container images have a
specified ENTRYPOINT and you also specify a task commandline.

If you would like to override the container image ENTRYPOINT, you can specify the --
entrypoint <args> argument as a containerRunOption. Refer to the optional

ContainerRunOptions for arguments that you can provide to the docker create
command that Batch uses to create and run the container. For example, to set a working
directory for the container, set the --workdir <directory> option.

The following are some examples of container image and Batch container options or
task command lines and their effect:

Container image ENTRYPOINT isn't specified, and Batch task commandline is


"/bin/sh -c python myscript.py".
Batch creates the container with the Batch task commandline as specified and
runs it in the Batch task working directory. This may result in failure if
"myscript.py" isn't in the Batch task working directory.
If the task commandline was specified as "/bin/sh -c python
/path/to/script/myscript.py", then this task may work correctly even with the
working directory set as the Batch task working directory if all dependencies for
the script are satisfied.
Container image ENTRYPOINT is specified as "./myscript.sh", and Batch task
commandline is empty.
Batch creates the container relying on the ENTRYPOINT and runs it in the Batch
task working directory. This task may result in failure if the container image
WORKDIR isn't the same as the Batch task working directory, which is
dependent upon various factors such as the operating system, job ID, task ID,
etc.
If "--workdir /path/to/script" was specified as a containerRunOption, then this
task may work correctly if all dependencies for the script are satisfied.
Container image ENTRYPOINT isn't specified, Batch task commandline is
"./myscript.sh", and WORKDIR is overridden in ContainerRunOptions as "--workdir
/path/to/script".
Batch creates the container with the working directory to "/path/to/script" and
execute the commandline "./myscript.sh", which is successful as the script is
found in the specified working directory.

Container task working directory


A Batch container task executes in a working directory in the container that's similar to
the directory that Batch sets up for a regular (non-container) task. This working directory
is different from the WORKDIR if configured in the image, or the default container
working directory ( C:\ on a Windows container, or / on a Linux container).

For a Batch container task:

All directories recursively below the AZ_BATCH_NODE_ROOT_DIR on the host node (the
root of Azure Batch directories) are mapped into the container.
All task environment variables are mapped into the container.
The task working directory AZ_BATCH_TASK_WORKING_DIR on the node is set the same
as for a regular task and mapped into the container.

) Important

For Windows container pools on VM families with ephemeral disks, the entire
ephemeral disk is mapped to container space due to Windows container
limitations.

These mappings allow you to work with container tasks in much the same way as non-
container tasks. For example, install applications using application packages, access
resource files from Azure Storage, use task environment settings, and persist task output
files after the container stops.

Regardless of how the WORKDIR is set for a container image, both stdout.txt and
stderr.txt are captured into the AZ_BATCH_TASK_DIR .

Troubleshoot container tasks


If your container task doesn't run as expected, you might need to get information about
the WORKDIR or ENTRYPOINT configuration of the container image. To see the
configuration, run the docker image inspect command.

If needed, adjust the settings of the container task based on the image:

Specify an absolute path in the task command line. If the image's default
ENTRYPOINT is used for the task command line, ensure that an absolute path is
set.
In the task's container run options, change the working directory to match the
WORKDIR in the image. For example, set --workdir /app .

Container task examples


The following Python snippet shows a basic command line running in a container
created from a fictitious image pulled from Docker Hub. Here, the --rm container
option removes the container after the task finishes, and the --workdir option sets a
working directory. The command line overrides the container ENTRYPOINT with a simple
shell command that writes a small file to the task working directory on the host.

Python

task_id = 'sampletask'
task_container_settings = batch.models.TaskContainerSettings(
image_name='myimage',
container_run_options='--rm --workdir /')
task = batch.models.TaskAddParameter(
id=task_id,
command_line='/bin/sh -c \"echo \'hello world\' >
$AZ_BATCH_TASK_WORKING_DIR/output.txt\"',
container_settings=task_container_settings
)

The following C# example shows basic container settings for a cloud task:

C#

// Simple container task command


string cmdLine = "c:\\app\\myApp.exe";

TaskContainerSettings cmdContainerSettings = new TaskContainerSettings (


imageName: "myimage",
containerRunOptions: "--rm --workdir c:\\app"
);

CloudTask containerTask = new CloudTask (


id: "Task1",
commandline: cmdLine);
containerTask.ContainerSettings = cmdContainerSettings;

Next steps
For information on installing and using Docker CE on Linux, see the Docker
documentation .
Learn how to Use a managed image to create a custom image pool.
Learn more about the Moby project , a framework for creating container-based
systems.
Rendering using Azure
07/01/2025

Rendering is the process of taking 3D models and converting them into 2D images. 3D scene
files are authored in applications such as Autodesk 3ds Max, Autodesk Maya, and Blender.
Rendering applications such as Autodesk Maya, Autodesk Arnold, Chaos Group V-Ray, and
Blender Cycles produce 2D images. Sometimes single images are created from the scene files.
However, it's common to model and render multiple images, and then combine them in an
animation.

The rendering workload is heavily used for special effects (VFX) in the Media and Entertainment
industry. Rendering is also used in many other industries such as advertising, retail, oil and gas,
and manufacturing.

The process of rendering is computationally intensive; there can be many frames/images to


produce and each image can take many hours to render. Rendering is therefore a perfect batch
processing workload that can use Azure to run many renders in parallel and utilize a wide
range of hardware, including GPUs.

Why use Azure for rendering?


For many reasons, rendering is a workload perfectly suited for Azure:

Rendering jobs can be split into many pieces that can be run in parallel using multiple
VMs:
Animations consist of many frames and each frame can be rendered in parallel. The
more VMs available to process each frame, the faster all the frames and the animation
can be produced.
Some rendering software allows single frames to be broken up into multiple pieces,
such as tiles or slices. Each piece can be rendered separately, then combined into the
final image when all pieces are finished. The more VMs that are available, the faster a
frame can be rendered.
Rendering projects can require huge scale:
Individual frames can be complex and require many hours to render, even on high-end
hardware; animations can consist of hundreds of thousands of frames. A huge amount
of compute is required to render high-quality animations in a reasonable amount of
time. In some cases, over 100,000 cores are being used to render thousands of frames
in parallel.
Rendering projects are project-based and require varying amounts of compute:
Allocate compute and storage capacity when required, scale it up or down according
to load during a project, and remove it when a project is finished.
Pay for capacity when allocated, but don’t pay for it when there's no load, such as
between projects.
Cater for bursts due to unexpected changes; scale higher if there are unexpected
changes late in a project and those changes need to be processed on a tight schedule.
Choose from a wide selection of hardware according to application, workload, and
timeframe:
There’s a wide selection of hardware available in Azure that can be allocated and
managed with Batch.
Depending on the project, the requirement may be for the best price/performance or
the best overall performance. Different scenes and/or rendering applications can have
different memory requirements. Some rendering applications can use GPUs for the
best performance or certain features.
Low-priority or Azure Spot VMs reduce cost:
Low-priority and Spot VMs are available for a large discount compared to standard
VMs and are suitable for some job types.

Existing on-premises rendering environment


The most common case is for there to be an existing on-premises render farm that's managed
by a render management application such as PipelineFX Qube, Royal Render, Thinkbox
Deadline, or a custom application. The requirement is to extend the on-premises render farm
capacity using Azure VMs.

Azure infrastructure and services are used to create a hybrid environment where Azure is used
to supplement the on-premises capacity. For example:

Use a Virtual Network to place the Azure resources on the same network as the on-
premises render farm.
Use Avere vFXT for Azure or Azure HPC Cache to cache source files in Azure to reduce
bandwidth use and latency, maximizing performance.
Ensure the existing license server is on the virtual network and purchase more licenses as
required to cater for the extra Azure-based capacity.

No existing render farm


Client workstations may be performing rendering, but the rendering load is increasing and it's
taking too long to solely use workstation capacity.

There are two main options available:


Deploy an on-premises render manager, such as Royal Render, and configure a hybrid
environment to use Azure when further capacity or performance is required. A render
manager is specially tailored for rendering workloads and will include plug-ins for the
popular client applications, enabling easy submission of rendering jobs.

A custom solution using Azure Batch to allocate and manage the compute capacity and
providing the job scheduling to run the render jobs.

Next steps
Learn more about Azure Batch rendering capabilities.
Azure Batch rendering capabilities
07/01/2025

Standard Azure Batch capabilities are used to run rendering workloads and applications. Batch
also includes specific features to support rendering workloads.

For an overview of Batch concepts, including pools, jobs, and tasks, see this article.

Batch pools using custom VM images and standard


application licensing
As with other workloads and types of application, a custom VM image can be created with the
required rendering applications and plug-ins. The custom VM image is placed in the Azure
Compute Gallery and can be used to create Batch Pools.

The task command line strings will need to reference the applications and paths used when
creating the custom VM image.

Most rendering applications will require licenses obtained from a license server. If there's an
existing on-premises license server, then both the pool and license server need to be on the
same virtual network. It is also possible to run a license server on an Azure VM, with the Batch
pool and license server VM being on the same virtual network.

Batch pools using custom VM images


A custom image from the Azure Compute Gallery:
Using this option, you can configure your VM with the exact applications and specific
versions that you require. For more information, see Create a pool with the Azure
Compute Gallery. Autodesk and Chaos Group have modified Arnold and V-Ray,
respectively, to validate against an Azure Batch licensing service. Make sure you have
the versions of these applications with this support, otherwise the pay-per-use
licensing won't work. Current versions of Maya or 3ds Max don't require a license
server when running headless (in batch/command-line mode). Contact Azure support
if you're not sure how to proceed with this option.
Application packages:
Package the application files using one or more ZIP files, upload via the Azure portal,
and specify the package in pool configuration. When pool VMs are created, the ZIP
files are downloaded and the files extracted.
Resource files:
Application files are uploaded to Azure blob storage, and you specify file references in
the pool start task. When pool VMs are created, the resource files are downloaded
onto each VM.

Azure VM families
As with other workloads, rendering application system requirements vary, and performance
requirements vary for jobs and projects. A large variety of VM families are available in Azure
depending on your requirements – lowest cost, best price/performance, best performance, and
so on. Some rendering applications, such as Arnold, are CPU-based; others such as V-Ray and
Blender Cycles can use CPUs and/or GPUs. For a description of available VM families and VM
sizes, see VM types and sizes.

Spot VMs
As with other workloads, Azure Spot VMs can be utilized in Batch pools for rendering. Spot
VMs perform the same as regular dedicated VMs but utilize surplus Azure capacity and are
available for a large discount. The tradeoff for using Spot VMs is that those VMs may not be
available to be allocated or may be preempted at any time, depending on available capacity.
For this reason, Spot VMs aren't going to be suitable for all rendering jobs. For example, if
images take many hours to render then it's likely that having the rendering of those images
interrupted and restarted due to VMs being preempted wouldn't be acceptable.

For more information about the characteristics of Spot VMs and the various ways to configure
them using Batch, see Use Spot VMs with Batch.

Jobs and tasks


No rendering-specific support is required for jobs and tasks. The main configuration item is the
task command line, which needs to reference the required application. When the Azure
Marketplace VM images are used, then the best practice is to use the environment variables to
specify the path and application executable.

Next steps
Learn about Batch rendering services.
Learn about Storage and data movement options for rendering asset and output files.
Storage and data movement options for
rendering asset and output files
Article • 02/07/2025

There are multiple options for making the scene and asset files available to the
rendering applications on the pool VMs:

Azure Blob Storage:


Scene and asset files are uploaded to blob storage from a local file system.
When the application is run by a task, then the required files are copied from
blob storage onto the VM so they can be accessed by the rendering application.
The output files are written by the rendering application to the VM disk and
then copied to blob storage. If necessary, the output files can be downloaded
from blob storage to a local file system.
Azure Blob Storage is a simple and cost-effective option for smaller projects. As
all asset files are required on each pool VM, then once the number and size of
asset files increases care needs to be taken to ensure the file transfers are as
efficient as possible.
Azure storage as a file system using blobfuse:
For Linux VMs, a storage account can be exposed and used as a file system
when the blobfuse virtual file system driver is used.
This option has the advantage that it is cost-effective, as no VMs are required
for the file system, plus blobfuse caching on the VMs avoids repeated
downloads of the same files for multiple jobs and tasks. Data movement is also
simple as the files are simply blobs and standard APIs and tools, such as azcopy,
can be used to copy file between an on-premises file system and Azure storage.
File system or file share:
Depending on VM operating system and performance/scale requirements, then
options include Azure Files, using a VM with attached disks for NFS, using
multiple VMs with attached disks for a distributed file system like GlusterFS, or
using a third-party offering.
Avere Systems is now part of Microsoft and will have solutions soon that are
ideal for large-scale, high-performance rendering. The Avere solution enable an
Azure-based NFS or SMB cache to be created that works with blob storage or
with on-premises NAS devices.
With a file system, files can be read or written directly to the file system or can
be copied between file system and the pool VMs.
A shared file system allows a large number of assets shared between projects
and jobs to be utilized, with rendering tasks only accessing what is required.
Using Azure Blob Storage
A blob storage account or a general-purpose v2 storage account should be used. These
two storage account types can be configured with higher limits compared to a general-
purpose v1 storage account, as detailed in this blog post . When configured, the
higher limits enable better performance and scalability, especially when there are many
pool VMs accessing the storage account.

Copying files between client and blob storage


To copy files to and from Azure storage, various mechanisms can be used including the
storage blob API, the Azure Storage Data Movement Library , the azcopy command
line tool for Windows or Linux, Azure Storage Explorer , and Azure Batch Explorer .

For example, using azcopy, all assets in a folder can be transferred as follows:

azcopy /source:. /dest:https://account.blob.core.windows.net/rendering/project

/destsas:"?st=2018-03-30T16%3A26%3A00Z&se=2020-03-31T16%3A26%3A00Z&sp=rwdl&sv=2017-

04-17&sr=c&sig=sig" /Y

To copy only modified files, the /XO parameter can be used:

azcopy /source:. /dest:https://account.blob.core.windows.net/rendering/project


/destsas:"?st=2018-03-30T16%3A26%3A00Z&se=2020-03-31T16%3A26%3A00Z&sp=rwdl&sv=2017-

04-17&sr=c&sig=sig" /XO /Y

Copying input asset files from blob storage to Batch pool


VMs
There are a couple of different approaches to copy files with the best approach
determined by the size of the job assets. The simplest approach is to copy all the asset
files to the pool VMs for each job:

When there are files unique to a job, but are required for all the tasks of a job, then
a job preparation task can be specified to copy all the files. The job preparation
task is run once when the first job task is executed on a VM but is not run again for
subsequent job tasks.
When a job release task required to be specified to remove the per-job files once
the job has completed; this will avoid the VM disk getting filled by all the job asset
files.
When there are multiple jobs using the same assets, with only incremental changes
to the assets for each job, then all asset files are still copied, even if only a subset
were updated. This would be inefficient when there are lots of large asset files.

When asset files are reused between jobs, with only incremental changes between jobs,
then a more efficient but slightly more involved approach is to store assets in the shared
folder on the VM and sync changed files.

The job preparation task would perform the copy using azcopy with the /XO
parameter to the VM shared folder specified by AZ_BATCH_NODE_SHARED_DIR
environment variable. This will only copy changed files to each VM.
Thought will have to be given to the size of all assets to ensure they'll fit on the
temporary drive of the pool VMs.

Azure Batch has built-in support to copy files between a storage account and Batch pool
VMs. Task resource files copy files from storage to pool VMs and could be specified for
the job preparation task. Unfortunately, when there are hundreds of files it's possible to
hit a limit and tasks to fail. When there are large numbers of assets it's recommended to
use the azcopy command line in the job preparation task, which can use wildcards and
has no limit.

Copying output files to blob storage from Batch pool


VMs
Output files can be used copy files from a pool VM to storage. One or more files can be
copied from the VM to a specified storage account once the task has completed. The
rendered output should be copied, but it also may be desirable to store log files.

Using a blobfuse virtual file system for Linux


VM pools
Blobfuse is a virtual file system driver for Azure Blob Storage, which allows you to access
files stored as blobs in a Storage account through the Linux file system.

Pool nodes can mount the file system when started or the mount can happen as part of
a job preparation task – a task that is only run when the first task in a job runs on a
node. Blobfuse can be configured to leverage both a ramdisk and the VMs local SSD for
caching of files, which will increase performance significantly if multiple tasks on a node
access some of the same files.
Sample templates are available to run standalone V-Ray renders using a blobfuse file
system and can be used as the basis for templates for other applications.

Accessing files
Job tasks specify paths for input files and output files using the mounted file system.

Copying input asset files from blob storage to Batch pool


VMs
As files are simply blobs in Azure Storage, then standard blob APIs, tools, and UIs can be
used to copy files between an on-premises file system and blob storage; for example,
azcopy, Storage Explorer, Batch Explorer, etc.

Using Azure Files with Windows VMs


Azure Files offers fully managed file shares in the cloud that are accessible via the SMB
protocol. Azure Files is based on Azure Blob Storage; it's cost-efficient and can be
configured with data replication to another region so globally redundant. Scale targets
should be reviewed to determine if Azure Files should be used given the forecast pool
size and number of asset files.

There's documentation covering how to mount an Azure File share.

Mounting an Azure Files share


To use in Batch, a mount operation needs to be performed each time a task in run as it
isn't possible to persist the connection between tasks. The easiest way to do this is to
use cmdkey to persist credentials using the start task in the pool configuration, then
mount the share before each task.

Example use of cmdkey in a pool template (escaped for use in JSON file) – note that
when separating the cmdkey call from the net use call, the user context for the start task
must be the same as that used for running the tasks:

"startTask": {
"commandLine": "cmdkey /add:storageaccountname.file.core.windows.net
/user:AZURE\\markscuscusbatch /pass:storage_account_key",
"userIdentity":{
"autoUser": {
"elevationLevel": "nonadmin",
"scope": "pool"
}
}

Example job task command line:

"commandLine":"net use S:
\\\\storageaccountname.file.core.windows.net\\rendering &
3dsmaxcmdio.exe -v:5 -rfw:0 -10 -end:10
-bitmapPath:\"s:\\3dsMax\\Dragon\\Assets\"
-outputName:\"s:\\3dsMax\\Dragon\\RenderOutput\\dragon.jpg\"
-w:1280 -h:720
\"s:\\3dsMax\\Dragon\\Assets\\Dragon_Character_Rig.max\""

Accessing files
Job tasks specify paths for input files and output files using the mounted file system,
either using a mapped drive or a UNC path.

Copying input asset files from blob storage to Batch pool


VMs
Azure Files are supported by all the main APIs and tools that have Azure Storage
support; e.g. azcopy, Azure CLI, Storage Explorer, Azure PowerShell, Batch Explorer, etc.

Azure File Sync is available to automatically synchronize files between an on-premises


file system and an Azure File share.

Next steps
For more information about the storage options, see the in-depth documentation:

Azure Blob Storage


Blobfuse
Azure Files

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Reference architectures for Azure
rendering
Article • 02/07/2025

This article shows high-level architecture diagrams for scenarios to extend, or "burst", an
on-premises render farm to Azure. The examples show different options for Azure
compute, networking, and storage services.

Hybrid with NFS or CFS


The following diagram shows a hybrid scenario that includes the following Azure
services:

Compute - Azure Batch pool or Virtual Machine Scale Set.

Network - On-premises: Azure ExpressRoute or VPN. Azure: Azure VNet.

Storage - Input and output files: NFS or CFS using Azure VMs, synchronized with
on-premises storage via Azure File Sync or RSync. Alternatively: Avere vFXT to
input or output files from on-premises NAS devices using NFS.

Hybrid with Blobfuse


The following diagram shows a hybrid scenario that includes the following Azure
services:
Compute - Azure Batch pool or Virtual Machine Scale Set.

Network - On-premises: Azure ExpressRoute or VPN. Azure: Azure VNet.

Storage - Input and output files: Blob storage, mounted to compute resources via
Azure Blobfuse.

Hybrid compute and storage


The following diagram shows a fully connected hybrid scenario for both compute and
storage and includes the following Azure services:

Compute - Azure Batch pool or Virtual Machine Scale Set.

Network - On-premises: Azure ExpressRoute or VPN. Azure: Azure VNet.

Storage - Cross-premises: Avere vFXT. Optional archiving of on-premises files via


Azure Data Box to Blob storage, or on-premises Avere FXT for NAS acceleration.
Next steps
Learn more about options for rendering in Azure.
Learn about using rendering applications with Batch.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Use custom activities in an Azure Data
Factory or Azure Synapse Analytics pipeline
Article • 03/27/2025

APPLIES TO: Azure Data Factory Azure Synapse Analytics

 Tip

Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises.
Microsoft Fabric covers everything from data movement to data science, real-time
analytics, business intelligence, and reporting. Learn how to start a new trial for free!

There are two types of activities that you can use in an Azure Data Factory or Synapse pipeline.

Data movement activities to move data between supported source and sink data stores.
Data transformation activities to transform data using compute services such as Azure
HDInsight and Azure Batch.

To move data to/from a data store that the service does not support, or to transform/process
data in a way that isn't supported by the service, you can create a Custom activity with your
own data movement or transformation logic and use the activity in a pipeline. The custom
activity runs your customized code logic on an Azure Batch pool of virtual machines.

7 Note

We recommend that you use the Azure Az PowerShell module to interact with Azure. To
get started, see Install Azure PowerShell. To learn how to migrate to the Az PowerShell
module, see Migrate Azure PowerShell from AzureRM to Az.

See following articles if you are new to Azure Batch service:

Azure Batch basics for an overview of the Azure Batch service.


New-AzBatchAccount cmdlet to create an Azure Batch account (or) Azure portal to create
the Azure Batch account using Azure portal. See Using PowerShell to manage Azure Batch
Account article for detailed instructions on using the cmdlet.
New-AzBatchPool cmdlet to create an Azure Batch pool.

) Important
When creating a new Azure Batch pool, ‘VirtualMachineConfiguration’ must be used and
NOT ‘CloudServiceConfiguration'.

Add custom activities to a pipeline with UI


To use a Custom activity in a pipeline, complete the following steps:

1. Search for Custom in the pipeline Activities pane, and drag a Custom activity to the
pipeline canvas.

2. Select the new Custom activity on the canvas if it is not already selected.

3. Select the Azure Batch tab to select or create a new Azure Batch linked service that will
execute the custom activity.

4. Select the Settings tab and specify a command to be executed on the Azure Batch, and
optional advanced details.
Azure Batch linked service
The following JSON defines a sample Azure Batch linked service. For details, see Supported
compute environments

JSON

{
"name": "AzureBatchLinkedService",
"properties": {
"type": "AzureBatch",
"typeProperties": {
"accountName": "batchaccount",
"accessKey": {
"type": "SecureString",
"value": "access key"
},
"batchUri": "https://batchaccount.region.batch.azure.com",
"poolName": "poolname",
"linkedServiceName": {
"referenceName": "StorageLinkedService",
"type": "LinkedServiceReference"
}
}
}
}
To learn more about Azure Batch linked service, see Compute linked services article.

Custom activity
The following JSON snippet defines a pipeline with a simple Custom Activity. The activity
definition has a reference to the Azure Batch linked service.

JSON

{
"name": "MyCustomActivityPipeline",
"properties": {
"description": "Custom activity sample",
"activities": [{
"type": "Custom",
"name": "MyCustomActivity",
"linkedServiceName": {
"referenceName": "AzureBatchLinkedService",
"type": "LinkedServiceReference"
},
"typeProperties": {
"command": "helloworld.exe",
"folderPath": "customactv2/helloworld",
"resourceLinkedService": {
"referenceName": "StorageLinkedService",
"type": "LinkedServiceReference"
}
}
}]
}
}

In this sample, the helloworld.exe is a custom application stored in the customactv2/helloworld


folder of the Azure Storage account used in the resourceLinkedService. The Custom activity
submits this custom application to be executed on Azure Batch. You can replace the command
to any preferred application that can be executed on the target Operation System of the Azure
Batch Pool nodes.

The following table describes names and descriptions of properties that are specific to this
activity.

ノ Expand table

Property Description Required

name Name of the activity in the pipeline Yes


Property Description Required

description Text describing what the activity does. No

type For Custom activity, the activity type is Custom. Yes

linkedServiceName Linked Service to Azure Batch. To learn about this linked service, see Yes
Compute linked services article.

command Command of the custom application to be executed. If the Yes


application is already available on the Azure Batch Pool Node, the
resourceLinkedService and folderPath can be skipped. For example,
you can specify the command to be cmd /c dir , which is natively
supported by the Windows Batch Pool node.

resourceLinkedService Azure Storage Linked Service to the Storage account where the No *
custom application is stored

folderPath Path to the folder of the custom application and all its dependencies No *

If you have dependencies stored in subfolders - that is, in a


hierarchical folder structure under folderPath - the folder structure is
currently flattened when the files are copied to Azure Batch. That is,
all files are copied into a single folder with no subfolders. To work
around this behavior, consider compressing the files, copying the
compressed file, and then unzipping it with custom code in the
desired location.

referenceObjects An array of existing Linked Services and Datasets. The referenced No


Linked Services and Datasets are passed to the custom application in
JSON format so your custom code can reference resources of the
service

extendedProperties User-defined properties that can be passed to the custom No


application in JSON format so your custom code can reference
additional properties

retentionTimeInDays The retention time for the files submitted for custom activity. Default No
value is 30 days.

* The properties resourceLinkedService and folderPath must either both be specified or both
be omitted.

7 Note

If you are passing linked services as referenceObjects in Custom Activity, it is a good


security practice to pass an Azure Key Vault enabled linked service (since it does not
contain any secure strings) and fetch the credentials using secret name directly from Key
Vault from the code. You can find an example here that references AKV enabled linked
service, retrieves the credentials from Key Vault, and then accesses the storage in the code.

7 Note

Currently only Azure Blob storage is supported for resourceLinkedService in custom


activity, and it is the only linked service that gets created by default and no option to
choose other connectors like ADLS Gen2.

Custom activity permissions


The custom activity sets the Azure Batch auto-user account to Non-admin access with task
scope (the default auto-user specification). You can't change the permission level of the auto-
user account. For more info, see Run tasks under user accounts in Batch | Auto-user accounts.

Executing commands
You can directly execute a command using Custom Activity. The following example runs the
"echo hello world" command on the target Azure Batch Pool nodes and prints the output to
stdout.

JSON

{
"name": "MyCustomActivity",
"properties": {
"description": "Custom activity sample",
"activities": [{
"type": "Custom",
"name": "MyCustomActivity",
"linkedServiceName": {
"referenceName": "AzureBatchLinkedService",
"type": "LinkedServiceReference"
},
"typeProperties": {
"command": "cmd /c echo hello world"
}
}]
}
}

Passing objects and properties


This sample shows how you can use the referenceObjects and extendedProperties to pass
objects and user-defined properties from the service to your custom application.

JSON

{
"name": "MyCustomActivityPipeline",
"properties": {
"description": "Custom activity sample",
"activities": [{
"type": "Custom",
"name": "MyCustomActivity",
"linkedServiceName": {
"referenceName": "AzureBatchLinkedService",
"type": "LinkedServiceReference"
},
"typeProperties": {
"command": "SampleApp.exe",
"folderPath": "customactv2/SampleApp",
"resourceLinkedService": {
"referenceName": "StorageLinkedService",
"type": "LinkedServiceReference"
},
"referenceObjects": {
"linkedServices": [{
"referenceName": "AzureBatchLinkedService",
"type": "LinkedServiceReference"
}]
},
"extendedProperties": {
"connectionString": {
"type": "SecureString",
"value": "aSampleSecureString"
},
"PropertyBagPropertyName1": "PropertyBagValue1",
"propertyBagPropertyName2": "PropertyBagValue2",
"dateTime1": "2015-04-12T12:13:14Z"
}
}
}]
}
}

When the activity is executed, referenceObjects and extendedProperties are stored in following
files that are deployed to the same execution folder of the SampleApp.exe:

activity.json

Stores extendedProperties and properties of the custom activity.

linkedServices.json
Stores an array of Linked Services defined in the referenceObjects property.

datasets.json

Stores an array of Datasets defined in the referenceObjects property.

Following sample code demonstrate how the SampleApp.exe can access the required
information from JSON files:

C#

using Newtonsoft.Json;
using System;
using System.IO;

namespace SampleApp
{
class Program
{
static void Main(string[] args)
{
//From Extend Properties
dynamic activity =
JsonConvert.DeserializeObject(File.ReadAllText("activity.json"));

Console.WriteLine(activity.typeProperties.extendedProperties.connectionString.valu
e);

// From LinkedServices
dynamic linkedServices =
JsonConvert.DeserializeObject(File.ReadAllText("linkedServices.json"));

Console.WriteLine(linkedServices[0].properties.typeProperties.accountName);
}
}
}

Retrieve execution outputs


You can start a pipeline run using the following PowerShell command:

PowerShell

$runId = Invoke-AzDataFactoryV2Pipeline -DataFactoryName $dataFactoryName -


ResourceGroupName $resourceGroupName -PipelineName $pipelineName

When the pipeline is running, you can check the execution output using the following
commands:
PowerShell

while ($True) {
$result = Get-AzDataFactoryV2ActivityRun -DataFactoryName $dataFactoryName -
ResourceGroupName $resourceGroupName -PipelineRunId $runId -RunStartedAfter (Get-
Date).AddMinutes(-30) -RunStartedBefore (Get-Date).AddMinutes(30)

if(!$result) {
Write-Host "Waiting for pipeline to start..." -foregroundcolor "Yellow"
}
elseif (($result | Where-Object { $_.Status -eq "InProgress" } | Measure-
Object).count -ne 0) {
Write-Host "Pipeline run status: In Progress" -foregroundcolor "Yellow"
}
else {
Write-Host "Pipeline '"$pipelineName"' run finished. Result:" -
foregroundcolor "Yellow"
$result
break
}
($result | Format-List | Out-String)
Start-Sleep -Seconds 15
}

Write-Host "Activity `Output` section:" -foregroundcolor "Yellow"


$result.Output -join "`r`n"

Write-Host "Activity `Error` section:" -foregroundcolor "Yellow"


$result.Error -join "`r`n"

The stdout and stderr of your custom application are saved to the adfjobs container in the
Azure Storage Linked Service you defined when creating Azure Batch Linked Service with a
GUID of the task. You can get the detailed path from Activity Run output as shown in the
following snippet:

Pipeline ' MyCustomActivity' run finished. Result:

ResourceGroupName : resourcegroupname
DataFactoryName : datafactoryname
ActivityName : MyCustomActivity
PipelineRunId : xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
PipelineName : MyCustomActivity
Input : {command}
Output : {exitcode, outputs, effectiveIntegrationRuntime}
LinkedServiceName :
ActivityRunStart : 10/5/2017 3:33:06 PM
ActivityRunEnd : 10/5/2017 3:33:28 PM
DurationInMs : 21203
Status : Succeeded
Error : {errorCode, message, failureType, target}
Activity Output section:
"exitcode": 0
"outputs": [
"https://<container>.blob.core.windows.net/adfjobs/<GUID>/output/stdout.txt",
"https://<container>.blob.core.windows.net/adfjobs/<GUID>/output/stderr.txt"
]
"effectiveIntegrationRuntime": "DefaultIntegrationRuntime (East US)"
Activity Error section:
"errorCode": ""
"message": ""
"failureType": ""
"target": "MyCustomActivity"

If you would like to consume the content of stdout.txt in downstream activities, you can get the
path to the stdout.txt file in expression "@activity('MyCustomActivity').output.outputs[0]".

) Important

The activity.json, linkedServices.json, and datasets.json are stored in the runtime


folder of the Batch task. For this example, the activity.json, linkedServices.json, and
datasets.json are stored in
https://adfv2storage.blob.core.windows.net/adfjobs/<GUID>/runtime/ path. If

needed, you need to clean them up separately.


For Linked Services that use the Self-Hosted Integration Runtime, the sensitive
information like keys or passwords are encrypted by the Self-Hosted Integration
Runtime to ensure credential stays in customer defined private network
environment. Some sensitive fields could be missing when referenced by your
custom application code in this way. Use SecureString in extendedProperties instead
of using Linked Service reference if needed.

Pass outputs to another activity


You can send custom values from your code in a Custom Activity back to the service. You can
do so by writing them into outputs.json from your application. The service copies the content
of outputs.json and appends it into the Activity Output as the value of the customOutput
property. (The size limit is 2MB.) If you want to consume the content of outputs.json in
downstream activities, you can get the value by using the expression
@activity('<MyCustomActivity>').output.customOutput .
Retrieve SecureString outputs
Sensitive property values designated as type SecureString, as shown in some of the examples in
this article, are masked out in the Monitoring tab in the user interface. In actual pipeline
execution, however, a SecureString property is serialized as JSON within the activity.json file
as plain text. For example:

JSON

"extendedProperties": {
"connectionString": {
"type": "SecureString",
"value": "aSampleSecureString"
}
}

This serialization is not truly secure, and is not intended to be secure. The intent is a hint to the
service to mask the value in the Monitoring tab.

To access properties of type SecureString from a custom activity, read the activity.json file,
which is placed in the same folder as your .EXE, deserialize the JSON, and then access the JSON
property (extendedProperties => [propertyName] => value).

Auto-scaling of Azure Batch


You can also create an Azure Batch pool with autoscale feature. For example, you could create
an Azure batch pool with 0 dedicated VMs and an autoscale formula based on the number of
pending tasks.

The sample formula here achieves the following behavior: When the pool is initially created, it
starts with 1 VM. $PendingTasks metric defines the number of tasks in running + active
(queued) state. The formula finds the average number of pending tasks in the last 180 seconds
and sets TargetDedicated accordingly. It ensures that TargetDedicated never goes beyond 25
VMs. So, as new tasks are submitted, pool automatically grows and as tasks complete, VMs
become free one by one and the autoscaling shrinks those VMs. startingNumberOfVMs and
maxNumberofVMs can be adjusted to your needs.

Autoscale formula:

startingNumberOfVMs = 1;
maxNumberofVMs = 25;
pendingTaskSamplePercent = $PendingTasks.GetSamplePercent(180 *
TimeInterval_Second);
pendingTaskSamples = pendingTaskSamplePercent < 70 ? startingNumberOfVMs :
avg($PendingTasks.GetSample(180 * TimeInterval_Second));
$TargetDedicated=min(maxNumberofVMs,pendingTaskSamples);

See Automatically scale compute nodes in an Azure Batch pool for details.

If the pool is using the default autoScaleEvaluationInterval, the Batch service could take 15-30
minutes to prepare the VM before running the custom activity. If the pool is using a different
autoScaleEvaluationInterval, the Batch service could take autoScaleEvaluationInterval + 10
minutes.

Related content
See the following articles that explain how to transform data in other ways:

U-SQL activity
Hive activity
Pig activity
MapReduce activity
Hadoop Streaming activity
Spark activity
Stored procedure activity
az batch
Manage Azure Batch.

Commands
ノ Expand table

Name Description Type Status

az batch account Manage Azure Batch accounts. Core GA

az batch account Manage the access keys for the auto storage account configured for Core GA
autostorage-keys a Batch account.

az batch account Synchronizes access keys for the auto-storage account configured Core GA
autostorage-keys for the specified Batch account, only if storage key authentication is
sync being used.

az batch account Create a Batch account with the specified parameters. Core GA
create

az batch account Deletes the specified Batch account. Core GA


delete

az batch account Manage identities of a batch account. Core GA


identity

az batch account Add managed identities to an existing batch account. Core GA


identity assign

az batch account Remove managed identities from an existing batch account. Core GA
identity remove

az batch account Display managed identities of a batch account. Core GA


identity show

az batch account Manage Batch account keys. Core GA


keys

az batch account Gets the account keys for the specified Batch account. This operation Core GA
keys list applies only to Batch accounts with allowedAuthenticationModes
containing 'SharedKey'. If the Batch account doesn't contain
'SharedKey' in its allowedAuthenticationMode, clients cannot use
shared keys to authenticate, and must use another
allowedAuthenticationModes instead. In this case, getting the keys
will fail.
Name Description Type Status

az batch account Renew keys for a Batch account. Core GA


keys renew

az batch account List the Batch accounts associated with a subscription or resource Core GA
list group.

az batch account Log in to a Batch account through Azure Active Directory or Shared Core GA
login Key authentication.

az batch account Manage Batch account Network profiles. Core GA


network-profile

az batch account Manage Batch account Network rules in Network Profile. Core GA
network-profile
network-rule

az batch account Add a Network rule from a Network Profile. Core GA


network-profile
network-rule add

az batch account Delete a Network rule from a Network Profile. Core GA


network-profile
network-rule
delete

az batch account List the Network rules from a Network Profile. Core GA
network-profile
network-rule list

az batch account Set the Network profile for Batch account. Core GA
network-profile
set

az batch account Get information about the Network profile for Batch account. Core GA
network-profile
show

az batch account List an account's outbound network dependencies. Core GA


outbound-
endpoints

az batch account Update properties for a Batch account. Core GA


set

az batch account Get a specified Batch account or the currently set account. Core GA
show

az batch Manage Batch applications. Core GA


application
Name Description Type Status

az batch Adds an application to the specified Batch account. Core GA


application
create

az batch Deletes an application. Core GA


application
delete

az batch Lists all of the applications in the specified account. Core GA


application list

az batch Manage Batch application packages. Core GA


application
package

az batch Activates a Batch application package. Core GA


application
package activate

az batch Create a Batch application package record and activate it. Core GA
application
package create

az batch Deletes an application package record and its associated binary file. Core GA
application
package delete

az batch Lists all of the application packages in the specified application. Core GA
application
package list

az batch Gets information about the specified application package. Core GA


application
package show

az batch Update properties for a Batch application. Core GA


application set

az batch Gets information about the specified application. Core GA


application show

az batch View a summary of Batch application packages. Core GA


application
summary

az batch Lists all of the applications available in the specified account. Core GA
application
summary list
Name Description Type Status

az batch Gets information about the specified application. Core GA


application
summary show

az batch job Manage Batch jobs. Core GA

az batch job- Manage Batch job schedules. Core GA


schedule

az batch job- Add a Batch job schedule to an account. Core GA


schedule create

az batch job- Deletes a Job Schedule from the specified Account. Core GA
schedule delete

az batch job- Disables a Job Schedule. Core GA


schedule disable

az batch job- Enables a Job Schedule. Core GA


schedule enable

az batch job- Lists all of the Job Schedules in the specified Account. Core GA
schedule list

az batch job- Reset the properties of a job schedule. An updated job specification Core GA
schedule reset only applies to new jobs.

az batch job- Update the properties of a job schedule. Core GA


schedule set

az batch job- Gets information about the specified Job Schedule. Core GA
schedule show

az batch job- Terminates a Job Schedule. Core GA


schedule stop

az batch job Add a job to a Batch account. Core GA


create

az batch job Deletes a job from a Batch account. Core GA


delete

az batch job Disable a Batch job. Core GA


disable

az batch job Enable a Batch job. Core GA


enable

az batch job list List all of the jobs or job schedule in a Batch account. Core GA
Name Description Type Status

az batch job View the status of Batch job preparation and release tasks. Core GA
prep-release-
status

az batch job Lists the execution status of the Job Preparation and Job Release Core GA
prep-release- Task for the specified Job across the Compute Nodes where the Job
status list has run.

az batch job Update the properties of a Batch job. Unspecified properties which Core GA
reset can be updated are reset to their defaults.

az batch job set Update the properties of a Batch job. Updating a property in a Core GA
subgroup will reset the unspecified properties of that group.

az batch job Gets information about the specified Batch job. Core GA
show

az batch job stop Stop a running Batch job. Core GA

az batch job View the number of tasks and slots in a Batch job and their states. Core GA
task-counts

az batch job Gets the Task counts for the specified Job. Core GA
task-counts show

az batch location Manage Batch service options for a subscription at the region level. Core GA

az batch location List virtual machine SKUs available in a location. Core GA


list-skus

az batch location Manage Batch service quotas at the region level. Core GA
quotas

az batch location Gets the Batch service quotas for the specified subscription at the Core GA
quotas show given location.

az batch node Manage Batch compute nodes. Core GA

az batch node Removes Compute Nodes from the specified Pool. Core GA
delete

az batch node Manage Batch compute node files. Core GA


file

az batch node Deletes the specified file from the Compute Node. Core GA
file delete

az batch node Download the content of the a node file. Core GA


file download
Name Description Type Status

az batch node Lists all of the files in Task directories on the specified Compute Core GA
file list Node.

az batch node Gets the properties of the specified Compute Node file. Core GA
file show

az batch node Lists the Compute Nodes in the specified Pool. Core GA
list

az batch node Reboot a Batch compute node. Core GA


reboot

az batch node Retrieve the remote login settings for a Batch compute node. Core GA
remote-login-
settings

az batch node Gets the settings required for remote login to a Compute Node. Core GA
remote-login-
settings show

az batch node Manage task scheduling for a Batch compute node. Core GA
scheduling

az batch node Disable scheduling on a Batch compute node. Core GA


scheduling
disable

az batch node Enable scheduling on a Batch compute node. Core GA


scheduling
enable

az batch node Manage the service log files of a Batch compute node. Core GA
service-logs

az batch node Upload service logs from a specified Batch compute node. Core GA
service-logs
upload

az batch node Gets information about the specified Compute Node. Core GA
show

az batch node Manage the user accounts of a Batch compute node. Core GA
user

az batch node Add a user account to a Batch compute node. Core GA


user create

az batch node Deletes a user Account from the specified Compute Node. Core GA
user delete
Name Description Type Status

az batch node Update the properties of a user account on a Batch compute node. Core GA
user reset Unspecified properties which can be updated are reset to their
defaults.

az batch pool Manage Batch pools. Core GA

az batch pool Manage automatic scaling of Batch pools. Core GA


autoscale

az batch pool Disables automatic scaling for a Pool. Core GA


autoscale disable

az batch pool Enables automatic scaling for a Pool. Core GA


autoscale enable

az batch pool Gets the result of evaluating an automatic scaling formula on the Core GA
autoscale Pool.
evaluate

az batch pool Create a Batch pool in an account. When creating a pool, choose Core GA
create arguments from either Cloud Services Configuration or Virtual
Machine Configuration.

az batch pool Deletes a Pool from the specified Account. Core GA


delete

az batch pool list Lists all of the Pools in the specified Account. Core GA

az batch pool Get node counts for Batch pools. Core GA


node-counts

az batch pool Gets the number of Compute Nodes in each state, grouped by Pool. Core GA
node-counts list

az batch pool Update the properties of a Batch pool. Unspecified properties which Core GA
reset can be updated are reset to their defaults.

az batch pool Resize or stop resizing a Batch pool. Core GA


resize

az batch pool set Update the properties of a Batch pool. Updating a property in a Core GA
subgroup will reset the unspecified properties of that group.

az batch pool Gets information about the specified Pool. Core GA


show

az batch pool Query information on VM images supported by Azure Batch service. Core GA
supported-
images
Name Description Type Status

az batch pool Lists all Virtual Machine Images supported by the Azure Batch Core GA
supported- service.
images list

az batch pool View usage metrics of Batch pools. Core GA


usage-metrics

az batch pool Lists the usage metrics, aggregated by Pool across individual time Core GA
usage-metrics intervals, for the specified Account.
list

az batch private- Manage Batch account private endpoint connections. Core GA


endpoint-
connection

az batch private- List all of the private endpoint connections in the specified account. Core GA
endpoint-
connection list

az batch private- Get information about the specified private endpoint connection. Core GA
endpoint-
connection show

az batch private- Manage Batch account private Link Resources. Core GA


link-resource

az batch private- List all of the private link resources in the specified account. Core GA
link-resource list

az batch private- Get information about the specified private link resource. Core GA
link-resource
show

az batch task Manage Batch tasks. Core GA

az batch task Create Batch tasks. Core GA


create

az batch task Deletes a Task from the specified Job. Core GA


delete

az batch task file Manage Batch task files. Core GA

az batch task file Deletes the specified Task file from the Compute Node where the Core GA
delete Task ran.

az batch task file Download the content of a Batch task file. Core GA
download
Name Description Type Status

az batch task file Lists the files in a Task's directory on its Compute Node. Core GA
list

az batch task file Gets the properties of the specified Task file. Core GA
show

az batch task list Lists all of the Tasks that are associated with the specified Job. Core GA

az batch task Reactivates a Task, allowing it to run again even if its retry count has Core GA
reactivate been exhausted.

az batch task Reset the properties of a Batch task. Core GA


reset

az batch task Gets information about the specified Task. Core GA


show

az batch task Terminates the specified Task. Core GA


stop

az batch task Manage subtask information of a Batch task. Core GA


subtask

az batch task Lists all of the subtasks that are associated with the specified multi- Core GA
subtask list instance Task.
Az.Batch Module

The Azure Batch cmdlets in the Azure module enable you to manage Microsoft Azure Batch
services in Azure PowerShell.

Batch
ノ Expand table

Cmdlet Description

Disable-AzBatchAutoScale Disables automatic scaling of a pool.

Disable-AzBatchComputeNodeScheduling Disables task scheduling on the specified compute


node.

Disable-AzBatchJob Disables a Batch job.

Disable-AzBatchJobSchedule Disables a Batch job schedule.

Enable-AzBatchAutoScale Enables automatic scaling of a pool.

Enable-AzBatchComputeNodeScheduling Enables task scheduling on the specified compute


node.

Enable-AzBatchJob Enables a Batch job.

Enable-AzBatchJobSchedule Enables a Batch job schedule.

Enable-AzBatchTask Reactivates a task.

Get-AzBatchAccount Gets a Batch account in the current subscription.

Get-AzBatchAccountKey Gets the keys of a Batch account.

Get-AzBatchApplication Gets information about the specified application.

Get-AzBatchApplicationPackage Gets information about an application package in a


Batch account.

Get-AzBatchCertificate Gets the certificates in a Batch account.

Get-AzBatchComputeNode Gets Batch compute nodes from a pool.

Get-AzBatchComputeNodeExtension Gets Batch compute node extensions from a compute


node.

Get-AzBatchJob Gets Batch jobs for a Batch account or job schedule.


Cmdlet Description

Get- Gets Batch job preparation and release task status.


AzBatchJobPreparationAndReleaseTaskStatus

Get-AzBatchJobSchedule Gets Batch job schedules.

Get-AzBatchLocationQuota Gets the Batch service quotas for your subscription at


the given location.

Get-AzBatchNodeFile Gets the properties of Batch node files.

Get-AzBatchNodeFileContent Gets a Batch node file.

Get-AzBatchPool Gets Batch pools under the specified Batch account.

Get-AzBatchPoolNodeCount Gets Batch node counts per node state grouped by


pool id.

Get-AzBatchPoolUsageMetric Gets pool usage metrics for a Batch account.

Get-AzBatchRemoteDesktopProtocolFile Gets an RDP file from a compute node.

Get-AzBatchRemoteLoginSetting Gets remote logon settings for a compute node.

Get-AzBatchSubtask Gets the subtask information of the specified task.

Get-AzBatchSupportedImage Gets Batch supported images for a Batch account.

Get-AzBatchSupportedVirtualMachineSku Gets the list of Batch supported Virtual Machine VM


sizes available at the given location.

Get-AzBatchTask Gets the Batch tasks for a job.

Get-AzBatchTaskCount Gets the task counts for the specified job.

Get-AzBatchTaskSlotCount Gets the task slot counts for the specified job.

New-AzBatchAccount Creates a Batch account.

New-AzBatchAccountKey Regenerates a key of a Batch account.

New-AzBatchApplication Adds an application to the specified Batch account.

New-AzBatchApplicationPackage Creates an application package in a Batch account.

New-AzBatchCertificate Adds a certificate to the specified Batch account.

New-AzBatchComputeNodeUser Creates a user account on a Batch compute node.

New-AzBatchJob Creates a job in the Batch service.

New-AzBatchJobSchedule Creates a job schedule in the Batch service.


Cmdlet Description

New-AzBatchPool Creates a pool in the Batch service.

New-AzBatchResourceFile Creates a Resource File for usage by New-AzBatchTask .

New-AzBatchTask Creates a Batch task under a job.

Remove-AzBatchAccount Removes a Batch account.

Remove-AzBatchApplication Deletes an application from a Batch account.

Remove-AzBatchApplicationPackage Deletes an application package record and the binary


file.

Remove-AzBatchCertificate Deletes a certificate from an account.

Remove-AzBatchComputeNode Removes compute nodes from a pool.

Remove-AzBatchComputeNodeUser Deletes a user account from a Batch compute node.

Remove-AzBatchJob Deletes a Batch job.

Remove-AzBatchJobSchedule Removes a Batch job schedule.

Remove-AzBatchNodeFile Deletes a node file for a task or compute node.

Remove-AzBatchPool Deletes the specified Batch pool.

Remove-AzBatchTask Deletes a Batch task.

Reset-AzBatchComputeNode Reinstalls the operating system on the specified


compute node.

Restart-AzBatchComputeNode Reboots the specified compute node.

Set-AzBatchAccount Updates a Batch account.

Set-AzBatchApplication Updates settings for the specified application.

Set-AzBatchComputeNodeUser Modifies properties of an account on a Batch compute


node.

Set-AzBatchJob Updates a Batch job.

Set-AzBatchJobSchedule Sets a job schedule.

Set-AzBatchPool Updates the properties of a pool.

Set-AzBatchTask Updates the properties of a task.

Start-AzBatchComputeNodeServiceLogUpload Upload compute node service log files to an Azure


Storage container.
Cmdlet Description

Start-AzBatchPoolResize Starts to resize a pool.

Stop-AzBatchCertificateDeletion Cancels a failed deletion of a certificate.

Stop-AzBatchJob Stops a Batch job.

Stop-AzBatchJobSchedule Stops a Batch job schedule.

Stop-AzBatchPoolResize Stops a pool resize operation.

Stop-AzBatchTask Stops a Batch task.

Test-AzBatchAutoScale Gets the result of an automatic scaling formula on a


pool.
Azure Batch SDK for .NET - latest
08/11/2025

Packages - latest
ノ Expand table

Reference Package Source

Batch Microsoft.Azure.Batch GitHub

Batch - Conventions Files Microsoft.Azure.Batch.Conventions.Files GitHub

Batch - File Staging Microsoft.Azure.Batch.FileStaging GitHub

Resource Management - Batch Azure.ResourceManager.Batch GitHub


Azure Batch libraries for Java
09/01/2025

Overview
Run large-scale parallel and high-performance computing applications efficiently in the cloud
with Azure Batch.

To get started with Azure Batch, see Create a Batch account with the Azure portal.

Client library
The Azure Batch client libraries let you configure compute nodes and pools, define tasks and
configure them to run in jobs, and set up a job manager to control and monitor job execution.
Learn more about using these objects to run large-scale parallel compute solutions.

Add a dependency to your Maven pom.xml file to use the client library in your project. The
client library source code can be found in Github .

XML

<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-batch</artifactId>
<version>4.0.0</version>
</dependency>

Example
Set up a pool of Linux compute nodes in a batch account:

Java

// create the batch client for an account using its URI and keys
BatchClient client = BatchClient.open(new
BatchSharedKeyCredentials("https://fabrikambatch.eastus.batch.azure.com",
"fabrikambatch", batchKey));

// configure a pool of VMs to use


VirtualMachineConfiguration configuration = new VirtualMachineConfiguration();
configuration.withNodeAgentSKUId("batch.node.ubuntu 16.04");
client.poolOperations().createPool(poolId, poolVMSize, configuration,
poolVMCount);
Explore the Client APIs

Management API
Use the Azure Batch management libraries to create and delete batch accounts, read and
regenerate batch account keys, and manage batch account storage.

Add a dependency to your Maven pom.xml file to use the management API in your project.

XML

<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-mgmt-batch</artifactId>
<version>1.3.0</version>
</dependency>

Example
Create an Azure Batch account and configure a new application and Azure storage account for
it.

Java

BatchAccount batchAccount = azure.batchAccounts().define("newBatchAcct")


.withRegion(Region.US_EAST)
.withNewResourceGroup("myResourceGroup")
.defineNewApplication("batchAppName")
.defineNewApplicationPackage(applicationPackageName)
.withAllowUpdates(true)
.withDisplayName(applicationDisplayName)
.attach()
.withNewStorageAccount("batchStorageAcct")
.create();

Explore the Management APIs

Samples
Manage Batch accounts

Explore more sample Java code for Azure Batch you can use in your apps.
Azure Batch SDK for JavaScript - latest
09/01/2025

Packages - latest
ノ Expand table

Reference Package Source

Batch @azure/batch GitHub

Resource Management - Batch @azure/arm-batch GitHub


Azure Batch libraries for python
09/01/2025

Overview
Run large-scale parallel and high-performance computing applications efficiently in the cloud
with Azure Batch.

To get started with Azure Batch, see Create a Batch account with the Azure portal.

Install the libraries

Client library
The Azure Batch client libraries let you configure compute nodes and pools, define tasks and
configure them to run in jobs, and set up a job manager to control and monitor job execution.
Learn more about using these objects to run large-scale parallel compute solutions.

Bash

pip install azure-batch

Example
Set up a pool of Linux compute nodes in a batch account:

Python

import azure.batch
from azure.batch import batch_auth, BatchServiceClient, models

# create the batch client for an account using its URI and keys
creds = batch_auth.SharedKeyCredentials(account, key)
client = BatchServiceClient(creds, batch_url)

# Create the VirtualMachineConfiguration, specifying


# the VM image reference and the Batch node agent to
# be installed on the node.
vmc = models.VirtualMachineConfiguration(
image_reference = models.ImageReference(
publisher='Canonical',
offer='UbuntuServer',
sku='18.04-LTS'
),
node_agent_sku_id = "batch.node.ubuntu 18.04")

# Assign the virtual machine configuration to the pool


new_pool = models.PoolAddParameter(
id = 'new_pool',
vm_size='standard_d2_v2',
virtual_machine_configuration = vmc
)

# Create pool in the Batch service


client.pool.add(new_pool)

Explore the Client APIs

Management API
Use the Azure Batch management libraries to create and delete batch accounts, read and
regenerate batch account keys, and manage batch account storage.

Bash

pip install azure-mgmt-batch

Example
Create an Azure Batch account and configure a new application and Azure storage account for
it.

Python

from azure.mgmt.batch import BatchManagementClient


from azure.mgmt.resource import ResourceManagementClient
from azure.mgmt.storage import StorageManagementClient

LOCATION ='eastus'
GROUP_NAME ='batchresourcegroup'
STORAGE_ACCOUNT_NAME ='batchstorageaccount'

# Create Resource group


print('Create Resource Group')
resource_client.resource_groups.create_or_update(GROUP_NAME, {'location':
LOCATION})

# Create a storage account


storage_async_operation = storage_client.storage_accounts.create(
GROUP_NAME,
STORAGE_ACCOUNT_NAME,
StorageAccountCreateParameters(
sku=Sku(SkuName.standard_ragrs),
kind=Kind.storage,
location=LOCATION
)
)
storage_account = storage_async_operation.result()

# Create a Batch Account, specifying the storage account we want to link


storage_resource = storage_account.id
batch_account_parameters = azure.mgmt.batch.models.BatchAccountCreateParameters(
location=LOCATION,

auto_storage=azure.mgmt.batch.models.AutoStorageBaseProperties(storage_resource)
)
creating = batch_client.batch_account.begin_create('MyBatchResourceGroup',
'MyBatchAccount', batch_account_parameters)
creating.wait()

Explore the Management APIs


Azure Batch Service REST API Reference
Article • 04/07/2022

The REST APIs for the Azure Batch service offer developers a means to schedule large-
scale parallel and HPC applications in the cloud.

Azure Batch REST APIs can be accessed from within a service running in Azure, or
directly over the Internet from any application that can send an HTTPS request and
HTTPS response.

Batch account
All access to the Batch service requires a Batch account, and the account is the basis for
authentication.

The Base URL for Batch service is https://{account-name}.{region-id}.batch.azure.com

REST APIs
Use these APIs to schedule and run large scale computational workloads. All operations
conform to the HTTP/1.1 protocol specification and each operation returns a request-id
header that can be used to obtain information about the request. You must make sure
that requests made to these resources are secure. For more information, see
Authenticate Requests to the Azure Batch Service.

Account

Application

Certificate

Compute Node

File

Job

Job Schedule

Pool

Task
Common operations
Add a pool to an account

Add a task to a job

List the compute nodes in a pool

Get information about a task


Batch Management REST API Reference
Article • 10/31/2023

Azure Batch enables you to run large-scale parallel and high-performance computing
(HPC) applications efficiently in the cloud. It's a platform service that schedules
compute-intensive work to run on a managed collection of virtual machines, and can
automatically scale compute resources to meet the needs of your jobs.

The Batch Management REST API provides operations for working with the Batch service
through the Microsoft.Batch provider.

See also
Azure Batch documentation
Azure Batch code samples on GitHub
Microsoft.Batch resource types
Article • 12/09/2024

This article lists the available versions for each resource type.

For a list of changes in each API version, see change log

Resource types and versions


ノ Expand table

Types Versions

Microsoft.Batch/batchAccounts 2015-12-01
2017-01-01
2017-05-01
2017-09-01
2018-12-01
2019-04-01
2019-08-01
2020-03-01
2020-05-01
2020-09-01
2021-01-01
2021-06-01
2022-01-01
2022-06-01
2022-10-01
2023-05-01
2023-11-01
2024-02-01
2024-07-01

Microsoft.Batch/batchAccounts/applications 2015-12-01
2017-01-01
2017-05-01
2017-09-01
2018-12-01
2019-04-01
2019-08-01
2020-03-01
2020-05-01
2020-09-01
2021-01-01
2021-06-01
2022-01-01
Types Versions

2022-06-01
2022-10-01
2023-05-01
2023-11-01
2024-02-01
2024-07-01

Microsoft.Batch/batchAccounts/applications/versions 2015-12-01
2017-01-01
2017-05-01
2017-09-01
2018-12-01
2019-04-01
2019-08-01
2020-03-01
2020-05-01
2020-09-01
2021-01-01
2021-06-01
2022-01-01
2022-06-01
2022-10-01
2023-05-01
2023-11-01
2024-02-01
2024-07-01

Microsoft.Batch/batchAccounts/certificates 2017-09-01
2018-12-01
2019-04-01
2019-08-01
2020-03-01
2020-05-01
2020-09-01
2021-01-01
2021-06-01
2022-01-01
2022-06-01
2022-10-01
2023-05-01
2023-11-01
2024-02-01
2024-07-01

Microsoft.Batch/batchAccounts/detectors 2022-01-01
2022-06-01
2022-10-01
2023-05-01
Types Versions

2023-11-01
2024-02-01
2024-07-01

Microsoft.Batch/batchAccounts/networkSecurityPerimeterConfigurations 2024-07-01

Microsoft.Batch/batchAccounts/pools 2017-09-01
2018-12-01
2019-04-01
2019-08-01
2020-03-01
2020-05-01
2020-09-01
2021-01-01
2021-06-01
2022-01-01
2022-06-01
2022-10-01
2023-05-01
2023-11-01
2024-02-01
2024-07-01

Microsoft.Batch/batchAccounts/privateEndpointConnections 2020-03-01
2020-05-01
2020-09-01
2021-01-01
2021-06-01
2022-01-01
2022-06-01
2022-10-01
2023-05-01
2023-11-01
2024-02-01
2024-07-01

Microsoft.Batch/batchAccounts/privateLinkResources 2020-03-01
2020-05-01
2020-09-01
2021-01-01
2021-06-01
2022-01-01
2022-06-01
2022-10-01
2023-05-01
2023-11-01
2024-02-01
2024-07-01
Feedback
Was this page helpful?  Yes  No
Azure Batch monitoring data reference
Article • 04/02/2025

This article contains all the monitoring reference information for this service.

See Monitor Azure Batch for details on the data you can collect for Azure Batch and how
to use it.

Metrics
This section lists all the automatically collected platform metrics for this service. These
metrics are also part of the global list of all platform metrics supported in Azure
Monitor.

For information on metric retention, see Azure Monitor Metrics overview.

Supported metrics for Microsoft.Batch/batchaccounts


The following table lists the metrics available for the Microsoft.Batch/batchaccounts
resource type.

All columns might not be present in every table.


Some columns might be beyond the viewing area of the page. Select Expand table
to view all available columns.

Table headings

Category - The metrics group or classification.


Metric - The metric display name as it appears in the Azure portal.
Name in REST API - The metric name as referred to in the REST API.
Unit - Unit of measure.
Aggregation - The default aggregation type. Valid values: Average (Avg), Minimum
(Min), Maximum (Max), Total (Sum), Count.
Dimensions - Dimensions available for the metric.
Time Grains - Intervals at which the metric is sampled. For example, PT1M indicates
that the metric is sampled every minute, PT30M every 30 minutes, PT1H every hour,
and so on.
DS Export- Whether the metric is exportable to Azure Monitor Logs via diagnostic
settings. For information on exporting metrics, see Create diagnostic settings in
Azure Monitor.
ノ Expand table

Metric Name in REST API Unit Aggregation Dimensions Time DS


Grains Export

Dedicated CoreCount Count Total (Sum) <none> PT1M No


Core Count

Total
number of
dedicated
cores in the
batch
account

Creating CreatingNodeCount Count Total (Sum) <none> PT1M No


Node
Count

Number of
nodes
being
created

Idle Node IdleNodeCount Count Total (Sum) <none> PT1M No


Count

Number of
idle nodes

Job Delete JobDeleteCompleteEvent Count Total (Sum) jobId PT1M Yes


Complete
Events

Total
number of
jobs that
have been
successfully
deleted.

Job Delete JobDeleteStartEvent Count Total (Sum) jobId PT1M Yes


Start
Events

Total
number of
jobs that
have been
requested
Metric Name in REST API Unit Aggregation Dimensions Time DS
Grains Export

to be
deleted.

Job JobDisableCompleteEvent Count Total (Sum) jobId PT1M Yes


Disable
Complete
Events

Total
number of
jobs that
have been
successfully
disabled.

Job JobDisableStartEvent Count Total (Sum) jobId PT1M Yes


Disable
Start
Events

Total
number of
jobs that
have been
requested
to be
disabled.

Job Start JobStartEvent Count Total (Sum) jobId PT1M Yes


Events

Total
number of
jobs that
have been
successfully
started.

Job JobTerminateCompleteEvent Count Total (Sum) jobId PT1M Yes


Terminate
Complete
Events

Total
number of
jobs that
have been
Metric Name in REST API Unit Aggregation Dimensions Time DS
Grains Export

successfully
terminated.

Job JobTerminateStartEvent Count Total (Sum) jobId PT1M Yes


Terminate
Start
Events

Total
number of
jobs that
have been
requested
to be
terminated.

Leaving LeavingPoolNodeCount Count Total (Sum) <none> PT1M No


Pool Node
Count

Number of
nodes
leaving the
Pool

LowPriority LowPriorityCoreCount Count Total (Sum) <none> PT1M No


Core Count

Total
number of
low-priority
cores in the
batch
account

Offline OfflineNodeCount Count Total (Sum) <none> PT1M No


Node
Count

Number of
offline
nodes

Pool PoolCreateEvent Count Total (Sum) poolId PT1M Yes


Create
Events

Total
Metric Name in REST API Unit Aggregation Dimensions Time DS
Grains Export

number of
pools that
have been
created

Pool PoolDeleteCompleteEvent Count Total (Sum) poolId PT1M Yes


Delete
Complete
Events

Total
number of
pool
deletes
that have
completed

Pool PoolDeleteStartEvent Count Total (Sum) poolId PT1M Yes


Delete
Start
Events

Total
number of
pool
deletes
that have
started

Pool Resize PoolResizeCompleteEvent Count Total (Sum) poolId PT1M Yes


Complete
Events

Total
number of
pool
resizes that
have
completed

Pool Resize PoolResizeStartEvent Count Total (Sum) poolId PT1M Yes


Start
Events

Total
number of
pool
Metric Name in REST API Unit Aggregation Dimensions Time DS
Grains Export

resizes that
have
started

Preempted PreemptedNodeCount Count Total (Sum) <none> PT1M No


Node
Count

Number of
preempted
nodes

Rebooting RebootingNodeCount Count Total (Sum) <none> PT1M No


Node
Count

Number of
rebooting
nodes

Reimaging ReimagingNodeCount Count Total (Sum) <none> PT1M No


Node
Count

Number of
reimaging
nodes

Running RunningNodeCount Count Total (Sum) <none> PT1M No


Node
Count

Number of
running
nodes

Starting StartingNodeCount Count Total (Sum) <none> PT1M No


Node
Count

Number of
nodes
starting

Start Task StartTaskFailedNodeCount Count Total (Sum) <none> PT1M No


Failed
Node
Count
Metric Name in REST API Unit Aggregation Dimensions Time DS
Grains Export

Number of
nodes
where the
Start Task
has failed

Task TaskCompleteEvent Count Total (Sum) poolId , PT1M Yes


Complete jobId
Events

Total
number of
tasks that
have
completed

Task Fail TaskFailEvent Count Total (Sum) poolId , PT1M Yes


Events jobId

Total
number of
tasks that
have
completed
in a failed
state

Task Start TaskStartEvent Count Total (Sum) poolId , PT1M Yes


Events jobId

Total
number of
tasks that
have
started

Low- TotalLowPriorityNodeCount Count Total (Sum) <none> PT1M No


Priority
Node
Count

Total
number of
low-priority
nodes in
Metric Name in REST API Unit Aggregation Dimensions Time DS
Grains Export

the batch
account

Dedicated TotalNodeCount Count Total (Sum) <none> PT1M No


Node
Count

Total
number of
dedicated
nodes in
the batch
account

Unusable UnusableNodeCount Count Total (Sum) <none> PT1M No


Node
Count

Number of
unusable
nodes

Waiting WaitingForStartTaskNodeCount Count Total (Sum) <none> PT1M No


For Start
Task Node
Count

Number of
nodes
waiting for
the Start
Task to
complete

Metric dimensions
For information about what metric dimensions are, see Multi-dimensional metrics.

This service has the following dimensions associated with its metrics.

poolId
jobId

Resource logs
This section lists the types of resource logs you can collect for this service. The section
pulls from the list of all resource logs category types supported in Azure Monitor.

Supported resource logs for


Microsoft.Batch/batchaccounts

ノ Expand table

Category Category Log table Supports Supports Example Costs


display basic log ingestion-time queries to
name plan transformation export

AuditLog Audit Logs AzureDiagnostics No No Queries Yes

Logs from
multiple Azure
resources.

ServiceLog Service Logs AzureDiagnostics No No Queries No

Logs from
multiple Azure
resources.

ServiceLogs Service Logs AzureDiagnostics No No Queries Yes


(deprecated)
Logs from
multiple Azure
resources.

Service log events


Batch service logs contain events emitted by the Batch service during the lifetime of an
individual Batch resource, such as a pool or task. The Batch service emits the following
log events:

Pool create
Pool delete start
Pool delete complete
Pool resize start
Pool resize complete
Pool autoscale
Task start
Task complete
Task fail
Task schedule fail

Each event emitted by Batch is logged in JSON format. The following example shows the
body of a sample pool create event:

JSON

{
"id": "myPool1",
"displayName": "Production Pool",
"vmSize": "Standard_F1s",
"imageType": "VirtualMachineConfiguration",
"cloudServiceConfiguration": {
"osFamily": "3",
"targetOsVersion": "*"
},
"networkConfiguration": {
"subnetId": " "
},
"virtualMachineConfiguration": {
"imageReference": {
"publisher": " ",
"offer": " ",
"sku": " ",
"version": " "
},
"nodeAgentId": " "
},
"resizeTimeout": "300000",
"targetDedicatedNodes": 2,
"targetLowPriorityNodes": 2,
"taskSlotsPerNode": 1,
"vmFillType": "Spread",
"enableAutoScale": false,
"enableInterNodeCommunication": false,
"isAutoPool": false
}

Azure Monitor Logs tables


This section lists the Azure Monitor Logs tables relevant to this service, which are
available for query by Log Analytics using Kusto queries. The tables contain resource log
data and possibly more depending on what is collected and routed to them.

Batch Accounts
microsoft.batch/batchaccounts
AzureActivity
AzureMetrics
AzureDiagnostics

Activity log
The linked table lists the operations that can be recorded in the activity log for this
service. These operations are a subset of all the possible resource provider operations in
the activity log.

For more information on the schema of activity log entries, see Activity Log schema.

Microsoft.Batch resource provider operations

Related content
See Monitor Batch for a description of monitoring Batch.
See Monitor Azure resources with Azure Monitor for details on monitoring Azure
resources.
Learn about the Batch APIs and tools available for building Batch solutions.

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Batch Analytics
Article • 04/02/2025

The topics in this section contain reference information for the events and alerts
available for Batch service resources.

See Azure Batch diagnostic logging for more information on enabling and consuming
Batch diagnostic logs.

Diagnostic logs
The Azure Batch service emits the following diagnostic log events during the lifetime of
certain Batch resources.

Service Log events


Pool create
Pool delete start
Pool delete complete
Pool resize start
Pool resize complete
Pool autoscale
Task start
Task complete
Task fail
Task schedule fail

Feedback
Was this page helpful?  Yes  No

Provide product feedback | Get help at Microsoft Q&A


Pool create event
Article • 12/14/2021

This event is emitted once a pool has been created. The content of the log will expose
general information about the pool. Note that if the target size of the pool is greater
than 0 compute nodes, a pool resize start event will follow immediately after this event.

The following example shows the body of a pool create event.

{
"id": "myPool1",
"displayName": "Production Pool",
"vmSize": "Standard_F1s",
"imageType": "VirtualMachineConfiguration",
"cloudServiceConfiguration": {
"osFamily": "3",
"targetOsVersion": "*"
},
"networkConfiguration": {
"subnetId": " "
},
"virtualMachineConfiguration": {
"imageReference": {
"publisher": " ",
"offer": " ",
"sku": " ",
"version": " "
},
"nodeAgentId": " "
},
"resizeTimeout": "300000",
"targetDedicatedNodes": 2,
"targetLowPriorityNodes": 2,
"taskSlotsPerNode": 1,
"vmFillType": "Spread",
"enableAutoScale": false,
"enableInterNodeCommunication": false,
"isAutoPool": false
}

Element Type Notes

id String The ID of the pool.

displayName String The display name of the pool.


Element Type Notes

vmSize String The size of the virtual machines in the pool. All virtual
machines in a pool are the same size.

For information about available sizes of virtual


machines for Cloud Services pools (pools created
with cloudServiceConfiguration), see Sizes for Cloud
Services. Batch supports all Cloud Services VM sizes
except ExtraSmall .

For information about available VM sizes for pools


using images from the Virtual Machines Marketplace
(pools created with virtualMachineConfiguration) see
Sizes for Virtual Machines (Linux) or Sizes for Virtual
Machines (Windows). Batch supports all Azure VM
sizes except STANDARD_A0 and those with premium
storage ( STANDARD_GS , STANDARD_DS , and
STANDARD_DSV2 series).

imageType String The deployment method for the image. Supported


values are virtualMachineConfiguration or
cloudServiceConfiguration

cloudServiceConfiguration Complex The cloud services configuration for the pool.


Type

virtualMachineConfiguration Complex The virtual machine configuration for the pool.


Type

networkConfiguration Complex The network configuration for the pool.


Type

resizeTimeout Time The timeout for allocation of compute nodes to the


pool specified for the last resize operation on the
pool. (The initial sizing when the pool is created
counts as a resize.)

targetDedicatedNodes Int32 The number of dedicated compute nodes that are


requested for the pool.

targetLowPriorityNodes Int32 The number of Azure Spot compute nodes that are
requested for the pool.

enableAutoScale Bool Specifies whether the pool size automatically adjusts


over time.

enableInterNodeCommunication Bool Specifies whether the pool is set up for direct


communication between nodes.
Element Type Notes

isAutoPool Bool Specifies whether the pool was created via a job's
AutoPool mechanism.

taskSlotsPerNode Int32 The maximum number of tasks that can run


concurrently on a single compute node in the pool.

vmFillType String Defines how the Batch service distributes tasks


between compute nodes in the pool. Valid values are
Spread or Pack.

cloudServiceConfiguration

2 Warning

Cloud Services Configuration pools are deprecated . Please use Virtual Machine
Configuration pools instead.

Element name Type Notes

osFamily String The Azure Guest OS family to be installed on the virtual machines in
the pool.

Possible values are:

2 – OS Family 2, equivalent to Windows Server 2008 R2 SP1.

3 – OS Family 3, equivalent to Windows Server 2012.

4 – OS Family 4, equivalent to Windows Server 2012 R2.

For more information, see Azure Guest OS Releases.

targetOSVersion String The Azure Guest OS version to be installed on the virtual machines in
the pool.

The default value is * which specifies the latest operating system


version for the specified family.

For other permitted values, see Azure Guest OS Releases.

virtualMachineConfiguration

Element name Type Notes


Element name Type Notes

imageReference Complex Specifies information about the platform or Marketplace


Type image to use.

nodeAgentId String The SKU of the Batch node agent provisioned on the
compute node.

windowsConfiguration Complex Specifies Windows operating system settings on the virtual


Type machine. This property must not be specified if the
imageReference is referencing a Linux OS image.

imageReference

Element name Type Notes

publisher String The publisher of the image.

offer String The offer of the image.

sku String The SKU of the image.

version String The version of the image.

windowsConfiguration

Element name Type Notes

enableAutomaticUpdates Boolean Indicates whether the virtual machine is enabled for


automatic updates. If this property is not specified, the
default value is true.

networkConfiguration

Element Type Notes


name

subnetId String Specifies the resource identifier of the subnet in which the pool's compute
nodes are created.
Pool delete start event
07/01/2025

This event is emitted when a pool delete operation is started. Since the pool delete is an
asynchronous event, you can expect a pool delete complete event to be emitted once the
delete operation completes.

The following example shows the body of a pool delete start event.

{
"id": "myPool1"
}

ノ Expand table

Element Type Notes

id String The ID of the pool.


Pool delete complete event
07/01/2025

This event is emitted when a pool delete operation is completed.

The following example shows the body of a pool delete complete event.

{
"id": "myPool1",
"startTime": "2016-09-09T22:13:48.579Z",
"endTime": "2016-09-09T22:14:08.836Z"
}

ノ Expand table

Element Type Notes

id String The ID of the pool.

startTime DateTime The time the pool delete started.

endTime DateTime The time the pool delete completed.

Remarks
For more information about states and error codes for pool resize operation, see Delete a pool
from an account.
Pool resize start event
07/02/2025

This event is emitted when a pool resize is started. Since the pool resize is an asynchronous
event, you can expect a pool resize complete event to be emitted once the resize operation
completes.

The following example shows the body of a pool resize start event for a pool resizing from 0 to
2 nodes with a manual resize.

{
"id": "myPool1",
"nodeDeallocationOption": "Invalid",
"currentDedicatedNodes": 0,
"targetDedicatedNodes": 2,
"currentLowPriorityNodes": 0,
"targetLowPriorityNodes": 2,
"enableAutoScale": false,
"isAutoPool": false
}

ノ Expand table

Element Type Notes

id String The ID of the pool.

nodeDeallocationOption String Specifies when nodes may be removed from the pool, if the pool size
is decreasing.

Possible values are:

requeue – Terminate running tasks and requeue them. The tasks run
again when the job is enabled. Remove nodes as soon as tasks are
terminated.

terminate – Terminate running tasks. The tasks won't run again.


Remove nodes as soon as tasks are terminated.

taskcompletion – Allow currently running tasks to complete.


Schedule no new tasks while waiting. Remove nodes when all tasks
are completed.

Retaineddata - Allow currently running tasks to complete, then wait


for all task data retention periods to expire. Schedule no new tasks
Element Type Notes

while waiting. Remove nodes when all task retention periods are
expired.

The default value is requeue.

If the pool size is increasing then the value is set to invalid.

currentDedicatedNodes Int32 The number of dedicated compute nodes currently assigned to the
pool.

targetDedicatedNodes Int32 The number of dedicated compute nodes that are requested for the
pool.

currentLowPriorityNodes Int32 The number of Spot compute nodes currently assigned to the pool.

targetLowPriorityNodes Int32 The number of Spot compute nodes that are requested for the pool.

enableAutoScale Bool Specifies whether the pool size automatically adjusts over time.

isAutoPool Bool Specifies whether the pool was created via a job's AutoPool
mechanism.
Pool resize complete event
07/02/2025

This event is emitted when a pool resize is completed or failed.

The following example shows the body of a pool resize complete event for a pool that
increased in size and completed successfully.

{
"id": "myPool",
"nodeDeallocationOption": "invalid",
"currentDedicatedNodes": 10,
"targetDedicatedNodes": 10,
"currentLowPriorityNodes": 5,
"targetLowPriorityNodes": 5,
"enableAutoScale": false,
"isAutoPool": false,
"startTime": "2016-09-09T22:13:06.573Z",
"endTime": "2016-09-09T22:14:01.727Z",
"resultCode": "Success",
"resultMessage": "The operation succeeded"
}

ノ Expand table

Element Type Notes

id String The ID of the pool.

nodeDeallocationOption String Specifies when nodes may be removed from the pool, if the pool
size is decreasing.

Possible values are:

requeue – Terminate running tasks and requeue them. The tasks


run again when the job is enabled. Remove nodes as soon as
tasks are terminated.

terminate – Terminate running tasks. The tasks won't run again.


Remove nodes as soon as tasks are terminated.

taskcompletion – Allow currently running tasks to complete.


Schedule no new tasks while waiting. Remove nodes when all
tasks are completed.

Retaineddata - Allow currently running tasks to complete, then


Element Type Notes

wait for all task data retention periods to expire. Schedule no new
tasks while waiting. Remove nodes when all task retention periods
are expired.

The default value is requeue.

If the pool size is increasing, then the value is set to invalid.

currentDedicatedNodes Int32 The number of dedicated compute nodes currently assigned to


the pool.

targetDedicatedNodes Int32 The number of dedicated compute nodes that are requested for
the pool.

currentLowPriorityNodes Int32 The number of Spot compute nodes currently assigned to the
pool.

targetLowPriorityNodes Int32 The number of Spot compute nodes that are requested for the
pool.

enableAutoScale Bool Specifies whether the pool size automatically adjusts over time.

isAutoPool Bool Specifies whether the pool was created via a job's AutoPool
mechanism.

startTime DateTime The time the pool resize started.

endTime DateTime The time the pool resize completed.

resultCode String The result of the resize.

resultMessage String A detailed message about the result.

If the resize completed successfully it states that the operation


succeeded.
Pool autoscale event
07/01/2025

This event is emitted once a pool automatic scaling is executed. The content of the log will
expose autoscale formula and evaluation results for the pool.

The following example shows the body of a pool autoscale event for a pool automatic scaling
which failed due to insufficient sample data.

{
"id": "myPool1",
"timestamp": "2020-09-21T18:59:56.204Z",
"formula": "...",
"results": "...",
"error": {
"code": "InsufficientSampleData",
"message": "Autoscale evaluation failed due to insufficient sample data",
"values": [{
"name": "Message",
"value": "Line 15, Col 44: Insufficient data from data set:
$RunningTasks wanted 70%, received 50%"
}
]
}
}

ノ Expand table

Element Type Notes

id String The ID of the pool.

timestamp DateTime The timestamp when automatic scaling is executed.

formula String The formula defined for automatic scaling.

results String Evaluation results for all variables used in the formula.

error Complex Type The detailed error for automatic scaling.

error

ノ Expand table
Element Type Notes
name

code String An identifier for the automatic scaling error. Codes are invariant and are intended
to be consumed programmatically.

message String A message describing the automatic scaling error, intended to be suitable for
display in a user interface.

values Array List of name-value pairs describing more details of the automatic scaling error.
Task start event
07/02/2025

This event is emitted once a task is scheduled to start on a compute node by the scheduler. If
the task is retried or requeued, this event will be emitted again for the same task. The retry
count and system task version will be updated accordingly.

The following example shows the body of a task start event.

{
"jobId": "myJob",
"id": "myTask",
"taskType": "User",
"systemTaskVersion": 220192842,
"requiredSlots": 1,
"nodeInfo": {
"poolId": "pool-001",
"nodeId": "tvm-257509324_1-20160908t162728z"
},
"multiInstanceSettings": {
"numberOfInstances": 1
},
"constraints": {
"maxTaskRetryCount": 2
},
"executionInfo": {
"retryCount": 0
}
}

ノ Expand table

Element name Type Notes

jobId String The ID of the job containing the task.

id String The ID of the task.

taskType String The type of the task. It's either a 'JobManager' indicating it's a job
manager task or 'User' indicating it isn't a job manager task.

systemTaskVersion Int32 The internal retry counter on a task. Internally the Batch service
retries a task to account for transient issues. These issues include
internal scheduling errors or attempts to recover from compute
nodes in a bad state.
Element name Type Notes

requiredSlots Int32 The required slots to run the task.

nodeInfo Complex Contains information about the compute node on which the task ran.
Type

multiInstanceSettings Complex Specifies that the task is Multi-Instance Task requiring multiple
Type compute nodes. See multiInstanceSettings for details.

constraints Complex The execution constraints that apply to this task.


Type

executionInfo Complex Contains information about the execution of the task.


Type

nodeInfo

ノ Expand table

Element name Type Notes

poolId String The ID of the pool on which the task ran.

nodeId String The ID of the node on which the task ran.

multiInstanceSettings

ノ Expand table

Element name Type Notes

numberOfInstances Int The number of compute nodes required by the task.

constraints

ノ Expand table

Element name Type Notes

maxTaskRetryCount Int32 The maximum number of times the task is retried. The Batch service retries a
task if its exit code is nonzero.

This value specifically controls the number of retries. The Batch service tries
the task once, and may then retry up to this limit. For example, if the
Element name Type Notes

maximum retry count is 3, Batch tries a task up to 4 times (one initial try and
3 retries).

If the maximum retry count is 0, the Batch service doesn't retry tasks.

If the maximum retry count is -1, the Batch service retries tasks without limit.

The default value is 0 (no retries).

executionInfo

ノ Expand table

Element Type Notes


name

retryCount Int32 The number of times the task is retried by the Batch service. The task is retried if it
exits with a nonzero exit code, up to the specified MaxTaskRetryCount
Task complete event
07/02/2025

This event is emitted once a task is completed, regardless of the exit code. This event can be
used to determine the duration of a task, where the task ran, and whether it was retried.

The following example shows the body of a task complete event.

{
"jobId": "myJob",
"id": "myTask",
"taskType": "User",
"systemTaskVersion": 0,
"requiredSlots": 1,
"nodeInfo": {
"poolId": "pool-001",
"nodeId": "tvm-257509324_1-20160908t162728z"
},
"multiInstanceSettings": {
"numberOfInstances": 1
},
"constraints": {
"maxTaskRetryCount": 2
},
"executionInfo": {
"startTime": "2016-09-08T16:32:23.799Z",
"endTime": "2016-09-08T16:34:00.666Z",
"exitCode": 0,
"retryCount": 0,
"requeueCount": 0
}
}

ノ Expand table

Element name Type Notes

jobId String The ID of the job containing the task.

id String The ID of the task.

taskType String The type of the task. This can either be 'JobManager' indicating it's a
job manager task or 'User' indicating it isn't a job manager task. Note
that this event isn't emitted for job preparation tasks, job release
tasks, or start tasks.
Element name Type Notes

systemTaskVersion Int32 The internal retry counter on a task. Internally the Batch service can
retry a task to account for transient issues. These issues can include
internal scheduling errors or attempts to recover from compute
nodes in a bad state.

requiredSlots Int32 The required slots to run the task.

nodeInfo Complex Contains information about the compute node on which the task ran.
Type

multiInstanceSettings Complex Specifies that the task is a Multi-Instance Task requiring multiple
Type compute nodes. See multiInstanceSettings for details.

constraints Complex The execution constraints that apply to this task.


Type

executionInfo Complex Contains information about the execution of the task.


Type

nodeInfo

ノ Expand table

Element name Type Notes

poolId String The ID of the pool on which the task ran.

nodeId String The ID of the node on which the task ran.

multiInstanceSettings

ノ Expand table

Element name Type Notes

numberOfInstances Int32 The number of compute nodes required by the task.

constraints

ノ Expand table
Element name Type Notes

maxTaskRetryCount Int32 The maximum number of times the task may be retried. The Batch service
retries a task if its exit code is nonzero.

This value specifically controls the number of retries. The Batch service tries
the task once, and may then retry up to this limit. For example, if the
maximum retry count is 3, Batch tries a task up to 4 times (one initial try and
3 retries).

If the maximum retry count is 0, the Batch service doesn't retry tasks.

If the maximum retry count is -1, the Batch service retries tasks without limit.

The default value is 0 (no retries).

executionInfo

ノ Expand table

Element Type Notes


name

startTime DateTime The time when the task started running. 'Running' corresponds to the
running state, so if the task specifies resource files or application packages,
then the start time reflects the time when the task started downloading or
deploying these. If the task restarted or retried, this is the most recent time at
which the task started running.

endTime DateTime The time when the task completed.

exitCode Int32 The exit code of the task.

retryCount Int32 The number of times the task is retried by the Batch service. The task is
retried if it exits with a nonzero exit code, up to the specified
MaxTaskRetryCount.

requeueCount Int32 The number of times the task is requeued by the Batch service as the result
of a user request.

When you remove nodes from a pool (by resizing or shrinking it) or disable a
job, you can choose to requeue the running tasks on those nodes for
execution. This count tracks how many times the task requeued for these
reasons.
Task fail event
07/02/2025

This event is emitted when a task completes with a failure. Currently all nonzero exit codes are
considered failures. This event is emitted in addition to a task complete event and can be used
to detect when a task fails.

The following example shows the body of a task fail event.

{
"jobId": "myJob",
"id": "myTask",
"taskType": "User",
"systemTaskVersion": 0,
"requiredSlots": 1,
"nodeInfo": {
"poolId": "pool-001",
"nodeId": "tvm-257509324_1-20160908t162728z"
},
"multiInstanceSettings": {
"numberOfInstances": 1
},
"constraints": {
"maxTaskRetryCount": 2
},
"executionInfo": {
"startTime": "2016-09-08T16:32:23.799Z",
"endTime": "2016-09-08T16:34:00.666Z",
"exitCode": 1,
"retryCount": 2,
"requeueCount": 0
}
}

ノ Expand table

Element name Type Notes

jobId String The ID of the job containing the task.

id String The ID of the task.

taskType String The type of the task. It's either 'JobManager' indicating it's a job
manager task or 'User' indicating it's not a job manager task. It's not
emitted for job preparation tasks, job release tasks, or start tasks.
Element name Type Notes

systemTaskVersion Int32 It's the internal retry counter on a task. Internally the Batch service
can retry a task to account for transient issues. These issues can
include internal scheduling errors or attempts to recover from
compute nodes in a bad state.

requiredSlots Int32 The required slots to run the task.

nodeInfo Complex Contains information about the compute node on which the task ran.
Type

multiInstanceSettings Complex Specifies that the task is a Multi-Instance Task requiring multiple
Type compute nodes. See multiInstanceSettings for details.

constraints Complex The execution constraints that apply to this task.


Type

executionInfo Complex Contains information about the execution of the task.


Type

nodeInfo

ノ Expand table

Element name Type Notes

poolId String The ID of the pool on which the task ran.

nodeId String The ID of the node on which the task ran.

multiInstanceSettings

ノ Expand table

Element name Type Notes

numberOfInstances Int32 The number of compute nodes required by the task.

constraints

ノ Expand table
Element name Type Notes

maxTaskRetryCount Int32 The maximum number of times the task may be retried. The Batch service
retries a task if its exit code is nonzero.

This value specifically controls the number of retries. The Batch service tries
the task once, and may then retry up to this limit. For example, if the
maximum retry count is 3, Batch tries a task up to 4 times (one initial try and
3 retries).

If the maximum retry count is 0, the Batch service doesn't retry tasks.

If the maximum retry count is -1, the Batch service retries tasks without limit.

The default value is 0 (no retries).

executionInfo

ノ Expand table

Element Type Notes


name

startTime DateTime The time when the task started running. 'Running' corresponds to the
running state, so if the task specifies resource files or application packages,
then the start time reflects the time at which the task started downloading or
deploying them. If the task is restarted or retried, it's the most recent time at
which the task started running.

endTime DateTime The time when the task completed.

exitCode Int32 The exit code of the task.

retryCount Int32 The number of times the task is retried by the Batch service. The task is
retried if it exits with a nonzero exit code, up to the specified
MaxTaskRetryCount.

requeueCount Int32 The number of times the task is requeued by the Batch service as a result of
user request.

When users remove nodes from a pool (by resizing or shrinking it) or disable
a job, they can choose to requeue the running tasks on those nodes for
execution. This count tracks how many times the task is requeued for these
reasons.
Task schedule fail event
07/02/2025

This event is emitted when a task failed to be scheduled and it's retried later. It's a temporary
failure at task scheduling time due to resource limitation, for example, not enough slots
available on nodes to run a task with requiredSlots specified.

The following example shows the body of a task schedule fail event.

{
"jobId": "job-01",
"id": "task-01",
"taskType": "User",
"systemTaskVersion": 665378862,
"requiredSlots": 1,
"nodeInfo": {
"poolId": "pool-01",
"nodeId": " "
},
"multiInstanceSettings": {
"numberOfInstances": 1
},
"constraints": {
"maxTaskRetryCount": 0
},
"schedulingError": {
"category": "UserError",
"code": "JobPreparationTaskFailed",
"message": "Task cannot run because the job preparation task failed on
node"
}
}

ノ Expand table

Element name Type Notes

jobId String The ID of the job containing the task.

id String The ID of the task.

taskType String The type of the task. It's either 'JobManager' indicating that it's a job
manager task or 'User' indicating it's not a job manager task. This
event isn't emitted for job preparation tasks, job release tasks, or
start tasks.
Element name Type Notes

systemTaskVersion Int32 The internal retry counter on a task. Internally the Batch service can
retry a task to account for transient issues. These issues can include
internal scheduling errors or attempts to recover from compute
nodes in a bad state.

requiredSlots Int32 The required slots to run the task.

nodeInfo Complex Contains information about the compute node on which the task ran.
Type

multiInstanceSettings Complex Specifies that the task is a Multi-Instance Task requiring multiple
Type compute nodes. See multiInstanceSettings for details.

constraints Complex The execution constraints that apply to this task.


Type

schedulingError Complex Contains information about the scheduling error of the task.
Type

nodeInfo

ノ Expand table

Element name Type Notes

poolId String The ID of the pool on which the task ran.

nodeId String The ID of the node on which the task ran.

multiInstanceSettings

ノ Expand table

Element name Type Notes

numberOfInstances Int32 The number of compute nodes required by the task.

constraints

ノ Expand table
Element name Type Notes

maxTaskRetryCount Int32 The maximum number of times the task may be retried. The Batch service
retries a task if its exit code is nonzero.

This value specifically controls the number of retries. The Batch service tries
the task once, and may then retry up to this limit. For example, if the
maximum retry count is 3, Batch tries a task up to 4 times (one initial try and
3 retries).

If the maximum retry count is 0, the Batch service doesn't retry tasks.

If the maximum retry count is -1, the Batch service retries tasks without limit.

The default value is 0 (no retries).

schedulingError

ノ Expand table

Element Type Notes


name

category String The category of the error.

code String An identifier for the task scheduling error. Codes are invariant and are intended
to be consumed programmatically.

message String A message describing the task scheduling error, intended to be suitable for
display in a user interface.
Azure Policy built-in definitions for
Azure Batch
Article • 02/06/2024

This page is an index of Azure Policy built-in policy definitions for Azure Batch. For
additional Azure Policy built-ins for other services, see Azure Policy built-in definitions.

The name of each built-in policy definition links to the policy definition in the Azure
portal. Use the link in the Version column to view the source on the Azure Policy GitHub
repo .

Azure Batch
ノ Expand table

Name Description Effect(s) Version


(Azure portal) (GitHub)

Azure Batch Use customer-managed keys to manage the Audit, Deny, 1.0.1
account should encryption at rest of your Batch account's data. Disabled
use customer- By default, customer data is encrypted with
managed keys service-managed keys, but customer-managed
to encrypt data keys are commonly required to meet regulatory
compliance standards. Customer-managed keys
enable the data to be encrypted with an Azure
Key Vault key created and owned by you. You
have full control and responsibility for the key
lifecycle, including rotation and management.
Learn more at https://aka.ms/Batch-CMK .

Azure Batch Enabling Azure Batch disk encryption ensures Audit, Disabled, 1.0.0
pools should that data is always encrypted at rest on your Deny
have disk Azure Batch compute node. Learn more about
encryption disk encryption in Batch at
enabled https://docs.microsoft.com/azure/batch/disk-
encryption.

Batch accounts Disabling local authentication methods Audit, Deny, 1.0.0


should have improves security by ensuring that Batch Disabled
local accounts require Azure Active Directory
authentication identities exclusively for authentication. Learn
methods more at: https://aka.ms/batch/auth .
disabled
Name Description Effect(s) Version
(Azure portal) (GitHub)

Configure Batch Disable location authentication methods so that Modify, Disabled 1.0.0
accounts to your Batch accounts require Azure Active
disable local Directory identities exclusively for
authentication authentication. Learn more at:
https://aka.ms/batch/auth .

Configure Batch Disabling public network access on a Batch Modify, Disabled 1.0.0
accounts to account improves security by ensuring your
disable public Batch account can only be accessed from a
network access private endpoint. Learn more about disabling
public network access at
https://docs.microsoft.com/azure/batch/private-
connectivity.

Configure Batch Private endpoints connect your virtual network DeployIfNotExists, 1.0.0
accounts with to Azure services without a public IP address at Disabled
private the source or destination. By mapping private
endpoints endpoints to Batch accounts, you can reduce
data leakage risks. Learn more about private
links at:
https://docs.microsoft.com/azure/batch/private-
connectivity.

Deploy Deploys the diagnostic settings for Batch DeployIfNotExists, 2.0.0


Diagnostic Account to stream to a regional Event Hub Disabled
Settings for when any Batch Account which is missing this
Batch Account diagnostic settings is created or updated.
to Event Hub

Deploy Deploys the diagnostic settings for Batch DeployIfNotExists, 1.0.0


Diagnostic Account to stream to a regional Log Analytics Disabled
Settings for workspace when any Batch Account which is
Batch Account missing this diagnostic settings is created or
to Log Analytics updated.
workspace

Metric alert Audit configuration of metric alert rules on AuditIfNotExists, 1.0.0


rules should be Batch account to enable the required metric Disabled
configured on
Batch accounts

Private endpoint Private endpoint connections allow secure AuditIfNotExists, 1.0.0


connections on communication by enabling private connectivity Disabled
Batch accounts to Batch accounts without a need for public IP
should be addresses at the source or destination. Learn
enabled more about private endpoints in Batch at
Name Description Effect(s) Version
(Azure portal) (GitHub)

https://docs.microsoft.com/azure/batch/private-
connectivity.

Public network Disabling public network access on a Batch Audit, Deny, 1.0.0
access should account improves security by ensuring your Disabled
be disabled for Batch account can only be accessed from a
Batch accounts private endpoint. Learn more about disabling
public network access at
https://docs.microsoft.com/azure/batch/private-
connectivity.

Resource logs in Audit enabling of resource logs. This enables AuditIfNotExists, 5.0.0
Batch accounts you to recreate activity trails to use for Disabled
should be investigation purposes; when a security incident
enabled occurs or when your network is compromised

Next steps
See the built-ins on the Azure Policy GitHub repo .
Review the Azure Policy definition structure.
Review Understanding policy effects.
High-performance computing (HPC) on
Azure
12/12/2024

Introduction to HPC
https://www.youtube-nocookie.com/embed/rKURT32faJk

High-performance computing (HPC), also called "big compute", uses a large number of CPU or
GPU-based computers to solve complex mathematical tasks.

Many industries use HPC to solve some of their most difficult problems. These include
workloads such as:

Genomics
Oil and gas simulations
Finance
Semiconductor design
Engineering
Weather modeling

How is HPC different on the cloud?


One of the primary differences between an on-premises HPC system and one in the cloud is
the ability for resources to dynamically be added and removed as they're needed. Dynamic
scaling removes compute capacity as a bottleneck and instead allow customers to right size
their infrastructure for the requirements of their jobs.

The following articles provide more detail about this dynamic scaling capability.

Big Compute Architecture Style


Autoscaling best practices

Implementation checklist
As you're looking to implement your own HPC solution on Azure, ensure you're reviewed the
following topics:

" Choose the appropriate architecture based on your requirements


" Know which compute options is right for your workload
" Identify the right storage solution that meets your needs
" Decide how you're going to manage all your resources
" Optimize your application for the cloud
" Secure your Infrastructure

Infrastructure
There are many infrastructure components that are necessary to build an HPC system.
Compute, storage, and networking provide the underlying components, no matter how you
choose to manage your HPC workloads.

Compute
Azure offers a range of sizes that are optimized for both CPU & GPU intensive workloads.

CPU-based virtual machines

Linux VMs
Windows VMs

GPU-enabled virtual machines


N-series VMs feature NVIDIA GPUs designed for compute-intensive or graphics-intensive
applications including artificial intelligence (AI) learning and visualization.

Linux VMs
Windows VMs

Storage
Large-scale Batch and HPC workloads have demands for data storage and access that exceed
the capabilities of traditional cloud file systems. There are many solutions that manage both
the speed and capacity needs of HPC applications on Azure:

Azure NetApp Files


Storage Optimized Virtual Machines
Blob, table, and queue storage
Azure SMB File storage
Azure Managed Lustre
For more information comparing Lustre, GlusterFS, and BeeGFS on Azure, review the Parallel
Files Systems on Azure e-book and the Lustre on Azure blog.

Networking
H16r, H16mr, A8, and A9 VMs can connect to a high throughput back-end RDMA network. This
network can improve the performance of tightly coupled parallel applications running under
Microsoft Message Passing Interface better known as MPI or Intel MPI.

RDMA Capable Instances


Virtual Network
ExpressRoute

Management

Do-it-yourself
Building an HPC system from scratch on Azure offers a significant amount of flexibility, but it is
often very maintenance intensive.

1. Set up your own cluster environment in Azure virtual machines or Virtual Machine Scale
Sets.
2. Use Azure Resource Manager templates to deploy leading workload managers,
infrastructure, and applications.
3. Choose HPC and GPU VM sizes that include specialized hardware and network
connections for MPI or GPU workloads.
4. Add high-performance storage for I/O-intensive workloads.

Hybrid and cloud Bursting


If you have an existing on-premises HPC system that you'd like to connect to Azure, there are
several resources to help get you started.

First, review the Options for connecting an on-premises network to Azure article in the
documentation. From there, you can find additional information on these connectivity options:
Connect an Connect an
on-premises on-premises
network to network to
Azure using a Azure using
VPN gateway ExpressRoute
This reference
with VPN
architecture shows failover
how to extend an on-
premises network to Implement a highly
Azure, using a site- available and secure
to-site virtual private site-to-site network
network (VPN). architecture that
spans an Azure
virtual network and
an on-premises
network connected
using ExpressRoute
with VPN gateway
failover.

Once network connectivity is securely established, you can start using cloud compute resources
on-demand with the bursting capabilities of your existing workload manager.

Marketplace solutions
There are many workload managers offered in the Azure Marketplace .

SUSE Linux Enterprise Server for HPC


TIBCO DataSynapse GridServer
Azure Data Science VM for Windows and Linux
D3View
UberCloud

Azure Batch
Azure Batch is a platform service for running large-scale parallel and HPC applications
efficiently in the cloud. Azure Batch schedules compute-intensive work to run on a managed
pool of virtual machines, and can automatically scale compute resources to meet the needs of
your jobs.
SaaS providers or developers can use the Batch SDKs and tools to integrate HPC applications
or container workloads with Azure, stage data to Azure, and build job execution pipelines.

In Azure Batch all the services are running on the Cloud, the image below shows how the
architecture looks with Azure Batch, having the scalability and job schedule configurations
running in the Cloud while the results and reports can be sent to your on-premises
environment.

Azure CycleCloud
Azure CycleCloud Provides the simplest way to manage HPC workloads using any scheduler
(like Slurm, Grid Engine, HPC Pack, HTCondor, LSF, PBS Pro, or Symphony), on Azure

CycleCloud allows you to:

Deploy full clusters and other resources, including scheduler, compute VMs, storage,
networking, and cache
Orchestrate job, data, and cloud workflows
Give admins full control over which users can run jobs, as well as where and at what cost
Customize and optimize clusters through advanced policy and governance features,
including cost controls, Active Directory integration, monitoring, and reporting
Use your current job scheduler and applications without modification
Take advantage of built-in autoscaling and battle-tested reference architectures for a
wide range of HPC workloads and industries

Hybrid / cloud bursting model

In this Hybrid example diagram, we can see clearly how these services are distributed between
the cloud and the on-premises environment. Having the opportunity to run jobs in both
workloads.

Cloud native model

The cloud native model example diagram below, shows how the workload in the cloud will
handle everything while still conserving the connection to the on-premises environment.

Comparison chart

ノ Expand table

Feature Azure Batch Azure CycleCloud

Scheduler Batch APIs and tools and Use standard HPC schedulers such as Slurm, PBS Pro,
command-line scripts in the LSF, Grid Engine, and HTCondor, or extend CycleCloud
Azure portal (Cloud Native). autoscaling plugins to work with your own scheduler.
Feature Azure Batch Azure CycleCloud

Compute Software as a Service Nodes – Platform as a Service Software – Platform as a Service


Resources Platform as a Service

Monitor Tools Azure Monitor Azure Monitor, Grafana

Customization Custom image pools, Third Use the comprehensive RESTful API to customize and
Party images, Batch API extend functionality, deploy your own scheduler, and
access. support into existing workload managers

Integration Synapse Pipelines, Azure Data Built-In CLI for Windows and Linux
Factory, Azure CLI

User type Developers Classic HPC administrators and users

Work Type Batch, Workflows Tightly coupled (Message Passing Interface/MPI).

Windows Yes Varies, depending on scheduler choice


Support

Workload managers
The following are examples of cluster and workload managers that can run in Azure
infrastructure. Create stand-alone clusters in Azure VMs or burst to Azure VMs from an on-
premises cluster.

Alces Flight Compute


Altair PBS Works
Rescale
Altair Grid Engine
Microsoft HPC Pack
HPC Pack for Windows
HPC Pack for Linux

Containers
Containers can also be used to manage some HPC workloads. Services like the Azure
Kubernetes Service (AKS) makes it simple to deploy a managed Kubernetes cluster in Azure.

Azure Kubernetes Service (AKS)


Container Registry

Cost management
Managing your HPC cost on Azure can be done through a few different ways. Ensure you've
reviewed the Azure purchasing options to find the method that works best for your
organization.

Security
For an overview of security best practices on Azure, review the Azure Security Documentation.

In addition to the network configurations available in the Cloud Bursting section, you can
implement a hub/spoke configuration to isolate your compute resources:

Implement a Implement a
hub-spoke hub-spoke
network network
topology in topology with
Azure shared
The hub is a virtual services in
network (VNet) in Azure
Azure that acts as a
central point of This reference
connectivity to your architecture builds on
on-premises network. the hub-spoke
The spokes are VNets reference
that peer with the architecture to
hub, and can be used include shared
to isolate workloads. services in the hub
that can be
consumed by all
spokes.

HPC applications
Run custom or commercial HPC applications in Azure. Several examples in this section are
benchmarked to scale efficiently with additional VMs or compute cores. Visit the Azure
Marketplace for ready-to-deploy solutions.

7 Note

Check with the vendor of any commercial application for licensing or other restrictions for
running in the cloud. Not all vendors offer pay-as-you-go licensing. You might need a
licensing server in the cloud for your solution, or connect to an on-premises license server.

Engineering applications
MATLAB Distributed Computing Server
StarCCM+

Graphics and rendering


Autodesk Maya, 3ds Max, and Arnold on Azure Batch

AI and deep learning


Microsoft Cognitive Toolkit

MPI providers
Microsoft MPI

Remote visualization
Run GPU-powered virtual machines in Azure in the same region as the HPC output for the
lowest latency, access, and to visualize remotely through Azure Virtual Desktop.

GPU-optimized virtual machine sizes


Configure GPU acceleration for Azure Virtual Desktop
Windows
desktops
using Azure
Virtual
Desktop on
Azure
Build a VDI
environment for
Windows desktops
using Azure Virtual
Desktop on Azure.

Performance benchmarks
Compute benchmarks

Customer stories
There are many customers who have seen great success by using Azure for their HPC
workloads. You can find a few of these customer case studies below:

AXA Global P&C


Axioma
d3View
EFS
Hymans Robertson
MetLife
Microsoft Research
Milliman
Mitsubishi UFJ Securities International
NeuroInitiative
Towers Watson

Other important information


Ensure your vCPU quota has been increased before attempting to run large-scale
workloads.

Next steps
For the latest announcements, see the following resources:

Microsoft HPC and Batch team blog


Visit the Azure blog .

Microsoft Batch Examples


These tutorials will provide you with details on running applications on Microsoft Batch:

Get started developing with Batch


Use Azure Batch code samples
Use low-priority VMs with Batch
Use compute-intensive VMs in Batch pools

Related resources
Big compute architecture style
Microsoft.Batch batchAccounts
Article • 12/09/2024

Bicep resource definition


The batchAccounts resource type can be deployed with operations that target:

Resource groups - See resource group deployment commands

For a list of changed properties in each API version, see change log.

Resource format
To create a Microsoft.Batch/batchAccounts resource, add the following Bicep to your
template.

Bicep

resource symbolicname 'Microsoft.Batch/batchAccounts@2024-07-01' = {


identity: {
type: 'string'
userAssignedIdentities: {
{customized property}: {}
}
}
location: 'string'
name: 'string'
properties: {
allowedAuthenticationModes: [
'string'
]
autoStorage: {
authenticationMode: 'string'
nodeIdentityReference: {
resourceId: 'string'
}
storageAccountId: 'string'
}
encryption: {
keySource: 'string'
keyVaultProperties: {
keyIdentifier: 'string'
}
}
keyVaultReference: {
id: 'string'
url: 'string'
}
networkProfile: {
accountAccess: {
defaultAction: 'string'
ipRules: [
{
action: 'Allow'
value: 'string'
}
]
}
nodeManagementAccess: {
defaultAction: 'string'
ipRules: [
{
action: 'Allow'
value: 'string'
}
]
}
}
poolAllocationMode: 'string'
publicNetworkAccess: 'string'
}
tags: {
{customized property}: 'string'
}
}

Property values

AutoStorageBasePropertiesOrAutoStorageProperties

ノ Expand table

Name Description Value

authenticationMode The authentication mode which the Batch 'BatchAccountManagedIdentity'


service will use to manage the auto-storage 'StorageKeys'
account.

nodeIdentityReference The identity referenced here must be ComputeNodeIdentityReference


assigned to pools which have compute
nodes that need access to auto-storage.

storageAccountId The resource ID of the storage account to string (required)


be used for auto-storage account.

BatchAccountCreateParametersTags

ノ Expand table
Name Description Value

BatchAccountCreatePropertiesOrBatchAccountProperties

ノ Expand table

Name Description Value

allowedAuthenticationModes List of allowed String array containing any of:


authentication 'AAD'
modes for the Batch 'SharedKey'
account that can be 'TaskAuthenticationToken'
used to authenticate
with the data plane.
This does not affect
authentication with
the control plane.

autoStorage The properties AutoStorageBasePropertiesOrAutoStorageProperties


related to the auto-
storage account.

encryption Configures how EncryptionProperties


customer data is
encrypted inside the
Batch account. By
default, accounts are
encrypted using a
Microsoft managed
key. For additional
control, a customer-
managed key can be
used instead.

keyVaultReference A reference to the KeyVaultReference


Azure key vault
associated with the
Batch account.

networkProfile The network profile NetworkProfile


only takes effect
when
publicNetworkAccess
is enabled.

poolAllocationMode The pool allocation 'BatchService'


mode also affects 'UserSubscription'
how clients may
authenticate to the
Batch Service API. If
the mode is
Name Description Value

BatchService, clients
may authenticate
using access keys or
Microsoft Entra ID. If
the mode is
UserSubscription,
clients must use
Microsoft Entra ID.
The default is
BatchService.

publicNetworkAccess If not specified, the 'Disabled'


default value is 'Enabled'
'enabled'. 'SecuredByPerimeter'

BatchAccountIdentity

ノ Expand table

Name Description Value

type The type of identity used for the 'None'


Batch account. 'SystemAssigned'
'UserAssigned' (required)

userAssignedIdentities The list of user identities BatchAccountIdentityUserAssignedIdentities


associated with the Batch
account.

BatchAccountIdentityUserAssignedIdentities

ノ Expand table

Name Description Value

ComputeNodeIdentityReference

ノ Expand table

Name Description Value

resourceId The ARM resource id of the user assigned identity. string

EncryptionProperties
ノ Expand table

Name Description Value

keySource Type of the key source. 'Microsoft.Batch'


'Microsoft.KeyVault'

keyVaultProperties Additional details when using Microsoft.KeyVault KeyVaultProperties

EndpointAccessProfile

ノ Expand table

Name Description Value

defaultAction Default action for endpoint access. It is only applicable when 'Allow'
publicNetworkAccess is enabled. 'Deny'
(required)

ipRules Array of IP ranges to filter client IP address. IPRule[]

IPRule

ノ Expand table

Name Description Value

action Action when client IP address is matched. 'Allow' (required)

value IPv4 address, or IPv4 address range in CIDR format. string (required)

KeyVaultProperties

ノ Expand table

Name Description Value

keyIdentifier Full path to the secret with or without version. Example string
https://mykeyvault.vault.azure.net/keys/testkey/6e34a81fef704045975661e297a4c053 .
or https://mykeyvault.vault.azure.net/keys/testkey . To be usable the following
prerequisites must be met:

The Batch Account has a System Assigned identity


The account identity has been granted Key/Get, Key/Unwrap and Key/Wrap permissions
The KeyVault has soft-delete and purge protection enabled
KeyVaultReference

ノ Expand table

Name Description Value

id The resource ID of the Azure key vault associated with the Batch account. string (required)

url The URL of the Azure key vault associated with the Batch account. string (required)

Microsoft.Batch/batchAccounts

ノ Expand table

Name Description Value

identity The identity of the Batch BatchAccountIdentity


account.

location The region in which to create string (required)


the account.

name The resource name string

Constraints:
Min length = 3
Max length = 3
Pattern = ^[a-z0-9]+$ (required)

properties The properties of the Batch BatchAccountCreatePropertiesOrBatchAccountProperties


account.

tags Resource tags Dictionary of tag names and values. See Tags in
templates

NetworkProfile

ノ Expand table

Name Description Value

accountAccess Network access profile for batchAccount endpoint EndpointAccessProfile


(Batch account data plane API).

nodeManagementAccess Network access profile for nodeManagement EndpointAccessProfile


endpoint (Batch service managing compute nodes
for Batch pools).
UserAssignedIdentities

ノ Expand table

Name Description Value

Quickstart samples
The following quickstart samples deploy this resource type.

ノ Expand table

Bicep File Description

Azure Batch pool without public IP addresses This template creates Azure Batch simplified node
communication pool without public IP addresses.

Create a Batch Account using a template This template creates a Batch Account and a
storage account.

Feedback
Was this page helpful?  Yes  No

You might also like