TECH BRIEF
Deploying Splunk Enterprise on Microsoft Azure
Splunk® provides the leading platform for Operational can also run independently from within different cloud
Intelligence. Splunk software searches, monitors, instances. Depending on the deployment infrastructure,
analyzes and visualizes machine-generated big data considerations must also be taken to allocate the
from websites, applications, servers, networks, sensors proper amount of resources per component type.
and mobile devices. More than 11,000 organizations
Forwarders perform data collection, data forwarding
use Splunk software to deepen business and customer
and data load balancing. Low amounts of resources
understanding, mitigate cybersecurity risk, improve
are required to run a forwarder as they typically read
service performance and reduce costs. Splunk
and send data with minimal overhead. A Universal
Enterprise indexes machine data in real time, enabling
Forwarder is a lightweight package of the Splunk
multiple roles across the organization — from system
software that can perform most, if not all, of the
administrators to business analysts — to rapidly gain
forwarder functionality.
insight from the massive amounts of machine data
generated by your environment. Indexers write the data to a storage device and perform
searching on the data. These can be resource intense
Adopting a cloud strategy enables organizations
and require I/O and CPU allotment.
to increase agility, reduce costs, decrease time to
market and empower innovation. Splunk Enterprise is Search heads search for information across indexers
perfect for deploying in a cloud environment, offering and require CPU and memory allotment.
enterprise-grade availability and scalability to support
Budgeting system resources and bandwidth to enable
the collection of hundreds of terabytes of data per
search and index performance depend on the total
day from workloads residing on-premises, in the
volume of data being indexed and the number of
cloud or across hybrid environments. This document
active concurrent searches (scheduled or otherwise)
covers guidelines for deploying Splunk Enterprise on
at any time.
Microsoft Azure, an open and flexible cloud platform
with a growing collection of integrated cloud services, In addition to rapidly writing data to disk, indexers
including analytics, computing, database, mobile, perform much of the work involved in running searches:
networking, storage and web. reading data off disk, decompressing it, extracting
knowledge and reporting. Since indexers incur most
Splunk Deployment Components of the workload, increases in indexing volume should
A typical Splunk deployment includes Splunk be tied to an increase in indexer instances. Deploying
forwarders, indexers and search heads. Splunk additional indexers will distribute the load of increased
Enterprise is a single package that can perform one or data volume, resulting in reduced contention for
many of the roles that each component would normally resources and improved search performance.
deliver, in addition to others. The software can be Common Azure deployments leverage a combination
installed within minutes on your choice of hardware of forwarders and network streams to send data to the
(physical, cloud or virtual) and operating system. The Splunk indexer(s). While forwarders are not required to
package is available for download for most operating gather data from the source, they do provide certain
systems. Depending on the deployment infrastructure, benefits such as flexibility, load balancing and reliability.
considerations must also be taken to allocate the Using a syslog output (from a data source) or a file
proper amount of resources per component type. mount is also a common method of getting data into
While all major Splunk components can be run from the Splunk indexer. Additionally, modular inputs, which
a single installation on a single cloud instance, they are extensions to Splunk Enterprise that define a
TECH BRIEF
custom data input, and HTTP Event Collector, a highly unmanaged. Splunk recommends using managed VHDs
efficient and secure mechanism to send high volumes for Splunk storage. More specifically, you can store
of data directly to Splunk, can be used to collect data Splunk application and configurations in the persistent
from various API sources. OS disk and store Splunk indexes across multiple
persistent data disks.
Other Splunk components include the Deployment
Server (configuration management), License Master Managed Disks are preferred for various reasons:
(license management) and Master Node (data • Managed Disks transparently handle storage
replication management). accounts. With Unmanaged Disks, if IOPS across all
disks in a storage account approach storage account
Performance Considerations limits, you must create additional storage accounts as
Within Microsoft Azure well as rebalance your virtual machine disks across
There are several performance factors to consider the storage accounts to insure they stay within
when deploying Splunk software on Microsoft Azure. the IOPS limit. Managed Disks remove the need to
These considerations are Azure Virtual Machine (VM) provision additional storage accounts, effectively
image and size, and underlying Azure Storage. removing these IOPS limits.
Azure VM image • Managed Disks provide greater reliability for
Splunk Enterprise runs on most widely available Availability Sets by ensuring disks are sufficiently
operating systems including Windows and *nix isolated to avoid single points of failure. This ensures
platforms. Splunk is persistent software that is intended that VMs in an Availability Set will not be stored in
to gather and index data at all times; thus, reserved the same storage scale unit. Therefore if one VM in
instances are preferred. an Availability Set goes down due to hardware of
software failure, the other VMs will not be impacted.
Azure VM size
The size of an Azure VM is defined by the number of CPU • VHDs configured as Managed Disks are highly
cores, the generation of CPU, the amount of available available and are designed for 99.999% availability.
memory, the maximum network bandwidth, and number Refer to Microsoft’s documentation on specific
of data disks that can be attached. The following are storage limits.
recommended minimum Azure VM requirements:
• 8 CPU cores (compute optimized series) Deployment Guidelines and
• 14GB of RAM Examples
Splunk Enterprise scales horizontally, making it well The tables below describe general guidelines for
suited for Microsoft Azure. Adding Splunk instances mapping instances to Splunk workloads. Best practices
can give you more performance and capacity for architecting and sizing should still be considered
depending on usage and data volume requirements. when referencing these guidelines. It is important to
See below for more detail on recommended sizes. remember that overall Splunk load is composed of both
indexing and searching.
Azure Storage
Azure VM has two types of disks: a local temporary disk
Small-Scale Deployment
and a network-attached persistent disk or virtual hard
disk (VHD). Each VM comes with a local disk, one OS disk Table 1: Indexers
as VHD, and can have one or more data disks as VHDs. Instance Size (Type)
Daily Indexing
Performance
Volume (GB)
A local disk is generally not suitable to store Splunk
indexes since it’s intended for temporary data only: data Standard_DS4_v2 Up to 100 Good
on local disk may be lost in case of a hardware failure or
Standard_DS5_v2 100-500 Better
upon VM resize or reboot.
A VHD is stored in a standard or premium storage Standard_DS15_v2 150-250 Best
account in Azure. VHDs can be managed or
Deploying Splunk Enterprise on Microsoft Azure 2
TECH BRIEF
Table 2: Search Heads Medium-Scale Deployment
Concurrent The following specifications outline an example of a
Instance Size (Type) Performance
Users
medium-scale deployment that is capable of indexing
Standard_DS5_v2 Up to 8 Good 500GB/day, with a concurrent search load of eight users.
• 5 – Standard_DS5_v2 with VHDs-backed storage in
Standard_DS15_v2 Up to 16 Better
an Availability Set (Indexers)
• 1 – Standard_DS5_v2 with VHDs-backed storage
Table 3: Deployment Server, License or (Search Head)
Cluster Master
• 1 – Standard_D(S)3_v2 (License Master)
Instance Size (Type) Performance • N – Universal Forwarders (data sources)
Standard_DS3_v2 Good Architecturally, this deployment consists of six Splunk
instances in a traditional distributed configuration. Five
Standard_DS4_v2 Better of these instances act as indexers and one acts as the
search head. This deployment leverages the horizontal
The following specifications outline an example of a scalability of Splunk software. The number and size of
small-scale deployment that is capable of indexing your VHD volumes should be based on your retention
up to 100GB/day, with a maximum of six concurrent requirements and expected daily indexing volume.
searches running at all times. It is not uncommon
for this type of instance to be deployed for indexing Large-Scale Deployment
volumes in the single digit GB/day range. The following specifications outline an example of a
• 1 – Standard_DS4_v2 with VHDs-backed storage large-scale deployment that is capable of indexing 1TB/
day, with a concurrent search load of 16 users. As noted
• N – Universal Forwarders (data sources)
earlier, Splunk software scales horizontally. To increase
Architecturally, this is a single Splunk instance the capacity or performance of this installation, simply
performing indexing and searching. Data can be sent to add indexers or search heads when appropriate.
this system via Splunk forwarders, HTTP event collector,
• 5 – Standard_DS15_v2 with VHDs-backed storage in
local files, NFS mounted files, SMB file shares, and
an Availability Set (Indexers)
scripted calls or modular inputs. The number and size of
your VHD volume(s) should be based on your retention • 1 – Standard_DS15_v2 with VHDs-backed storage
requirements and expected daily indexing volume. (Search Head)
• 1 - Standard_D(S)3_v2 (License Master)
Distributed Deployments and • N – Universal Forwarders (data sources)
Azure Availability Sets
Architecturally, there is a single search head distributing
An Availability Set is a logical grouping capability used
searches to five Splunk indexers and N number of
in Azure to ensure that the VM resources placed within
Splunk forwarders distributing data to these indexers.
it are isolated from each other when deployed within an
The number and size of your VHD volumes should be
Azure datacenter. Azure ensures that the VMs placed
based on your retention requirements and expected
within an Availability Set run across multiple physical
daily indexing volume.
servers, compute racks, storage units and network
switches. When more than one VM is used to fulfill a role,
Splunk recommends using an availability set for that
role. For example, if more than one indexer is used in a
deployment, Splunk recommends placing the indexers
in an Availability Set. If search head clustering is used,
place the search heads in a separate Availability Set.
Deploying Splunk Enterprise on Microsoft Azure 3
TECH BRIEF
Clustered Deployment The graphic above represents a hybrid environment
where Splunk Enterprise is installed on-premises and
The following specifications are an example of a large-
in the cloud. Splunk software’s distributed search
scale deployment leveraging the index replication
capability allows you to peer into both environments
feature. Index replication creates and manages multiple
from a single interface.
copies of indexes’ buckets, so they are readily available
in the rare event of a Splunk indexer outage. This
Additional Considerations
deployment is capable of indexing 500GB/day, with
• Leverage Splunk Universal Forwarders to gather data
a concurrent search load of up to eight users. Similar
from existing systems.
to the previous example, adding indexers or search
heads will increase performance or capacity when • Use the Splunk deployment server to manage and
appropriately applied. propagate Splunk apps and configurations from a
central Splunk instance.
• 5 – Standard_DS5_v2 with VHDs-backed storage in
• The Index Replication feature allows for high
an Availability Set (Indexers)
availability of the indexed data across multiple
• 1 – Standard_DS5_v2 with VHDs-backed storage
Splunk systems. Availability is managed at the Splunk
(Search Head)
software layer versus traditional storage availability
• 1 – Standard_D(S)3_v2 (License Master and Master Node) methods (like VHD with RAID).
• N – Universal Forwarders (data sources) • Consider provisioning an Azure VM for use as a jump
Architecturally, there are five Splunk indexers and a box for SSH or RDP console access. To secure the
single Splunk search head. All of these components jump box, add a NSG rule that allows connections
communicate with the cluster and license manager only from a safe set of public IP addresses.
instance for replication and licensing purposes. Like
the previous example, the search head distributes
Summary
search to all five indexers, although it does so, based For best performance when deploying Splunk
on information from the cluster master. To increase Enterprise on Azure, use the recommended Azure VM
retention, capacity or both, simply add more indexers sizes and Azure Storage volumes, and plan according
and/or consider larger instance sizes. to your expected daily volume requirements. As Azure
VM and Azure Storage are friendly to horizontal scaling,
Hybrid Environment deploy additional Splunk instances and data disk
volumes to gain capacity and performance.
Indexer Indexer/
Search Head
Databases Web Services App Servers Networking Web Services App Servers
Cloud On-Premises
Use Splunk Solutions on Microsoft Azure
Download Splunk Enterprise for free to quickly deploy Splunk Enterprise as either a single instance or a distributed cluster on
Azure. You’ll get a Splunk Enterprise license for 60 days and you can index up to 500 megabytes of data per day. After 60 days,
or any time before then, you can convert to a perpetual free license or purchase an enterprise license by contacting Splunk at
www.splunk.com/asksales.
Splunk Add-on for Microsoft Cloud Services. Get started with the Splunk Add-on for Microsoft Cloud Services to gain operational
visibility and security from a variety of Office 365 and Azure services.
Learn more: www.splunk.com/asksales www.splunk.com
Splunk, Splunk>, Data-to-Everything, D2E and Turn Data Into Doing are trademarks and registered trademarks of Splunk Inc. in the United States and
other countries. All other brand names, product names or trademarks belong to their respective owners. © 2020 Splunk Inc. All rights reserved. 20-13235-Splunk-ES on Microsoft Azure-102-TB