Splunk is an incredibly robust tool that can scale depending on the certain parameters:

Number of users using the deployment

Amount of data coming in

Number of endpoints sending data to the deployment

Depending upon the above parameters you can horizontally/vertically scale a deployment to accommodate to your needs. In this blog we will briefly discuss following deployments:

Standalone deployment

Distributed deployment

Clustered deployment

Before we dive into various deployments, let us go over some of the widely used components in a Splunk deployment. Splunk comes out of the box with the following components and can be tailored suit your needs. Bear in mind – these components will be used in all the deployments except “Standalone”. Will shed more light on this later.

Search Head

A search head is used to – as the name suggests –search the data. Search heads get all the traffic from the end users. End users log into the UI using the search head and run their searches, reports, alerts, and dashboards and other knowledge objects.

Indexer

An indexer is used to index/parse the data. Splunk uses its proprietary algorithm to store the data in a way that it can be retrieved in a faster manner and then searched upon.

In a distributed deployment – search head (where user searches) and an indexer (where the data is stored) can be separated out. This makes sure that both of the functions i.e. searching and storing is done in a quick and efficient manner. To understand how Splunk indexes data, you can follow this link

Forwarder

A forwarder is used to – as the name suggests –orward the data to a specific target. There are two types of forwarders:

Heavy Forwarders

Lightweight Forwarders

Depending on the use case for the data and infrastructure which decide role for selecting the type of forwarder. To learn more about the difference refer this link.

Components above are represented diagrammatically as follows:

Now that we have covered understanding of basic components, let’s go over the different deployments of Splunk.

Standalone Deployment

A standalone deployment in Splunk means that all the functions that Splunk does are managed by a single instance. Various functions that a Standalone Deployment can do are:

Searching

Indexing

Parsing

Hold Knowledge Objects (This covers Reporting/Alerting/Dashboard Creation and many more)

When is this deployment type used?

This type of deployment is typically used when there are a limited number of users and a very limited amount of data flowing into Splunk.

Pros/Cons of this type of deployment:

Component Pros Cons Supportability Very easy to manage and support as it has only one instance NA High Availability NA No high availability as it is a single point of failure Disaster Recovery NA No disaster availability as it is a single point of failure Search Concurrency NA Low search concurrency as it is a single instance and can be over-loaded easily

Distributed Deployment

There are a few drawbacks of a “Standalone” deployment for Splunk in terms of High Availability, Disaster Recovery and Search Concurrency. To overcome some of these, Splunk can be set up in a way to distribute the tasks to different instances within the platform.

In this deployment, the roles of the Search Head, Indexers and Forwarders are split to create a distributed deployment.

To do this we need to create a distributed search. To learn more about distributed search click on this link.

We can see that we have now split the functions of each component to create a distributed environment. Find the comparison as follows:

Component Pros Cons Supportability Easy to support as the components are separated out in different functions NA High Availability NA Single point of failure. If the indexer goes down, then indexing stops. Disaster Recovery NA Single point of failure. Search Concurrency Higher search concurrency compared to standalone as Search head is separated out If the users are more, consider going into search head clustering for higher search concurrency

Clustered deployment

We can’t achieve features like High Availability and Disaster Recovery for mission-critical production deployments. To achieve this, we need a clustered deployment which looks as follows:

(Image source: Splunk)

In the above deployment, the indexers are in a cluster and there is something called a “Master Node” – this Master Node or Cluster Master manages the indexers and replicates the data across multiple indexers. This creates more than one copy of data across the deployment giving the users “High Availability” of the data.

Master Node Functions:

Manages configurations/apps across all the Peer nodes/indexers

Manages incoming search requests from the search heads

Knows all the copies of the data that is indexed and replicates if any indexer goes out of service

Search Head Captain Functions:

Manages the search load coming in from users

Delegates search jobs coming in from different search head members

Replicates the knowledge bundle across the search head cluster.

We can also see the Search Heads are now in a cluster. This means that it now gives us “High Search Concurrency” which was a drawback in a previous distributed deployment.

To learn how to do indexer clustering please follow this link. To learn how to do search head clustering please follow this link.

Component Pros Cons Supportability Supportaibility is challenging, however, with Master and Captain Nodes we can manage the Splunk configs and apps easily NA High Availability Highly Available as data is replicated across multiple nodes and if single indexer goes down still the data is searchable. If a search head goes down, other search heads will continue to provide the service NA Disaster Recovery No disaster recovery No disaster recovery Search Concurrency High search concurrency as there is a cluster of Search Heads serving multiple clients NA

The learn more on how to size and scale any Splunk deployments please refer this link.

This blog explains different deployments and compares them. It is suggested that once you have the basic differences between different deployments in Splunk, it is easier to architect your deployment per your needs.

Happy Splunking!