The ABC’s of Splunk Part One: What deployment to Choose

When I first started working with Splunk, I really didn’t understand the nuanced differences between a Clustered environment and a standalone other than the fact that one is much more complex and powerful than the other. In this blog, I’m going to share my experience of the factors that need to be considered and what I learned throughout the process. 

Let’s start with the easy stuff:
  1. Do you intend to run Enterprise Security? If you are, clustered is the way to go unless you are a very small shop (less than 10GB/day of ingestion)

  2. How many log messages, systems, and feeds will you configure? If you intend to receive in excess of 50GB/day of logs, you will need a clustered environment. You can potentially get away with a standalone but your decision will most likely change to a clustered environment over time as your system matures and adds the necessary alerts and searches

Now, moving on to the harder items:
  • How about if I’m receiving less than 50GB/day: In this scenario, it will depend primarily on the following factors:

    • Number of Users: Splunk allocates 1 CPU core for each search being executed. Increasing the number of users will also increase the number of searches in your deployment. On average, If <10, then standalone, otherwise clustered

    • Scheduled Saved-searches, Reports, and Alerts:  How many alerts do you intend to configure, and how frequently will they run the searches? If less than 30, then a standalone will work, but more will require a clustered environment especially if the alerts/searches are running every 5 minutes

    • How many  cloud tenancies are you going to be pulling logs from AWS, O365, GSuite, Sophos, and others collect lots of logs and if you have more than 5 of these to pull logs from, I would choose a clustered environment over a standalone (the larger your user environment, the more logs you will be collecting from your cloud tenancies)

    • How many systems are you pulling the logs from? If you have in excess of 70 systems, I would choose a clustered environment over standalone

    • Finally, Is your organization going to grow? I assume you know the drill here

A recent “how-to” question came from a Splunk user that is pertinent to this blog ”What if I want to build a standalone server because the complexity of the clustered environment is beyond my abilities, and my deployment based on the items above marginally requires a clustered environment, is there something I can do?”

The simple answer is yes, there are two things that will make a standalone environment work in this scenario:

  1. Add more memory and CPUs which you can always do after the fact: (look at the specs of the standalone server at the bottom of the document)

  2. Add a heavy forwarder: Heavy forwarders can handle the initial incoming traffic to your Splunk from all the different feeds and cloud tenancies which will help the Splunk platform dedicate the resources to acceleration, searches, dashboards, alerts/reports, etc.

Finally, it’s important to note that a clustered environment has a replication factor that can be used to recover data in case a single indexer fails and or the data on it is lost

Important Note when using Distributed Architecture:

Network latency plays an important role in a distributed/clustered environment, therefore, minimal network latency between your indexers and search heads will ensure optimal performance.

Hardware Requirements

Standalone Environment (Single Instance)

Splunk Recommended Hardware Configuration
  • Intel x86 64-bit chip architecture

  • 12 CPU cores at 2Ghz or greater speed per core

  • 12GB RAM

  • Standard 64-bit Linux or Windows distribution

  • Storage Requirement – Calculate Storage Requirement

View Reference Here

Standalone Environment with a separate Heavy Forwarder

Hardware Configuration
  • Same as Standalone hardware requirement for both the Standalone Instance and the Heavy Forwarder, however, the heavy forwarder does not store data and therefore you can get away with a 50 or 100 GB drive partition

Distributed Clustered Architecture

Distributed Architecture will have the following components:
  • Heavy Forwarder – Collects the data and forwards it to Indexers.

  • Indexers – Stores the data and performs a search on that data (3 or more)

  • Search Head – Users will interact here. The search head will trigger the search on indexers to fetch the data.

  • Licensing Server

  • Master Cluster Node

  • Deployment Server

Search Head hardware requirements

  • Intel 64-bit chip architecture

  • 16 CPU cores at 2Ghz or greater speed per core

  • 12GB RAM

  • A 1Gb Ethernet NIC

  • A 64-bit Linux or Windows distribution

Indexer requirements

  • Intel 64-bit chip architecture

  • 12 CPU cores at 2GHz or greater per core

  • 12GB RAM

  • 800 average IOPS as a minimum for the disk subsystem. For details, see the topic Disk subsystem. Refer Calculate Storage Requirement see how much storage will your deployment need

  • A 1Gb Ethernet NIC

  • A 64-bit Linux or Windows distribution

Heavy Forwarder requirements

  • Intel 64-bit chip architecture

  • 12 CPU cores at 2Ghz or greater speed per core.

  • 12GB RAM

  • A 1Gb Ethernet NIC

  • A 64-bit Linux or Windows distribution

Deployment/Licensing/Cluster Master requirements

  • Intel 64-bit chip architecture

  • 12 CPU cores at 2GHz or greater per core

  • 12GB RAM

  • A 1Gb Ethernet NIC

  • A 64-bit Linux or Windows distribution

View Reference Here

Calculate Storage Requirements

Splunk will compress the data that you are ingesting. At a very high-level, Splunk’s compressed data to almost half the size, so for your standalone environment, you can calculate storage requirements with the below equation.

( Daily average indexing rate ) x ( retention policy in days ) x 1/2

For or your clustered environment, you can calculate storage requirements for each indexer with the below equation.

((( Daily average indexing rate ) x ( retention policy in days ) x 1/2) x replication factor)) / No. of Indexers)
View Reference Here

Written by Usama Houlila.

Any questions, comments, or feedback are appreciated! Leave a comment or send me an email to uhoulila@newtheme.jlizardo.com for any questions you might have.

If you wish to learn more, click the button below to schedule a free consultation with Usama Houlila.

Leave a Reply

Your email address will not be published. Required fields are marked *