ABC's of Splunk, CyberSecurity, Partners, Splunk, Tips & Tricks

The ABC’s of Splunk Part Five: Splunk CheatSheet

Aug 12, 2020 by Sam Taylor

In the past few blogs, I wrote about which environments to choose whether – clustered or standalone, how to configure on Linux, how to manage the storage over time, and the deployment server.

If you haven’t read our previous blogs, get caught up here! Part 1, Part 2, Part 3, Part 4

For this blog, I decided to switch it around and provide you with a CheatSheet (takes me back to high school) for the items that you will need through your installation process which are sometimes hard to find.

This blog will be split into two sections: Splunk and Linux CheatSheets

Splunk CheatSheet:

1: Management Commands

$SPLUNK_HOME$/bin/splunk status – To check Splunk status

$SPLUNK_HOME$/bin/splunk start – To start the Splunk processes

$SPLUNK_HOME$/bin/splunk stop – To stop the Splunk processes

$SPLUNK_HOME$/bin/splunk restart – To restart the Splunk

2: How to Check Licensing Usage

Go to “Settings” > “Licensing”.

For a more detailed report go to “Settings” > “Monitoring Console” > “Indexing” > “Licence Usage”

3: How to delete index Data: You’re Done Configuring Your Installation But You Have Lots of Logs Going into an Old Indexer and or Data That You No Longer Need But is Taking Space.

Clean Index Data (Note: you cannot recover these logs once you issue the command)

$SPLUNK_HOME$/bin/splunk clean eventdata -index

If you do not provide -index argument, that will clear all the indexes.

Do to apply this command directly in the clustered environment.

4: Changing your TimeZone (Per User)

Click on your username on the top navigation bar and select “Preferences”.

5: Search Commands That Are Nice To Know For Beginners

Index=”name of index you’re trying to search. E.g “pan_log” for Palo Alto firewalls”

Sourcetype=”name of sourcetype for the items you are looking for. E.g. “pan:traffic, pan:userid, pan:threat, pan:system”

The following are more examples on how to filter further in your search:

| dedup : allows you to remove all events of similar output – for instance if you dedup on user and your firewall is generating logs for all user activity, you will not see all the activity of the user, just all the distinct users

| stats: Calculates aggregate statistics, such as average, count, and sum, over the results set

| stats count by rule : Will show you the number of events that matches any specific rule on your firewall

How to get actual event ingestion time?

As most of you may know, the _time field in the events in Splunk is not always the event ingestion time. So, how to get event ingestion time in Splunk? You can get that with the _indextime field.

| eval it=strftime(_indextime, “%F %T”) | table it, _time, other_fields

Search for where the packets are coming to a receiving port

index=_internal source=*metrics.log tcpin_connections or udpin_connections

Linux CheatSheet:

User Operations

whoami – Which user is active. Useful to verify you are using the correct user to make configuration changes in the backend.

chown -R : – Change the owner of directory.

Directory Operations

mv – Moving file or directory to new location.

mv – Renaming a file or directory.

cp – Copy a file to a new location.

cp -r – Copy a directory to the new location.

rm -rf – Remove file or directory.

Get Size

df -h – Get disk usage (in human-readable size unit)

du -sh * – Get the size of all the directories under the current directory.

watch df -h – Monitor disk usage (in human-readable size unit). Update stats every two seconds. Press Ctrl+C to exit.

watch du -sh * – Get size of all the directories under the current directory. Update stats every two seconds. Press Ctrl+C to exit.

Processes

ps -aux – List all the running processes.

top – Get resource utilization statistics by the processes

Work with Files

vi – Open and edit the file with VI editor

tail -f – Tail the log file (will display the content of the log file. Unlike cat, touch, or vi it displays the live logs coming to the file.

Networking

ifconfig – To get the IP address of the machine

Written by Usama Houlila.

Any questions, comments, or feedback are appreciated! Leave a comment or send me an email to uhoulila@newtheme.jlizardo.com for any questions you might have.

ABC's of Splunk, CrossRealms, CyberSecurity, Partners, Splunk

The ABC’s of Splunk Part Four: Deployment Server

Aug 3, 2020 by Sam Taylor

Thank you for joining us for part four of our ABC’s of Splunk series. If you haven’t read our first three blogs, get caught up here! Part 1, Part 2, Part 3.

When I started working with Splunk, our installations were mostly small with less than 10 servers and the rest of the devices mainly involved switches, routers, and firewalls. In the current environments which we manage most installations have more than three hundred servers which are impossible to manage without some form of automation. As you manage your environment over time, one of the following scenarios will make you appreciate the deployment server:

You need to update a TA (technology add-on) on some, if not all, of your Universal Forwarders.
Your logging needs changed over time and now you need to collect more or less data from your Universal Forwarders.
You’re in the middle of investigating a breach, and/or an attack, and need to quickly push a monitoring change to your entire environment. – How cool is that!

What is a Deployment Server?

A deployment server is an easy way to manage forwarders without logging into them directly and individually to make any changes. Forwarders are the Linux or Microsoft Windows servers that you are collecting logs from by installing the Splunk Universal Forwarder.

Deployment servers also provide a way to show you which server has which Apps and whether those servers are in a connected state or offline.

Please note that whether you use Splunk Cloud or on-prem, the Universal Forwarders are still your responsibility and I hope that this blog will provide you with some good insights.

Deployment Server Architecture:

The below image shows how a deployment architecture looks conceptually.

There are three core components of the deployment server architecture:

Deployment Apps
Splunk Apps that will be deployed to the forwarders.
Deployment Client
The forwarder instances on which Splunk Apps will be deployed.
Server Classes
A logical way to map between Apps and Deployment Clients.

- You can have multiple Apps within a Server Class.
- You can deploy multiple Server Classes on a single Deployment Client.
- You can have the same Server Class deployed on multiple Clients.

How Deployment Server Works:

Each deployment client periodically polls the deployment server, identifying itself.
The deployment server determines the set of deployment Apps for the client based on which server classes the client belongs to.
The deployment server gives the client the list of Apps that belong to it, along with the current checksums of the Apps.
The client compares the App info from the deployment server with its own App info to determine whether there are any new or updated Apps that it needs to download.
If there are new or updated Apps, the Deployment Client downloads them.
Depending on the configuration for a given App, the client might restart itself before the App changes take effect.

Where to Configure the Deployment Server:

The recommendation is to use a dedicated machine for the Deployment Server. However, you can use the same machine for other management components like “License Master”, “SH Cluster Deployer” or “DMC”. Do not combine it with Cluster Master.

Configuration:

I started writing this in a loose format explaining the concepts but quickly realized that a step by step is a much easier method to digest the process

1. Create a Deployment Server

By default, a Splunk server install does not have the deployment server configured and if you were to go to the GUI and click on settings, forwarder management, you will get the following message.

To enable a deployment server, you start by installing any App in $SPLUNK_HOME/etc/deployment-apps directory. If you’re not sure how to do that, download any App that you want through the GUI on the server you want to configure (see the example below)

and then using the Linux shell or Windows server Cut/Paste, mv the entire App directory that was created from $SPLUNK_HOME/etc/apps where it installs by default to $SPLUNK_HOME/etc/deployment-apps. See below:

Move

/opt/splunk/etc/apps/Splunk_TA_windows$

To /opt/splunk/etc/deployment-apps/Splunk_TA_windows$

This will automatically allow your Splunk server to present you with the forwarder management interface

2. Manage Server Classes Apps and Clients

Next, you will need to add a server class. Go to Splunk UI > Forwarder Management > Server Class. Create a new server class from here.

Give it a name that is meaningful to you and your staff and go to Step 3

3. Point the Clients to this Deployment Server

You can either specify that in the GUI guided config when you install Splunk Universal Forwarder on a machine or by using the CLI post installation

Splunk set deploy-poll <IP_address/hostname>:

Where,

IP_Address – IP Address of Deployment Server

management_port – Management port of deployment server (default is 8089)

4. Whitelist the Clients on the Deployment Server

Go to any of the server classes you just created, click on edit clients.

For Client selection, you can choose the “Whitelist” and “Blacklist” parameters. You can write a comma-separated IP address list in the “Whitelist” box to select those Clients

5. Assign Apps to Server Classes:

Go to any of the server classes you just created, and click on edit Apps.

Click on the Apps you want to assign to the server class.

Once you add Apps and Clients to a Server Class, Splunk will start deploying the Apps to the listed Clients under that Server Class.

You will also see whether the server is connected and the last time it phoned home.

Note – Some Apps that you push require the Universal Forwarder to be restarted. If you want Splunk Forwarder to restart on update of any App, edit that App (using the GUI) and then select the checkbox “Restart on Deploy”.

Example:

You have a few AD servers, a few DNS servers and a few Linux servers with Universal Forwarders installed to get some fixed sets of data, and you have 4 separate Apps to collect Windows Performance data, DNS specific logs, Linux audit logs, and syslogs.

Now you want to collect Windows Performance logs from all the Windows servers which includes AD servers, and DNS servers. You would also like to collect syslog and audit logs from Linux servers.

Here is what your deployment server would look like:

Server Class – Windows

- Apps – Windows_Performance
- Deployment Client – All AD servers and All DNS servers

Server Class – DNS

- Apps – DNS_Logs
- Deployment Client – DNS servers

Server Class – Linux

- Apps – linux_auditd, linux_syslog
- Deployment Client – Linux servers

6. How to Verify Whether Forwarder is Sending Data or Not?

Go to the Search Head and search with the below search (Make sure you have rights to see internal indexes data):

index=_internal | dedup host | fields host | table host

Look in the list to see if your Forwarder’s hostname is in the list, if it is present that means the Forwarder is connected. If you are missing a host using the above command, you might have one of two problems:

A networking and or firewall issue somewhere in between and or on the host.
Need to redo step 3 and/or restart the Splunk process on that server.

If you are missing a particular index/source data then check inputs.conf configuration in the App that you pushed to that host.

Written by Usama Houlila.

Any questions, comments, or feedback are appreciated! Leave a comment or send me an email to uhoulila@newtheme.jlizardo.com for any questions you might have.

ABC's of Splunk, CyberSecurity, Partners, Splunk, Tips & Tricks

The ABC’s of Splunk Part Three: Storage, Indexes, and Buckets

Jul 28, 2020 by Sam Taylor

In our previous two blogs, we discussed whether to build a clustered or single Splunk environment and how to properly secure a Splunk installation using a Splunk user.

Read our first blog here

Read our second blog here

For this blog, we will discuss the art of Managing Storage with indexes.conf

In my experience, it’s easy to create and start using a large Splunk environment, until you see storage on your Splunk indexers getting full. What would you do? You start reading about it and you get information about indexes and buckets but you really don’t know what those are. Let’s find out

What is an Index?

Indexes are a logical collection of data. On disk, index data is stored in different buckets

What are Buckets?

Buckets are sets of directories that contain _raw data (logs), and indexes that point to the raw data organized by age

Types of Buckets:

There are 4 types of buckets in the Splunk based on the Age of the data

Hot Bucket
1. Location – homePath (default – $SPLUNK_DB//db)
2. Age – New events come to these buckets
3. Searchable – Yes
Warm Buckets
1. Location – homePath (default – $SPLUNK_DB//db)
2. Age – Hot buckets will be moved to Warm buckets based on multiple policies of Splunk
3. Searchable – Yes

Cold Bucket
1. Location – coldPath (default – $SPLUNK_DB//cold)
2. Age – warm buckets will be moved to Cold buckets based on multiple policies of Splunk
3. Searchable – Yes

Frozen Bucket (Archived)
1. Location – coldToFrozenDir (default – $SPLUNK_DB//cold
2. Age – Cold buckets can be optionally archived. Archived data are called to be Frozen buckets.
3. Searchable – No

Thawed ~~Bucket~~ Location
1. Location – thawedPath (no default)
2. Age – Splunk does not put any data here. This is the location where archived (frozen) data can be unarchived -we will be covering this topic at a later date
3. Searchable – Yes

Manage Storage and Buckets

I always like to include the reference materials from which the blog is based upon and the link below has all the different parameters that can be altered whether they should or not. It’s a long read but necessary if you intend to become an expert on Splunk

https://docs.splunk.com/Documentation/Splunk/8.0.5/Admin/Indexesconf

Continuing with the blog:

Index level settings

homePath
- Path where hot and warm buckets live
- Default – $SPLUNK_DB//db
- MyView – As data in Warm and hot bucket are latest and that’s what mostly is being searched. Keep it in a faster storage to get better search performance.
coldPath
- Path where cold buckets are stored
- Default – $SPLUNK_DB//colddb
- MyView – As Splunk will move data from the warm bucket to here, slower storage can be used as long as you don’t have searches that span long periods > 2 months
thawedPath
- Path where you can unarchive the data when needed
- Volume reference does not work with this parameter
- Default – $SPLUNK_DB//thaweddb
maxTotalDataSizeMB
- The maximum size of an index, in megabytes.
- Default – 500000
- MyView – When I started working with Splunk, I left this field as-is for all indexes. Later on, I realized that the decision was ill-advised because the total number of indexes multiplied by the individual size, far exceeded my allocated disk space. If you can estimate the data size in any way, do it at this stage and save yourself the headache
repFactor = 0|auto
- Valid only for indexer cluster peer nodes.
- Determines whether an index gets replicated.
- Default – 0
- MyView – when creating indexes (on a cluster), set the repFactor = auto so that if you change your mind down the line and decide to increase your resiliency. You can simply edit from the GUI and the change will apply to all your indexes without making manual changes to each one

And now for the main point of this blog: How do I control the size of the buckets in my tenancy?

Option 1: Control how buckets migrate between hot to warm to cold

Hot to Warm (Limiting Bucket’s Size)

maxDataSize = |auto|auto_high_volume
- The maximum size, in megabytes, that a hot bucket can reach before splunk
- Triggers a roll to warm.
- auto – 750MB
- auto_high_volume – 10GB
- Default – auto
- MyView – Do not change it.
maxHotSpanSecs
- Upper bound of timespan of hot/warm buckets, in seconds. Maximum timespan of any bucket can have.
- This is an advanced setting that should be set with care and understanding of the characteristics of your data.
- Default – 7776000 (90 days)
- MyView – Do not increase this value.
maxHotBuckets
- Maximum number of hot buckets that can exist per index.
- Default – 3
- MyView – Do not change this.

Warm to Cold

homePath.maxDataSizeMB
- Specifies the maximum size of ‘homePath’ (which contains hot and warm buckets).
- If this size is exceeded, splunk moves buckets with the oldest value of latest time (for a given bucket) into the cold DB until homePath is below the maximum size.
- If you set this setting to 0, or do not set it, splunk does not constrain the size of ‘homePath’.
- Default – 0
maxWarmDBCount
- The maximum number of warm buckets.
- Default – 300
- MyView – Set this parameter with care as the number of buckets is very arbitrary based on a number of factors.

Cold to Frozen

When to move the buckets?

frozenTimePeriodInSecs [Post this time, the data will be deleted]
- The number of seconds after which indexed data rolls to frozen.
- Default – 188697600 (6 years)
- MyView – If you do not want to archive the data, set this parameter to time for which you want to keep your data. After that Splunk will delete the data.
coldPath.maxDataSizeMB
- Specifies the maximum size of ‘coldPath’ (which contains cold buckets).
- If this size is exceeded, splunk freezes buckets with the oldest value of the latest time (for a given bucket) until coldPath is below the maximum size.
- If you set this setting to 0, or do not set it, splunk does not constrain the size of ‘coldPath’.
- Default – 0

What to do when freezing the buckets?

Delete the data
- Default setting for Splunk
Archive the data
- Please note – If you archive the data, Splunk will not delete the data automatically, you have to do it manually.

- coldToFrozenDir
  - Archive the data into some other directories
  - This data is not searchable
  - It cannot use volume reference.
- coldToFrozenScript
  - Script that you can use to ask Splunk what to do to archive the data from cold storage
  - See indexes.conf.spec for more information

Option 2: Control the maximum volume size of your buckets

Volumes

There are only two important settings that you really need to care about.

path
- Path on the disk
maxVolumeDataSizeMB
- If set, this setting limits the total size of all databases that reside on this volume to the maximum size specified, in MB. Note that this will act only on those indexes which reference this volume, not on the total size of the path set in the ‘path’ setting of this volume.
- If the size is exceeded, splunk removes buckets with the oldest value of the latest time (for a given bucket) across all indexes in the volume, until the volume is below the maximum size. This is the trim operation. This can cause buckets to be chilled [moved to cold] directly from a hot DB, if those buckets happen to have the least value of latest-time (LT) across all indexes in the volume.
- MyView – I would not recommend using this parameter if you are having multiple (small and large) indexes in the same volume because now, the size of the volume will decide when the data moves from the hot buckets to the cold buckets irrespective of how important and or fast you need it to be

The Scenario that led to this blog:

Issue

One of our clients has a clustered environment and the hot/warm paths were on SSD drives of limited size (1 TB per indexer) and the coldpath had a 3TB size per indexer. The ingestion rate was somewhere around 60 GB per day across 36+ indexes which resulted in the hot/warm volume to fill up before any normal migration process would occur. When we tried to research the problem and ask the experts, there was no consensus on the best method and I would summarize the answer as follows “It’s an art and different per environment. I.e. we don’t have any advice for you”

Resolution

We initially started looking for an option to move data to cold storage when data reaches a certain age (time) limit. But there is no way to do that. (Reference – https://community.splunk.com/t5/Deployment-Architecture/How-to-move-the-data-to-colddb-after-30-days/m-p/508807#M17467)

So, then we had two options as mentioned in the Warm to Cold section.

maxWarmDBCount
homePath.maxDataSizeMB

The problem with the maxDataSizeMB setting is that it would impact all indexes which means that some are going to end up in the cold bucket although they are needed in the hot/warm bucket and are not taking space. So we went the warm bucket route because we knew that only three indexes seem to consume most of the storage. We looked at those and found that they were containing 180+ warm buckets.

We reduced maxWarmDBCount to 40 for these large indexes only and the storage size for the hot and warm buckets normalized for the entire environment.

For our next blog, we will be discussing how to archive and unarchive data in Splunk

Written by Usama Houlila.

Any questions, comments, or feedback are appreciated! Leave a comment or send me an email to uhoulila@newtheme.jlizardo.com for any questions you might have.

If you wish to learn more, click the button below to schedule a free consultation with Usama Houlila.

ABC's of Splunk, CyberSecurity, Splunk

The ABC’s of Splunk Part Two: How to Install Splunk on Linux

Jul 21, 2020 by Sam Taylor

In the last blog, we discussed how to choose between a single or clustered environment. Read our first blog here!

Regardless of which one you choose, you must install Splunk using a user other than root to prevent the Splunk platform from being used in a security breach.

The following instructions have to be done in sequence:

Step 1: Create a Splunk user

We will first create a separate user for Splunk and add a group for that user.
groupadd splunk
useradd -d /opt/splunk -m -g splunk splunk

Step 2: Download and Extract Splunk

The easiest way to download Splunk on a Linux machine is with wget. To get the URL do the following:

Go to https://www.splunk.com/en_us/download/splunk-enterprise.html
Log in with your Splunk credential.
Select to download the Linux .tgz file. This will download the latest version of Splunk. To download an older version click on the “Older Releases” link.
Once you click download, it will start downloading Splunk on your browser. Stop downloading.
On the newly opened page, you will see Link for useful tools from there select “Download via Command Line (wget)” to get the URL.
Select and copy the full wget link.

Open a Linux ssh session and paste in /opt/ directory. This will download the Splunk tgz file.

Extract Splunk:

tar -xvzf

Step 3: Start Splunk

Make sure from this point onwards you always use Splunk user to do any activity in the backend related to Splunk.

Change ownership of the Splunk directory.
Chown -R splunk:splunk /opt/splunk

Change user to Splunk.
su splunk

Start Splunk
/opt/splunk/bin/splunk start –accept-license

It will ask you to enter the admin username and password.

Step 4: Enable Splunk boot start.

/opt/splunk/bin/splunk enable boot-start -user splunk

Step 5: Use Splunk

Open your browser and go to the URL below and you will be able to use Splunk.
http://<ip-or-host-of-your-linux-machine>:8000/

Use the username and password you entered in step-3 while starting Splunk.

Click here for a reference

Written by Usama Houlila.

Any questions, comments, or feedback are appreciated! Leave a comment or send me an email to uhoulila@newtheme.jlizardo.com for any questions you might have.
If you wish to learn more, click the button below to schedule a free consultation with Usama Houlila.

ABC's of Splunk, Partners, Splunk, Tips & Tricks

The ABC’s of Splunk Part One: What deployment to Choose

Jul 15, 2020 by Sam Taylor

When I first started working with Splunk, I really didn’t understand the nuanced differences between a Clustered environment and a standalone other than the fact that one is much more complex and powerful than the other. In this blog, I’m going to share my experience of the factors that need to be considered and what I learned throughout the process.

Let’s start with the easy stuff:

Do you intend to run Enterprise Security? If you are, clustered is the way to go unless you are a very small shop (less than 10GB/day of ingestion)
How many log messages, systems, and feeds will you configure? If you intend to receive in excess of 50GB/day of logs, you will need a clustered environment. You can potentially get away with a standalone but your decision will most likely change to a clustered environment over time as your system matures and adds the necessary alerts and searches

Now, moving on to the harder items:

How about if I’m receiving less than 50GB/day: In this scenario, it will depend primarily on the following factors:

- Number of Users: Splunk allocates 1 CPU core for each search being executed. Increasing the number of users will also increase the number of searches in your deployment. On average, If <10, then standalone, otherwise clustered
- Scheduled Saved-searches, Reports, and Alerts: How many alerts do you intend to configure, and how frequently will they run the searches? If less than 30, then a standalone will work, but more will require a clustered environment especially if the alerts/searches are running every 5 minutes
- How many cloud tenancies are you going to be pulling logs from AWS, O365, GSuite, Sophos, and others collect lots of logs and if you have more than 5 of these to pull logs from, I would choose a clustered environment over a standalone (the larger your user environment, the more logs you will be collecting from your cloud tenancies)
- How many systems are you pulling the logs from? If you have in excess of 70 systems, I would choose a clustered environment over standalone
- Finally, Is your organization going to grow? I assume you know the drill here

A recent “how-to” question came from a Splunk user that is pertinent to this blog ”What if I want to build a standalone server because the complexity of the clustered environment is beyond my abilities, and my deployment based on the items above marginally requires a clustered environment, is there something I can do?”

The simple answer is yes, there are two things that will make a standalone environment work in this scenario:

Add more memory and CPUs which you can always do after the fact: (look at the specs of the standalone server at the bottom of the document)
Add a heavy forwarder: Heavy forwarders can handle the initial incoming traffic to your Splunk from all the different feeds and cloud tenancies which will help the Splunk platform dedicate the resources to acceleration, searches, dashboards, alerts/reports, etc.

Finally, it’s important to note that a clustered environment has a replication factor that can be used to recover data in case a single indexer fails and or the data on it is lost

Important Note when using Distributed Architecture:

Network latency plays an important role in a distributed/clustered environment, therefore, minimal network latency between your indexers and search heads will ensure optimal performance.

Hardware Requirements

Standalone Environment (Single Instance)

Splunk Recommended Hardware Configuration

Intel x86 64-bit chip architecture
12 CPU cores at 2Ghz or greater speed per core
12GB RAM
Standard 64-bit Linux or Windows distribution
Storage Requirement – Calculate Storage Requirement

View Reference Here

Standalone Environment with a separate Heavy Forwarder

Hardware Configuration

Same as Standalone hardware requirement for both the Standalone Instance and the Heavy Forwarder, however, the heavy forwarder does not store data and therefore you can get away with a 50 or 100 GB drive partition

Distributed Clustered Architecture

Distributed Architecture will have the following components:

Heavy Forwarder – Collects the data and forwards it to Indexers.
Indexers – Stores the data and performs a search on that data (3 or more)
Search Head – Users will interact here. The search head will trigger the search on indexers to fetch the data.
Licensing Server
Master Cluster Node
Deployment Server

Search Head hardware requirements

Intel 64-bit chip architecture
16 CPU cores at 2Ghz or greater speed per core
12GB RAM
A 1Gb Ethernet NIC
A 64-bit Linux or Windows distribution

Indexer requirements

Intel 64-bit chip architecture
12 CPU cores at 2GHz or greater per core
12GB RAM
800 average IOPS as a minimum for the disk subsystem. For details, see the topic Disk subsystem. Refer Calculate Storage Requirement see how much storage will your deployment need
A 1Gb Ethernet NIC
A 64-bit Linux or Windows distribution

Heavy Forwarder requirements

Intel 64-bit chip architecture
12 CPU cores at 2Ghz or greater speed per core.
12GB RAM
A 1Gb Ethernet NIC
A 64-bit Linux or Windows distribution

Deployment/Licensing/Cluster Master requirements

Intel 64-bit chip architecture
12 CPU cores at 2GHz or greater per core
12GB RAM
A 1Gb Ethernet NIC
A 64-bit Linux or Windows distribution

View Reference Here

Calculate Storage Requirements

Splunk will compress the data that you are ingesting. At a very high-level, Splunk’s compressed data to almost half the size, so for your standalone environment, you can calculate storage requirements with the below equation.

( Daily average indexing rate ) x ( retention policy in days ) x 1/2

For or your clustered environment, you can calculate storage requirements for each indexer with the below equation.

((( Daily average indexing rate ) x ( retention policy in days ) x 1/2) x replication factor)) / No. of Indexers)
View Reference Here

Written by Usama Houlila.

Any questions, comments, or feedback are appreciated! Leave a comment or send me an email to uhoulila@newtheme.jlizardo.com for any questions you might have.

If you wish to learn more, click the button below to schedule a free consultation with Usama Houlila.

Call Us:

Email Us:

Category: ABC’s of Splunk

The ABC’s of Splunk Part Five: Splunk CheatSheet

Splunk CheatSheet:

1: Management Commands

2: How to Check Licensing Usage

3: How to delete index Data: You’re Done Configuring Your Installation But You Have Lots of Logs Going into an Old Indexer and or Data That You No Longer Need But is Taking Space.

4: Changing your TimeZone (Per User)

5: Search Commands That Are Nice To Know For Beginners

How to get actual event ingestion time?

Search for where the packets are coming to a receiving port

Linux CheatSheet:

User Operations

Directory Operations

Get Size

Processes

Work with Files

Networking

Written by Usama Houlila.

The ABC’s of Splunk Part Four: Deployment Server

What is a Deployment Server?

Deployment Server Architecture:

How Deployment Server Works:

Where to Configure the Deployment Server:

Configuration:

1. Create a Deployment Server

2. Manage Server Classes Apps and Clients

3. Point the Clients to this Deployment Server

4. Whitelist the Clients on the Deployment Server

5. Assign Apps to Server Classes:

Example:

6. How to Verify Whether Forwarder is Sending Data or Not?

Other Useful Content:

Written by Usama Houlila.

The ABC’s of Splunk Part Three: Storage, Indexes, and Buckets

What is an Index?

What are Buckets?

Types of Buckets:

Manage Storage and Buckets

Index level settings

And now for the main point of this blog: How do I control the size of the buckets in my tenancy?

Option 1: Control how buckets migrate between hot to warm to cold

Hot to Warm (Limiting Bucket’s Size)

Warm to Cold

Cold to Frozen

When to move the buckets?

What to do when freezing the buckets?

Option 2: Control the maximum volume size of your buckets

The Scenario that led to this blog:

Issue

Resolution

Written by Usama Houlila.

The ABC’s of Splunk Part Two: How to Install Splunk on Linux

The following instructions have to be done in sequence:

Step 1: Create a Splunk user

Step 2: Download and Extract Splunk

Step 3: Start Splunk

Step 4: Enable Splunk boot start.

Step 5: Use Splunk

Written by Usama Houlila.

The ABC’s of Splunk Part One: What deployment to Choose

Let’s start with the easy stuff:

Now, moving on to the harder items:

Important Note when using Distributed Architecture:

Hardware Requirements

Standalone Environment (Single Instance)

Splunk Recommended Hardware Configuration

Standalone Environment with a separate Heavy Forwarder

Hardware Configuration

Distributed Clustered Architecture

Distributed Architecture will have the following components:

Search Head hardware requirements

Indexer requirements

Heavy Forwarder requirements

Deployment/Licensing/Cluster Master requirements

Calculate Storage Requirements

Written by Usama Houlila.