How to check the compression ratio of data ingested into Splunk

release date
2016-11-14
last updated
2023-12-05
version
Splunk Enterprise 9.0.4
Overview
How to check the compression ratio of data ingested into Splunk
Reference information
content

About Compressing Ingested Data

Splunk compresses the data to be imported to about 50% and saves it on the file system. This 50% compression rate is a guideline, and the actual compression rate will vary depending on the data.

Therefore, when selecting hardware, it is necessary to check in advance how much the data to be imported will be compressed before being imported into Splunk.

*1. Splunk creates an index file and a raw data file when saving data. Raw data is created with a size of about 10% of the file to be imported, and index file is created with a size of about 10 to 110% of the file to be imported, so the compression rate may not be about 50%.

*2. The index file may become large if the data includes Japanese (double-byte characters) or if fields are extracted during import.

See the document below for details.

※About the compression rate of 1.
https://docs.splunk.com/Documentation/Splunk/6.5.0/Capacity/Estimate your storage requirements

※About field extraction of 2.
http://docs.splunk.com/Documentation/Splunk/6.5.0/Data/Aboutindexedfieldextraction

About how to check the compression ratio of imported data

By using the management console, you can investigate the compression ratio for each index.

The procedure is described below.

Data compression ratio investigation procedure

  • Go to SplunkWeb and create a new index from Settings > Indexes > New Index.
  • Save the data to check the compression rate in the index created in 1).
    *If the data whose compression ratio you want to check is already stored in a specific index, skip steps 1) and 2).
  • Select Settings > Monitor Console.
  • Select Indexing > Index Details: Instance from the navigation menu.
  • Specify the index created in 1) from the index of the input form.
  • Review the following items on the overview dashboard:
  • Index size
    1. Compressed data size stored in Splunk
  • Uncompressed Raw Data Size
    1. Actual data size
  • Raw to Index Size Ratio
    1. Ratio of data size after compression to actual data size
      *For example, in the case of 2:1, you can see that the compression ratio is 50%

that's all