What the HEC Is This Thing?

The Splunk HTTP Event Collector (HEC) is a great mechanism for receiving streaming data from a variety of sources where it may not be practical to use another collection mechanism, such as monitoring a log file. These include many types of cloud services and applications, as well as custom applications that can do logging via a web POST request. In some cases, you may have the option of using HEC or an API pull on a heavy forwarder to collect data, such as for Amazon Web Services (AWS). Generally, if HEC is an available option, it is the best one to use.

Why HEC?

I’ve covered some of the benefits of using HEC near the end of my 2019 Splunk.conf talk, Administrators Anonymous: Splunk Best Practices and Useful Tricks I Learned the Hard Way, available here for your viewing pleasure.

Using HEC is advantageous to your data ingestion experience because using a streaming mechanism will be better than using a polling method. This is for a couple of different reasons. A streaming mechanism:

Provides close to real-time ingestion: Streaming data can be received by Splunk as soon as it is created, which makes it available for searching much faster. A polling method runs on a scheduled interval, checking for new data in a queue on a regular basis. This means there will always be at least some delay in getting data, depending on how often the polling happens.

Streaming data can be received by Splunk as soon as it is created, which makes it available for searching much faster. A polling method runs on a scheduled interval, checking for new data in a queue on a regular basis. This means there will always be at least some delay in getting data, depending on how often the polling happens. Avoids API limits: In order to reduce the delay associated with an API pull, it may be tempting to increase the frequency of these checks to get the data faster. However, many APIs have or may enforce rate limits if they are accessed too frequently. Streaming data doesn’t rely on this mechanism, so any API rate limits don’t apply in the same way.

In order to reduce the delay associated with an API pull, it may be tempting to increase the frequency of these checks to get the data faster. However, many APIs have or may enforce rate limits if they are accessed too frequently. Streaming data doesn’t rely on this mechanism, so any API rate limits don’t apply in the same way. Improves load balancing and redundancy: There’s not a great mechanism to load balance or allow for failover when using a heavy forwarder for data collection. HEC, on the other hand, can use an external load balancing mechanism to distribute incoming data to all of your indexers.

There’s not a great mechanism to load balance or allow for failover when using a heavy forwarder for data collection. HEC, on the other hand, can use an external load balancing mechanism to distribute incoming data to all of your indexers. Improves reliability: Many vendors force API tokens to expire on a recurring basis, some as frequently as every 90 days. When this expires, your data ingestion will break. Streaming data removes the need to constantly keep an API token up to date.

Configuring HEC

By this point, I’m sure you’re thrilled to get started with using HEC and want to see how it’s done. I’ve put together some instructions and also a few video demos of the process on a lab instance to show you how it works.

See the video below for a walkthrough of setting up HEC:

To set up HEC on a single Splunk instance:

Navigate to Settings -> Data Inputs:

Click on HTTP Event Collector:

In the HTTP Event Collector screen, there will be two configurations that need to be set. First, you will need to create a token. Once a token is created, you must globally enable HTTP event collector under the Global Settings menu.

To create a new token, click on “New Token”. Then, specify a name for the token and click Next.

On the next screen, you can customize the input and specify the sourcetype, index, and app context for the input. In this example, I am putting the HTTP configuration into the search app, but you’ll likely want to put this in a HEC-specific inputs app in a real deployment. Additionally, you can force a source type on the inputs config here.

The index configuration allows you to select indexes that are permitted to receive data via the HEC input being configured. In the example, I am selecting the hec index and using that as both the allowed index and default index.

Next, review the configuration and click submit to save it.

Finally, grab the token value that is displayed on the confirmation screen. You will need this token to send data to HEC.

Now, you need to globally enable HEC. From the HTTP Event Collector screen in the data inputs settings, click on Global Settings:

Set all tokens to Enabled, verify the HTTP Port Number is set to 8088, and click Save.

Now you are ready to test your HEC setup.

Testing Your HEC Configuration

To test your new HEC token, we will use the curl command to send a web request to the HEC endpoint.

This video demonstrates the process of sending some sample data:

First, make sure you have the address of your Splunk instance, as well as the HEC token you generated in the last section. You will use this to create the following curl command:

curl -k https://172.16.212.130:8088/services/collector -H 'Authorization: Splunk 578254cc-05f5-46b5-957b-910d1400341a' -d '{"sourcetype": "demo", "event":"Hello, world!"}'

In this example, we are sending the data in JSON format, using the /services/collector endpoint. If the data was in raw format, you could send it to the /services/collector/raw endpoint instead. We are also specifying the sourcetype as “demo” and creating the event in this example.

Upon running the curl command, you will see a Success response from the Splunk instance, indicating that the data was received.

You can then search for this data in your Splunk instance:

Under the Hood: Inputs.conf and HEC

Now that we’ve seen how to configure HEC and how to test it, let’s dive into the config files to see how they control this data input.

I’ve put together a quick demo, where I add an additional HEC token in inputs.conf. See below:

If you followed the configuration steps and set the app to the Searching and Reporting (search) app, you’ll find the inputs.conf for the HEC token you set up in the web interface in $SPLUNK_HOME/etc/apps/search/local.inputs.conf. In this example, it looks like this:

[http://hec-test] description = Splunk HTTP Event Collector Demo disabled = 0 index = hec indexes = hec token = 578254cc-05f5-46b5-957b-910d1400341a

If you want to create a new HEC token, you can simply create a new stanza in inputs.conf. To easily create a token, you can use uuidgen, if it is available on your system:

# uuidgen 96d7fb16-1fc2-4d6f-ba2f-f686787bcb92

To enable this token, simply create an additional stanza in inputs.conf, reload or restart Splunk on the system, and you can receive data via your new token as well.

[http://hec-test-2] description = Splunk HTTP Event Collector Demo #2 disabled = 0 index = main indexes = main token = 96d7fb16-1fc2-4d6f-ba2f-f686787bcb92

Conclusion

I hope these examples and instructions are helpful in learning more about how to configure and test the Splunk HTTP event collector. May your data streaming be smooth!