Pre-Processing Data I often hear the question from our customers, how data can be transformed prior to indexing in Splunk. Damien from Baboonbones has done a tremendous job in creating add-ons providing custom inputs for Splunk. Most of his custom inputs provide the means to pre-process data by allowing custom event handlers to be written. Sometimes you still want to pre-process data that gets collected from Splunk's standard input types, like file monitors, Windows EventLogs, scripted inputs etc. Also, not everyone is capable of writing custom event handlers. A requirement these customers have, is that they have rolled out a large number of Splunk Universal Forwarders and they do not want to install another agent. To summarize, the solution capable of pre-processing data, should be easy to use, be easily integrated and be build on top of their existing architecture. How to plumb Splunk Pipelines Splunk has its own fittings to connect a Universal Forwar
Welcome back to the "Heating up the Data Pipeline" blog series. In part 1 we talked about how to route data from Splunk to a 3rd party system. In part 2 walked through a simple data flow that passes data collected from Splunk Forwarders through Apache NiFi back to Splunk over the HTTP Event Collector. In this part, we will look at a more complex use case, where we route events to an index, based on the sending host's classification. The classification will be looked up from a CSV file. We will make use of Apache NiFi's new Record-Oriented data handling capabilities, which will look initially a bit more complicated, but once you grasp it, it will make further Use Cases easier and faster to build. High-Level Dataflow We will again start with our ListentTCP input, but this time we will send the data to another Processor Group. The Processor Group will again emit events suitable for Splunk HEC. Note that we have increased the Max Batch Size from 1
In Part 1 we went through how to route events from Splunk to a 3rd party system without losing metadata. Now I'll show you how events can be transformed using Apache NiFi and be sent back to Splunk into the HTTP Event Collector . Note: The following is not a step-by-step documentation. To learn how to use Apache NiFi you should read the G etting Started Guide . Simple Pass-Through Flow As a first exercise we will create a simple flow, that only passes data through NiFi, without applying any complex transformations. The following picture shows a high-level NiFi flow, that receives events in our custom uncooked event format with a TCP listener, then sends the data further into a transformation "black box" (aka Processor Group), which emits events in a format, that can be ingested into a Splunk HTTP Event Collector input. Apache Nifi currently provides a rapidly growing number of processors (currently 266), which can be used for data ingestion, transfo
Comments