I often hear the question from our customers, how data can be transformed prior to indexing in Splunk.
Damien from Baboonbones has done a tremendous job in creating add-ons providing custom inputs for Splunk. Most of his custom inputs provide the means to pre-process data by allowing custom event handlers to be written.
Sometimes you still want to pre-process data that gets collected from Splunk's standard input types, like file monitors, Windows EventLogs, scripted inputs etc. Also, not everyone is capable of writing custom event handlers.
A requirement these customers have, is that they have rolled out a large number of Splunk Universal Forwarders and they do not want to install another agent.
To summarize, the solution capable of pre-processing data, should be easy to use, be easily integrated and be build on top of their existing architecture.
How to plumb Splunk Pipelines
Splunk has its own fittings to connect a Universal Forwarder to a Heavy Forwarder o…
In Part 1 we went through how to route events from Splunk to a 3rd party system without losing metadata. Now I'll show you how events can be transformed using Apache NiFi and be sent back to Splunk into the HTTP Event Collector.
Note: The following is not a step-by-step documentation. To learn how to use Apache NiFi you should read the Getting Started Guide.
Simple Pass-Through Flow
As a first exercise we will create a simple flow, that only passes data through NiFi, without applying any complex transformations.
The following picture shows a high-level NiFi flow, that receives events in our custom uncooked event format with a TCP listener, then sends the data further into a transformation "black box" (aka Processor Group), which emits events in a format, that can be ingested into a Splunk HTTP Event Collector input.
Apache Nifi currently provides a rapidly growing number of processors (currently 266), which can be used for data ingestion, transformation and data output. P…