Pre-Processing Data I often hear the question from our customers, how data can be transformed prior to indexing in Splunk. Damien from Baboonbones has done a tremendous job in creating add-ons providing custom inputs for Splunk. Most of his custom inputs provide the means to pre-process data by allowing custom event handlers to be written. Sometimes you still want to pre-process data that gets collected from Splunk's standard input types, like file monitors, Windows EventLogs, scripted inputs etc. Also, not everyone is capable of writing custom event handlers. A requirement these customers have, is that they have rolled out a large number of Splunk Universal Forwarders and they do not want to install another agent. To summarize, the solution capable of pre-processing data, should be easy to use, be easily integrated and be build on top of their existing architecture. How to plumb Splunk Pipelines Splunk has its own fittings to connect a Univer...
Welcome back to the "Heating up the Data Pipeline" blog series. In part 1 we talked about how to route data from Splunk to a 3rd party system. In part 2 walked through a simple data flow that passes data collected from Splunk Forwarders through Apache NiFi back to Splunk over the HTTP Event Collector. In this part, we will look at a more complex use case, where we route events to an index, based on the sending host's classification. The classification will be looked up from a CSV file. We will make use of Apache NiFi's new Record-Oriented data handling capabilities, which will look initially a bit more complicated, but once you grasp it, it will make further Use Cases easier and faster to build. High-Level Dataflow We will again start with our ListentTCP input, but this time we will send the data to another Processor Group. The Processor Group will again emit events suitable for Splunk HEC. Note that we have increased the Max Batch Size from 1 ...
After reading Brendan's newest blog entry, I was curious about what kind of slog latency we can see for our data migration load. To remind you, only synchronous writes go into the slog SSD devices. As this is a migration running, we can see mostly NFS write operations: In our configuration the SSD slog is mirrored (HDD4 and HDD8). Hence the same number of IOPS: The next picture shows us the latency for our SDD SLOG Device HDD 4. We can see here that latencies start at 79 us and are mostly under 200 us. There are some outliers, but approx. 95% are under 500 us: This matches quite well with the values Brendan blogged about (137-181 us), which includes NFSv3 latency. For reference (no picture here), we can see latencies of about 170-500 us mostly for NFSv4. By the way. SLOG Devices are mostly one way devices, as shown here. Only if things go really bad, they are read from...
Comments