Splunk Plugin

Splunk Plugin

Introduction

Splunk Enterprise is a fully featured, powerful platform for collecting, searching, monitoring and analyzing machine data. Splunk Enterprise is easy to deploy and use. It turns machine data into rapid visibility, insight and intelligence. 

Use case(s)

User Storie(s)

  • As a pipeline developer, I would like to write relevant machine level data to Splunk for analysis

  • As a pipeline developer, I would like to filter and transform relevant fields to send to Splunk for analysis

  • As a pipeline developer I would like to ensure the records and relevant metadata are transformed into the correct format needed by Splunk HTTP Event Collector (Splunk HEC). Each record must contain event data and optional metadata in required format.

  • As a pipeline developer, I would like to send HTTP POST requests to Spunk HEC in JSON. One request is generated for one batch of records to Splunk HEC

  • As a pipeline developer, I would like to get an error if the data is not written successfully to Spunk HTTP Event Collector (Splunk HEC)

Plugin Type

Batch Source
Batch Sink 
Real-time Source
Real-time Sink
Action
Post-Run Action
Aggregate
Join
Spark Model
Spark Compute

Configurables

Following fields must be configurable for the plugin. The plugin should be created as a wrapper on HTTPSink with additional attributes required for Splunk HTTP Event Collector

 

Batch Sink

User Facing Name

Type

Description

Constraints

Macro Enabled?

User Facing Name

Type

Description

Constraints

Macro Enabled?

URL

String

Required. The URL to post data to.

 

yes

Authentication Type

Select

  • Basic Authentication

    • Example: -u "x:<hec_token>"

  • HTTP Authentication

    • Example: "Authorization: Splunk <hec_token>"

  • Query String (Splunk Cloud Only)

    • Example: ?token=<hec_token>

    Pre-requisities for Query String URL

  • You must also enable query string authentication on a per-token basis. On your Splunk server, request Splunk Support to edit the file at $SPLUNK_HOME/etc/apps/splunk_httpinput/local/inputs.conf. Your tokens are listed by name in this file, in the form http://<token_name>

  • Within the stanza for each token you want to enable query string authentication, add the following setting (or change the existing setting, if applicable):allowQueryStringAuth = true

 

 

HEC Token

String

Required. Specify value of token created for authentication to Splunk.

 

 

Username

String

Required for Basic Authentication. Login name for authentication to the Splunk API.

 

 

Endpoint

Select

Splunk endpoint to send data to.

  • Event

  • Raw

 

 

Batch Size

Number - with upper bound

The number of messages to batch before sending

> 0, default 1 (no batching)

yes

Event metadata

String

Optional event metadata string in the JSON export for destination

JSON

 

Channel Identifier Header

String

GUID for Channel Identifier

 

yes

Connect Timeout

Number - with upper bound

The time in milliseconds to wait for a connection. Set to 0 for infinite. Defaults to 60000 (1 minute).

 

 

Read Timeout

Number

The time in milliseconds to wait for a read. Set to 0 for infinite. Defaults to 60000 (1 minute).

 

 

Number of Retries

Toggle

The number of times the request should be retried if the request fails. Defaults to 3.

0,1,2,3,4,5,6,7,8,9,10

 

Max Retry Wait

Number

Maximum time in milliseconds retries can take. Set to 0 for infinite. Defaults to 60000 (1 minute).

 

 

Max Retry Jitter Wait

Number

Maximum time in milliseconds added to retries. Defaults to 100.

 

 

 

Streaming Source or Batch Source using Splunk Search Jobs API

For streaming data from Splunk, real time searches need to be performed. The searches are performed against the incoming events within a sliding time window by using the search criteria that has been defined in Splunk. Option to run search is normal or realtime exists

User Facing Name

Type

Description

Default

User Facing Name

Type

Description

Default

Data Source URL

String

Required. URL to point to the defined Splunk port. Default API port for Splunk is 8089

 

Authentication Type

Select

  • Basic Authentication

    • Example: -u "x:<pass>"

  • HTTP Authentication

    • Example: "Authorization: Bearer <hec_token>"

 

Token

String

Required. Specify value of token created for authentication to Splunk

 

Username

String

Required. Login name for authentication to the Splunk API.

 

Password

Password

Required. Password for authentication to the Splunk API.

 

Execution Mode

Select

Blocking or Oneshot or Normal (Default is normal)

Normal= Asynchronous search

Oneshot = Retrieve all results at same time, specify output format for this mode

Blocking = Return Search ID (SID) on completion

 

Output format

Select

Default is xml . Valid values: (csv | json | xml)

This is specified only if Execution mode = Oneshot

 

Search String

String

Splunk search string for retrieving results

 

Search ID

String

Optional to specify search ID for retrieving job results.

 

Auto Cancel

Number

If specified, the job automatically cancels after this many seconds of inactivity. (0 means never auto-cancel). Default is 0

 

Earliest Time

String

Specify the earliest _time for the time range of your search.

 

Latest Time

String

Specify the latest time for the _time range of your search.

 

Indexed Earliest Time

String

Specify the earliest _indextime for the time range of your search.

 

Indexed Latest Time

String

Specify the latest _indextime for the time range of your search.

 

 

References

https://docs.splunk.com/Documentation/Splunk/7.1.1/Data/FormateventsforHTTPEventCollector

https://docs.splunk.com/Documentation/Splunk/7.3.1/RESTREF/RESTsearch#search.2Fjobs.2Fexport

https://docs.splunk.com/Documentation/Splunk/latest/RESTREF/RESTsearch#search.2Fjobs

https://docs.splunk.com/Documentation/Splunk/7.3.1/Search/Aboutrealtimesearches

Design / Implementation Tips

  • Plugin will be implemented using Splunk Java SDK.

  • Authentication will be performed using Username/Password or API Token.

  • Output schema will be automatically generated from specified Search String or Search Id.

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

  • Some future work – HYDRATOR-99999

  • Another future work – HYDRATOR-99999

Test Case(s)

  • Test case #1

  • Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data. 

Pipeline #1

Pipeline #2

 

 

Table of Contents

Checklist

User stories documented 
User stories reviewed 
Design documented 
Design reviewed 
Feature merged 
Examples and guides 
Integration tests 
Documentation for feature 
Short video demonstrating the feature

Comments

Created in 2020 by Google Inc.