Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

Splunk Enterprise is a fully featured, powerful platform for collecting, searching, monitoring and analyzing machine data. Splunk Enterprise is easy to deploy and use. It turns machine data into rapid visibility, insight and intelligence. 

Use case(s)

User Storie(s)

  • As a pipeline developer, I would like to write relevant machine level data to Splunk for analysis
  • As a pipeline developer, I would like to filter and transform relevant fields to send to Splunk for analysis
  • As a pipeline developer I would like to ensure the records and relevant metadata are transformed into the correct format needed by Splunk HTTP Event Collector (Splunk HEC). Each record must contain event data and optional metadata in required format.
  • As a pipeline developer, I would like to send HTTP POST requests to Spunk HEC in JSON. One request is generated for one batch of records to Splunk HEC
  • As a pipeline developer, I would like to get an error if the data is not written successfully to Spunk HTTP Event Collector (Splunk HEC)

Plugin Type

  •  Batch Source
  •  Batch Sink 
  •  Real-time Source
  •  Real-time Sink
  •  Action
  •  Post-Run Action
  •  Aggregate
  •  Join
  •  Spark Model
  •  Spark Compute

Configurables

Following fields must be configurable for the plugin. The plugin should be created as a wrapper on HTTPSink with additional attributes required for Splunk HTTP Event Collector


Batch Sink

Authentication at $SPLUNK form http
If your request includes raw events, you must include an X-Splunk-Request-Channel header field in the event, and it must be set to a unique channel identifier (a GUID).
curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw  -H "X-Splunk-Request-Channel: FE0ECFAD-13D5-401B-847D-77833BD77131" -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v

Alternatively, the X-Splunk-Request-Channel header field can be sent as a URL query parameter, as shown here:

curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw?channel=FE0ECFAD-13D5-401B-847D-77833BD77131 -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v
User Facing Name
Type
Description
ConstraintsMacro Enabled?
URL
String
Required. The URL to post data to.

yesHEC TokenStringRequired . Specify value of token created for authentication to Splunk
Authentication Type
Select
  • Basic Authentication
    • Example: -u "x:<hec_token>"
  • HTTP Authentication
    • Example: "Authorization: Splunk <hec_token>"
  • Query String (Splunk Cloud Only)
    • Example: ?token=<hec_token>
    Pre-requisities for Query String URL
  • You must also enable query string authentication on a per-token basis. On your Splunk server, request Splunk Support to edit the file
  •  at $SPLUNK_HOME/etc/apps/splunk_httpinput/local/inputs.conf. Your tokens are listed by name in this file, in the
  •  form http://<token_name>
  • Within the stanza for each token you want to enable query string authentication, add the following setting (or change the existing setting, if applicable):allowQueryStringAuth = true


HEC Token
String
Required. Specify value of token created for authentication to Splunk.


Username
String
Required for Basic Authentication. Login name for authentication to the Splunk API.


Endpoint
Select
Splunk endpoint to send data to.
  • Event
  • Raw 


Batch Size
Number - with upper bound
The number of messages to batch before sending
> 0, default 1 (no batching)yesFormatNumber with upper limitThe format to send the message in. JSON will format the entire input record to json and send it as a payload. Form will convert the input message to a query string and send it in the payload. Custom will leverage the request body field to send.
JSON, Form, CustomRequest Body
Event metadata
String
Optional requestevent body.metadata Onlystring requiredin ifthe CustomJSON formatexport isfor specified.
yesContent TypeStringUsed to specify the Content-Type header.yesChannel Identifier HeaderKeyValueyesShould Follow Redirects?Toggle
Whether to automatically follow redirects. Defaults to true.
true,falseNumber of
destination
JSON
Channel Identifier Header
String
GUID for Channel Identifier

yes
Connect Timeout
Number - with upper bound
The time in milliseconds to wait for a connection. Set to 0 for infinite. Defaults to 60000 (1 minute).


Read Timeout
Number
The time in milliseconds to wait for a read. Set to 0 for infinite. Defaults to 60000 (1 minute).


Number of Retries
Toggle
The number of times the request should be retried if the request fails. Defaults to 3.
0,1,2,3,4,5,6,7,8,9,10
Connect TimeoutString
TheMax Retry Wait
Number
Maximum time in milliseconds toretries wait for a connectioncan take. Set to 0 for infinite. Defaults to 60000 (1 minute).
Read TimeoutString

TheMax Retry Jitter Wait
Number
Maximum time in milliseconds added to wait for a read. Set to 0 for infinite. Defaults to 60000 (1 minute).
Use ProxySelectTrue or false to enable HTTP proxy to connect to system. Defaults to falsetrue, falseProxy URIStringProxy URIProxy UsernameStringUsernameProxy PasswordStringPassword
 retries. Defaults to 100.



Streaming Source or Batch Source using Splunk Search Jobs API

For streaming data from Splunk, real time searches need to be performed. The searches are performed against the incoming events within a sliding time window by using the search criteria that has been defined in Splunk. Option to run search is normal or realtime exists

User Facing NameTypeDescriptionDefault
Data Source URLStringRequired. URL to point to the defined Splunk port. Default API port for Splunk is 8089
Authentication TypeSelect
  • Basic Authentication
    • Example: -u "x:<pass>"
  • HTTP Authentication
    • Example: "Authorization: Bearer <hec_token>"

TokenString
Required. Specify value of token created for authentication to Splunk

Username
String
Required. Login name for authentication to the Splunk API.

Password
Password
Required. Password for authentication to the Splunk API.

Execution ModeSelect

Blocking or Oneshot or Normal (Default is normal)

Normal= Asynchronous search

Oneshot = Retrieve all results at same time, specify output format for this mode

Blocking = Return Search ID (SID) on completion


Output formatSelect

Default is xml . Valid values: (csv | json | xml)

This is specified only if Execution mode = Oneshot


Search StringStringSplunk search string for retrieving results
Search IDStringOptional to specify search ID for retrieving job results.
Auto CancelNumber

If specified, the job automatically cancels after this many seconds of inactivity. (0 means never auto-cancel). Default is 0


Earliest TimeString

Specify the earliest _time for the time range of your search.


Latest TimeStringSpecify the latest time for the _time range of your search.
Indexed Earliest TimeStringSpecify the earliest _indextime for the time range of your search.
Indexed Latest TimeStringSpecify the latest _indextime for the time range of your search.


References

https://docs.splunk.com/Documentation/Splunk/7.1.1/Data/FormateventsforHTTPEventCollector

https://docs.splunk.com/Documentation/Splunk/7.3.1/RESTREF/RESTsearch#search.2Fjobs.2Fexport

https://docs.splunk.com/Documentation/Splunk/latest/RESTREF/RESTsearch#search.2Fjobs

https://docs.splunk.com/Documentation/Splunk/7.3.1/Search/Aboutrealtimesearches

Design / Implementation Tips

  • Tip #1
  • Tip #2Plugin will be implemented using Splunk Java SDK.
  • Authentication will be performed using Username/Password or API Token.
  • Output schema will be automatically generated from specified Search String or Search Id.

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

  • Some future work – HYDRATOR-99999
  • Another future work – HYDRATOR-99999

Test Case(s)

  • Test case #1
  • Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data. 

Pipeline #1

Pipeline #2



Table of Contents

Table of Contents
stylecircle

Checklist

  •  User stories documented 
  •  User stories reviewed 
  •  Design documented 
  •  Design reviewed 
  •  Feature merged 
  •  Examples and guides 
  •  Integration tests 
  •  Documentation for feature 
  •  Short video demonstrating the feature