Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

Splunk Enterprise is a fully featured, powerful platform for collecting, searching, monitoring and analyzing machine data. Splunk Enterprise is easy to deploy and use. It turns machine data into rapid visibility, insight and intelligence. 

Use case(s)

Confluent Cloud is a streaming data service with Apache Kafka delivered as a managed service. 

User Storie(s)

  • As a pipeline developer, I would like to write relevant machine level data to Splunk for analysisstream data real time from Confluent cloud based on the specified schema by the user
  • As a pipeline developer, I would like to filter and transform relevant fields to send to Splunk for analysisAs a pipeline developer I would like to ensure the records and relevant metadata are transformed into the correct format needed by Splunk HTTP Event Collector (Splunk HEC). Each record must contain event data and optional metadata in required format.stream data real time to Confluent Cloud and perform serialization during event streaming
  • As a pipeline developer , I would like to send HTTP POST requests to Spunk HEC in JSON. One request is generated for one batch of records to Splunk HECAs a pipeline developer, I would like to get an error if the data is not written successfully to Spunk HTTP Event Collector (Splunk HEC)capture records that did not get delivered downstream to Confluent Cloud for analysis 

Plugin Type

  •  Batch Source
  •  Batch Sink 
  •  Real-time Source
  •  Real-time Sink
  •  Action
  •  Post-Run Action
  •  Aggregate
  •  Join
  •  Spark Model
  •  Spark Compute

Configurables

Following fields must be configurable for the plugin. The plugin should be created as a wrapper on HTTPSink with additional attributes required for Splunk HTTP Event Collector

User Facing NameTypeDescriptionConstraintsMacro Enabled?URLString
Required. The URL to post data to.
yesHEC TokenStringRequired . Specify value of token created for authentication to SplunkAuthentication TypeSelect
  • Basic Authentication
    • Example: -u "x:<hec_token>"
  • HTTP Authentication
    • Example: "Authorization: Splunk <hec_token>"
  • Query String (Splunk Cloud Only)Example: ?token=<hec_token>
    Pre-requisities for Query String URL
  • You must also enable query string authentication on a per-token basis. On your Splunk server, request Splunk Support to edit the file at $SPLUNK_HOME/etc/apps/splunk_httpinput/local/inputs.conf. Your tokens are listed by name in this file, in the form http://<token_name>
  • Within the stanza for each token you want to enable query string authentication, add the following setting (or change the existing setting, if applicable):allowQueryStringAuth = true
    Batch SizeNumber - with upper boundThe number of messages to batch before sending> 0, default 1 (no batching)yesFormatNumber with upper limitThe format to send the message in. JSON will format the entire input record to json and send it as a payload. Form will convert the input message to a query string and send it in the payload. Custom will leverage the request body field to send.JSON, Form, CustomRequest BodyString
    Optional request body. Only required if Custom format is specified.
    yesContent TypeStringUsed to specify the Content-Type header.yesChannel Identifier HeaderKeyValue
    If your request includes raw events, you must include an X-Splunk-Request-Channel header field in the event, and it must be set to a unique channel identifier (a GUID).
    curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw  -H "X-Splunk-Request-Channel: FE0ECFAD-13D5-401B-847D-77833BD77131" -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v

    Alternatively, the X-Splunk-Request-Channel header field can be sent as a URL query parameter, as shown here:

    curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw?channel=FE0ECFAD-13D5-401B-847D-77833BD77131 -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v
    yesShould Follow Redirects?Toggle
    Whether to automatically follow redirects. Defaults to true.
    true,falseNumber of RetriesToggle
    The number of times the request should be retried if the request fails. Defaults to 3.
    0,1,2,3,4,5,6,7,8,9,10Connect TimeoutString
    The time in milliseconds to wait for a connection. Set to 0 for infinite. Defaults to 60000 (1 minute).
    Read TimeoutString
    The time in milliseconds to wait for a read. Set to 0 for infinite. Defaults to 60000 (1 minute).
    Use ProxyToggleTrue or false to enable HTTP proxy to connect to system. Defaults to falsetrue, falseProxy URIStringProxy URIProxy UsernameStringUsernameProxy PasswordStringPassword

    References

    https://docs.splunk.com/Documentation/Splunk/7.1.1/Data/FormateventsforHTTPEventCollector

    Design / Implementation Tips

    Tip #1

    Prerequisites

    • Kafka Broker: Confluent Platform 3.3.0 or above, or Kafka 0.11.0 or above
    • Connect: Confluent Platform 3.3.0 or above, or Kafka 0.11.0 or above
    • Java 1.8

    Properties

    Real time source

    PropertydescriptionMandatory or Optional
    referenceNameUniquely identify the sourceYes
    Kafka cluster credential API keyAPI key in order to connect to the Confluent clusterYes
    Kafka cluster credential secretSecret in order to connect to the Confluent clusterYes
    Kafka ZookeeperThe connect string location of ZooKeeper. Either that or the list of brokers is requiredRequired if brokers not specified
    Kafka brokersComma-separated list of Kafka brokers. Either that or the ZooKeeper quorum is requiredRequired if zookeeper not specified
    Kafka partitionNumber of partitionsYes
    Kafka offsetThe initial offset for the partition
    Kafka topicList of topics which we are listening to for streamingYes
    Schema registry URLURL endpoint for the schema registry on Confluent Cloud or self hosted schema registry URLNo
    Schema registry API keyAPI keyNo
    Schema registry secretSecretNo
    FormatSpecify the format for the Kafka event. Any supported format by CDAP is supported. Default output is key and Value as bytesNo


    Real time sink

    PropertydescriptionTypeMandatory
    Reference NameUniquely identify the sinkStringYes
    Kafka cluster credential API keyAPI key in order to connect to the Confluent clusterString

    Yes

    Kafka cluster credential secretSecret in order to connect to the Confluent clusterStringYes
    Kafka brokersComma-separated list of Kafka brokersStringYes
    AsyncSpecifies whether writing the events to broker is Asynchronous or SynchronousSelectYes
    partitionfieldSpecifies the input fields that need to be used to determine the partition idInt or LongYes
    keySpecifies the input field that should be used as the key for the event published into Kafka.String

    Yes

    Kafka topicList of topics to which the data should be published toStringYes
    formatSpecifies the format of the event published to Confluent cloudStringYes


    References

    https://docs.confluent.io/current/connect/kafka-connect-bigquery/index.html

    https://docs.confluent.io/current/cloud/connectors/cc-gcs-sink-connector.html#cc-gcs-connect-sink

    https://docs.confluent.io/current/quickstart/cloud-quickstart/index.html

    https://docs.cask.co/cdap/4.2.0/en/developer-manual/pipelines/plugins/sinks/kafkaproducer-realtimesink.html

    https://docs.cask.co/cdap/4.2.0/en/developer-manual/pipelines/plugins/sources/kafka-realtimesource.html

    https://docs.confluent.io/current/cloud/limits.html#cloud-limits



    Design / Implementation Tips

    Design

    Approach(s)

    Properties

    Security

    Limitation(s)

    Future Work

    • Some future work – HYDRATOR-99999
    • Another future work – HYDRATOR-99999

    Test Case(s)

    • Test case #1
    • Test case #2

    Sample Pipeline

    Please attach one or more sample pipeline(s) and associated data. 

    Pipeline #1

    Pipeline #2



    Table of Contents

    Table of Contents
    stylecircle

    Checklist

    •  User stories documented 
    •  User stories reviewed 
    •  Design documented 
    •  Design reviewed 
    •  Feature merged 
    •  Examples and guides 
    •  Integration tests 
    •  Documentation for feature 
    •  Short video demonstrating the feature