Introduction

Confluent Cloud is is a resilient, scalable streaming data service based on with Apache Kafka®, Kafka delivered as a fully managed service. Confluent Cloud has a web interface and local command line interface. You can manage cluster resources, settings, and billing with the web interface.

Use case(s)

User Storie(s)

As a pipeline developer, I would like to stream data real time from Confluent cloud Confluent cloud based on the specified schema by the user
As a pipeline developer, I would like to stream data real time to Confluent Cloud and perform serialization during event streaming
As a pipeline developer I would like to capture records that are did not get delivered downstream to Confluent Cloud for analysisanalysis

Plugin Type

Batch Source
Batch Sink
Real-time Source
Real-time Sink
Action
Post-Run Action
Aggregate
Join
Spark Model
Spark Compute

Configurables

Following fields must be configurable for the plugin. The plugin should be created as a wrapper on HTTPSink with additional attributes required for Splunk HTTP Event Collector

User Facing NameTypeDescriptionConstraintsMacro Enabled?URLString

Required. The URL to post data to.

yesHEC TokenStringRequired . Specify value of token created for authentication to SplunkAuthentication TypeSelect

Basic Authentication

Example: -u "x:<hec_token>"

HTTP Authentication

Example: "Authorization: Splunk <hec_token>"

Query String (Splunk Cloud Only)Example: ?token=<hec_token>
Pre-requisities for Query String URL

You must also enable query string authentication on a per-token basis. On your Splunk server, request Splunk Support to edit the file at $SPLUNK_HOME/etc/apps/splunk_httpinput/local/inputs.conf. Your tokens are listed by name in this file, in the form http://<token_name>

Within the stanza for each token you want to enable query string authentication, add the following setting (or change the existing setting, if applicable):allowQueryStringAuth = true

Batch SizeNumber - with upper boundThe number of messages to batch before sending> 0, default 1 (no batching)yesFormatNumber with upper limitThe format to send the message in. JSON will format the entire input record to json and send it as a payload. Form will convert the input message to a query string and send it in the payload. Custom will leverage the request body field to send.JSON, Form, CustomRequest BodyString

Optional request body. Only required if Custom format is specified.

yesContent TypeStringUsed to specify the Content-Type header.yesChannel Identifier HeaderKeyValue

If your request includes raw events, you must include an X-Splunk-Request-Channel header field in the event, and it must be set to a unique channel identifier (a GUID).

curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw  -H "X-Splunk-Request-Channel: FE0ECFAD-13D5-401B-847D-77833BD77131" -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v

Alternatively, the X-Splunk-Request-Channel header field can be sent as a URL query parameter, as shown here:

curl https://http-inputs-<customer>.splunkcloud.com/services/collector/raw?channel=FE0ECFAD-13D5-401B-847D-77833BD77131 -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<raw data string>' -v

yesShould Follow Redirects?Toggle

Whether to automatically follow redirects. Defaults to true.

true,falseNumber of RetriesToggle

The number of times the request should be retried if the request fails. Defaults to 3.

0,1,2,3,4,5,6,7,8,9,10Connect TimeoutString

The time in milliseconds to wait for a connection. Set to 0 for infinite. Defaults to 60000 (1 minute).

Read TimeoutString

The time in milliseconds to wait for a read. Set to 0 for infinite. Defaults to 60000 (1 minute).

Use ProxyToggleTrue or false to enable HTTP proxy to connect to system. Defaults to falsetrue, falseProxy URIStringProxy URIProxy UsernameStringUsernameProxy PasswordStringPassword

References

https://docs.splunk.com/Documentation/Splunk/7.1.1/Data/FormateventsforHTTPEventCollector

Design / Implementation Tips

Tip #1

Prerequisites

Kafka Broker: Confluent Platform 3.3.0 or above, or Kafka 0.11.0 or above
Connect: Confluent Platform 3.3.0 or above, or Kafka 0.11.0 or above
Java 1.8

Properties

Real time source

Property	description	Mandatory or Optional
referenceName	Uniquely identify the source	Yes
Kafka cluster credential API key	API key in order to connect to the Confluent cluster	Yes
Kafka cluster credential secret	Secret in order to connect to the Confluent cluster	Yes
Kafka Zookeeper	The connect string location of ZooKeeper. Either that or the list of brokers is required	Required if brokers not specified
Kafka brokers	Comma-separated list of Kafka brokers. Either that or the ZooKeeper quorum is required	Required if zookeeper not specified
Kafka partition	Number of partitions	Yes
Kafka offset	The initial offset for the partition
Kafka topic	List of topics which we are listening to for streaming	Yes
Schema registry URL	URL endpoint for the schema registry on Confluent Cloud or self hosted schema registry URL	No
Schema registry API key	API key	No
Schema registry secret	Secret	No
Format	Specify the format for the Kafka event. Any supported format by CDAP is supported. Default output is key and Value as bytes	No

Real time sink

Property	description	Type	Mandatory
Reference Name	Uniquely identify the sink	String	Yes
Kafka cluster credential API key	API key in order to connect to the Confluent cluster	String	Yes
Kafka cluster credential secret	Secret in order to connect to the Confluent cluster	String	Yes
Kafka brokers	Comma-separated list of Kafka brokers	String	Yes
Async	Specifies whether writing the events to broker is Asynchronous or Synchronous	Select	Yes
partitionfield	Specifies the input fields that need to be used to determine the partition id	Int or Long	Yes
key	Specifies the input field that should be used as the key for the event published into Kafka.	String	Yes
Kafka topic	List of topics to which the data should be published to	String	Yes
format	Specifies the format of the event published to Confluent cloud	String	Yes

References

https://docs.confluent.io/current/connect/kafka-connect-bigquery/index.html

https://docs.confluent.io/current/cloud/connectors/cc-gcs-sink-connector.html#cc-gcs-connect-sink

https://docs.confluent.io/current/quickstart/cloud-quickstart/index.html

https://docs.cask.co/cdap/4.2.0/en/developer-manual/pipelines/plugins/sinks/kafkaproducer-realtimesink.html

https://docs.cask.co/cdap/4.2.0/en/developer-manual/pipelines/plugins/sources/kafka-realtimesource.html

https://docs.confluent.io/current/cloud/limits.html#cloud-limits

Design / Implementation Tips

Tip #1https://docs.confluent.io/current/quickstart/cloud-quickstart/index.html
Tip #2

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

Some future work – HYDRATOR-99999
Another future work – HYDRATOR-99999

Test Case(s)

Test case #1
Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data.

Pipeline #1

Pipeline #2

Table of Contents

Table of Contents

style	circle

Checklist

User stories documented
User stories reviewed
Design documented
Design reviewed
Feature merged
Examples and guides
Integration tests
Documentation for feature
Short video demonstrating the feature

Versions Compared

Old Version 2

New Version Current

Key

Introduction

User Storie(s)

Plugin Type

Configurables

Design / Implementation Tips

Prerequisites

Properties

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

Test Case(s)

Sample Pipeline

Pipeline #1

Pipeline #2

Page Comparison

Versions Compared

Old Version 2

New Version Current

Key

Introduction

User Storie(s)

Plugin Type

Configurables

Design / Implementation Tips

Prerequisites

Properties

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

Test Case(s)

Sample Pipeline

Pipeline #1

Pipeline #2