Confluent Cloud Plugin

Confluent Cloud Plugin

Introduction

Confluent Cloud is a streaming data service with Apache Kafka delivered as a managed service. 

User Storie(s)

  • As a pipeline developer, I would like to stream data real time from Confluent cloud based on the specified schema by the user

  • As a pipeline developer, I would like to stream data real time to Confluent Cloud and perform serialization during event streaming

  • As a pipeline developer I would like to capture records that did not get delivered downstream to Confluent Cloud for analysis 

Plugin Type

Batch Source
Batch Sink 
Real-time Source
Real-time Sink
Action
Post-Run Action
Aggregate
Join
Spark Model
Spark Compute

Prerequisites

  • Kafka Broker: Confluent Platform 3.3.0 or above, or Kafka 0.11.0 or above

  • Connect: Confluent Platform 3.3.0 or above, or Kafka 0.11.0 or above

  • Java 1.8

Properties

Real time source

Property

description

Mandatory or Optional

Property

description

Mandatory or Optional

referenceName

Uniquely identify the source

Yes

Kafka cluster credential API key

API key in order to connect to the Confluent cluster

Yes

Kafka cluster credential secret

Secret in order to connect to the Confluent cluster

Yes

Kafka Zookeeper

The connect string location of ZooKeeper. Either that or the list of brokers is required

Required if brokers not specified

Kafka brokers

Comma-separated list of Kafka brokers. Either that or the ZooKeeper quorum is required

Required if zookeeper not specified

Kafka partition

Number of partitions

Yes

Kafka offset

The initial offset for the partition

 

Kafka topic

List of topics which we are listening to for streaming

Yes

Schema registry URL

URL endpoint for the schema registry on Confluent Cloud or self hosted schema registry URL

No

Schema registry API key

API key

No

Schema registry secret

Secret

No

Format

Specify the format for the Kafka event. Any supported format by CDAP is supported. Default output is key and Value as bytes

No

 

Real time sink

Property

description

Type

Mandatory

Property

description

Type

Mandatory

Reference Name

Uniquely identify the sink

String

Yes

Kafka cluster credential API key

API key in order to connect to the Confluent cluster

String

Yes

Kafka cluster credential secret

Secret in order to connect to the Confluent cluster

String

Yes

Kafka brokers

Comma-separated list of Kafka brokers

String

Yes

Async

Specifies whether writing the events to broker is Asynchronous or Synchronous

Select

Yes

partitionfield

Specifies the input fields that need to be used to determine the partition id

Int or Long

Yes

key

Specifies the input field that should be used as the key for the event published into Kafka.

String

Yes

Kafka topic

List of topics to which the data should be published to

String

Yes

format

Specifies the format of the event published to Confluent cloud

String

Yes

 

References

https://docs.confluent.io/current/connect/kafka-connect-bigquery/index.html

https://docs.confluent.io/current/cloud/connectors/cc-gcs-sink-connector.html#cc-gcs-connect-sink

https://docs.confluent.io/current/quickstart/cloud-quickstart/index.html

https://docs.cask.co/cdap/4.2.0/en/developer-manual/pipelines/plugins/sinks/kafkaproducer-realtimesink.html

https://docs.cask.co/cdap/4.2.0/en/developer-manual/pipelines/plugins/sources/kafka-realtimesource.html

https://docs.confluent.io/current/cloud/limits.html#cloud-limits

 

 

Design / Implementation Tips

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

  • Some future work – HYDRATOR-99999

  • Another future work – HYDRATOR-99999

Test Case(s)

  • Test case #1

  • Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data. 

Pipeline #1

Pipeline #2

 

 

Table of Contents

Checklist

User stories documented 
User stories reviewed 
Design documented 
Design reviewed 
Feature merged 
Examples and guides 
Integration tests 
Documentation for feature 
Short video demonstrating the feature

Created in 2020 by Google Inc.