Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

The Amazon Kinesis Spark Stream source plugin is available in the Hub.

Plugin version: 12.30.10

Apache Spark streaming source that reads from AWS Kinesis streams. Use this source when you want to read data from a Kinesis stream in real-time. For example, you might want to read data from a Kinesis stream write it to Google BigQuery.

This source requires Scala 2.12 for execution (Dataproc 1.5, Dataproc 2.0 or later).

Configuration

Property

Macro Enabled?

Description

Reference Name

No

Required. Name used to uniquely identify this source for lineage, annotating metadata, etc.

Application Name

Yes

Required. The name of the Kinesis application. The application name that is used to checkpoint the Kinesis sequence numbers in DynamoDB table.

Stream Name

Yes

Required. The name of the Kinesis stream to the get the data from. The stream should be active. 

Kinesis endpoint Endpoint url

Yes

Required. Valid Kinesis endpoint URL, for example, Kinesis.us-east-1.amazonaws.com. 

Region

Yes

Required. Valid Kinesis region URL, for example, ap-south-1.

Default is us-east-1.

Checkpoint interval durationInterval Duration

Yes

Required. The interval in milliseconds at which the Kinesis Client Library saves its position in the stream

Initial position Position in streamStream

Yes

Required. Initial position in the stream. Can be either TRIM_HORIZON or LATEST.

Default is LATEST.

AWS access key idAccess Key ID

Yes

Required. The access Id provided by AWS required to access the Kinesis streams. The Id can be stored in CDAP secure store and can be provided as macro configuration. 

AWS access secretAccess Secret

Yes

Required. AWS access key secret having access to Kinesis streams. The key can be stored in CDAP secure store and can be provided as macro configuration.

Format

No

Optional. Format of the Kinesis shard payload. Any format supported by CDAP is supported. For example, a value of ‘csv’ will attempt to parse Kinesis payloads as comma-separated values. If no format is given, Kinesis payloads will be treated as bytes.

...