Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Description

Spark streaming source to ingest real-time streaming data into Hydrator pipelines.

User Story

User has a continuous data stream, generated by click streams, which is being dumped into Amazon Kinesis streams from various nodes. User has live dashboards running powered by Hydrator pipelines and want to feed the data being generated into the dashboards. User can leverage Kinesis Spark streaming source to connect the incoming stream to a Hydrator pipeline.

This will also help users to leverage AWS sources that are not currently supported by Hydrator or don't have existing plugins for them. Kinesis has out-of-the-box support for all of the AWS products and can act as a connector between AWS services and Hydrator pipelines.

Properties

Kinesis App Name: The application name that is used to checkpoint the Kinesis sequence numbers in a DynamoDB table by the Kinesis Spark streaming library. The application name must be unique for a given account and region.

...

AWS access secret: Api access secret for Amazon Web Services

Design

 

Implementation

 

Code Block
languagejava
titlegetStream
public JavaDStream<StructuredRecord> getStream(StreamingContext streamingContext) throws Exception {
  registerUsage(streamingContext);
  JavaStreamingContext javaStreamingContext = streamingContext.getSparkStreamingContext();
  JavaReceiverInputDStream<byte[]> kinesisStream = KinesisUtils.createStream(javaStreamingContext, config);
return kinesisStream.map(new MyObjectToStructuredRecordFunction());

...