Amazon SQS plugin

Introduction

Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. The SQS plugin in CDAP will enable ETL developers to create streaming pipelines that read events from SQS queues in realtime and process them.

Use case(s)

  • As a user, I would like to create a streaming pipeline that reads events from Amazon SQS, runs some transformations and aggregations on it and joins the data with other sources, so that I can generate real-time enrichments/insights based on telemetry data in SQS.
  • A web beacon is pushing log records to SQS and I want to read these log events in real-time

User Storie(s)

  • I want to specify credentials securely as Access Key and Access ID
  • I want to also specify credentials using IAM
  • I want to specify the queue and region in SQS to read events from

Plugin Type

  • Batch Source
  • Batch Sink 
  • Real-time Source
  • Real-time Sink
  • Action
  • Post-Run Action
  • Aggregate
  • Join
  • Spark Model
  • Spark Compute

Realtime Source

This section defines properties that are configurable for this plugin. 

SectionUser Facing NameTypeDescriptionConstraintsOptional?Default
CredentialsAuthentication methodRadio buttonEither Access Credentials or IAM
NAccess Credentials
Access IDTextboxAWS Access ID. Only shown when Authentication method is Access Credentials
Y
Access KeyPasswordAWS Secret Access Key. Only shown when Authentication method is Access Credentials
Y
SQS propertiesRegionDrop downSelect from a list of available regions where your SQS queue is located

us-west-1
Queue nameTextboxSpecifies the queue name to read from


EndpointTextboxEndpoint of the SQS server to connect to. Omit this field to connect to AWS.
Yes
Delete MessagesDrop DownDelete messages from SQS queue after successfully reading.
Ntrue
Wait TimeNumberSQS Long poll wait time.

Valid values

1-20

Y10
IntervalNumber

The amount of time to wait between each poll in seconds. The plugin will wait for the duration specified
before issuing an receive message call.


Y0
Number of Messages to returnNumberMaximum number of messages to return for each API call.

Valid values

1-10

Y10

Batch Sink

This section defines properties that are configurable for this plugin. 

SectionUser Facing NameTypeDescriptionConstraintsOptional?Default
CredentialsAuthentication methodRadio buttonEither Access Credentials or IAM
NAccess Credentials
Access IDTextboxAWS Access ID. Only shown when Authentication method is Access Credentials
Y
Access KeyPasswordAWS Secret Access Key. Only shown when Authentication method is Access Credentials
Y
SQS propertiesRegionDrop downSelect from a list of available regions where your SQS queue is located

us-west-1
Queue nameTextboxSpecifies the queue name to read from


EndpointTextboxEndpoint of the SQS server to connect to. Omit this field to connect to AWS.
Yes
Message formatSelectEither CSV or JSON. Converts the structured record into a CSV or JSON to be sent to SQS

JSON
Delay secondsNumberThe length of time, in seconds, for which a specific message is delayed. Valid values: 0 to 900. 

Design / Implementation Tips

  • Tip #1
  • Tip #2

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

  • Some future work – HYDRATOR-99999
  • Another future work – HYDRATOR-99999

Test Case(s)

  • Test case #1
  • Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data. 

Pipeline #1

Pipeline #2



Table of Contents

Checklist

  • User stories documented 
  • User stories reviewed 
  • Design documented 
  • Design reviewed 
  • Feature merged 
  • Examples and guides 
  • Integration tests 
  • Documentation for feature 
  • Short video demonstrating the feature

Created in 2020 by Google Inc.