A financial customer would like to quickly unload financial reports into S3 that have been generated from processing that is happening in Redshift. The pipeline would have a Redshift to S3 action at the beginning, and then leverage the s3 source to read that data into a processing pipeline.
User Storie(s)
As a user, i would like to unload data from redshift to s3 using the unload command.
I would like to authenticate with IAM credentials as well as id and secret key pairs.
I would like to use that s3 data as an input into a hydrator pipeline
I would like the location of the data to be passed via workflow token so that they next plugin can use it in a macro
Plugin Type
Action
Configurables
This section defines properties that are configurable for this plugin.
User Facing Name
Type
Description
Constraints
Query
String
select statement to be used for unloading the data
Access Key
String
AWS access key for S3
Secret Access Key
String
AWS secret access key for S3
AWS IAM Role
String
IAM Role
S3 Bucket
String
Amazon S3 bucket (including key prefix)
Manifest
Boolean
Specify whether manifest file is to be created during unloading the data into S3
S3 Delimiter
String
The delimiter by which fields in a character-delimited file are to be separated
Parallel
String
To write data in parallel to multiple files, according to the number of slices in the cluster or to a single file. The default option is ON or TRUE.
Compression
String
Unload a compressed file of type BZIP2 or GZIP
ALLOWOVERWRITE
String
By default, UNLOAD fails if it finds files that it would possibly overwrite. If ALLOWOVERWRITE is specified, UNLOAD will overwrite existing files, including the manifest file.
Redshift Cluster DB Url
String
JDBC Redshift DB url for connecting to the redshift cluster
Master User
String
Master user for redshift
Master User Password
String
Master user password
Redshift Table Name
String
Redshift table name from which data is to be unloaded
User can connect to the S3 buckets using Either access and secret access key or using IAM role.
By default unload command assumes that Redshift and s3 are in the same region, If S3 bucket is not in same region as of the Redshift cluster, then user can provide the region using 'Region for S3'.