S3 To Redshift Action

The S3 To Redshift action plugin is available in the Hub.

The S3 To Redshift action plugin loads data from an AWS S3 bucket into an AWS Redshift table.

Configuration

Property

Macro Enabled?

Description

Property

Macro Enabled?

Description

S3 Access Key

Yes

Optional. Access key for AWS S3 to connect to. Either provide 'Keys(Access and Secret Access keys)' or 'IAM Role' for connecting to AWS S3 bucket. 

S3 Secret Access Key

Yes

Optional. Secret access key for AWS S3 to connect to. Either provide 'Keys(Access and Secret Access keys)' or 'IAM Role' for connecting to AWS S3 bucket. 

S3 IAM Role

Yes

Optional. IAM Role for AWS S3 to connect to. This can only be used if the cluster is hosted on AWS servers. Either provide 'Keys(Access and Secret Access keys)' or 'IAM Role' for connecting to AWS S3 bucket. 

S3 Region

Yes

Optional. The region for AWS S3 to connect to. If not specified, then plugin will consider that S3 bucket is in the same region as of the Redshift cluster. 

S3 Data Path

Yes

Required. The S3 path of the bucket where the data is stored and will be loaded into the Redshift table. For example, 's3://bucket-name/test/' or 's3://bucket-name/test/2017-02-22/'(will load files present in specific directory) or 's3://bucket-name/test'(will load the files having prefix test) or 's3://bucket-name/test/2017-02-22'(will load files from test directory having prefix 2017-02-22).

JDBC Redshift Cluster Database URL

Yes

Required. The JDBC Redshift database URL for Redshift cluster, where the table is present. For example, 'jdbc:redshift://x.y.us-west-2.redshift.amazonaws.com:5439/dev'. 

Redshift Master User

Yes

Required. Master user for the Redshift cluster to connect to.

Redshift Master Password

Yes

Required. Master password for Redshift cluster to connect to.

Redshift Table Name

Yes

Required. The Redshift table name where the data from the S3 bucket will be loaded. 

List of Columns

Yes

Optional. Comma-separated list of the Redshift table column names to load the specific columns from S3 bucket. If not provided, then all the columns from S3 will be loaded into the Redshift table.

Conditions

  • Any invalid configurations for connecting to AWS S3 bucket or Redshift cluster, will result into the runtime failure.

  • Both configurations 'Keys(Access and Secret Access keys)' and 'IAM Role' can not be provided or empty at the same time. Either provide 'Keys(Access and Secret Access keys)' or 'IAM Role' for connecting to AWS S3 bucket.

  • S3 data path should starts with s3://, not with s3n:// or s3a:// uri scheme.

  • Table must exists in the Redshift cluster, for loading the data. If not, then it will result into the runtime failure.

  • Schema of the table must match with S3 bucket data schema. If not, then it will result into the runtime failure.

  • Plugin supports only avro formatted data present in the S3 bucket to be loaded into the Redshift table and uses 'auto' option for formatting.

Example

This example connects to a S3 instance using the 'accessKey and secretAccessKey', and to Redshift instance using 'clusterDbUrl, masterUser and masterPassword'. Data from the S3 bucket provided through 's3DataPath' will be loaded into the Redshift table 'redshifttest'.

Property

Value

Property

Value

S3 Access Key

access-key

S3 Secret Access Key

secret-access-key

S3 Data Path

s3://bucket-name/test/

JDBC Redshift Cluster Database URL

jdbc:redshift://x.y.us-west-2.redshift.amazonaws.com:5439/dev

Redshift Master User

master-user

Redshift Master Password

master-password

Redshift Table Name

redshifttest

 

Created in 2020 by Google Inc.