Amazon S3 Sink

Plugin version: 0.19.0

Use this sink to write to Amazon S3 in various formats. For example, you might want to create daily snapshots of a database by reading the entire contents of a table, writing to this sink, and then other programs can analyze the contents of the specified file.

Configuration

Property

Macro Enabled?

Version Introduced

Description

Property

Macro Enabled?

Version Introduced

Description

Use Connection

No

6.7.0/0.17.0

Optional. Whether to use a connection. If a connection is used, you do not need to provide the credentials.

Connection

Yes

6.7.0/0.17.0

Optional. Name of the connection to use. You can also use the macro function ${conn(connection_name)}

Authentication Method

Yes

 

Optional. Authentication method to access S3. The default value is Access Credentials. IAM can only be used if the plugin is run in an AWS environment, such as on EMR.

Access ID

Yes

 

Optional. Amazon access ID required for authentication.

Access Key

Yes

 

Optional. Amazon access key required for authentication.

Session Token

Yes

6.7.0/0.17.0

Optional. Amazon session token required for authentication. Only required for temporary credentials. Temporary credentials are only supported for S3A paths.

Reference Name

No

 

Required. Name used to uniquely identify this sink for lineage, annotating metadata, etc.

Path

Yes

 

Required. For example, s3a://<bucket>/path/to/output

You can also use the logicalStartTime function to append a date to the output filename.

Path Suffix

Yes

 

Optional. Time format for the output directory that will be appended to the path. For example, the format ‘yyyy-MM-dd-HH-mm’ will result in a directory of the form ‘2015-01-01-20-42’. If not specified, nothing will be appended to the path.”

Format

No

 

Required. Format to write the records in. The format must be one of ‘json’, ‘avro’, ‘parquet’, ‘csv’, ‘tsv’, or ‘delimited’.

Delimiter

Yes

 

Optional. Delimiter to use if the format is ‘delimited’. The delimiter will be ignored if the format is anything other than ‘delimited’.

File System Properties

Yes

 

Optional. Additional properties to use with the Format when reading the data. This is an advanced feature that requires knowledge of the properties supported by the underlying filesystem.

Enable Encryption

Yes

 

Optional. Whether to enable server side encryption. The only supported algorithm is AES256.

Output Schema

Yes

 

Schema of the data to write. If a schema is provided, it must be compatible with the schema in S3.

 

Created in 2020 by Google Inc.