Google Cloud Spanner Sink

Plugin version: 0.22.0

This sink writes to a Google Cloud Spanner table. Cloud Spanner is a fully managed, mission-critical, relational database service that offers transactional consistency at global scale, schemas, SQL (ANSI 2011 with extensions), and automatic, synchronous replication for high availability.

Credentials

If the plugin is run on a Google Cloud Dataproc cluster, the service account key does not need to be provided and can be set to 'auto-detect'. Credentials will be automatically read from the cluster environment.

If the plugin is not run on a Dataproc cluster, the path to a service account key must be provided. The service account key can be found on the Dashboard in the Cloud Platform Console. Make sure the account key has permission to access Google Cloud Spanner. The service account key file needs to be available on every node in your cluster and must be readable by all users running the job.

Configuration

Property

Macro Enabled?

Version Introduced

Description

Property

Macro Enabled?

Version Introduced

Description

Use Connection

No

6.7.0/0.20.0

Optional. Whether to use a connection. If a connection is used, you do not need to provide the credentials.

Connection

Yes

6.7.0/0.20.0

Optional. Name of the connection to use. Project and service account information will be provided by the connection. You can also use the macro function ${conn(connection_name)}

Project ID

Yes

 

Optional. Google Cloud Project ID, which uniquely identifies a project. It can be found on the Dashboard in the Google Cloud Platform Console.

Service Account Type

Yes

6.3.0/0.16.0

Optional. Select one of the following options:

  • File Path. File path where the service account is located.

  • JSON. JSON content of the service account.

Service Account File Path

Yes

 

Optional. Path on the local file system of the service account key used for authorization. Can be set to 'auto-detect' when running on a Dataproc cluster. When running on other clusters, the file must be present on every node in the cluster.

Default is auto-detect.

Service Account JSON

Yes

6.3.0/0.16.0

Optional. Content of the service account.

Reference Name

No

 

Optional. Name used to uniquely identify this sink for lineage, annotating metadata, etc.

Instance ID

Yes

 

Required. Instance the Spanner database belongs to. Spanner instance is contained within a specific project. Instance is an allocation of resources that is used by Cloud Spanner databases created in that instance.

Database Name

Yes

 

Required. Database the Spanner table belongs to. Spanner database is contained within a specific Spanner instance. If the database does not exist, it will get created.

Table Name

Yes

 

Required. Table to write to. A table contains individual records organized in rows. Each record is composed of columns (also called fields). Every table is defined by a schema that describes the column names, data types, and other information. If the table does not exist, it will get created.

Primary Key

Yes

 

Optional. If the table does not exist, a primary key must be provided in order to auto-create the table. The key can be a composite key of multiple fields in the schema. This is not required if the table already exists.

Encryption Key Name

Yes

6.5.1/0.18.1

Optional. The GCP customer managed encryption key (CMEK) used to encrypt data written to any bucket created by the plugin. If the bucket already exists, this is ignored. More information can be found here.

Write Batch Size

Yes

 

Optional. Size (in number of records) of the batched writes to the Spanner table. Each write to Cloud Spanner contains some overhead. To maximize bulk write throughput, maximize the amount of data stored per write. A good technique is for each commit to mutate hundreds of rows. Commits with the number of mutations in the range of 1 MiB - 5 MiB rows usually provide the best performance. Default value is 100 mutations.

Output Schema

Yes

 

Optional. Schema of the data to write. Must be compatible with the table schema.

 

Created in 2020 by Google Inc.