Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If the plugin is not run on a Dataproc cluster, the path to a service account key must be provided. The service account key can be found on the Dashboard in the Cloud Platform Console. Make sure the account key has permission to access BigQuery and Google Cloud Storage. The service account key file needs to be available on every node in your cluster and must be readable by all users running the job.

Configuration

Property

Macro Enabled?

Version Introduced

Description

Project ID

Yes

Optional. Google Cloud Project ID, which uniquely identifies a project. It can be found on the Dashboard in the Google Cloud Platform Console.

Default is auto-detect.

Run Condition

No

Required. When to run the action. Must be completionsuccess, or failure. Defaults to completion. If set to completion, the action will be executed and a marker file will get created regardless of whether the pipeline run succeeded or failed. If set to success, the action will get executed and the marker file will get created only if the pipeline run succeeded. If set to failure, the action will get executed and the marker file will get created only if the pipeline run failed.

Path

Yes

Required. Google Cloud Storage path to the marker file. This takes the format gs://<bucket>/directory/marker-file-name. For example gs://billing-data/2021-01-21/__SUCCESS. The marker file will get created only if there is no previous marker file in that path. Otherwise, creating a new marker file will be skipped. If the bucket does not exist, it will get created automatically.

Encryption Key Name

Yes

6.5.1/0.18.1

Optional. The GCP customer managed encryption key (CMEK) used to encrypt data written to any bucket created by the plugin. If the bucket already exists, this is ignored.

Service Account Type

Yes

Optional. Select one of the following options:

  • File Path. File path where the service account is located.

  • JSON. JSON content of the service account.

Service Account File Path

Yes

Optional. Path on the local file system of the service account key used for authorization. Can be set to 'auto-detect' when running on a Dataproc cluster. When running on other clusters, the file must be present on every node in the cluster.

Default is auto-detect.

Service Account JSON

Yes

Optional. Content of the service account.

Example

Suppose you want to copy an object from bucketX to bucketY. Upon a successful copy, you want to mark the process as done by creating an empty DONE file in bucketY.

...