Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Dataproc image URI. If the URI is not specified, it will be inferred from the Image Version.

...

Staging Bucket

Google Cloud Storage bucket used to stage job dependencies and config files for running pipelines in Google Cloud Dataproc.

Temp Bucket

Google Cloud Storage bucket used to store ephemeral cluster and jobs data, such as Spark and MapReduce history files in Google Cloud Dataproc.

Encryption Key Name

The GCP customer managed encryption key (CMEK) name used by Cloud Dataproc.

OAuth Scopes

The OAuth 2.0 scopes that you might need to request to access Google APIs, depending on the level of access you need. Google Cloud Platform Scope is always included.

Initialization Actions

A list of scripts to be executed during initialization of the cluster. Init actions should be placed on Google Cloud Storage.

...

Cluster properties used to override default configuration properties for the Hadoop services.

Labels

...

Common Labels

A label is a key-value pair that helps you organize your Google Cloud Dataproc clusters and jobs. You can attach a label to each resource, and then filter the resources based on their labels. Information about labels is forwarded to the billing system, so customers can break down your billing charges by label.

Specifies labels for the Dataproc clusters and jobs being created.

Cluster Labels

Note: Labels (now Cluster Labels) were introduced in CDAP 6.5.0.

Specifies labels for the Dataproc cluster being created.

...