Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • task.driver.system.resources.memory to configure the memory for Spark Driver.

    • Memory is configured in Megabytes.

    • Example: Setting task.driver.system.resources.memory to 2048 sets the driver memory resources to 2 GB (2048 MB).

  • task.driver.system.resources.cores to configure the CPU (cores) for Spark Driver.

    • By default the driver CPU is set to 1 core.

    • Example: Setting task.driver.system.resources.cores to 2 sets the driver cores to 2.

  • task.executor.system.resources.memory to configure the memory for Spark Executors.

    • Memory is configured in Megabytes

    • Example: task.executor.system.resources.memory 2048 sets the executor memory resources to 2 GB (2048 MB).

  • task.executor.system.resources.cores to configure the CPU (cores) for Spark Executors.

    • By default the driver CPU (cores) is set to 1 core.

    • Example: task.executor.system.resources.cores 2 configures 2 cores for all executors.

Configuring Compute Resources

...

for Dataproc

  • system.profile.properties.serviceAccount service account for the Dataproc cluster.

  • system.profile.properties.masterNumNodes to set the number of master nodes.

  • system.profile.properties.masterMemoryMB to set the memory per master node.

  • system.profile.properties.masterCPUs to set the number of CPUs for the master.

  • system.profile.properties.masterDiskGB to set the disk in GB per master node.

  • system.profile.properties.workerNumNodes to set the number of worker nodes.

  • system.profile.properties.workerMemoryMB to set the memory per worker node.

  • system.profile.properties.workerCPUs to set the number of CPUs per worker node.

  • system.profile.properties.workerDiskGB to set the disk in GB per worker node.

  • system.profile.properties.stackdriverLoggingEnabled to true to enable Stackdriver logging for the pipelines.

  • system.profile.properties.stackdriverMonitoringEnabled to true to enable Stackdriver monitoring for the pipelines.

  • system.profile.properties.imageVersion to configure Dataproc image version.

  • system.profile.properties.network to configure network for the Dataproc cluster.