Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Preferences provide the ability to save configuration information at various levels of the system, including the CDAP instance, namespace, application, and program levels. A configuration is represented by a map of string-string pairs. Preferences can be retrieved, saved, and deleted through the Preferences Microservices and through the Command Line Interface. When programs are started, all the preferences at the different levels are collapsed into a single map. Preferences are persisted across a restart of either programs or CDAP.

...

Example: A configuration preference SAMPLE_KEY is set to 20 at the namespace level and is set to 10 at the program level. When the program is started, the value set at the program level overrides the value set at the namespace level and thus the value for the preference SAMPLE_KEY will be 10.

Programs such as MapReduce Spark programs, services, workflows, and workers will receive the resolved preferences and can be accessed through the getRuntimeArguments method of the context:

  • For services , and workers: preferences Preferences are available to the initialize method in the context.

  • For MapReduce and Spark: preferences Preferences are available to the initialize and destroy methods in the context. The initialize method can pass them to the mappers and reducers through the job configuration.

  • When a workflow receives preferences, it passes them to each MapReduce in the workflow.

...

On each program run, CDAP populates the runtime arguments with pre-defined values (currently, one):

  • logical.start.time: the The start time of the run as a timestamp in milliseconds. If the run was is started by a schedule, this will be equal to the trigger time for the schedule. For example, if the schedule was is set to run at midnight on Jan 1, 2016 UTC, the logical start time would be 1451606400000.

...

When a workflow is configured, you may want to pass specific runtime arguments to the different programs and datasets used inside the workflow. To achieve this, you can prefix the runtime arguments with a <scope>. Currently supported scopes are datasetmapreduce, and spark.

Note: Datasets are deprecated and will be removed in CDAP 7.0.0.

Example: To set a runtime argument of read.timeout=30 for the MapReduce program oneMapReduce in a workflow, the argument can be provided with a scope of mapreduce.oneMapReduce.read.timeout=30. In this case, oneMapReduce and the datasets used in oneMapReduce will receive two arguments: one with a scope of mapreduce.oneMapReduce.read.timeout=30, and another with the scope extracted as read.timeout=30. Programs other than oneMapReduce and datasets used in them will receive only the single argument mapreduce.oneMapReduce.read.timeout=30.

...