As a CDAP pipeline user, I would like to specify the resources for my pipeline, so that I can specify dynamically allocate/deallocate resources for my pipeline when more/less resources are required.
Specs:
BATCH MAPREDUCE:
Label: Driver; Pipeline Spec Property: driverResources
Tooltip: Resources for the MapReduce driver process, which initializes the pipeline
Label: Executor (Mapper/Reducer); Pipeline Spec Property: resources
Tooltip: Resources for Map and Reduce Tasks of the MapReduce program
BATCH SPARK:
Label: Driver; Pipeline Spec Property: driverResources
Tooltip: Resources for the Apache Spark driver process which initializes the pipeline
Label: Executor; Pipeline Spec Property: resources
Tooltip: Resources for executor processes which run tasks in an Apache Spark pipeline.
DATA STREAMS:
Label: Driver; Pipeline Spec Property: driverResources
Tooltip: Resources for the Apache Spark driver process which initializes the pipeline
Label: Client; Pipeline Spec Property: clientResources
Tooltip: Resources for the Apache Spark client process
Label: Executor; Pipeline Spec Property: resources
Tooltip: Resources for executor processes which run tasks in an Apache Spark Streaming pipeline.
For the Batch pipeline, if the engine is MapReduce instead of Spark, the terminologies could be slightly different, as they would be Client, Mapper and Reducer??
Why only data stream has Client and Driver, but not the Spark Batch one??