Default to one attempt for Spark pipeline
Description
By default, Spark will retry a YARN app if the first attempt fails. For our pipelines, this second attempt basically always fails, and often in a way that misleads the user. For example, the first attempt may create an output directory on GCS based on the logical start time. The second attempt fails because the output directory already exists.
It seems like it would make sense to turn app retry off, especially since Spark already retries tasks within an app when the tasks fail.
Release Notes
Disabled Spark yarn app retries since spark already performs retries at a task level.
Activity
Spark retries task failures by default. We should not retry failures at the yarn app level since they will basically fail again, override the original error message and mislead the user. |
PR merged to disable yarn app retries. https://github.com/cdapio/cdap/pull/12805