Consider a MapReduce that performs some writes in its beforeSubmit method.
In MapReduceRuntimeService, the beforeSubmit method will execute in a single transaction that commits before even submitting the MapReduce job.
If there is some exception after this point and before the MapReduce job is executed by the Hadoop framework, the MapReduce job will not run, but the data modified by the MapReduce's beforeSubmit method will still be in place.
This seems wrong, as a MapReduce's writes should not be committed if it failed.
TLDR: MapReduce's beforeSubmit dataset writes can be persisted and visible to others, even if there is some failure.
An example of such an error:
In this case, beforeSubmit marked partitions as IN_PROGRESS. They remained IN_PROGRESS even after this failure. To solve this case at least, we can submit the job in the same transaction as beforeSubmit.
Alternatively, if destroy method of the MR was called, that would give an opportunity to release the IN_PROGRESS partitions, and mark them AVAILABLE again.
This is caused by CDAP-7444. Once that is fixed, we can do the cleanup in destroy.
Calling the submit in the same transaction may not be a good idea, that would pull in a lot of other stuff into that transaction.
deprioritizing non-critical fixes to MR