Lifecycle methods should be able use the same or a different transaction

Description

(Batch) Programs like Spark and MapReduce have a beforeSubmit() and a inFinish() method.

In some use cases, it makes sense to run beforeSubmit() in the same transaction. For example if a MR processes a range in a stream, and needs to persist a "high watermark" of how far it has read. If beforeSubmit() persists that in its own transaction, and the MR fails subsequently, then the persisted mark is wrong. What we would want is that the mark only gets updated in case of success.

However, it can also make sense to persists this in its own transaction. For example, if we expect concurrent runs of the same MR (say it runs for 20 minutes, but the next run is scheduled 15 minutes later), then the next run should see the updated mark, so that it processes a disjoint range of events.

We can find similar reasoning for onFinish().

Conclusion is that these methods should support running either in their own transaction, or in the same transaction as the main program. Or we could introduce two pairs of methods, either or each of which can be implemented by a program.