The mapreduce pipeline planner can place the same sink in multiple mapreduce phases. For some sinks, this is ok but for others it is not. For example, I believe the partitioned file set sinks will fail because whatever job happens to finish first will successfully add a partition, but the second job will try to add that same partition and fail.
The planner should instead ensure that connectors are used to ensure that sinks are only written to once in a single mapreduce job, similar to how we ensure that a source is only read from once in a single mapreduce job.
An example pipeline that causes this issue looks like:
Fixed a planner bug to ensure that sinks are never placed in two different mapreduce phases in the same pipeline.