Configuring output directories for pipelines
This document provides best practices for configuring output directory for file based sink plugins (Amazon S3 and GCS).
General Tips
Ensure the output paths are unique if there are multiple file based sinks in the same pipeline
Having same output path in a pipeline (ex: Two error collectors having same paths) will result in a pipeline failure with an error: “User class threw exception: org.apache.hadoop.mapredue.FileAlreadyExistsException: Output directory xxxxxx already exists.”
Created in 2020 by Google Inc.