/
Configuring output directories for pipelines
Configuring output directories for pipelines
This document provides best practices for configuring output directory for file based sink plugins (Amazon S3 and GCS).
General Tips
Ensure the output paths are unique if there are multiple file based sinks in the same pipeline
Having same output path in a pipeline (ex: Two error collectors having same paths) will result in a pipeline failure with an error: “User class threw exception: org.apache.hadoop.mapredue.FileAlreadyExistsException: Output directory xxxxxx already exists.”
, multiple selections available,
Related content
Reusable Pipelines Best Practices
Reusable Pipelines Best Practices
Read with this
Plugin Coding Standards
Plugin Coding Standards
More like this
Dealing with CSV challenges in Wrangler
Dealing with CSV challenges in Wrangler
Read with this
File Sink
File Sink
More like this
Pipeline resource configurations
Pipeline resource configurations
Read with this
Configuring data pipeline resources
Configuring data pipeline resources
More like this
Created in 2020 by Google Inc.