Migrating data pipelines to different environments

You might need to migrate a data pipeline from one environment to another. For example, you create a set of pipelines in a development environment and need to migrate them to a production environment. To migrate pipelines to different instances, you can export the pipelines and then import them into a different instance. When you export a pipeline, the JSON representation includes connectivity information for plugins, for example paths for GCS sources and sinks. However, an exported pipeline will not include any triggers that were created after the pipeline was deployed. If they are required, you must recreate them in the other environment. 

To dynamically change the connectivity information for sources and sinks when you migrate a pipeline to another environment, you can configure the input and output locations in a pipeline as macros, so that the same pipeline can be deployed to all environments and then configured to read from and write to different locations.

For example, if a pipeline has a GCS source, you can set the path to a macro. In the dev environment, you can set up the runtime arguments so that the path resolves to the dev bucket. In the production environment, you can set up the runtime arguments so that the path resolves to the product bucket. Similarly, if the pipeline is writing to BigQuery, the dataset can be set to a macro so that the development pipeline writes to the development dataset and the production pipeline writes to the production dataset.

Note: To export all deployed data pipelines for all namespaces, see Lifecycle Microservices, “Export All Application Details”. To export all data pipelines in Draft mode, see Pipeline Microservices, “List Draft Pipelines”.

Created in 2020 by Google Inc.