Using Triggers

Note: You can create triggers for batch data pipelines. You cannot create triggers for realtime pipelines.

In the Pipeline Studio, you can create a trigger on a batch data pipeline to have it run when one or more pipeline runs complete. These are called downstream and upstream pipelines. You create a trigger on the downstream pipeline so that it runs based on the completion of one or more upstream pipelines.

Triggers are useful when you want to:

  • Clean your data once and make it available to multiple downstream pipelines for consumption. 

  • Share information such as runtime arguments and plugin configurations between pipelines. This is called payload configuration.

  • Having a set of dynamic pipelines that can run using the data of the hour/day/week/month, instead of a static pipeline that needs to be updated for every run.

For example, you have a dataset that contains all of the information about your companies’ shipments. You have several business questions that you want answered based on this data. So, you create one pipeline that cleanses the raw data about shipments, called Shipments Data Cleansing. Then you create a second pipeline, called Delayed Shipments USA, that reads the cleansed data and finds the shipments within the USA that were delayed by more than a specified threshold. The Delayed Shipments USA pipeline can be triggered as soon as the upstream Shipments Data Cleansing pipeline completes successfully. 

Additionally, since the downstream pipeline consumes the output of the upstream pipeline, you want to specify that when the downstream pipeline runs using this trigger, it also receives the input directory to read from (which is the directory where the upstream pipeline generated its output). This is called passing payload configuration, which you define with runtime arguments. This enables you to have a set of dynamic pipelines that can run using the data of the hour/day/week/month, as opposed to a static pipeline that needs to be updated for every run.

Note: You can also use the Schedule Lifecycle Microservices to create inbound triggers.

Before you begin

In the Pipeline Studio, deploy the pipelines that are your upstream and downstream pipelines.

Optional: Set runtime arguments for your upstream pipeline

If you want to pass payload configuration as runtime arguments, set the runtime arguments for your upstream pipeline:

  1. Go to the List page. In the Deployed tab, click the name of the upstream pipeline. The Deploy view for that pipeline appears.

  2. Click the arrow to the right of the Run button.

  3. Click the + button and fill in the Key and Value for your runtime argument.

  4. Click Save.

Create an inbound trigger on a downstream pipeline

Starting in CDAP 6.8.0, you can create OR and AND triggers. OR triggers run the downstream pipeline when the event (Succeeds, Stops, Fails) for one of the upstream pipelines is met. AND triggers run the downstream pipeline when all of the events (Succeeds, Stops, Fails) for the upstream pipelines are met.

In CDAP 6.7.x and earlier, you can create only OR triggers.  

When you upgrade to CDAP 6.8.x, all existing triggers are set as OR triggers.

To create an inbound trigger on a downstream pipeline, follow these steps:

  1. Deploy both the upstream and downstream pipelines.

  2. From the List > Deployed Pipeline page, click the name of the downstream pipeline. The pipeline opens in Deploy mode.

  3. On the left side of the page, click Inbound triggers. A list of available pipelines displays.

  4. In the View pipelines in namespace field, select the namespace for the upstream pipelines.

  5. In the Trigger Type field, select Trigger on any selected event (OR) or Wait for all events (AND).

  6. Required. In the Trigger Name field, enter a unique name for the trigger.

  7. To select the upstream pipeline, click the arrow to the left of the pipeline name. You can also search for pipelines to use in the trigger.

    Note: Do not select the checkbox to the right of the pipeline. This locks the trigger configuration.

  8. Select when to trigger the downstream pipeline to run. You can trigger the downstream pipeline when the upstream pipeline Succeeds, Stops, and/or Fails. You can choose more than one option.

     

  9. If you do not want to pass any payload configuration between these pipelines, click the checkbox to the right of the pipeline name and click Enable Trigger.

Note: To pass payload configuration between these pipelines, see “Passing payload configuration as runtime arguments”.

Passing payload configuration as runtime arguments

Payload configuration lets you pass relevant information such as the output directory, the format of data, and the day for which it was run from the upstream pipeline into the downstream pipeline. The downstream pipeline can then use this information to make decisions such as picking the right dataset to read from.

You can pass payload information from the upstream pipeline to the downstream pipeline by setting the runtime arguments of the downstream pipeline using the values of either the runtime arguments or the configuration of any plugin in the upstream pipeline.

The Pipeline Studio ensures that every time the downstream pipeline is triggered, its payload is set using the runtime arguments of the particular run of the upstream pipeline that triggered the downstream pipeline.

To add payload configuration to a trigger, follow these steps:

  1. Click Trigger Config to see runtime arguments and plugin configuration that you can pass to the downstream pipeline when this trigger executes.

  2. In the Runtime arguments section, from the dropdown lists, you can pick runtime arguments from the first pipeline to pass along to the second pipeline.

  3. Click Plugin Config to see a list of properties from all the plugins you have in your upstream pipeline that you can pass along with the trigger to the downstream pipeline.

  4. After you have all the runtime arguments and plugin configuration set to include with the trigger, click Select.

  5. To enable the trigger, click Enable The Trigger.

    Note: Enable The Trigger is disabled if the trigger configuration is incomplete.
    You can see the count increased in your Inbound Trigger indicator:

6. To view the configuration you have set for this trigger, click View Payload.

7. Optional. To add additional triggers, click Add New Trigger.

Testing a trigger

To test a trigger, follow these steps:

  1. Run the upstream pipeline.

  2. From the Pipeline List > Deploy page, click the name of the upstream pipeline. Then click the Run button. 

  3. Wait for the pipeline to complete, and navigate to the downstream pipeline. You should see it being triggered.

Navigating triggers

You can also navigate from an upstream pipeline to its triggered pipelines by clicking the Outbound triggers button on the details page of the upstream pipeline.

Likewise, you can also navigate from a downstream pipeline to its triggering pipelines by clicking the Inbound triggers button on the Pipeline Deploy page of the downstream pipeline. 

You can use this feature to navigate through a set of interconnected pipelines.

Created in 2020 by Google Inc.