Previewing data

You can preview data to debug issues before you deploy and run a pipeline. If you encounter any errors, you can easily fix them while in Draft mode.

The Pipeline Studio uses the first 100 rows of your source dataset to generate the preview. 

In Preview mode, the Pipeline Studio displays the status of the Preview job and the duration of the Preview job. You can stop the Preview job at any time. You can also monitor the log events as the Preview job runs.

Note: Actions and Post-run actions are ignored when running preview.

How to preview data

Want to watch the video?

To preview data, follow these steps:

  1. Make sure each source, transformation, and sink do not have any errors.

  2. Click Preview
    Notice that the task bar changes and there are three new buttons: Run, Duration, and Logs.

     

  3. Optionally, click Configure to change the following settings before running Preview:

  • Runtime arguments.

  • Preview config. You can change the number of rows to preview.

  • Advanced options. You can change Pipeline config and Engine config.

4. Click Run to start the preview job.
While the pipeline is running in preview mode, no data is actually written to the sink, but you will be able to confirm that data is being read properly and that it will be written as expected once the pipeline is deployed. 

After you Run the data preview, you can click Preview Data on any node that handles data, for example, sources, sinks, and transformations, to see what your data looks like at each stage in the pipeline.

The Preview button is a toggle, so be sure to click it again to get out of preview mode when you’re finished.

 

Created in 2020 by Google Inc.