Viewing records.updated BigQuery sink metric

After a pipeline run completes, you can use the Microservices to view the number of records processed (insert, update, or upsert) by a BigQuery sink. Starting in 6.1.3/6.2.1, the user-defined metric records.updated shows the total count of all records inserted, updated, or upserted into a BigQuery table. The metric counts records based on the label of a BigQuery sink.

To see the number of records updated for a particular pipeline run, include the namespace id, pipeline name, and run-id tags in the HTTP Post command. For more information about including tags in the metrics context, see the CDAP Microservices documentation.

Note: If you don’t include additional tags, such as namespace, pipeline name, and run-id, the count for records.updated includes all BigQuery sinks with the same label across all pipelines in the default namespace.

Before you begin

Run a pipeline and verify the status is Succeeded. After you run the pipeline, copy the Run ID for the pipeline.

To locate the Run ID for a pipeline:

  1. On the Deploy Pipeline page, click Summary.

  2. In the Number of records in section, click Table.
    The Run ID for each pipeline run is listed.

  3. To copy the run id for the pipeline, click RunID.
    Save this so you can add it to the HTTP POST request to get the records updated count.

Viewing the user-defined metric records.updated

To view the user-defined metric records.updated, complete the following steps:

  1. In the Pipeline Studio, click System Admin.

  2. From the System Admin page, click Configuration.

  3. Click Make HTTP Calls.

  4. From the drop-down list, select POST.

  5. Enter the following HTTP request:

    metrics/query?tag=namespace:<namespace>&tag=app:<pipelinename>&tag=run:<run-Id>&tag=workflow:DataPipelineWorkflow&metric=<metric-name>
  6. Enter the namespace, pipeline name, and run id for the pipeline.

  7. Replace metric-name with user.<BigQueryLabel>.records.updated.
    For example, if you want to get the count for records.updated for a sink called BigQuery55, default namespace, pipeline named POS_Sales_per_region, and a Spark Run ID of d1685723-cba5-11ea-a338-ce6a685813cb, enter:

    metrics/query?tag=namespace:default&tag=app:POS_Sales_per_Region-1_v8&tag=run:d1685723-cba5-11ea-a338-ce6a685813cb&tag=workflow:DataPipelineWorkflow&metric=user.BigQuery55.records.updated
  8. Press Send to view the records.updated:

     

    The records updated are listed next to “value”.

 

Created in 2020 by Google Inc.