BigQuery Pushdown Pipeline Metrics

This topic outlines the new metrics CDAP collects for BigQuery Pushdown pipelines in CDAP 6.7.2 and later releases.

New metrics for BigQuery Pushdown pipelines

Using the Metrics Context supplied by the CDAP platform, we have added logic to capture additional metrics from the portions of the pipeline executed in BigQuery.

These metrics all share a common prefix:

user.pushdown.<the name of the engine>

For the BigQuery SQL Engine implementation, this results in:

user.pushdown.BigQueryPushdownEngine

Additionally, metrics are further split in 2 further categories:

  1. Pipeline metrics are collected in the platform, and relate to the number of stages that get pushed down, and the number of records that get pushed down.
    <prefix>.pipeline.records.in -> # input records to pushdown stages
    <prefix>.pipeline.records.out -> # output records from pushdown stages
    <prefix>.pipeline.records.pull -> # pushed records into the engine
    <prefix>.pipeline.records.push -> # pulled records from the engine
    <prefix>.pipeline.stages.count -> # executed stages in the engine
    <prefix>.pipeline.stages.count.join -> # executed join stages in the engine
    <prefix>.pipeline.stages.count.transform -> # executed transform in the engine

  2. Engine metrics are collected by the engine itself, and can be used to collect resource utilization, cost and other metrics emitted by the engine directly.
    <prefix>.engine.bytes.billed -> Bytes Billed by BigQuery
    <prefix>.engine.bytes.processed -> Bytes Processed by BigQuery
    <prefix>.engine.slot.ms -> Slot usage for BigQuery jobs

A non-exhaustive list of the metrics we collect from BigQuery engine:

POST v3/metrics/search?target=metric&tag=namespace:default&tag=app:the_pipeline_nameuser.pushdown.BigQueryPushdownEngine.engine.bytes.billed
user.pushdown.BigQueryPushdownEngine.engine.bytes.processed
user.pushdown.BigQueryPushdownEngine.engine.slot.ms
user.pushdown.BigQueryPushdownEngine.pipeline.records.in
user.pushdown.BigQueryPushdownEngine.pipeline.records.out
user.pushdown.BigQueryPushdownEngine.pipeline.records.pull
user.pushdown.BigQueryPushdownEngine.pipeline.records.push
user.pushdown.BigQueryPushdownEngine.pipeline.stages.count
user.pushdown.BigQueryPushdownEngine.pipeline.stages.count.join
user.pushdown.BigQueryPushdownEngine.pipeline.stages.count.transform
user.pushdown.BigQueryPushdownEngine.pipeline.stages.count.pull
user.pushdown.BigQueryPushdownEngine.pipeline.stages.count.push
user.pushdown.BigQueryPushdownEngine.pipeline.stages.count.spark_pull
user.pushdown.BigQueryPushdownEngine.pipeline.stages.count.spark_push
user.pushdown.BigQueryPushdownEngine.pipeline.stages.count.read
user.pushdown.BigQueryPushdownEngine.pipeline.stages.count.write


Created in 2020 by Google Inc.