Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Properties associated with the Dataset. For example: File path, name of the directory associated with the "HR File", Broker Id, Topic name etc associated with the Kafka plugin. This will be single row per dataset per namespace . If the same dataset is used in multiple pipelines, but with different configurations the properties will be union of both. per run of the pipeline.
  2. Fields associated with the Dataset. This will be single row per dataset per namespace per run of the pipeline. We will store each field as a separate column in this row. The value of the column can be additional properties such as creation time, last update time, runid responsible for last update etc.
  3. Lineage information associated with the each field from the target dataset. For each field belonging to each target dataset , and for each run of the pipeline writing to that dataset, per run there will be one rowsingle row which will contain the entire lineage graph.

Example: With one run of the pipeline shown above, following will be the sample data in the store.

...