Discovery and Lineage


Metadata can be used to tag different CDAP components so that they are easily identifiable and managed. This helps in discovering CDAP components.

For example, you can tag a dataset as experimental or an application as production. These entities can then be discovered by using search queries with the annotated metadata.

Using search, you can discover entities:

  • that have a particular value for any key in their properties;

  • that have a particular key with a particular value in their properties; or

  • that have a particular tag.

You can find a dataset that has a "field with the given name" or a "field with the given name and the given type".

To search metadata, you can use the Metadata Microservices.


Lineage can be retrieved for dataset entities. A lineage shows, for a specified time range, all data access of the entity and details of where that access originated from.

For datasets, lineage can indicate if a dataset access was for reading, writing, or both, if the methods in the dataset have appropriate annotations. If annotations are absent, lineage can only indicate that a dataset access took place, and does not provide indication if that access was for reading or writing.