Metadata and Lineage Commands

The CLI includes the following metadata and lineage commands:

Command

Description

Command

Description

add metadata-properties <entity> <properties>

Adds metadata properties for an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

add metadata-tags <entity> <tags>

Adds metadata tags for an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

get lineage dataset <dataset-name> [start <start>] [end <end>] [levels <levels>]

DEPRECATED. Gets the lineage of a dataset.

get metadata <entity> [scope <scope>]

Gets the metadata of an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

get metadata-properties <entity> [scope <scope>]

Gets the metadata properties of an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

get metadata-tags <entity> [scope <scope>]

Gets the metadata tags of an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

remove metadata <entity>

Removes metadata for an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

remove metadata-properties <entity>

Removes all metadata properties for an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

remove metadata-property <entity> <property>

Removes a specific metadata property for an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

remove metadata-tag <entity> <tag>

Removes a specific metadata tag for an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

remove metadata-tags <entity>

Removes all metadata tags for an entity. <entity> is of the form <entity-type>:<entity-id>, where <entity-type> is one of 'artifact', 'application', 'dataset' or 'program'.

For artifacts and apps, <entity-id> is composed of the namespace, entity name, and version, such as <namespace-name>.<artifact-name>.<artifact-version> or <namespace-name>.<app-name>.<app-version>.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

For programs, <entity-id> includes the application name and the program type: <namespace-name>.<app-name>.<program-type>.<program-name>. <program-type> is one of mapreduce, service, spark, worker, or workflow.

For datasets, <entity-id> is the namespace and entity names, such as <namespace-name>.<dataset-name>.

Custom entities can be specified as hierarchical key-value pair with an optional type if the last key in hierarchy is not the type of the entity. For example a 'field' in dataset can be specified as: namespace=<namespace-name>,dataset=<dataset-name>,field=<field-name>. A 'jar' in a namespace can be specified as: namespace=<namespace-name>,jar=<jar-name>,version=<version-number>,type=jar.

search metadata <search-query> [filtered by target-type <target-type>]

Searches CDAP entities based on the metadata annotated on them. The search can be restricted by adding a comma-separated list of target types: 'artifact', 'app', 'dataset', 'program', 'stream', or 'view'.



Created in 2020 by Google Inc.