...
Metadata consists of properties (a list of key-value pairs) or tags (a list of keys). Metadata and their use are described in the Metadata and Lineage section.
The Microservices is divided into these sections:
metadata Metadata properties
metadata Metadata tags
searching Searching metadata
viewing Viewing lineage
field Field level lineage
metadata Metadata for a run of a program
Metadata keys, values, and tags must conform to the CDAP alphanumeric extra extended character set, and are limited to 50 characters in length. The entire metadata object associated with a single entity is limited to 10K bytes in size.
...
All methods or endpoints described in this API have a base URL (typically http://<host>:11015
or https://<host>:10443
) that precedes the resource identifier, as described in the Microservices Conventions. These methods return a status code, as listed in the Microservices Status Codes.
Note: Datasets are deprecated and will be removed in CDAP 7.0.0.
Metadata Properties
Annotating Properties
...
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
HTTP Responses
Status Codes | Description |
---|---|
| The properties were set. |
Note: When using this API, properties can be added to the metadata of the specified entity only in the user scope.
...
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
| Optional scope filter. If not specified, properties in the |
...
Status Codes | Description |
---|---|
| The properties requested were returned as a JSON string in the body of the response which can be empty if there are no properties associated with the entity, or the entity does not exist. |
Deleting Properties
To delete all user metadata properties for an application, dataset, or other entities including custom entities, submit an HTTP DELETE request:
...
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
| Metadata property key. |
HTTP Responses
Status Codes | Description |
---|---|
| The method was successfully called, and the properties were deleted, or in the case of a specific key, were either deleted or the key was not present, or the entity itself was not present. |
...
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
HTTP Responses
Status Codes | Description |
---|---|
| The tags were set. |
Note: When using this API, tags can be added to the metadata of the specified entity only in the user scope.
...
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
| Optional scope filter. If not specified, properties in the |
...
Status Codes | Description |
---|---|
| The tags requested were returned as a JSON string in the body of the response which can be empty if there are no tags associated with the entity or entity does not exist. |
Removing Tags
To delete all user metadata tags for an application, dataset, or other entities including custom entities, submit an HTTP DELETE request:
...
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
| Metadata tag. |
HTTP Responses
Status Codes | Description |
---|---|
| The method was successfully called, and the tags were deleted, or in the case of a specific tag, was either deleted or the tag was not present, or the entity itself was not present. |
Note: When using this API, only tags in the user scope can be deleted.
...
Parameter | Description |
---|---|
| Namespace ID. |
| Query term, as described below. Query terms are case-insensitive. |
| Restricts the search to either all or specified entity types: |
| Options for controlling cursors, limits, offsets, the inclusion of hidden and custom entities, and sorting: Option NameOption Value, Description, and Notes Format for an option: |
...
Status Codes | Description |
---|---|
| Entity ID and metadata of entities that match the query and entity type(s) are returned in the body of the response. |
Query Terms
CDAP supports prefix-based search of metadata properties and tags across both user and system scopes. Search metadata of entities by using either a complete or partial name followed by an asterisk *
.
Search for properties and tags by specifying one of:
a complete Complete property key-value pair, separated by a colon, such as
type:production
a complete Complete property key with a partial value, such as
type:prod*
a complete Complete
tags
key with a complete or partial value, such astags:production
ortags:prod*
to search for tags onlya complete Complete or partial value, such as
prod*
; this will return both properties and tagsmultiple Multiple search terms separated by space, such as
type:prod* author:joe
; this will return entities having either of the terms in their metadata.
Since CDAP also annotates system metadata to entities by default as mentioned at System Metadata, the following special search queries are also supported:
artifacts Artifacts or applications containing a specific plugin:
plugin:<plugin-name>
programs Programs with a specific mode:
batch
orrealtime
applications Applications with a specific program type:
service:<service-name>
,mapreduce:<mapreduce-name>
,spark:<spark-name>
,worker:<worker-name>
,workflow:<workflow-name>
datasets Datasets or views with schema field:
field name only:
field-name
field name with a type:
<field-name>:<field-type>
, wherefield-type
can be:simple types:
int
,long
,boolean
,float
,double
,bytes
,string
,enum
complex types:
array
,map
,record
,union
...
To view the lineage of a dataset or , submit an HTTP GET request:
...
Parameter | Description |
---|---|
| Namespace ID. |
|
|
| Name of the |
| Starting time-stamp of lineage (inclusive), in seconds. Supports |
| Ending time-stamp of lineage (exclusive), in seconds. Supports |
| Number of levels of lineage output to return. Defaults to 10. Determines how far back the provenance of the data in the lineage chain is calculated. |
| An optional set of |
| An optional |
...
For more information about collapsing lineage output, please refer to see the following section below on Collapsing Lineage Output.
...
Parameter | Description |
---|---|
| Namespace ID. |
| Name of the |
| Starting time-stamp (inclusive), in seconds. Supports |
| Ending time-stamp (exclusive), in seconds. Supports |
| Optional |
| Optional flag, when set to true the current fields of the dataset will be be included irrespective of whether they have any lineage information or not. |
...
Status Codes | Description |
---|---|
| Fields of dataset are returned as a list of strings in the body of the response. |
| Failure to parse the time range provided. |
Field Lineage Summary
Gets the field lineage summary for a specified field of a dataset. The field lineage summary consists of the sets of datasets and their respective fields used to compute the specified field of a dataset:
...
Parameter | Description |
---|---|
| Namespace ID. |
| Name of the |
| Name of the |
| Starting time-stamp (inclusive), in seconds. Supports |
| Ending time-stamp (exclusive), in seconds. Supports |
|
|
...
Status Codes | Description |
---|---|
| Fields of dataset are returned as a list of strings in the body of the response. |
| Failure to parse the time range provided. |
Field Lineage Operations
Gets the details of operations responsible for computation of a specified field of a dataset for a specified range of time:
...
Parameter | Description |
---|---|
| Namespace ID. |
| Name of the |
| Name of the |
| Starting time-stamp (inclusive), in seconds. Supports |
| Ending time-stamp (exclusive), in seconds. Supports |
|
|
...
Status Codes | Description |
---|---|
| Fields of dataset are returned as a list of strings in the body of the response. |
| Failure to parse the time range provided. |
Metadata for Custom Entities
...
Custom Entities are represented as a hierarchical key-value pair and can optionally have a an explicitly defined type.
If a type is not specified then the last key in the hierarchy is considered as the type.
...