...
Code Block |
---|
POST /v3/namespaces/<namespace-id>/<entity-details>/metadata/properties
|
or, for a particular program of a specific application:
Code Block |
---|
POST /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/properties
|
or, for a particular version of an artifact:
Code Block |
---|
POST /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/properties
|
or, for a custom entity like field of a dataset:
Code Block |
---|
POST /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/properties
|
with the metadata properties as a JSON string map of string-string pairs, passed in the request body:
Code Block |
---|
{
"key1" : "value1",
"key2" : "value2",
...
}
|
New property keys will be added and existing keys will be updated. Existing keys not in the properties map will not be deleted.
...
Code Block |
---|
GET /v3/namespaces/<namespace-id>/<entity-details>/metadata/properties[?scope=<scope>]
|
or, for a specific application:
...
Code Block |
---|
GET /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/properties[?scope=<scope>]
|
or, for a particular version of an artifact:
Code Block |
---|
GET /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/properties[?scope=<scope>]
|
or, for a custom entity like field of a dataset:
Code Block |
---|
GET /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/properties[?scope=<scope>]
|
with the metadata properties returned as a JSON string map of string-string pairs, passed in the response body (pretty-printed):
Code Block |
---|
{
"key1" : "value1",
"key2" : "value2",
...
}
|
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
| Optional scope filter. If not specified, properties in the |
...
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/<entity-details>/metadata/properties
|
or, for all user metadata properties of a particular program of a specific application:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/properties
|
or, for a particular version of an artifact:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/properties
|
To delete a specific property for an application, dataset, or submit an HTTP DELETE request with the property key:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/<entity-type>/<entity-id>/metadata/properties/<key>
|
or, for a particular property of a program of a specific application:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/properties/<key>
|
or, for a particular version of an artifact:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/properties/<key>
|
or, for a custom entity like field of a dataset:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/properties/<key>
|
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
| Metadata property key. |
...
Code Block |
---|
POST /v3/namespaces/<namespace-id>/<entity-details>/metadata/tags
|
or, for a particular program of a specific application:
Code Block |
---|
POST /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/tags
|
or, for a particular version of an artifact:
Code Block |
---|
POST /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/tags
|
or, for a custom entity like field of a dataset:
Code Block |
---|
POST /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/tags
|
with the metadata tags, as a list of strings, passed in the JSON request body:
Code Block |
---|
["tag1", "tag2"]
|
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
...
Code Block |
---|
GET /v3/namespaces/<namespace-id>/<entity-details>/metadata/tags[?scope=<scope>]
|
or, for a particular program of a specific application:
Code Block |
---|
GET /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/tags[?scope=<scope>]
|
or, for a particular version of an artifact:
Code Block |
---|
GET /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/tags[?scope=<scope>]
|
or, for a custom entity like field of a dataset:
Code Block |
---|
GET /v3/namespaces/<namespace-id>/dataset/<dataset-id>/field/<field-name>/metadata/tags[?scope=<scope>]
|
with the metadata tags returned as a JSON string in the return body:
Code Block |
---|
["tag1", "tag2"]
|
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
| Optional scope filter. If not specified, properties in the |
...
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/<entity-details>/metadata/tags
|
or, for all user metadata tags of a particular program of a specific application:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/tags
|
or, for a particular version of an artifact:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/tags
|
To delete a specific user metadata tag for an application, dataset, or submit an HTTP DELETE request with the tag:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/<entity-type>/<entity-id>/metadata/tags/<tag>
|
or, for a particular user metadata tag of a program of a specific application:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/tags/<tag>
|
or, for a particular version of an artifact:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/tags/<tag>
|
or, for a custom entity like field of a dataset:
Code Block |
---|
DELETE /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/tags/<tag>
|
Parameter | Description |
---|---|
| Namespace ID. |
| Hierarchical key-value representation of the entity. |
| Name of the application. |
| One of |
| Name of the program. |
| Name of the artifact. |
| Version of the artifact. |
| Name of the dataset. |
| Name of the field. |
| Metadata tag. |
...
Code Block |
---|
GET /v3/namespaces/<namespace-id>/metadata/search?query=<term>[&target=<entity-type>&target=<entity-type2>...][&<option>=<option-value>&...]
|
Parameter | Description |
---|---|
| Namespace ID. |
| Query term, as described below. Query terms are case-insensitive. |
| Restricts the search to either all or specified entity types: |
| Options for controlling cursors, limits, offsets, the inclusion of hidden and custom entities, and sorting: Option NameOption Value, Description, and Notes Format for an option: |
...
Code Block |
---|
{
"type":"record",
"name":"employee",
"fields":[
{
"name":"employeeName",
"type":"string"
},
{
"name":"departments",
"type":{
"type":"array",
"items":"long"
}
}
]
}
|
With a schema as shown above, queries such as employee:record
, employeeName:string
, departments
, departments:array
can be issued.
...
Code Block |
---|
GET /v3/namespaces/<namespace-id>/<entity-type>/<entity-id>/lineage?start=<start-ts>&end=<end-ts>[&levels=<levels>][&collapse=<collapse>&collapse=<collapse>...]
|
where:
Parameter | Description |
---|---|
| Namespace ID. |
|
|
| Name of the |
| Starting time-stamp of lineage (inclusive), in seconds. Supports |
| Ending time-stamp of lineage (exclusive), in seconds. Supports |
| Number of levels of lineage output to return. Defaults to 10. Determines how far back the provenance of the data in the lineage chain is calculated. |
| An optional set of |
| An optional |
...
Code Block |
---|
{
"start": 1442863938,
"end": 1442881938,
"relations": [
{
"data": "stream.default.purchaseStream",
"program": "flows.default.PurchaseHistory.PurchaseFlow",
"access": "read",
"runs": [
"4b5d7891-60a7-11e5-a9b0-42010af01c4d"
],
"components": [
"reader"
]
},
{
"data": "dataset.default.purchases",
"program": "flows.default.PurchaseHistory.PurchaseFlow",
"access": "unknown",
"runs": [
"4b5d7891-60a7-11e5-a9b0-42010af01c4d"
],
"components": [
"collector"
]
}
],
"data": {
"dataset.default.purchases": {
"entityId": {
"dataset": "history",
"entity": "DATASET",
"namespace": "default"
}
},
"stream.default.purchaseStream": {
"entityId": {
"entity": "STREAM",
"namespace": "default",
"stream": "purchaseStream"
}
}
},
"programs": {
"flows.default.PurchaseHistory.PurchaseFlow": {
"entityId": {
"application": "PurchaseHistory",
"entity": "PROGRAM",
"namespace": "default",
"program": "PurchaseFlow",
"type": "Flow",
"version": "-SNAPSHOT"
}
}
}
}
|
HTTP Responses
Status Codes | Description |
---|---|
| Entities IDs of entities with the metadata properties specified were returned as a list of strings in the body of the response |
| No entities matching the specified query were found |
...
Code Block |
---|
{
"relations": [
{
"accesses": [
"read"
],
"components": [
"collector"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525"
]
},
{
"accesses": [
"read"
],
"components": [
"reader"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525"
]
},
{
"accesses": [
"read"
],
"components": [
"collector"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"ae188ea2-0c2f-11e6-b499-561602fdb525"
]
},
{
"accesses": [
"write"
],
"components": [
"collector"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525"
]
},
{
"accesses": [
"write"
],
"components": [
"collector"
],
"data": "dataset.default.purchase",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"ae188ea2-0c2f-11e6-b499-561602fdb525"
]
}
]
}
|
Collapsing the above by run
would group the runs together as:
Code Block |
---|
{
"relations": [
{
"accesses": [
"read"
],
"components": [
"collector"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525",
"ae188ea2-0c2f-11e6-b499-561602fdb525"
]
},
{
"accesses": [
"read"
],
"components": [
"reader"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525"
]
},
{
"accesses": [
"write"
],
"components": [
"collector"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525",
"ae188ea2-0c2f-11e6-b499-561602fdb525"
]
}
]
}
|
Collapsing by access
would produce:
Code Block |
---|
{
"relations": [
{
"accesses": [
"read",
"write"
],
"components": [
"collector"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525"
]
},
{
"accesses": [
"read",
"write"
],
"components": [
"collector"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"ae188ea2-0c2f-11e6-b499-561602fdb525"
]
},
{
"accesses": [
"read"
],
"components": [
"reader"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525"
]
}
]
}
|
Similarly, collapsing by component
will generate:
Code Block |
---|
{
"relations": [
{
"accesses": [
"read"
],
"components": [
"collector",
"reader"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525"
]
},
{
"accesses": [
"read"
],
"components": [
"collector"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"ae188ea2-0c2f-11e6-b499-561602fdb525"
]
},
{
"accesses": [
"write"
],
"components": [
"collector"
],
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"a442db61-0c2f-11e6-bc75-561602fdb525"
]
},
{
"accesses": [
"write"
],
"components": [
"collector"
],
"data": "dataset.default.purchase",
"program": "mapreduce.default.PurchaseHistory.PurchaseFlow",
"runs": [
"ae188ea2-0c2f-11e6-b499-561602fdb525"
]
}
]
}
|
Rolling Up Lineage Output
...
Code Block |
---|
{
"start": 1442863938,
"end": 1442881938,
"relations": [
{
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.phase-1",
"access": "read",
"runs": [
"4b5d7891-60a7-11e5-a9b0-42010af01c4d"
],
"components": [
"reader"
]
},
{
"data": "dataset.default.purchases",
"program": "mapreduce.default.PurchaseHistory.phase-2",
"access": "unknown",
"runs": [
"7d6r7891-60a7-11e5-a9b0-42010af01c4d"
],
"components": [
"collector"
]
}
],
"data": {
"dataset.default.purchases": {
"entityId": {
"dataset": "purchases",
"entity": "DATASET",
"namespace": "default"
}
},
},
"programs": {
"mapreduce.default.PurchaseHistory.phase-1": {
"entityId": {
"application": "PurchaseHistory",
"entity": "PROGRAM",
"namespace": "default",
"program": "phase-1",
"type": "Mapreduce",
"version": "-SNAPSHOT"
}
},
"mapreduce.default.PurchaseHistory.phase-2": {
"entityId": {
"application": "PurchaseHistory",
"entity": "PROGRAM",
"namespace": "default",
"program": "phase-2",
"type": "Mapreduce",
"version": "-SNAPSHOT"
}
}
},
}
|
Rolling up the above using rollup=workflow
would group the programs together as:
Code Block |
---|
{
"start": 1442863938,
"end": 1442881938,
"relations": [
{
"data": "dataset.default.purchases",
"program": "workflows.default.PurchaseHistory.DataPipelineWorkflow",
"access": [
"read",
"unknown"
],
"runs": [
"5b3ar7891-60a7-11e5-a9b0-42010af01c4d"
],
"components": [
"reader"
]
},
],
"data": {
"dataset.default.purchases": {
"entityId": {
"dataset": "purchases",
"entity": "DATASET",
"namespace": "default"
}
},
},
"programs": {
"workflows.default.PurchaseHistory.DataPipelineWorkflow": {
"entityId": {
"application": "PurchaseHistory",
"entity": "PROGRAM",
"namespace": "default",
"program": "DataPipelineWorkflow",
"type": "Workflow",
"version": "-SNAPSHOT"
}
},
},
}
|
Field Level Lineage
Fields associated with the Dataset
...
Code Block |
---|
GET /namespaces/{namespace-id}/datasets/{dataset-id}/lineage/fields?start=<start-ts>&end=<end-ts>[&prefix=<prefix>]
|
where:
Parameter | Description |
---|---|
| Namespace ID. |
| Name of the |
| Starting time-stamp (inclusive), in seconds. Supports |
| Ending time-stamp (exclusive), in seconds. Supports |
| Optional |
| Optional flag, when set to true the current fields of the dataset will be be included irrespective of whether they have any lineage information or not. |
...
Code Block |
---|
[{"name":"firstName","lineage":true},{"name":"lastName","lineage":true},{"name":"customer_id","lineage":false}]
|
HTTP Responses
Status Codes | Description |
---|---|
| Fields of dataset are returned as a list of strings in the body of the response. |
| Failure to parse the time range provided. |
...
Code Block |
---|
GET /namespaces/{namespace-id}/datasets/{dataset-id}/lineage/fields/{field-name}?start=<start-ts>&end=<end-ts>&direction=incoming
|
where:
Parameter | Description |
---|---|
| Namespace ID. |
| Name of the |
| Name of the |
| Starting time-stamp (inclusive), in seconds. Supports |
| Ending time-stamp (exclusive), in seconds. Supports |
|
|
...
Code Block |
---|
{
"incoming": [
{
"dataset": {
"dataset": "Customer",
"entity": "DATASET",
"namespace": "default"
},
"fields": [
"body"
]
},
{
"dataset": {
"dataset": "purchases",
"entity": "DATASET",
"namespace": "default"
},
"fields": [
"body"
]
}
]
}
|
HTTP Responses
Status Codes | Description |
---|---|
| Fields of dataset are returned as a list of strings in the body of the response. |
| Failure to parse the time range provided. |
...
Code Block |
---|
GET /namespaces/{namespace-id}/datasets/{dataset-id}/lineage/fields/{field-name}/operations?start=<start-ts>&end=<end-ts>&direction=incoming
|
where:
Parameter | Description |
---|---|
| Namespace ID. |
| Name of the |
| Name of the |
| Starting time-stamp (inclusive), in seconds. Supports |
| Ending time-stamp (exclusive), in seconds. Supports |
|
|
...
Code Block |
---|
{
"incoming": [
{
"operations": [
{
"description": "Read files",
"inputs": {
"endPoint": {
"name": "purchases",
"namespace": "default"
}
},
"name": "File2.Read",
"outputs": {
"fields": [
"offset",
"body"
]
}
},
{
"description": "Parsed field",
"inputs": {
"fields": [
{
"name": "body",
"origin": "File2.Read"
}
]
},
"name": "CSVParser2.CSV Parse",
"outputs": {
"fields": [
"customer_id",
"item",
"price"
]
}
},
{
"description": "Read files",
"inputs": {
"endPoint": {
"name": "Customer",
"namespace": "default"
}
},
"name": "File.Read",
"outputs": {
"fields": [
"offset",
"body"
]
}
},
{
"description": "Parsed field",
"inputs": {
"fields": [
{
"name": "body",
"origin": "File.Read"
}
]
},
"name": "CSVParser.CSV Parse",
"outputs": {
"fields": [
"id",
"first_name",
"last_name"
]
}
},
{
"description": "Used as a key in a join",
"inputs": {
"fields": [
{
"name": "id",
"origin": "CSVParser.CSV Parse"
},
{
"name": "customer_id",
"origin": "CSVParser2.CSV Parse"
}
]
},
"name": "Joiner.Join",
"outputs": {
"fields": [
"id",
"customer_id"
]
}
},
{
"description": "Wrote to TPFS dataset",
"inputs": {
"fields": [
{
"name": "id_from_customer",
"origin": "Joiner.Rename id"
},
{
"name": "fname",
"origin": "Joiner.Rename CSVParser.first_name"
},
{
"name": "lname",
"origin": "Joiner.Rename CSVParser.last_name"
},
{
"name": "customer_id",
"origin": "Joiner.Join"
},
{
"name": "item",
"origin": "Joiner.Identity CSVParser2.item"
}
]
},
"name": "Parquet Time Partitioned Dataset.Write",
"outputs": {
"endPoint": {
"name": "parquet_data",
"namespace": "default"
}
}
}
],
"programs": [
{
"lastExecutedTimeInSeconds": 1532468358,
"program": {
"application": "customer_pipeline_spark",
"entity": "PROGRAM",
"namespace": "default",
"program": "DataPipelineWorkflow",
"type": "Workflow",
"version": "-SNAPSHOT"
}
}
]
}
]
}
|
HTTP Responses
Status Codes | Description |
---|---|
| Fields of dataset are returned as a list of strings in the body of the response. |
| Failure to parse the time range provided. |
...
Code Block |
---|
POST /v3/namespaces/<namespace-id>/datasets/<dataset-id>/file/<file-name>/metadata/tags
|
In the example above, the custom entity is a single key-value pair where file
is the key and <file-name>
is the value.
...
Code Block |
---|
POST /v3/namespaces/<namespace-id>/jar/<jar-id>/versions/<jar-version>/metadata/tags[?type=jar]
|
In the example above, the custom entity consists of two key-value pairs. The first has key jar
and value <jar-id>
. The second has key versions
and value <jar-version>
. We pass the jar as the type to specify the type of the entity since the last key in the hierarchy is not the type in this case.
...