...
Note that "body" field is generated from "HR Record" as well as "Person Record". To distinguish it while storing we might need to prefix it with the stage name. - To distinguish it we can associate the name of the stage as a property with the Operation itself.
As an additional information for the source and target datasets we might want to show the associated properties such as file path, regex used etc.
...
- When platform receives the LineageGraph from the app, processing of the graph would be done before storing the data so the retrieval is straightforward.
- In the above pipeline, "HR File Parser" stage parses the body and generate fields "Employee_Name", "Dept_Name", "Salary", and "Start_Date". However the actual JSON stored for the ID field only contains operation from "
...
- related to the "Employee_Name" and "Dept_Name" since these are the only fields involved in the "ID" generation and not "Salary" and "Start_Date".
Retrieval:
Following REST APIs are available:
Get the list of fields in the dataset.
Code Block GET /v3/namespaces/<namespace-id>/datasets/<dataset-id>/fields Where: namespace-id: namespace name dataset-id: dataset name Sample Response: [ { "name": "ID", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } }, { "name": "name", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } }, { "name": "Department", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } }, { "name": "ContactDetails", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } }, { "name": "JoiningDate", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } } ]
- Get the properties associated with the dataset.
- Get the lineage associated with the particular field in a dataset.