Note that "body" field is generated from "HR Record" as well as "Person Record". To distinguish it while storing we might need to prefix it with the stage name. - To distinguish it we can associate the name of the stage as a property with the Operation itself.
As an additional information for the source and target datasets we might want to show the associated properties such as file path, regex used etc.
- When platform receives the LineageGraph from the app, processing of the graph would be done before storing the data so the retrieval is straightforward.
- In the above pipeline, "HR File Parser" stage parses the body and generate fields "Employee_Name", "Dept_Name", "Salary", and "Start_Date". However the actual JSON stored for the ID field only contains operation from "
- related to the "Employee_Name" and "Dept_Name" since these are the only fields involved in the "ID" generation and not "Salary" and "Start_Date".
Following REST APIs are available:
Get the list of fields in the dataset.
Code Block GET /v3/namespaces/<namespace-id>/datasets/<dataset-id>/fields Where: namespace-id: namespace name dataset-id: dataset name Sample Response: [ { "name": "ID", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } }, { "name": "name", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } }, { "name": "Department", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } }, { "name": "ContactDetails", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } }, { "name": "JoiningDate", "properties": { "creation_time": 12345678, "last_update_time": 12345688, "last_modified_run": "runid_x" } } ]
- Get the properties associated with the dataset.
- Get the lineage associated with the particular field in a dataset.