Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note that "body" field is generated from "HR Record" as well as "Person Record". To distinguish it while storing we might need to prefix it with the stage name.  - To distinguish it we can associate the name of the stage as a property with the Operation itself.

As an additional information for the source and target datasets we might want to show the associated properties such as file path, regex used etc.

...

  1. When platform receives the LineageGraph from the app, processing of the graph would be done before storing the data so the retrieval is straightforward.
  2. In the above pipeline, "HR File Parser" stage parses the body and generate fields "Employee_Name", "Dept_Name", "Salary", and "Start_Date". However the actual JSON stored for the ID field only contains operation from "

 

...

  1. related to the "Employee_Name" and "Dept_Name" since these are the only fields involved in the "ID" generation and not "Salary" and "Start_Date".

Retrieval:

Following REST APIs are available:

  1. Get the list of fields in the dataset.

    Code Block
    GET /v3/namespaces/<namespace-id>/datasets/<dataset-id>/fields
     
    Where:
    namespace-id: namespace name
    dataset-id: dataset name
     
    Sample Response:
    [
      {
        "name": "ID",
        "properties": {
          "creation_time": 12345678,
          "last_update_time": 12345688,
          "last_modified_run": "runid_x"
        }
      },
      {
        "name": "name",
        "properties": {
          "creation_time": 12345678,
          "last_update_time": 12345688,
          "last_modified_run": "runid_x"
        }
      },
      {
        "name": "Department",
        "properties": {
          "creation_time": 12345678,
          "last_update_time": 12345688,
          "last_modified_run": "runid_x"
        }
      },
      {
        "name": "ContactDetails",
        "properties": {
          "creation_time": 12345678,
          "last_update_time": 12345688,
          "last_modified_run": "runid_x"
        }
      },
      {
        "name": "JoiningDate",
        "properties": {
          "creation_time": 12345678,
          "last_update_time": 12345688,
          "last_modified_run": "runid_x"
        }
      }
    ]
  2. Get the properties associated with the dataset.
  3. Get the lineage associated with the particular field in a dataset.