Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Row KeyColumn KeyValueNote
MyNamespace:HRFile:<runidX-inverted-start-time>:runidXProperties

inputDir=/data/2017/hr

regex=*.csv

failOnError=false

One Row per namespace per dataset per run
MyNamespace: PersonFile:<runidX-inverted-start-time>:runidXProperties

inputDir=/data/2017/person

regex=*.csv

failOnError=false

One Row per namespace per dataset per run
MyNamespace:EmployeeData:<runidX-inverted-start-time>:runidXProperties

rowid=ID

/*should we store schema too? what if that changes per run?*/

One Row per namespace per dataset per run
MyNamespace:EmployeeData:AllFields:<runidX-inverted-start-time>:runidXID

/* We may not necessarily required to store any value*/

created_time:12345678

updated_time:12345678

last_updated_by:runid_X

One Row per namespace per dataset per run
MyNamespace:EmployeeData:AllFields:<runidX-inverted-start-time>:runidXName  
MyNamespace:EmployeeData:AllFields:<runidX-inverted-start-time>:runidXDepartment  
MyNamespace:EmployeeData:ID:<runidX-MyNamespace:EmployeeData:AllFieldsContactDetails  
MyNamespace:EmployeeData:AllFieldsJoiningDate  
:<runidX-inverted-start-time>:runidXLineagePlease see the full JSON below.ContactDetails  One row per run if field is part of target
MyNamespace:EmployeeData:NameAllFields:<runidX-inverted-start-time>:runidXLineageJoiningDateSimilar JSONOne row per run if field is part of target  
MyNamespace:EmployeeData:ContactDetails:<runidX-inverted-start-time>:runidXLineageSimilar JSONOne row per run if field is part of targetMyNamespace:EmployeeData:JoiningDate:<runidX-inverted-start-time>:runidXLineageSimilar JSON

JSON representation of the LineageGraph provided by app to the platform.

 

One row per run if field is part of per target dataset

JSON stored for ID field:

Code Block
{
  "sources": [
    {
      "name": "PersonFile",
      "properties": {
        "inputPath": "/data/2017/persons",
        "regex": "*.csv"
      }
    },
    {
      "name": "HRFile",
      "properties": {
        "inputPath": "/data/2017/hr",
        "regex": "*.csv"
      }
    }
  ],
  "targets": [
    {
      "name": "Employee Data"
    }
  ],
  "operations": [
    {
      "inputs": [
        {
          "name": "PersonRecord",
          "properties": {
            "source": "PersonFile"
          }
        }
      ],
      "outputs": [
        {
          "name": "body"
        }
      ],
      "name": "READ",
      "description": "Read Person file.",
      "properties": {
        "stage": "Person File Reader"
      }
    },
    {
      "inputs": [
        {
          "name": "body"
        }
      ],
      "outputs": [
        {
          "name": "SSN"
        }
      ],
      "name": "PARSE",
      "description": "Parse the body field",
      "properties": {
        "stage": "Person File Parser"
      }
    },
    {
      "inputs": [
        {
          "name": "HRRecord",
          "properties": {
            "source": "HRFile"
          }
        }
      ],
      "outputs": [
        {
          "name": "body"
        }
      ],
      "name": "READ",
      "description": "Read HR file.",
      "properties": {
        "stage": "HR File Reader"
      }
    },
    {
      "inputs": [
        {
          "name": "body"
        }
      ],
      "outputs": [
        {
          "name": "Employee_Name"
        },
        {
          "name": "Dept_Name"
        }
      ],
      "name": "PARSE",
      "description": "Parse the body field",
      "properties": {
        "stage": "HR File Parser"
      }
    },
    {
      "inputs": [
        {
          "name": "Employee_Name"
        },
        {
          "name": "Dept_Name"
        },
        {
          "name": "SSN"
        }
      ],
      "outputs": [
        {
          "name": "ID",
          "properties": {
            "target": "Employee Data"
          }
        }
      ],
      "name": "GenerateID",
      "description": "Generate unique Employee Id",
      "properties": {
        "stage": "Field Normalizer"
      }
    }
  ]
}

...