Value Mapper

Value Mapper

Introduction 

Value Mapper is a type of transform that maps string values of a field in the input record to another value. Mappings are usually stored in another dataset. This provides you a simple alternative for mapping data in the record.

Use-case(s)

If you want to replace language codes in the input record field to language description.

  • Source Field name : language_code

  • Target field name : language_desc

  • Mappings, Source / Target :  DE/German, ES/Spanish, EN/English, etc.

Conditions

  • If source field to be mapped is null, then target field will be null

  • If source field to be mapped is empty, then target field will be empty

  • If source field cannot be mapped, then a default could be assigned and that’s populated in the target field

  • More than one field can be mapped using this transform

  • Source field type can be of only string type

  • Output field type can be of only string type

Options

Following are the options that should be provided for user to configure

  • Configuring the value mapper fields

    • Mapping

      • Source Field Name

      • Dataset

      • Target Field Name

    • Can support one or more such mappings

  • How the source field should be handled

    • If NULL, then user can provide a default value or NULL

    • If EMPTY, then user can provide a default value or EMPTY

  • Output Schema should allow

    • Remove Source Field

    • Include Target Field

Design

 

Implementation strategy to allow user to provide default value for mapping field (default value or empty or null)

Plugin would expect the input with following UI Widget:

 

This will be represented in json format in plugin as: "defaults":"field1:value1, field2:value2" 

Examples

Suppose that user takes the input data( employee details) through the csv file or any other source and  wants to apply value mapper on certain field (field=Designation).

This would be helpful for user to access data in terms of readability.

 

Source: We are considering the source as CSV file, For Example:

Source

 

Output from Source

Type

Value

S3

 

Id

String

1234

Path: Path on S3

 

Name

String

John

CSV File with fields:

 

Salary

INT

9000

id, name,salary,designation

 

Designation

String

2

 

ValueMapper Plugin :  For this input will be the StructuredRecord from source and Mapping/Lookup Dataset.

The transform function in this plugin will apply the mapping on the source fields using Lookup Interface.

StructuredRecord Format:

Id

String

Name

String

Salary

INT

Designation

String

 

Sample structure for Mapping/Lookup Dataset

Designation

Value

1

SE

2

SSE

3

ML

 

Sink : After the transformations from ValueMapper plugin, output will have below structure:

FieldName

Type

Value

Id

String

1234

Name

String

John

Salary

INT

9000

Designation

String

SSE

 

Properties:

  • source: name of the field that contains the input column name to be mapped. This field should be removed from the output schema.

  • target: name of the field that contains the output column name that will be mapped. This field should be included in output schema.

  • lookup table: The configuration of the lookup tables to be used in your transform as mapping dataset. For example, if lookup table "designation" is configured, then you will be able to perform operations with that lookup table in your transform. Currently supports KeyValueTable.

  • User would provide these properties in input text field as triplet (as shown below)

  • User would provide default values for mapping fields. Json representation is provided in below example

Example:
         {
             "name": "ValueMapper",
             "type": "transform",
             "properties": {
                     "mapping": "source1:LookupTableName:target1,source2:LookupTable2:target2",

                      "defaults":"field1:value1,field2:value2                  

             }
        }

 

The transform takes record that have "id,name,salary,designationID" fields, maps the value of designationID field to generate designationName field based on the mapping dataset and then returns a record containing "id,name,salary,designationName" fields.

For example, if the input is like:

field name

type 

value 

field name

type 

value 

id 

string 

US1

name 

string 

samuel

salary 

string 

1000

designationID 

string 

D3

it will transform it to this output record:

field name

type 

value 

field name

type 

value 

id 

string 

US1

name 

string 

samuel

salary 

string 

1000

designationName 

string 

Module Lead

Table of Contents

Checklist

User stories documented 
User stories reviewed 
Design documented 
Design reviewed 
Feature merged 
Examples and guides 
Integration tests 
Documentation for feature 
Short video demonstrating the feature

Created in 2020 by Google Inc.