Transpose Transformation
Plugin version: 2.11.0
Transpose is a transformation plugin that breaks one source row into multiple target rows. Attributes stored in the columns of a table or a file may need to be broken into multiple records: for example, one record per column attribute. In general, the plugin allows the conversion of columns to rows.
The Transpose transformation can be used if you want to reduce the restructuring of a dataset when a new type of data is introduced into the collection. For example, assume you are building a master customer table that aggregates data for a user from multiple sources, and each of the sources has its own type of data to be added to a “customer-id”. Instead of creating wide columns, Transpose allows you to transform data into its canonical form and update the master customer profile simultaneously from the multiple sources.
Configuration
Property | Macro Enabled? | Description |
---|---|---|
Fields to be Mapped | No | Required. A string that is a comma-separated list of field names. Specifies the input schema field to be mapped to the output schema field. Example: “CustomerID:ID” maps the value of the CustomerID field to the ID field of the output schema. |
Fields to be Normalized | No | Optional. A string that is a comma-separated list of field names, a common column for the field types, and a common column for the field values. Specifies the name of the field to be normalized, to which output field its name should be mapped as a type, and the output field where the value needs to be saved. Example: “ItemId:AttributeType:AttributeValue” will save the name “ItemId” to the “AttributeType” field, and the value of “ItemId” column will be saved in the “AttributeValue” field of the output schema. |
Output Schema | No | Required. The output schema for the data. |
Example
This example creates a customer profile table from two sources. Assume we have as sources a “Customer_Profile” table and a “Customer_Purchase” table which we need to normalize into a “Customer” table.
Customer_Profile table:
CustomerId | First_Name | Last_Name | Shipping_Address | Credit_Card | Billing_Address | Last_Update |
---|---|---|---|---|---|---|
S23424242 | Joltie | Root | 32826 Mars Way, Marsville, MR, 24344 | 2334-232132-2323 | 32826 Mars Way, Marsville, MR, 24344 | 05/12/2015 |
R45764646 | Iris | Cask | 32423, Your Way, YourVille, YR, 65765 | 2343-12312-12313 | 32421 MyVilla, YourVille, YR, 23423 | 04/03/2012 |
Map the “CustomerId” column to the “ID” column of the output schema, and the “Last_Update_Date” to the “Date” column of the output schema. Normalize the “First_Name”, “Last_Name”, “Credit_Card”, and “Billing_Address” columns by mapping each column name to the “Attribute_Type” column and their values to the “Attribute_Value” column of the output schema.
The plugin’s properties will be:
Property | Value |
---|---|
Fields to be Mapped |
|
Fields to be Normalized |
|
After the transformation, the output records in the Customer table will be:
ID | Attribute_Type | Attribute_Value | Date |
---|---|---|---|
S23424242 | First Name | Joltie | 05/12/2015 |
S23424242 | Last Name | Root | 05/12/2015 |
S23424242 | Credit Card | 2334-232132-2323 | 05/12/2015 |
S23424242 | Billing Address | 32826 Mars Way, Marsville, MR, 24344 | 05/12/2015 |
R45764646 | First Name | Iris | 04/03/2012 |
R45764646 | Last Name | Cask | 04/03/2012 |
R45764646 | Credit Card | 2343-12312-12313 | 04/03/2012 |
R45764646 | Billing Address | 32421, MyVilla Ct, YourVille, YR, 23423 | 04/03/2012 |
Next, create a new pipeline to normalize the Customer_Purchase table to the revised Customer table.
Customer_Purchase table:
CustomerID | Item_ID | Item_Cost | Purchase_Date |
---|---|---|---|
S23424242 | UR-AR-243123-ST | 245.67 | 08/09/2015 |
S23424242 | SKU-234294242942 | 67.90 | 10/12/2015 |
R45764646 | SKU-567757543532 | 14.15 | 06/09/2014 |
Map the “CustomerId” column to the “ID” column of the output schema, and the “Purchase_Date” to the “Date” column of the output schema. Transpose the “Item_ID” and “Item_Cost” columns so that each column name will be mapped to the “Attribute_Type” column and each value will be mapped to the “Attribute_Value” column of the output schema.
The plugin’s properties will be:
Property | Value |
---|---|
Fields to be Mapped |
|
Fields to be Normalized |
|
After the transformation, the output records in the Customer table will be:
ID | Attribute_Type | Attribute_Value | Date |
---|---|---|---|
S23424242 | First_Name | Joltie | 05/12/2015 |
S23424242 | Last_Name | Root | 05/12/2015 |
S23424242 | Credit_Card | 2334-232132-2323 | 05/12/2015 |
S23424242 | Billing_Address | 32826 Mars Way, Marsville, MR, 24344 | 05/12/2015 |
R45764646 | First_Name | Iris | 04/03/2012 |
R45764646 | Last_Name | Cask | 04/03/2012 |
R45764646 | Credit_Card | 2343-12312-12313 | 04/03/2012 |
R45764646 | Billing_Address | 32421, MyVilla Ct, YourVille, YR, 23423 | 08/09/2015 |
S23424242 | Item_ID | UR-AR-243123-ST | 08/09/2015 |
S23424242 | Item_Cost | 245.67 | 08/09/2015 |
S23424242 | Item_ID | SKU-234294242942 | 10/12/2015 |
S23424242 | Item_Cost | 67.90 | 10/12/2015 |
R45764646 | Item_ID | SKU-567757543532 | 06/09/2014 |
R45764646 | Item_Cost | 14.15 | 06/09/2014 |
Created in 2020 by Google Inc.