Row Denormalizer Analytics
Plugin version: 2.11.0
Converts raw data into denormalized data based on a key column. User is able to specify the list of fields that should be used in the denormalized record, with an option to use an alias for the output field name. For example, ‘ADDRESS’ in the input is mapped to ‘addr’ in the output schema.
The transform takes input record that stores a variable set of custom attributes for an entity, denormalizes it on the basis of the key field, and then returns a denormalized table according to the output schema specified by the user. The denormalized data is easier to query.
Configuration
Property | Marco Enabled? | Description |
---|---|---|
Key Field | Yes | Required. Name of the column in the input record which will be used to group the raw data. For Example, id. |
Input Name Field | Yes | Required. Name of the column in the input record which contains the names of output schema columns. For example, input records have columns ‘id’, ‘attribute’, ‘value’ and the ‘attribute’ column contains ‘FirstName’, ‘LastName’, ‘Address’. “So the output record will have column names as ‘FirstName’, ‘LastName’, ‘Address’. |
Input Value Field | Yes | Required. Name of the column in the input record which contains the values for output schema columns. For example, input records have columns ‘id’, ‘attribute’, ‘value’ and the ‘value’ column contains ‘John’, ‘Wagh’, ‘NE Lakeside’. So the output record will have values for columns ‘FirstName’, ‘LastName’, ‘Address’ as ‘John’, ‘Wagh’, ‘NE Lakeside’ respectively. |
Output fields to include | Yes | Required. List of the output fields to be included in denormalized output. |
Output fields to rename | Yes | Optional. List of the output fields to rename. The key specifies the name of the field to rename, with its corresponding value specifying the new name for that field. |
numPartitions | Yes | Optional. Number of partitions to use when grouping data. If not specified, the execution framework will decide on the number to use. |
Output Schema | No | Required. The output schema for the data. |
Conditions
In case a field value is not present, then it will be considered as NULL.
For Example,
If keyfield(‘id’) in the input record is NULL, then that particular record will be filtered out.
If namefield(‘attribute’) or valuefield(‘value’) is not present for a particular keyfield(‘id’) value, then the denormalized output value for that namefield will be NULL.
If user provides output field which is not present in the input record, then it will be considered as NULL.
Example
The transform takes input records that have columns id, attribute, value, denormalizes it on the basis of id, and then returns a denormalized table according to the output schema specified by the user.
Property | Value |
---|---|
Key Field |
|
Input Name Field |
|
Input Value Field |
|
Output fields to include |
|
Output fields to rename |
|
For example, suppose the aggregator receives the input record:
id | attribute | value |
---|---|---|
joltie | Firstname | John |
joltie | Lastname | Wagh |
joltie | Address | NE Lakeside |
Output records will contain all the output fields specified by user:
id | Firstname | Lastname | Office Address |
---|---|---|---|
joltie | John | Wagh | NE Lakeside |
Now, let’s suppose the aggregator receives the input record with NULL values:
id | attribute | value |
---|---|---|
joltie | Firstname | John |
joltie | Lastname | Wagh |
joltie | Address | NE Lakeside |
brett | Firstname | Brett |
| Lastname | Lee |
brett | Address | SE Lakeside |
bob | Firstname | Bob |
bob |
| Smith |
bob | Address |
|
Output records will contain all the output fields specified by user:
id | Firstname | Lastname | Office Address |
---|---|---|---|
joltie | John | Wagh | NE Lakeside |
brett | Brett |
| SE Lakeside |
bob | Bob |
|
|
Created in 2020 by Google Inc.