Union Splitter Transformation
Plugin version: 2.11.0
The Union Splitter is used to split data by a union schema, so that type specific logic can be done downstream.
The Union Splitter will emit records to different ports depending on the schema of a particular field, or of the entire record. If no field is specified, each record will be emitted to a port named after the name of the record schema. If a field is specified, the schema for that field must be a union of supported schemas. All schemas except maps, arrays, unions, and enums are supported. For each input record, the value of that field will be examined and emitted to a port corresponding to its schema in the union.
For record schemas, the output port will be the name of the record schema. For simple types, the output port will be the schema type in lowercase (‘null’, ‘bool’, ‘bytes’, ‘int’, ‘long’, ‘float’, ‘double’, or ‘string’).
Configuration
Property | Macro Enable? | Description |
---|---|---|
Union field to split on | No | Required. The union field to split on. The schema for the field must be a union of supported schemas. All schemas except maps, arrays, unions, and enums are supported. Note that nulls are supported, which means all nulls will get sent to the ‘null’ port. |
Modify Schema | No | Optional. Whether to modify the output schema to remove the union. For example, suppose the field ‘x’ is a union of int and long. If Modify Schema is true, the schema for field ‘x’ will be just an int for the ‘int’ port and just a long for the ‘long’ port. If Modify Schema is false, the output schema for each port will be the same as the input schema. Default is true. |
Output Schema | No | Required. The output schema for the data. |
Example
Suppose the Union Splitter is configured to split on the ‘item’ field:
Property | Value |
---|---|
Union field to split on |
|
Modify Schema |
|
Suppose the Union Splitter receives records with schema:
name | type |
---|---|
id | long |
user | string |
item | [ int, long, itemMeta ] |
with the ‘item’ field as a union of int, long and a record named ‘itemMeta’ with schema:
name | type |
---|---|
id | long |
desc | string |
This means the Union Splitter will have three output ports, one for each schema in the union.
If a record contains an integer for the ‘item’ field, it will be emitted to the ‘int’ port with output schema:
name | type |
---|---|
id | long |
user | string |
item | int |
If a record contains a long for the ‘item’ field, it will be emitted to the ‘long’ port with output schema:
name | type |
---|---|
id | long |
user | string |
item | long |
If a record contains a StructuredRecord with the itemMeta schema for the ‘item’ field, it will be emitted to the ‘itemMeta’ port with output schema:
name | type |
---|---|
id | long |
user | string |
item | itemMeta |
Created in 2020 by Google Inc.