Wrangler concepts
Wrangler uses the concepts of record, column, directive, recipe, transformation step, and data pipeline.
Record
A record is a collection of field names and field values.
In this documentation, a record is shown as a JSON object with an object key representing the column names and a value shown by the plain representation of the the data, without any mention of types.
For example:
{
"id": 1,
"fname": "root",
"lname": "joltie",
"address": {
"housenumber": "678",
"street": "Mars Street",
"city": "Marcity",
"state": "Maregon",
"country": "Mari"
},
"gender": "M"
}
Column
A column is a group of field values of any of the supported data types. Each field value is part of one record.
Directive
A directive is a single data manipulation instruction, specified to either transform, filter, or pivot a single record into zero or more records. A directive can generate one or more steps to be executed by a data pipeline.
A directive can be represented in text in this format:
<command> <argument-1> <argument-2> ... <argument-n>
Recipe
A recipe is a set of directives. It consists of one or more directives. For example, the following recipe changes the data type of Fare
to integer
:
Transformation step
A transformation step is an implementation of a data transformation directive, operating on a single record or set of records. A transformation step can generate zero or more records from the application of a directive. Pipeline Studio applies the transformation steps in the order listed in the recipe.
Data pipeline
A data pipeline is a collection of stages to be applied on a record. The record(s) outputted from a stage are passed to the next stage in the pipeline.
Created in 2020 by Google Inc.