Encoding records

Base encoding of data is used in many situations to store or transfer data in environments that, for legacy reasons, are restricted to US-ASCII data. Base encoding can be used in new applications that do not have legacy restrictions because it allows the manipulation of objects with text editors.

You can apply the following encoding schemes, which are based on RFC-4648, to all values in a column:

  • base32

  • base64

  • hex 

  • url

When you apply encode, Wrangler generates a new column with a name following the format of <column>_encode_<type> except for url-encode.

Different column values are handled following these rules:

  • If the column is null, the resulting column will also be null.

  • If the column specified is not found in the record, then the record is skipped.

  • If the column value is not of either type string or byte, it fails and an error displays.

Encode base32

Encode32 adds the encode32 directive as a transformation step to the recipe and creates a new column with encoded values.

Encode base64

Encode64 adds the encode64 directive as a transformation step to the recipe and creates a new column with encoded values.

Encode hex

Encode hex adds the encode hex directive as a transformation step to the recipe and creates a new column with encoded values.

Encode url

Encode url adds the url-encode directive as a transformation step to the recipe and encodes the current column.

Created in 2020 by Google Inc.