Decoding records

Base decoding of data is used in many situations to store or transfer data in environments that, for legacy reasons, are restricted to US-ASCII data. Base decoding can be used in new applications that do not have legacy restrictions because it allows the manipulation of objects with text editors.

You can apply the following decoding schemes, which are based on RFC-4648, to all values in a column:

  • base32

  • base64

  • hex 

  • url

When you apply decode, Wrangler generates a new column with a name following the format of <column>_encode_<type>, except for url-decode.

Different column values are handled following these rules:

  • If the column is null, the resulting column will also be null.

  • If the column specified is not found in the record, then the record is skipped.

  • If the column value is not of either type string or byte array, it fails.

Decode base32

Decode32 adds the decode32 directive as a transformation step to the recipe and creates a new column with the decoded values.

Decode base64

Decode64 adds the decode64 directive as a transformation step to the recipe and creates a new column with the decoded values.

Decode hex

Decode hex adds the decode hex directive as a transformation step to the recipe and creates a new column with the decoded values.

Decode url

Decode url adds the url-decode directive as a transformation step to the recipe and decodes the current column.

Created in 2020 by Google Inc.