Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

You can parse and flatten explode JSON files with arrays in Wrangler. Because JSON files can be large and because

Wrangler supports up to 10 MB for sampling files, so if the JSON file must be stored is large, store it in Google Cloud Storage or BigQuery.

To parse JSON files with arrays, follow these steps:

  1. In Wrangler, Wrangler source (GCS or BigQuery), read a select the JSON file to Wrangle.

  2. Once navigated to Wrangler tab, open the dropdown on the column and choose Parse → JSON.

  3. After Step 4, the fields in the JSON will form the column. Identify the column that has rows.

  4. Open the drop-down from that specific column and select Explode → you want to parse. Wrangler displays the JSON as a single row of String data type.

    Image Added
  5. Click the down arrow to the left of the column and click Parse > JSON.

  6. Select the required Depth and click Apply.
    If the JSON has an Array, Wrangler displays Array data type above the column:

    Image Added
  7. Find the Array column, click the down arrow, and click Explode > Array (by flattening).

    Image Added

This

...

explodes the elements in the JSON array as individual data in the same column in Wrangler.After step 7 further directives can be applied based on cleanup needed

...

Continue adding directives to transform the data before creating a data pipeline.