Performing simple date transformations in Wrangler

The Wrangler allows you to interactively and visually specify transformations on a sample of data. It provides a list of built-in directives that you can use to transform your data. Additionally, it also provides a set of directives such as set-column, where you can invoke JEXL expressions. One of the most common and powerful use cases of this is to perform date manipulation. This topic describes some of the available transformations.

Before you begin

  1. Download this sample CSV dataset. This is only a sample. The steps in this topic apply to any dataset that has a date column.

Uploading a sample

  1. In the Pipeline Studio, go to Wrangler by clicking the main menu using the Hamburger icon on the top left, and choosing Wrangler.

  2. Click Add Connection and create a File connection.

  3. Locate the SalesJan2009.csv file and double-click it.  

  4. Set Enable Quoted Values to True.

  5. Set Use First Row as Header to True.



  6. Click Confirm.
    The file appears in your Wrangler workspace:

Transforming dates

Parsing the Transaction_date column as a date

  1. Let’s parse the Transaction_date column as a date. Click the directive dropdown in that column, and then choose Parse > Simple Date.

  2. Specify a Custom format as MM/dd/yy HH:mm, since that’s the format of the dates in this column. The date format must match the dates in the column. If you select a date format that doesn’t match the dates in the column, you will get incorrect results.

  3. Click Apply.
    In the Transformation steps column, you should see the transformation registered as parse-as-simple-date :Transaction_date MM/dd/yy HH:mm.

  4. Also observe that the data type of the column has now changed from String to Timestamp_micros, which exposes a lot of utility functions on dates that we will use in the next section.

Getting components

You can get various components of the date such as Month, Day, Year, Minute, Second, and Hour by applying functions such as:

set-column :month Transaction_date.getMonth() set-column :year Transaction_date.getYear() set-column :day_of_week Transaction_date.getDayOfWeek() set-column :day_of_month Transaction_date.getDayOfMonth() set-column :day_of_year Transaction_date.getDayOfYear() set-column :minute Transaction_date.getMinute() set-column :hour Transaction_date.getHour() set-column :second Transaction_date.getSecond() set-column :zone Transaction_date.getZone()

Adding time

You can add various units of time to the date by using the directives such as:

set-column :add_days Transaction_date.plusDays(1) set-column :add_hours Transaction_date.plusHours(200) set-column :add_minutes Transaction_date.plusMinutes(3600) set-column :add_months Transaction_date.plusMonths(3) set-column :add_weeks Transaction_date.plusWeeks(2) set-column :add_years Transaction_date.plusYears(2) set-column :add_seconds Transaction_date.plusSeconds(36000) set-column :add_nonoseconds Transaction_date.plusNanos(3600000)

Subtracting time

You can subtract various units of time to the date by using the directives such as:

set-column :subtract_days Transaction_date.minusDays(1) set-column :subtract_hours Transaction_date.minusHours(200) set-column :subtract_minutes Transaction_date.minusMinutes(3600) set-column :subtract_months Transaction_date.minusMonths(3) set-column :subtract_weeks Transaction_date.minusWeeks(2) set-column :subtract_years Transaction_date.minusYears(2) set-column :subtract_seconds Transaction_date.minusSeconds(36000) set-column :subtract_nonoseconds Transaction_date.minusNanos(3600000)

Related articles

Reference of all available date manipulation functions: ZonedDateTime.

Created in 2020 by Google Inc.