Performing simple date transformations in Wrangler
The Wrangler allows you to interactively and visually specify transformations on a sample of data. It provides a list of built-in directives that you can use to transform your data. Additionally, it also provides a set of directives such as set-column
, where you can invoke JEXL expressions. One of the most common and powerful use cases of this is to perform date manipulation. This topic describes some of the available transformations.
Before you begin
Download this sample CSV dataset. This is only a sample. The steps in this topic apply to any dataset that has a date column.
Uploading a sample
In the Pipeline Studio, go to Wrangler by clicking the main menu using the Hamburger icon on the top left, and choosing Wrangler.
Click Add Connection and create a File connection.
Locate the SalesJan2009.csv file and double-click it. Â
Set Enable Quoted Values to
True
.Set Use First Row as Header to
True
.Click Confirm.
The file appears in your Wrangler workspace:
Transforming dates
Parsing the Transaction_date column as a date
Let’s parse the Transaction_date column as a date. Click the directive dropdown in that column, and then choose Parse > Simple Date.
Specify a Custom format as MM/dd/yy HH:mm, since that’s the format of the dates in this column. The date format must match the dates in the column. If you select a date format that doesn’t match the dates in the column, you will get incorrect results.
Click Apply.
In the Transformation steps column, you should see the transformation registered as parse-as-simple-date :Transaction_date MM/dd/yy HH:mm.Also observe that the data type of the column has now changed from String to Timestamp_micros, which exposes a lot of utility functions on dates that we will use in the next section.
Getting components
You can get various components of the date such as Month, Day, Year, Minute, Second, and Hour by applying functions such as:
set-column :month Transaction_date.getMonth()
set-column :year Transaction_date.getYear()
set-column :day_of_week Transaction_date.getDayOfWeek()
set-column :day_of_month Transaction_date.getDayOfMonth()
set-column :day_of_year Transaction_date.getDayOfYear()
set-column :minute Transaction_date.getMinute()
set-column :hour Transaction_date.getHour()
set-column :second Transaction_date.getSecond()
set-column :zone Transaction_date.getZone()
Adding time
You can add various units of time to the date by using the directives such as:
set-column :add_days Transaction_date.plusDays(1)
set-column :add_hours Transaction_date.plusHours(200)
set-column :add_minutes Transaction_date.plusMinutes(3600)
set-column :add_months Transaction_date.plusMonths(3)
set-column :add_weeks Transaction_date.plusWeeks(2)
set-column :add_years Transaction_date.plusYears(2)
set-column :add_seconds Transaction_date.plusSeconds(36000)
set-column :add_nonoseconds Transaction_date.plusNanos(3600000)
Subtracting time
You can subtract various units of time to the date by using the directives such as:
set-column :subtract_days Transaction_date.minusDays(1)
set-column :subtract_hours Transaction_date.minusHours(200)
set-column :subtract_minutes Transaction_date.minusMinutes(3600)
set-column :subtract_months Transaction_date.minusMonths(3)
set-column :subtract_weeks Transaction_date.minusWeeks(2)
set-column :subtract_years Transaction_date.minusYears(2)
set-column :subtract_seconds Transaction_date.minusSeconds(36000)
set-column :subtract_nonoseconds Transaction_date.minusNanos(3600000)
Related articles
Reference of all available date manipulation functions: ZonedDateTime.
Created in 2020 by Google Inc.