Working with numbers in Wrangler
In Wrangler, you can quickly perform numeric calculations on the following types of columns:
Integer
Decimal
Double
Float
Long
Short
Note: To use numeric functions on Decimal columns, use the Wrangler CLI. For more information, see the set-column directive.
You can use numeric functions on one or multiple columns. The available numeric functions vary depending on how many columns you select. You can choose to create a new column with the results of the calculations or you can have the results appear in the column where you add the transformation.
If you apply a numeric function to a single column, the behavior is different than if you apply a numeric function to multiple columns.Â
For example, if you use the Multiply function on a single column, you must specify a decimal value by which to multiply each value in the column. Wrangler performs the multiplication on the sample data and displays the new values in the same column or in a new column.Â
If you use the Multiply function on multiple columns, Wrangler multiplies the values in each row for the selected columns and displays the new values in the first column of the transformation.
Performing numeric calculations on values in one column
You can perform the following calculations on all values in the column:
To apply a numeric calculation to one column, follow these steps:
Click the drop-down button next to the column name.
Click Calculate, and then select the numeric function you want to perform.
Some functions require a decimal value to complete the calculation. For example, if you select Subtract, enter the number to subtract from each row.
You can choose to create a new column with the results of numeric calculation or overwrite the values in the current column.
Click Apply.
The values change based on the calculation. Wrangler adds the corresponding function as a step in the recipe. For example, if you Subtract 2 from each value in the Price column, Wrangler adds the following transformation to the recipe:
set-column :Price Price - 2
When you run the data pipeline, the transformation is applied to all values in the column.
Performing numeric calculations on values in two columns
You can perform the following numeric calculations on values in each row in two columns:
To apply a numeric calculation to two columns, follow these steps:
To select the columns, click the box to the right of the column names.
Click the drop-down button next to one of the column names and click Calculate.
Select the numeric function you want to perform.
You can choose to create a new column with the results of numeric calculations or overwrite the values in the current column.
Click Apply.
The values change based on the calculation. Wrangler adds the corresponding function as a step in the recipe. For example, if you Add the values in each row of the Q1_Sales
and Q2_Sales
columns and create a new column called H1_Sales
, Wrangler adds the following transformation to the recipe:
set-column :H1_Sales arithmetic:add(Q1_Sales, Q2_Sales)
When you run the data pipeline, Wrangler performs the transformation and creates a new column called H1_Sales
with the total of Q1_Sales
and Q2_Sales
.
Performing numeric calculations on values in three or more columns
Note: Performing numeric calculations on values in three or more columns was added in CDAP 6.8.
You can perform the following calculations on values in each row in three or more columns:
To apply a numeric calculation to three or more columns, follow these steps:
To select the columns, click the box to the right of the column name.
Click the drop-down button next to one of the column names and click Calculate.
Select the numeric function you want to perform.
You can choose to create a new column with the results of numeric calculations or overwrite the values in the current column.
Click Apply.
The values change based on the calculation. Wrangler adds the corresponding function as a step in the recipe. For example, if you Add the values in each row of the Q1_Sales
, Q2_Sales
, Q3_Sales
, and Q4_Sales
columns and create a new column called 2022_Sales
, Wrangler adds the following transformation to the recipe:
set-column :2022_Sales arithmetic:add(Q1_Sales, Q2_Sales, Q3_Sales, Q4_Sales)
When you run the data pipeline, Wrangler performs the transformation and creates a new column called 2022_Sales
with the total of Q1_Sales
, Q2_Sales
, Q3_Sales
, and Q4_Sales
.
Created in 2020 by Google Inc.