Working with Decimal types in Wrangler
This article describes how to convert fields into decimal types in Wrangler and perform transformations on them.
Before you begin
This article presumes that you have set up a Wrangler connection to a database, file system, or another supported storage system that contains your data.
Reading decimal type data
Open an object (a table from a database, a file from a filesystem such as GCS) in Wrangler.
In the case of a database or a BigQuery connection, if the table has a decimal column, Wrangler automatically converts it into a BigDecimal type.
When you create a pipeline from Wrangler, such a column will automatically get mapped to the CDAP decimal type.
On the other hand, if your dataset contains non-decimal data that you want to convert into a decimal type, you can do it using the set-column directive as shown below:
set-column :decimal_column exp:{new("java.math.BigDecimal", <input column name>)}
Once this directive is executed, the columnâ€™s data type changes to BigDecimal, and similar to the previous steps, the schema also contains the appropriate data type.
Note: The <input column> can be of type String, Integer, Long, Float, or Double.If your dataset includes values with varying scale, such as 1.05, 2.698, 5.8745512, you need to set the scale with a Wrangler directive and also edit the schema in the pipeline to set the scale for the decimal column.
To set the scale in the Wrangler, use a directive similar to the following:set-column :ouput_col exp:{new("java.math.BigDecimal", decimal_col).setScale()}
For example, to convert a column of string called cost to decimal with a scale of 9 and output the results to a new column called output_col, use the following directive:set-column :output_col exp:{new("java.math.BigDecimal", "cost").setScale(9)}
Transforming decimal data
Since the underlying data type for decimal columns in Wrangler is the Java BigDecimal class, you can use methods of the BigDecimal class to transform these columns, once they are converted into BigDecimal. In all the following directives, decimal_col is the decimal column that will be transformed, while output_column is the output of the operation:
Transformation | Directive |
---|---|
Get the absolute value | set-column :output_col decimal_col.abs() |
Get the precision of a decimal value | set-column :output_col decimal_col.precision() |
Get the scale of a decimal value | set-column :output_col decimal_col.scale() |
Get the unscaled value of a decimal value | |
Add two decimal columns | |
Subtract a decimal from another | |
Multiply a decimal with another | |
Divide a decimal column by another and return the quotient | |
Divide a decimal column by another and return the remainder | |
Convert decimal to a integer | |
Convert decimal to a long | |
Convert decimal to a float | |
Convert decimal to a double | |
Check if a decimal value is equal to another | |
Find the maximum of two decimal columns | |
Find the minimum of two decimal columns | |
Move the decimal point n places to the left | |
Move the decimal point n places to the right | |
Get the nth power of a decimal | |
Negate a decimal | |
Strip trailing zeros in a decimal |
Created in 2020 by Google Inc.