Oracle batch source plugin (versions 1.9.1 and 1.8.5) converts Oracle NUMBER data type with undefined precision and scale to CDAP string

In Oracle batch source versions 1.9.0, 1.8.3, and earlier:

  • The Oracle NUMBER data type with undefined precision and scale is mapped to the CDAP decimal(38,0) data type.

While plugin versions 1.9.1, 1.8.4, and 1.8.5 are backward incompatible, pipelines working with earlier versions might not work after upgrading to versions 1.9.1, 1.8.5, and 1.8.4, if a downstream stage in the pipeline relies on the output schema of the source because the output schema has changed. When there is an output schema defined for the Oracle NUMBER data type defined without precision and scale in the previous plugin version, then after upgrading to versions 1.9.1, 1.8.5, or 1.8.4, the Oracle batch source plugin throws a schema mismatch error for the such types stating: Schema field '<field name>' is expected to have type 'decimal with precision <precision> and scale <scale> but found 'string'. Change the data type of field <field name> to string.

Versions 1.9.1, 1.8.5, and 1.8.4 will work with an output schema of CDAP string data type for Oracle NUMBER data type defined without precision and scale. It is not advised to use the older version of the Oracle plugin if there is any Oracle NUMBER data type defined without precision and scale present in the Oracle source output schema, as it can lead to rounding errors.

The special case is when you use a macro for the database name, schema name, or table name, and if you haven't manually specified an output schema. Thus, the schema gets detected and mapped at runtime. The older version of the Oracle batch source plugin maps the Oracle NUMBER data type defined without precision and scale to the CDAP decimal(38,0) data type, while versions 1.9.1, 1.8.5, and 1.8.4 and higher will map such data types to string at runtime.

Recommendation

To resolve the possible precision loss issue while working with Oracle NUMBER data types with undefined precision and scale, upgrade your pipelines to use Oracle batch source plugin versions 1.9.1, 1.8.5, or 1.8.4.

After the upgrade, the Oracle NUMBER data type defined without precision and scale is mapped to the CDAP string data type at runtime. If you have a downstream stage or sink that consumes the original CDAP decimal data type (to which the Oracle NUMBER data type defined without precision and scale was mapped), either update it or expect it to consume string data. 

If you understand the risk of possible data loss due to rounding errors but still want continue to use Oracle NUMBER data type defined without precision and scale as CDAP decimal(38,0) data type then, deploy the Oracle plugin version 1.8.6 (for CDAP 6.7.3) or 1.9.2 (for CDAP 6.8.1) from the Hub, and update the pipelines to use them instead.

For more information, see CDAP Oracle Batch Source.

Created in 2020 by Google Inc.