Wrangler silently mangles values when the wrong schema is given

Description

The wrangler transform allows the user to set an output schema. However, Wrangler doesn't check that the data it processes matches the schema provided. If it doesn't match, Wrangler will set an incorrect type, which leads to super confusing errors later in the pipeline.

This is related to CDAP-15317, where CDAP assumes plugin authors are well-behaved and will do proper type/error checking.

Release Notes

None

Activity

Show:

Albert ShauJanuary 26, 2021 at 5:40 PM

The relevant code is in the RecordConvertor.decode() method, where any logical type is decoded as itself. For example, a String can be set as a decimal, which will lead to an invalid value being used.

Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Components

Fix versions

Priority

Created January 26, 2021 at 5:37 PM
Updated January 26, 2021 at 5:40 PM