We're updating the issue view to help you get more done. 

Batch data pipelines with Spark engine fail on Spark 2.2

Description

The pipelines fail with the following exception in Spark 2.2 -

Release Notes

Fixed an issue where Spark 2.2 batch pipelines with HDFS sinks would fail with delegation token issue error

Activity

Show:
Poorna Chandra
April 6, 2018, 6:19 PM

This is because in Spark 2, rdd.saveAsNewAPIHadoopDataset(conf) expects the conf to be a JobConf object, and to contain the credentials. If a regular configuration object is passed, then Spark creates a JobConf object out of it. And when this happens, the credentials in the JobConf object would be set to null. Hence the error.

Poorna Chandra
April 6, 2018, 7:57 PM

Assignee

Poorna Chandra

Reporter

Poorna Chandra

Labels

None

Docs Impact

None

UX Impact

None

Components

Fix versions

Affects versions

Priority

Major
Configure