CDAP Release 6.10.0

Release date: January 11, 2024

Improvements

CDAP-15361: Wrangler is schema aware.

CDAP-20799: CDAP supports multi pipeline pull and push as part of source control management with GitHub.

CDAP-20831: If a task is stuck, task workers are forcefully restarted.

CDAP-20868: Added capability to run concurrent tasks in task workers.

PLUGIN-1694: Added validation for incorrect credentials in the Amazon S3 source.

Changes

CDAP-20904 and CDAP-20581: In Source Control Management, GitHub PAT was removed from CDAP web interface for repository configurations.

CDAP-20846: Improved latency when BigQuery pushdown is enabled by fetching artifacts from a local cache.

PLUGIN-1718: The BigQuery sink supports flexible table names and column names.

PLUGIN-1692: BigQuery sinks support ingesting data to JSON data type fields.

PLUGIN-1705: In BigQuery sink jobs, you can add labels in the form of key-value pairs.

PLUGIN-1729: In BigQuery execute jobs, you can add labels in the form of key-value pairs.

PLUGIN-1293: The Cloud Storage Java Client is upgraded to version 2.3 and later.

Fixes

CDAP-20521: Fixed an issue causing columns that have all null values to be dropped in Wrangler.

CDAP-20587: Fixed an issue causing slowness in API while fetching runs of all applications in a namespace.

CDAP-20815: Fixed an issue causing pipeline upgrades to not have the intended description.

CDAP-20839: Made the following fixes to Wrangler grammar:

  • The NUMERIC token type supports negative numbers.

  • The PROPERTIES token type supports one or more properties.

PLUGIN-1681: Fixed an issue in the Postgres DB plugin causing macros to be unsupported for database configuration.

Deprecated

Spark compute engine running on Scala 2.11 is not supported.

Breaking

Memory usage might increase for pipelines that use Dataproc 2.1 clusters. If you upgrade your instance to version 6.10.0 or later, and previous pipelines are failing due to memory issues, increase the driver and executor memory to 2048 MB in the Resources configuration for the pipeline.

Alternatively, you can override the Dataproc version by setting the system.profile.properties.imageVersion runtime argument to 2.0-debian10.

Created in 2020 by Google Inc.