CDAP Release 6.1.2
Important: CDAP 6.1.2 is deprecated.
Summary
This release primarily focuses on bug fixes and performance improvements. Some of the highlights include:
Performance improvements
Improve preview performance & limits concurrent preview runs to 10 by default
Shift in polling logic to UI to avoid polling leaks in Nodejs server
Batch API usage in UI to reduce the load on backend services
Pipeline and Plugin fixes
Support Field Level Lineage for Streaming pipelines
Improve Field Level Lineage computation algorithm
Add support for Spark 2.4
Improves memory consumption during pipeline execution
New Features
CDAP-15579Â - Added the ability for SparkCompute and SparkSink to record field lineage.
CDAP-16107Â - Adds support for Spark 2.4
CDAP-13643Â - Added the ability to record field lineage for streaming pipelines.
Bug Fixes
CDAP-16002Â - Fixed a bug that caused errors when Wrangler's parse-as-csv with header was used when reading multiple small files.
CDAP-16526Â - Fixed the BigQuery sink to properly allow certain types as clustering fields.
CDAP-16471Â - Fixed a bug that would cause zombie processes when using the Remote Hadoop Provisioner
CDAP-16472Â - Fixed a bug that getSchema is not working for database plugins.
CDAP-16453Â - Fixed a bug that made DBSource plugin fail in preview mode
CDAP-16309Â - Fixed a race condition bug that can cause failure when running Spark program
Improvements
CDAP-16517Â - Added an option to skip header in the files in delimited, csv, tsv and text formats.
CDAP-16525Â - Added an option for database source to replace the characters in the field names.
CDAP-16308Â - Reduces preview startup by 60%. Also adds limit to max concurrent preview runs (10 by default).
CDAP-16509Â - Reduce memory footprint for StructureRecord which improves overall memory consumption for pipeline execution
CDAP-16339Â - Introduced a new REST endpoint for fetching scheduled time for multiple programs
Â
Created in 2020 by Google Inc.