CDAP Release 6.1.1
Important: CDAP 6.1.1 is deprecated.
Summary
This release introduces a number of new features, improvements, and bug fixes for CDAP. Some of the main highlights of the release are:
Pipeline improvements
Validation checks for plugins for early error detection and prevention
New widgets for better pipeline configurability
Wrangler ADLS connection
Field Level Lineage
New, intuitive UI for field level lineage
Field level lineage support for more plugins
Platform enhancements
Performance improvements across the platform
Migration of more UI components from Angular to React
New Features
CDAP-16102Â - Added field level lineage support for Error Transform
CDAP-16037Â - Added region support for google cloud plugins
CDAP-15795Â - New UI landing page
CDAP-15789Â - Allow plugin developers to define filters to show/hide properties based on custom plugin configuration logic.
CDAP-15787Â - Introduced new FailureCollector apis for better user experience via contextual error messages
CDAP-15767Â - Added support for reading INT96 types in parquet file sources.
CDAP-15728Â - New ConfigurationGroup component in UI
CDAP-15723Â - Added support for pipeline to run in shared vpc network
CDAP-15619Â - Stage level validation for plugin properties.
CDAP-15482Â - Added a new REST endpoint that retrieves back all field lineage information about a dataset.
CDAP-15342Â - Added support for bytes types in the bigquery sink
Deprecation
CDAP-15917Â - Removed the outdated Validator plugin
Bug Fixes
CDAP-16193Â - Fix the preview run state after JVM restarted
CDAP-16146Â - content type detection now uses case insensitive file extensions
CDAP-16137Â - Fixed bug that prevents users from navigating to pipeline studio (indicating system artifacts being loaded for a long time).
CDAP-15973Â - Fixed the dataproc provisioner to log the error message if the dataproc creation operation fails.
CDAP-15899Â - Fixed a bug that caused pipeline startup to take longer than needed for cloud runs
CDAP-15879Â - Fixed regex usage in GCS and S3 source plugins.
CDAP-15878Â - Fixed a bug with the Datastore source that was overly restrictive when validating the user provided schema
CDAP-15809Â - Fixing a bug which can cause a thread spinning in an infinite while loop due to multi thread consumers on a queue that allows a single consumer.
CDAP-15770Â - Fixed a bug that caused pipeline failures when writing nullable byte fields as json.
CDAP-15757Â - Fixed a bug that caused MapReduce and Spark logs to be missing for remote pipeline runs
CDAP-15747Â - Fixed a race condition that could cause a program to get stuck in the pending state when stopped in the pending state
CDAP-15742Â - Added some safeguards to prevent cloud pipeline runs from getting stuck in certain edge cases
CDAP-15726Â - Fixed a bug where secure macros were not evaluated in preview mode
CDAP-15617Â - Fixed a bug in the BigQuery source that cause automatic bucket creation to fail if the dataset is in a different project.
CDAP-15583Â - Fix bug in new user tour on lower resolution screens
CDAP-15554Â - Fixed a bug that wrong resolution is used if a time range is specified for metrics query
CDAP-15535Â - Fixed an issue where BigQuery multi sink doesn't work if using an Oracle database as a source.
CDAP-15498Â - Fixed the dataproc provisioner to disable YARN pre-emptive container killing and to disable conscrypt.
CDAP-15445Â - Fixed a bug in the MLPredictor plugin that caused error when using a classification model
CDAP-15423Â - Fixed bug that didn't allow users to paste schema as runtime argument
CDAP-15388Â - Spark pipelines no longer try to run sinks in parallel unless runtime argument 'pipeline.spark.parallel.sinks.enabled' is set to 'true'. This prevents pipeline sections from being re-processed in the majority of situations.
CDAP-15373Â - Fixed the dataproc provisioner to handle networks that do not use automatic subnet creation
CDAP-15353Â - Fixed a Wrangler bug where the wrong jdbc driver would be used in some situations and where required classes could be unavailable.
CDAP-15221Â - Fixed a bug about artifact version comparison
CDAP-15206Â - Fixed a bug that the rollup of the workflow lineage does not remove the local datasets.
CDAP-15097Â - Expanding filename format that UI takes in when uploading artifacts.
Improvements
CDAP-16110Â - Fixed batch pipeline preview to read only the preview records instead of the full input.
CDAP-16069Â - Greatly improved the time it takes to calculate field level lineage
CDAP-15983Â - Set Spark as the default execution engine for batch pipeline
CDAP-15794Â - Improved error message for csv, tsv, and delimited formats when the schema has fewer fields than the data
CDAP-15782Â - Added support to automatically fill field level lineage for plugins that do not emit any
CDAP-15738Â - Upgrades Nodejs version from 8.x to 10.16.2
CDAP-15677Â - Added support to restore preview status after restart
CDAP-15659Â - Route user directly to the pipeline's detail page from pipeline card in Control Center.
CDAP-15489Â - New user experience for log level selection.
CDAP-15265Â - Added image version as a configuration setting to the dataproc provisioner
CDAP-16076Â - Improved the way pipelines with macros that are provided by intermediate stages run.
Created in 2020 by Google Inc.