CDAP Release 6.0.0
Important: CDAP 6.0.0 is deprecated.
Summary
This release introduces a number of new features, improvements, and bug fixes for CDAP. Some of the main highlights of the release are:
Storage SPIs
Storage SPIs provide abstraction for all system storage used by CDAP so that CDAP is more portable across runtime environments - Hadoop or Hadoop-free environments.
Portable Runtime
Provide a runtime architecture for CDAP to support both Hadoop and Hadoopless environments, such as Kubernetes, in a distributed and secure fashion.
Pipeline Enhancements
Improve experience of building pipelines with the help of features such as copy & paste and minimap of the pipeline.
Add support for more data types.
New Features
CDAP-14330Â - Added Google Cloud Storage copy and move action plugins.
CDAP-14533Â - New pipeline list user interface.
CDAP-14613Â - Added minimap to pipeline canvas.
CDAP-14645Â - Added support for running CDAP system services in Kubernetes environment.
CDAP-14657Â - Added the ability to copy and paste a node in pipeline studio.
CDAP-15058Â - Added the ability to limit the number of concurrent pipeline runs.
CDAP-15095Â - Added support for toggling Stackdriver integration in Google Cloud Dataproc cluster.
CDAP-15256Â - Added support for Numeric and Array types in Google BigQuery plugins.
CDAP-15339Â - Added support for showing decimal field types in plugin schemas in pipeline view.
Improvements
CDAP-13632Â - Added support for CDH 5.15.
CDAP-14653Â - Revamps top navbar for CDAP UI based on material design.
CDAP-14667Â - Secure store supports integration with other KMS systems such as Google Cloud KMS using new Secure Store SPIs.
CDAP-7208Â - Improved CDAP Master logging of events related to programs that it launches.
CDAP-14343Â - Use a shared thread pool for provisioning tasks to increase thread utilization.
CDAP-14569Â - Improve performance of LevelDB backed Table implementation.
CDAP-14571Â - Wrangler supports secure macros in connection.
CDAP-14617Â - Significantly improve performance of Transactional Messaging System.
CDAP-14821Â - Added early validation for the properties of the Google BigQuery sink to fail during pipeline deployment instead of at runtime.
CDAP-14823Â - Improved the error message when a null value is read for a non-nullable field in avro file sources.
CDAP-15047Â - Improved loading of system artifacts to load in parallel instead of sequentially.
CDAP-15059Â - Improved Google Cloud Dataproc provisioner to allow configuring default projectID from CDAP configuration.
CDAP-15318Â - Added support of using runtime arguments to pass in extra configurations for Google Cloud Dataproc provisioner.
CDAP-14579Â - Added support for spaces in file path for Google Cloud Storage plugin.
CDAP-14897Â - Google BigQuery source now validates schema when the pipeline is deployed.
Bug Fixes
CDAP-12211Â - Fixed a casting bug for the DB source where unsigned integer column were incorrectly being treated as integers instead of longs.
CDAP-13410Â - Removed the need for ZooKeeper for service discovery in remote runtime environment.
CDAP-7230Â - Fixed an issue with recording lineage for realtime sources.
CDAP-12941Â - Fixed dynamic Spark plugin to use appropriate context classloader for loading dynamic Spark code.
CDAP-13554Â - Fixed a bug that caused MapReduce pipelines to fail when using too many macros.
CDAP-13982Â - Fixed an issue that caused pipelines with too many macros to fail when running in MapReduce.
CDAP-14666Â - Fixed an issue with publishing metadata changes for profile assignments.
CDAP-14691Â - Fixed a bug that would cause workspace ids to clash when wrangling items of the same name.
CDAP-14702Â - Fixed a bug in secure store caused by breaking changes in Java update 171. Users should be able to get secure keys on java 8u171.
CDAP-14708Â - Fixed a bug that caused Google Cloud Dataproc clusters to fail provisioning if a firewall rule that denies ingress traffic existed in the project.
CDAP-14709Â - Fixed a bug that would cause data preparation to fail when preparing a large file in Google Cloud Storage.
CDAP-14724Â - Fixed a bug that caused action-only pipelines to fail when running using a cloud profile.
CDAP-14744Â - Fixed an issue with adding business tags to an entity.
CDAP-14778Â - Fixed an issue in handling metadata search parameters.
CDAP-14779Â - Fixed a bug that would cause pipelines to fail on remote clusters if the very first pipeline run was an action-only pipeline.
CDAP-14857Â - Fixed the standard deviation aggregate functions to work, even if there is only one element in a group.
CDAP-14951Â - Fixed a bug in the Google BigQuery sink that would cause pipelines to fail when writing to a dataset in a different region.
CDAP-15001Â - Fixed a race condition in processing profile assignments.
CDAP-15013Â - Fixed an issue that could cause inconsistencies in metadata.
CDAP-15069Â - Fixed an issue with displaying workspace metadata in the UI.
CDAP-15127Â - Fixed a race condition in the remote runtime scp implementation that could cause process to hang.
CDAP-15196Â - Fixed an issue with metadata search result pagination.
CDAP-15223Â - Fixed Wrangler DB connection where a bad JDBC driver could stay in cache for 60 minutes, making DB connection not usable.
CDAP-15249Â - Fixed a NullPointerException in Google Cloud Dataproc provision for when there was no network configured.
CDAP-15299Â - Fixed a bug that caused some aggregator and joiner keys to be dropped if they hashed to the same value as another key.
CDAP-15332Â - Fixed a bug in the RuntimeMonitor that doesn't reconnect through SSH correctly, causing failure in monitoring the correct program state.
CDAP-15369Â - Fixed Google Cloud Dataproc runtime for Google Cloud Platform projects where OS Login is enabled.
Deprecated and Removed Features
CDAP-15241Â - Deprecated HDFSMove and HDFSDelete plugins from core plugins.
CDAP-14591Â - Removed Streams and Stream Views, which were deprecated in CDAP 5.0.
CDAP-14592Â - Removed Flow, which was deprecated in CDAP 5.0.
CDAP-14529Â - Removed deprecated HDFSSink Plugin.
CDAP-14772Â - Removed the plugin endpoints feature to prevent execution of plugin code in the cdap master. Endpoints were only used for schema propagation, which has moved to the pipeline system service.
CDAP-14886Â - Removed the support for custom routing for user services.
Created in 2020 by Google Inc.