CDAP Release 6.6.0
Release Date: February 24, 2022
New Features
CDAP-18653: Added one-click autoscaling for Dataproc compute profiles.
Enhancements
PLUGIN-994: Added support for Fetch Size to the following plugins with the new limit of 1000 rows:
CDAP-18738: Dataproc Cluster Reuse. Runtime property system.profile.properties.clusterReuseEnabled
is no longer required to enable cluster reuse. Default Max Idle Time is set to 30 minutes to prevent accidental cluster leak.
CDAP-18725: Added more details for pipeline success and failure metrics.
CDAP-18712: Added ability to limit published lineage messages to a configurable size to avoid out of memory errors due to large lineages.
CDAP-18651: Preview runners no longer perform any kind of access enforcement.
CDAP-18647: Added new limit of 5000 records for Previewing data in the Pipeline Studio.
CDAP-18621: Added new default value of 30 minutes for the Dataproc profile Max Idle Time property. Previously, Max Idle Time had no default value.
CDAP-18836: Added temporary namespace UPDATE enforcement for pipeline connections.Â
CDAP-18798: Added system.program.starting.delay.seconds
metric to measure time taken by program to transition from provisioning to running state.
CDAP-18714: Added metrics for API call latency.
CDAP-18725: Added new tags (Provisioner, Cluster Status, Existing Status) to existing program failure/success metric.
CDAP-17772: Added authn/z between internal system services via token verification.
Instance Stability and Memory Usage
CDAP-18696: Added new Applications parameter (app.max.concurrent.launching
) to cdap-default.xml
control back pressure on pipeline starting requests. Requests exceeding the limit will fail with 429 (Too Many Requests) status.
CDAP-18712: Added new Metadata parameter (metadata.messaging.publish.size.limit
) to cdap-default.xml
to limit the size of published lineage messages to avoid out of memory errors due to large lineages.
CDAP-18672: Added new Dataset parameter (data.storage.sql.scan.size.rows
) to cdap-default.xml
to set the number of rows fetched for database reads from PostgreSQL.
CDAP-18559, CDAP-17986: Added retries to Dataproc API calls to ensure transient errors don’t affect cluster provisioning.
CDAP-18594, CDAP-18810: Fixed a problem when pipeline could not be deleted due to program state not updated after retries.
CDAP-18857: Added new Applications parameter (app.artifact.parallelism.max
) to cdap-default.xml
that limits artifact repository initialization parallelism to prevent Out of Memory errors on App Fabric startup.
CDAP-18848: Reduced Metrics parameter (metrics.processor.queue.size
) parameter default from 20000 to 1000 to prevent Out of Memory during metric processing.
CDAP-18791, CDAP-18627, CDAP-18553: Improved LevelDB performance and memory usage.
CDAP-18748, CDAP-18737, CDAP-18685, CDAP-18680: Improved running pipelines handling during App Fabric restarts.
CDAP-18656: Prevented App Fabric Out Of Memory error when it’s asked to retrieve a long list of pipelines within a namespace.
CDAP-18603: Added pagination to Lifecycle Microservices List Applications.
CDAP-18586: Prevented App Fabric Out Of Memory when system argument list is too long.
Bug Fixes
PLUGIN-1035: Fixed an issue that caused pipelines to fail when a Database batch source included a decimal column with precision greater than 19.
PLUGIN-1022: Fixed an issue that caused pipelines with a Conditional plugin and running on MapReduce to fail.
PLUGIN-1015: Fixed an issue that caused pipelines with a Conditional plugin and running on Spark to fail.
PLUGIN-974: Fixed an issue that caused validation to fail for GCS Multi File sinks.
Behavior Changes Â
CDAP-18586: getApplicationSpecification() method in interface io.cdap.cdap.api.schedule.ProgramStatusTriggerInfo has been removed in CDAP 6.6.0, which can cause the CDAP build break if you are using this method.
Known Issues
SQL Server Replication Source
CDAP-19354: The default setting for the snapshot transaction isolation level (snapshot.isolation.mode
) is repeatable_read
, which locks the source table until the initial snapshot completes. If the initial snapshot takes a long time, this can block other queries.Â
In case transaction isolation level doesn't work or is not enabled on the SQL Server instance, follow these steps:
Configure SQL Server with one of the following transaction isolation levels:
In most cases, set
snapshot.isolation.mode
tosnapshot
.If schema modification will not happen during the initial snapshot, set
snapshot.isolation.mode
toread_committed
.
For more information, see Enable the snapshot transaction isolation level in SQL Server 2005 Analysis Services.
2. After SQL Server is configured, pass a Debezium argument to the Replication job. To pass a Debezium argument to a Replication job in CDAP, specify a runtime argument prefixed with source.connector
, for example, set the Key to source.connector.snapshot.isolation.mode
and the Value to snapshot
.
For more information about setting a Debezium property, see Pass a Debezium argument to a Replication job.
Created in 2020 by Google Inc.