CDAP Hub Release Log
September 6, 2024
The CloudSQL MySQL plugin (version 1.10.7) is available in CDAP versions 6.9.0 and 6.10.0. This plugin version lets you use a macro to specify the name of the CloudSQL instance in the plugin's Connection name field.
CDAP-1640 Added support for macros for the connection name.
August 30, 2024
The Excel plugin (version 2.12.3) is available in CDAP version 6.10.0 and later. The release includes the following changes:
PLUGIN-1771 and PLUGIN-1795: Fixed an issue in the Excel batch source causing pipelines with large XLSX files to consume high memory and fail).
The Excel plugin (version 2.11.5) is available in CDAP 6.9 versions. The release includes the following changes:
PLUGIN-1771 and PLUGIN-1795: Fixed an issue in the Excel batch source causing pipelines with large XLSX files to consume high memory and fail).
The Excel plugin (version 2.10.3) is available in CDAP 6.8 versions. The release includes the following changes:
PLUGIN-1771 and PLUGIN-1795: Fixed an issue in the Excel batch source causing pipelines with large XLSX files to consume high memory and fail).
July 31, 2024
The Python Transform plugin (version 2.3.1) is available in CDAP version 6.10.1. The release includes the following change:
CDAP-21054 You can use a macro in the Script plugin property field to pass runtime arguments.
July 15, 2024
The GCS Copy/Move plugin (version 0.23.2), which is bundled with Google Cloud Platform plugins, is available in the CDAP Hub in CDAP 6.10.0 and later. The release includes the following change:
PLUGIN-698 You can use a wildcard character (*
) in the source path to copy and move multiple files. For example, the source path gs://demo0/prod/reports/*.csv
copies and moves all CSV files in the reports
directory.
June 28, 2024
The Cloud Storage (GCS) Multi File sink plugin version 0.23.2 is available in CDAP version 6.10.1 and later. The release includes the following change:
PLUGIN-1780: Fixed an issue causing pipelines to fail when Flexible schema was set to true
.
June 20, 2024
The Oracle sink plugin (version 1.10.7) is available in CDAP version 6.9. The release includes the following change:
PLUGIN-1793: Fixed an issue in the Oracle sink causing null values to be assigned to fields in the input schema that have lowercase letters in the field name.
June 5, 2024
The Google Sheets plugin (version 1.4.3), which is bundled with the Google Drive plugins, is available in the CDAP Hub. The release includes the following changes:
PLUGIN-1785: Fixed an issue causing the Google Sheets plugin to incorrectly parse column names that have special characters.
PLUGIN-1791: Fixed an issue causing pipelines to fail when Google Sheets plugin is used with Wrangler and any of the fields required to fetch schema was macro.
May 27, 2024
The Cloud Storage (GCS) Multi File sink plugin (version 0.22.8) is available in the Hub in CDAP version 6.9.2. The release includes the following change:
PLUGIN-1780: Fixed an issue in the Cloud Storage Multi File sink causing pipelines to fail when a Flexible schema was set to true
.
The Decompress plugin (version 1.2.1) is available in the CDAP Hub in CDAP version 6.10.1 and later. The release includes the following change:
PLUGIN-1743: Fixed an issue in the Decompress plugin causing concatenated GZIP files (.gz
) to not decompress as intended. In version 1.2.1, decompression occurs until EOF is reached.
April 26, 2024
HTTP plugin (version 1.4.2) is available in the CDAP Hub in CDAP versions 6.8.0 and later. The release includes the following change:
PLUGIN-1781: Fixed an issue in the HTTP source causing an error in the retrieved schema when one of the retrieved columns contained a quoted value with a delimiter, such as a comma.
April 12, 2024
Salesforce Marketing Cloud plugin (version 1.3.1) is available in the Hub in CDAP version 6.8.0 and later. The release includes the following change:
PLUGIN-1773: Fixed an issue in the Salesforce Marketing Cloud sink plugin causing upsert operations to fail.
April 2, 2024
Google Sheets plugin (version 1.4.2), which is bundled with the Google Drive plugins, is available in the CDAP Hub. The release includes the following changes:
PLUGIN-1762: Macros are supported for Oauth fields: Client ID, Client Secret, and Refresh Token.
PLUGIN-1763: You can specify a single file ID in the File Identifier field.
PLUGIN-1764: Added an Access Token field, which supports macros.
PLUGIN-1766: You can turn on auto detection for the number of rows and columns.
March 26, 2024
Redshift plugin (version 1.11.1) is available in the CDAP Hub in CDAP versions 6.10.0 and later. For more information, see Redshift batch source.
Redshift plugin (version 1.10.6) is available in the CDAP Hub in CDAP 6.9 versions. For more information, see Redshift batch source.
March 15, 2024
HTTP plugin (version 1.4.1) is available in the CDAP Hub in CDAP versions 6.8.0 and later. The release includes the following changes:
PLUGIN-1737: The HTTP batch source plugin supports schema detection for CSV and TSV file formats.
PLUGIN-1740: Fixed an issue in the HTTP sink plugin causing performance to degrade severely due to the read timeout property getting used as the timeout between calls to the HTTP server. To mitigate the issue, a new property, Wait Time Between Request, has been added in the HTTP Sink plugin to set up a time gap between requests.
PLUGIN-1757: Fixed an issue in the HTTP source plugin causing pipelines to fail when importing 10 million records.
March 14, 2024
Salesforce Plugins (version 1.6.3) is available in the CDAP Hub in CDAP versions 6.8.0 and later. The release includes the following changes:
PLUGIN-1749: Fixed an issue in the Salesforce sink plugin that throws unsupported type datetime
error for DateTime type fields in the input schema. In this version, the Salesforce sink plugin supports datetime and decimal logical types.
PLUGIN-1767: Fixed an issue in all Salesforce plugins causing pipeline to fail while using an oAuth macro because the oAuth macro value didn’t get passed to the plugin as intended. In this version, all Salesforce plugins support an OAuth macro.
PLUGIN-1768: At the time of failure on the Salesforce sink side, if the Error handling property is set to the Fail on error option, the Salesforce job is aborted, which stops newer batches from being added to the job due to spark retry settings in CDAP.
To make debugging easier, additional debug logs and batch results in logs are available.
Feb 27, 2024
Firestore Plugins (version 1.1.0) is available in the CDAP Hub in CDAP versions 6.8.x and later. The release includes the following change:
PLUGIN-1753: Added support for Named Databases, updated UI so that it is consistent with other plugins and remove deprecated ServiceAccountCredentials library.
Feb 12, 2024
CloudSQL PostgreSQL (version 1.10.5) is available in the CDAP Hub in CDAP versions 6.9.x. The release includes the following change:
PLUGIN-1640: Fixed an issue where database and connection name was not supported as macro in Google CloudSQL plugins. Also add support for using different ports while connecting to private Google Cloud SQL instance using Compute Engine VM.
Feb 9, 2024
Google Drive plugins (1.4.1) are available in the CDAP Hub. The release includes the following changes:
PLUGIN-1746: Fixed an issue which caused the validation in Google Drive Sheets Plugin to fail when using auto-detect
.
Feb 8, 2024
Google Cloud Platform plugins (version 0.22.6) are available in the CDAP Hub in CDAP versions 6.9.1 and later. The release includes the following changes:
PLUGIN-1735: Fixed an issue which caused the pipeline to fail with timeout when using GCS Copy Action Plugin to copy large files. A new config named ReadTimeout
is introduced which can be configured to higher values when copying large files.
BigQuery Replication plugins (version 0.9.0) are available in Cloud Data Fusion version 6.10.0 with the following change:
The BigQuery Replication sink lets you use BigQuery tables that exist in one project in another project.
BigQuery Replication plugin (version 0.8.6) are available in Cloud Data Fusion 6.9 versions with the following change:
The BigQuery Replication sink lets you use BigQuery tables that exist in one project in another project.
BigQuery Replication plugins (version 0.7.5) are available in Cloud Data Fusion 6.8 versions with the following change:
The BigQuery Replication sink lets you use BigQuery tables that exist in one project in another project.
January 9, 2024
Snowflake plugins (version 1.1.2) are available in the CDAP Hub. The release includes the following changes:
PLUGIN-1721: Snowflake plugins support the timestamp_tz format.
PLUGIN-1731: Fixed an issue causing Snowflake pipelines to fail when using OAuth for authorization in the plugin.
PLUGIN-1734: Fixed an issue causing pipelines with a Snowflake sink to fail due to usage of an older version of JDBC driver. It’s been upgraded from 3.13.24 to 3.14.4, which is the latest version.
December 22, 2023
Salesforce plugins (version 1.6.2) are available in the CDAP Hub in CDAP versions 6.8.0 and later. The release includes the following changes:
PLUGIN-1719: Fixed an issue in Salesforce plugin causing the following error in some pipelines that run more than 4 hours: java.lang.IllegalStateException: SSLException reading next record: javax.net.ssl.SSLException: Connection reset
.
A Connection Timeout property was added to the Salesforce plugin properties in the web interface with the default value of 3600 seconds.
PLUGIN-1720: To ensure accuracy, the following change was made to the schema handling for referenced object fields: child fields are explicitly marked as non-nullable, regardless of the schema values in the referenced object.
In earlier versions, when retrieving schema information for fields in referenced objects, such as contact.account_lastmodifieddate
, the schema inherits properties from the referenced object, causing incorrect, non-nullable assumptions.
PLUGIN-1706: A retry mechanism has been added in the Salesforce batch source and Multi-Source plugins for connection timeout issues.
December 8, 2023
ServiceNow plugins (version 1.2.0) are available in CDAP versions 6.8.0 and later. The release includes the following changes:
PLUGIN-1689: Fixed an issue in the ServiceNow Batch Source and Multi-Source plugins causing page size to be unconfigurable. You can set a custom page size through a plugin property.
PLUGIN-1645: Fixed an issue in the plugin causing the plugin to accept a start date value, even if an end date value was missing. To use either property, you must specify both a start and end date.
PLUGIN-1651: Fixed an issue in the plugin causing the pipeline to succeed after encountering a data recovery failed
error.
PLUGIN-1650: Fixed an issue in the plugin causing the schema not to load when you used Wrangler on some tables.
PLUGIN-1662: Fixed an issue in the plugin causing all data types to be mapped to string
in the schema.
November 28, 2023
HTTP plugin (version 1.4.0) is available in the CDAP Hub in CDAP versions 6.9.1 and later. The release includes the following changes:
PLUGIN-1699: Fixed an issue in the HTTP Source plugin causing it to validate successfully when the credentials for Basic and Oauth2 authentication are wrong.
Now, the plugin validates the credentials and provides specific error messages to identify the field responsible for the failure.
PLUGIN-1700: Fixed an issue in the HTTP Sink plugin, where it lacks a configuration field for HTTP Proxy, which is available in the HTTP Batch Source plugin.
Now, HTTP Sink plugin supports the proxy functionality, which lets you route HTTP requests through a proxy server when communicating with external endpoints.
PLUGIN-1695: In the HTTP Sink plugin, added support for linear and exponential retry policies, and non-HTTP code error handling.
PLUGIN-1696: In the HTTP Sink plugin, added support for placeholders in the PUT
and DELETE
endpoints.
October 27, 2023
Google Cloud Platform plugins (version 0.22.5) are available in the CDAP Hub in CDAP versions 6.9.1 and later. The release includes the following changes:
PLUGIN-1707: Fixed an issue which caused the pipeline to fail when using BQ Execute Plugin if any error occurred during metric emission.
October 16, 2023
Amazon S3 Batch Source Plugin (version 1.19.5) is available in the Hub (versions 6.9.0 and later) with the following change:
PLUGIN-1694: Added support for credentials verification. Also added a boolean flag in the plugin properties to turn the credentials verification feature on or off. The default value for the flag is False
.
Amazon S3 Batch Source Plugin (version 1.18.4) is available in the Hub (versions 6.8.0-6.8.4) with the following change:
PLUGIN-1694: Added support for credentials verification. Also added a boolean flag in the plugin properties to turn the credentials verification feature on or off. The default value for the flag is False
.
September 26, 2023
Amazon S3 Plugins (version 1.18.3) is available in the Hub (versions 6.8.0 to 6.8.3) with the following change:
PLUGIN-1683 Fixed an issue causing the following exception when creating an S3 connection: com/fasterxml/jackson/databind/JsonMappingException
. aws-java-sdk-s3 was upgraded to 1.12.522, maven-bundle-plugin 3.5.0.
Amazon S3 Plugins (version 1.19.4) is available in the Hub (versions 6.9.0 and later) with the following change:
PLUGIN-1683 Fixed an issue causing the following exception when creating an S3 connection: com/fasterxml/jackson/databind/JsonMappingException
. aws-java-sdk-s3 was upgraded to 1.12.522, maven-bundle-plugin 3.5.0.
DB2 plugins (version 1.10.4) is available in the CDAP Hub in CDAP versions 6.9.0 and later. This release includes the following changes:
PLUGIN-1688 Fixed an issue where plugin deployment succeeded but no sources/sinks/actions showed up in Studio.
September 25, 2023
CloudSQL PostgreSQL (version 1.10.3) is available in the CDAP Hub in CDAP versions 6.9 and later. The release includes the following change:
PLUGIN-1522: Fixed an issue that occurred when reading from a CloudSQL PostgreSQL database causing pipelines to fail with the following error: Column columnName has unsupported SQL Type: numeric with precision 0
. This issue occurs when you use a Numeric
data type without defined Precision and Scale.
September 22, 2023
CloudSQL PostgreSQL (version 1.9.5) is available in the CDAP Hub in CDAP versions 6.8 and later. The release includes the following change:
PLUGIN-1522: Fixed an issue that occurred when reading from a CloudSQL PostgreSQL database causing pipelines to fail with the following error: Column columnName has unsupported SQL Type: numeric with precision 0
. This issue occurs when you use a Numeric
data type without defined Precision and Scale.
Snowflake Batch Source plugin (version 1.1.1) is available in the CDAP Hub. This release includes the following changes:
PLUGIN-1682: Snowflake CDF Plugin was throwing a Null pointer exception while using macros with Wrangler as a transform plugin.
September 20, 2023
CloudSQL PostgreSQL (version 1.8.7) is available in the CDAP Hub in CDAP versions 6.7 and later. The release includes the following change:
PLUGIN-1522: Fixed an issue that occurred when reading from a CloudSQL PostgreSQL database causing pipelines to fail with the following error: Column columnName has unsupported SQL Type: numeric with precision 0
. This issue occurs when you use a Numeric
data type without defined Precision and Scale.
September 13, 2023
FTP Batch Source plugin (version 4.0.0) is available in the CDAP Hub in CDAP versions 6.9.2 and later. This release includes the following changes:
PLUGIN-1655 Supports files with column values that have a CSVs wrapped in quotation marks or in multi-line formats.
PLUGIN-1529 Supports symbols in FTP passwords, such as colons ( :
).
PLUGIN-1524 Splits the Path property into multiple properties. It exposes separate properties for Host, Port, Path, Username, and Password. Previously, those aspects were combined in a single path
property, which couldn't be parsed without URL encoding.
PLUGIN-1181 Supports localized changes to set the Connection Timeout in the plugin configurations.
PostgreSQL plugin (version 1.9.4) are available in the CDAP Hub in CDAP versions 6.8.3. The release includes the following changes:
PLUGIN-1681: Fixed an issue where database config was not supported as macro in PostgreSQL Batch Source and PostgreSQL Sink.
September 6, 2023
Google Cloud Platform plugins (version 0.22.3) are available in the CDAP Hub in CDAP versions 6.9.1 and later. The release includes the following changes:
PLUGIN-1647: Upgraded the Cloud Storage Hadoop Connector to version 2.2.9, which supports using the Cloud Storage Connector as a Hadoop Credential Provider (see the Hadoop connector issue on GitHub).
PLUGIN-1672: The BigQuery connector supports hyphens in the table name property.
PLUGIN-1445: Fixed an issue causing OOM errors when there are a lot of empty tasks with a BigQuery sink.
August 21, 2023
Netezza plugins (version 1.10.2) are available in the CDAP Hub in CDAP versions 6.9.0 and later. The release includes the following change:
PLUGIN-1654: For the Netezza database sink plugin, fixed an issue causing the pipeline to fail without any usable error message when loading the data into the Netezza database with exactly 52,000 records.
August 16, 2023
Netezza plugins (version 1.7.2) are available in the CDAP Hub in CDAP versions 6.5 thru 6.8. The release includes the following change:
PLUGIN-1654: For the Netezza database sink plugin, fixed an issue causing the pipeline to fail without any usable error message when loading the data into the Netezza database with exactly 52,000 records.
August 7, 2023
Salesforce plugins (version 1.6.1) are available in the CDAP Hub in versions 6.8 and later with the following changes:
PLUGIN-1656: In the Salesforce Sink, added Data type Validation toggle button to skip data type validation for input schema fields.
August 4, 2023
Salesforce plugins (version 1.4.8) are available in the CDAP Hub in versions 6.7 with the following changes:
PLUGIN-1656: In the Salesforce Sink, added Data type Validation toggle button to skip data type validation for input schema fields.
July 7, 2023
In Confluent Streaming plugins (version 2.0.0), the following fixes have been made:
CDAP-20176: Added support for state management and at-least-once processing.
PLUGIN-1495: Fixed a schema validation issue occurring when the previous (default) schema was cleared.
PLUGIN-1496: Fixed schema validation issue occurring when the previous (default) schema was cleared.
PLUGIN-1578: To support connection to the Confluent Platform, made auth optional in the Confluent Streaming Source Plugin.
PLUGIN-1588: To support connection to the Confluent Platform, made auth optional in Confluent Streaming Sink Plugin.
In Salesforce plugins (version 1.6.0), the following fix has been made to the Salesforce Batch Sink:
CDAP-1539: In the Salesforce Sink, added support for file attachments for the Attachment and ContentVersion sObjects.
In Salesforce plugins (version 1.4.7), the following fix has been made to the Salesforce Batch Sink:
CDAP-1539: In the Salesforce Sink, added support for file attachments for Attachment and ContentVersion sObjects. This plugin version is only supported in CDAP 6.7 versions.
July 6, 2023
Salesforce Marketing plugins (version 1.3.0) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1623 For Salesforce Marketing Cloud Batch Source, fixed an issue where the data filter field (where clause) validates in design time and works in preview mode, but fails to filter records in runtime after deployment.
PLUGIN-1637 For Salesforce Marketing Cloud Batch Source, fixed an issue where the data extension external key field validates with invalid values in design-time.
PLUGIN-1638 For Salesforce Marketing Cloud Batch Source, fixed an issue where Data extension keys, which contain spaces, cause an error at runtime, as they’re not handled at design time.
PLUGIN-1639 For Salesforce Marketing Cloud Batch Source, fixed an issue where Column mapping validates with false values in design-time, which should be validated only if the Column mapping is specific to an object and its fields.
July 3, 2023
HTTP plugins (version 1.3.5) are available in the CDAP Hub (versions 6.8.0+) with the following fixed issue:
PLUGIN-1635: Fixed an issue causing a 401 Unauthorized response status code due to an expired access token.
June 30, 2023
Zendesk plugins (version 1.2.1) is available in the Hub with the following change:
PLUGIN-1641: MD files changed for Zendesk Batch Source and Zendesk Multi Source plugins.
June 26, 2023
HTTP plugins (version 1.3.4) are available in the CDAP Hub (versions 6.8.0+) with the following change:
PLUGIN-1634: For the HTTP Batch Source, fixed an issue that occurred in version 1.3.0 and 1.3.3 on the Properties page when the Output Schema field contained a macro. It resulted in the following error message: Output schema cannot be empty. Provide valid value for config property 'schema'
. To fix the issue, upgrade to the newest plugin version.
June 7, 2023
Zendesk plugins (version 1.2.0) is available in the Hub (versions 6.7.1+) with the following change:
PLUGIN-1591 Zendesk plugins version 1.2.0 supports Connection Management. Added Use Connection and Browse Connections properties to the Zendesk Batch Source.
Salesforce plugins (version 1.5.0) are available in the CDAP Hub in versions 6.8.0 and later with the following changes:
PLUGIN-1409: Salesforce plugins version 1.5.0 supports Connection Management for batch pipelines.
CDAP-19638: Add FQN support to Salesforce batch plugins.
May 25, 2023
Salesforce plugins (version 1.4.6) are available in the CDAP Hub (all versions) with the following changes:
CDAP-20537: For Salesforce plugins, fixed an issue that caused the following error: Failed to configure the pipeline: Stage 'History Import Compute' encountered : Error encountered while configuring the stage:<stage>
. If the Salesforce Batch source plugin schema and connection properties were configured with a macro, and the schema was imported in the source plugin, it was not getting propagated to the next plugin at the time of deployment and was causing deployment failure. The issue is fixed in this version.
PLUGIN-1545: Fixed an issue that caused data to be duplicated because the spark.task.maxFailures
property is set to 10
by default. To prevent the data from getting duplicated, set this property to 1
in the Engine config. In version 1.4.6, the timeout for Bulk API jobs has increased to prevent failures of batches that take more than 10 minutes. Additional validation has been added to the Salesforce Sink to validate the schema type, in addition to the name.
May 11, 2023
Kafka Plugins (version 3.1.2 for 6.8.0+ and version 3.2.1 for 6.9.0+ ) are available in CDAP Hub with the following changes:
PLUGIN-1594: For the Kafka batch source, fixed an issue where the initial offset was not considered.
April 27, 2023
HTTP plugins (version 1.3.2) are available in the CDAP Hub (versions 6.8.0+) with the following changes:
PLUGIN-1592: For the HTTP Batch Source, fixed an issue where the pipeline failed when the Service Account File Path property was set to auto-detect
.
April 14, 2023
Multiple Database Tables plugins (versions 1.4.0) are available in the CDAP Hub (versions 6.9.0) with the following changes:
CDAP-20440: For the Multiple Database Tables Batch Source, added field-level lineage support.
April 13, 2023
Salesforce plugins (version 1.4.5) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1538: For Salesforce plugins, added support to use a Proxy URL to connect to Salesforce. Added Proxy URL property to the following plugins:
April 11, 2023
Quickstart pipeline (version 1.2.3) is available in the CDAP Hub (all versions) with the following changes:
CDAP-20542: For the Quickstart pipeline, increased the Spark Driver and Executor memory configuration from 1024 MB to 2048 MB. This fixes an issue where the pipeline failed when run on Dataproc 2.1 with Spark-3.3.0. To run the Quickstart pipeline on Dataproc 2.1 with Spark-3.3.0, upgrade Quickstart pipelines to version 1.2.3.
HTTP plugins (versions 1.2.7 and 1.3.1) are available in the CDAP Hub (versions 6.7.0, 6.7.1, 6.7.2, 6.7.3, and 6.8.0+) with the following changes:
PLUGIN-1544: For the HTTP Batch source, fixed an issue that caused pipeline runs to fail with a 401 authentication error
when the Authentication Type property was set to Basic Authentication
, and the URL entered in the URL property did not contain a port number. With this fix, pipelines with these configurations no longer fail.
Note: HTTP plugins version 1.2.6 will continue to work with CDAP 6.6.0 and earlier.
March 20, 2023
Salesforce plugins (version 1.4.4) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1533: In the Salesforce Sink, added the Concurrency Mode property to let you configure the plugin for parallel or serial concurrency. Use this property to help resolve lock contention issues with Salesforce bulk API.
In the Salesforce Batch Source, Salesforce Multi Object Batch Source, Salesforce Streaming Source, and Salesforce Sink, added the Connection Timeout property to let you set the maximum time in milliseconds to wait for connection initialization before it times out.
PLUGIN-1469: In the Salesforce Streaming Source, improved error handling in the CDAP pipeline logs.
March 14, 2023
Oracle plugins (version 1.8.6 and 1.9.2) are available in the CDAP Hub (versions 6.7.1, 6.7.2, 6.7.3 and 6.8.0, 6.8.1) with the following changes:
PLUGIN-1535: For the Oracle Batch Source, fixed a backward compatibility issue. In plugin versions 1.8.3, 1.9.0, and earlier, CDAP maps the Oracle NUMBER
data type with undefined precision and scale to CDAP decimal(38,0)
, which can cause data loss due to rounding errors. In plugin versions 1.8.4, 1.8.5, and 1.9.1, the Oracle NUMBER
data type with undefined precision and scale maps to the CDAP string
data type by default, which preserves all decimal digits. In versions 1.8.6 and 1.9.2, the Oracle NUMBER
data type with undefined precision and scale gets mapped to CDAP string
by default and lets the user edit the output schema to use the older mapping to decimal(38, 0)
data type.
For more information, see Oracle batch source plugin (versions 1.9.1 and 1.8.5) converts Oracle NUMBER data type with undefined precision and scale to CDAP string.
March 1, 2023
HTTP plugins (versions 1.2.6 and 1.3.0) are available in the CDAP Hub (versions 6.5.1+ and 6.8.0+) with the following changes:
PLUGIN-1515: For the HTTP Sink, fixed an issue that caused the plugin to drop data if the Batch Size property was greater than 1.
FTP Batch Source plugin (versions 3.1.1 and 3.2.1) are available in the CDAP Hub (versions 6.7.2+ and 6.8.0+) with the following changes:
PLUGIN-1525: For the FTP Batch Source, fixed an issue that caused the SFTP source to fail if the password contained a colon.
PLUGIN-1520: For the FTP Batch Source, fixed an issue that caused the FTP Batch Source to fail to fetch the schema if the password contained a special character.
February 23, 2023
FTP Batch Source plugin (versions 3.1.0 and 3.2.0) are available in the CDAP Hub (versions 6.7.2+ and 6.8.0+) with the following changes:
PLUGIN-1493: For the FTP Batch Source, added support for blob
, csv
, delimited
, json
, text
, tsv
formats, or the name of any format plugin that you have deployed to your environment.
Added the following plugin properties:
Format
Get Schema
Delimiter
Use First Row as Header
Enable Quoted Values
CDAP-18632, PLUGIN-1479: For the FTP Batch Source, fixed an issue where pipelines failed when run with Dataproc 2.0.
February 22, 2023
Google CloudSQL PostgreSQL plugins (version 1.5.6) are available in the CDAP Hub (versions 6.5.x) with the following changes:
PLUGIN-1510, PLUGIN-986: For the Google CloudSQL PostgreSQL sink, fixed an issue where plugin validation failed, but didn’t display a useful error message. Now, when validation fails, the plugin displays a detailed error message.
February 14, 2023
Hubspot plugins (version 1.1.0) are available in the CDAP Hub (versions 6.5+) with the following changes:
PLUGIN-1039: For Hubspot plugins (Hubspot Batch Source, Hubspot Streaming Source, and Hubspot Sink), added access token authorization support (Authorization method: Private App Access Token property) and deprecated the API Key plugin property due to Hubspot's deprecation of API Keys.
January 12, 2023
Database plugins (versions 2.10.1 and 2.9.3) are available in the CDAP Hub (versions 6.8.0, 6.7.1, and 6.7.2) with the following changes:
CDAP-20235: In the Database Batch Source, fixed an issue where the database username and password appeared in the App Fabric logs.
January 9, 2023
Data Profiler Analytics plugin (version 1.1.1) is available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1472: In the Data Profiler Analytics plugin, fixed an issue where the plugin did not calculate the count for nulls correctly.
December 20, 2022
HTTP plugins (version 1.2.5) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1434: In the HTTP Batch Source, fixed an issue where plugin validation failed when any of the OAUTH2 properties used a Macro.
December 12, 2022
HTTP plugins (version 1.2.4) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1447: In the HTTP Batch Source, fixed an issue where the pipeline failed when the Format property was set to Text
and the Pagination property was set to Custom
.
Oracle plugins (versions 1.9.1) are available in the CDAP Hub (version 6.8.0) with the following changes:
PLUGIN-1119: The Oracle Batch Source (version 1.9.1) reads Oracle NUMBER
data type with undefined precision and scale as CDAP string
. In plugin versions 1.9.0 and earlier, CDAP read Oracle NUMBER
data type with undefined precision and scale) as decimal
(38,0), which could have resulted in data loss.
Oracle plugins (versions 1.8.4) are available in the CDAP Hub (versions 6.7.1 and 6.7.2) with the following changes:
PLUGIN-1119: The Oracle Batch Source (version 1.8.4) reads Oracle NUMBER
data type with undefined precision and scale as CDAP string
. In plugin versions 1.8.3 and earlier, CDAP read Oracle NUMBER
data type with undefined precision and scale) as decimal
(38,0), which could have resulted in data loss.
December 5, 2022
FTP plugins (version 3.0.2) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1460: The FTP Batch Source now correctly filters input files based on the Regex Path Filter property.
MongoDB plugins (versions 2.0.1) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1286: In the MongoDB Batch Source, fixed an issue where pipelines failed when the collection data was larger than 2 MB.
ServiceNow plugins (version 1.1.0) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1142, PLUGIN-980: For the ServiceNow Batch Source, fixed an issue so the Output Schema only displays when the Mode property is set to Table and there is only one Table Name.
PLUGIN-1089: For the ServiceNow Batch Source, fixed an issue where pipelines were failing with timeout errors when reading large amounts of data.
New plugin: ServiceNow Batch Multi Source is available in the CDAP Hub.
Zendesk plugins (version 1.1.0) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1301: For the Zendesk Batch Source, fixed an issue that caused pipelines to fail when pulling records for object Ticket Metrics.
PLUGIN-1302: For the Zendesk Batch Source, fixed discrepancies in the list of fields in Zendesk API response and the Output Schema in the plugin.
PLUGIN-1303: For the Zendesk Batch Source, fixed an issue where pipelines reading Post Comments were only extracting records made by the Admin (owner) user.
PLUGIN-1304: For the Zendesk Batch Source and Multi Objects Batch Source, added content to the documentation tabs.
PLUGIN-1305: For the Zendesk Batch Source, fixed an issue where plugin validation failed when the Advanced properties were configured with macros.
PLUGIN-1306: For the Zendesk Batch Source, fixed an issue where pipelines failed while extracting records for object 'Article Comments.For the Zendesk Multi Object Batch Source, fixed an issue where pipelines failed with the error No matching schema found for union type: ["string","null"] for token: NUMBER when the Object to Pull property was set to more than one object.
For the Zendesk Multi Object Batch Source, fixed a validation error when Advance properties (Max Retry Count, Connect Timeout, and Read Timeout) were configured as Macros.
For the Zendesk Multi Object Batch Source, fixed an issue where pipelines failed when the Objects to Pull property was set to Ticket Comments.
Redesigned the Zendesk Multi Object Batch Source plugin.
December 1, 2022
Google Cloud Platform plugins (versions 0.20.4) are available in the CDAP Hub (versions 6.7.1 and 6.7.2) with the following changes:
PLUGIN-1450: The Dataplex Batch Source and Dataplex Sink are generally available (GA).
PLUGIN-1378: In the Dataplex Sink plugin, added a new property, Update Dataplex Metadata, which adds support for updating metadata in Dataplex for newly generated data. If enabled, the pipeline automatically copies the output schema to the destination Dataplex entities, and the automated Dataplex Discovery doesn’t run for them. The user is responsible for the compatibility of the changes applied to the output schema.
Google Cloud Platform plugins (versions 0.19.3) are available in the CDAP Hub (Version 6.6.0) with the following changes:
PLUGIN-1449: The Dataplex Batch Source and Dataplex Sink are generally available (GA).
PLUGIN-1378: In the Dataplex Sink plugin, added a new property, Update Dataplex Metadata, which adds support for updating metadata in Dataplex for newly generated data. If enabled, the pipeline automatically copies the output schema to the destination Dataplex entities, and the automated Dataplex Discovery doesn’t run for them. The user is responsible for the compatibility of the changes applied to the output schema.
November 30, 2022
Snowflake plugins (version 1.1.0) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1443: In the Snowflake Batch Source, Snowflake Sink, Snowflake to Cloud Storage Action, Snowflake Run SQL Action, and Cloud Storage to Snowflake Action plugins, upgraded the JDBC driver version to 3.13.24.
November 15, 2022
HTTP plugins (version 1.2.3) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1439: In the HTTP Batch Source, fixed an issue where the pipeline failed with the error No matching schema found when the source had missing fields and Null wasn’t selected in the Output Schema in the plugin. Now, if there are missing fields and Null isn’t selected in the Output Schema, CDAP treats the missing fields as Null.
November 8, 2022
Salesforce plugins (version 1.4.3) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1435: In the Salesforce Streaming Source, fixed the following issues:
Fixed Salesforce streaming Meta-Subscribe, Meta-Handshake, and Meta-Connect error handling
Fixed message loss due to Salesforce connectivity because of network failure and recovery using re-handshaking.
November 4, 2022
Oracle plugins (version 1.8.3) are available in the CDAP Hub (versions 6.7.1 and 6.7.2) with the following changes:
PLUGIN-1433: In the Oracle Batch Source, when the source data included fields with the Numeric data type (undefined precision and scale), CDAP set the precision to 38 and the scale to 0. If any values in the field had scale other than 0, CDAP truncated these values, which could have resulted in data loss. If the scale for a field was overridden in the plugin output schema, the pipeline failed.
Now, if an Oracle source has Numeric data type fields with undefined precision and scale, you must manually set the scale for these fields in the plugin output schema. When you run the pipeline, the pipeline will not fail and the new scale will be used for the field instead. However, there might be truncation if there are any Numbers present in the fields with the scale greater than the scale defined in the plugin. CDAP writes warning messages in the pipeline log indicating the presence of Numbers with undefined precision and scale in the pipeline. For more information about setting precision and scale in a plugin, see Changing the precision and scale for decimal fields in the output schema.
October 27, 2022
FTP Batch Source plugin (version 3.0.1) is available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1127: In the FTP Batch Source, fixed design-time validation to validate the connection to the FTP or SFTP source server. Now, if the connection information is incorrect, when you click Validate, validation fails. Previously, if the connection information was incorrect, plugin validation succeeded and the pipeline failed at runtime.
October 26, 2022
Snowflake plugins (version 1.0.1) are available in the CDAP Hub (all versions) with the following changes:
PLUGIN-1421: Fixed an issue in the Snowflake Batch Source plugin where if the input csv file had a line with a smaller number of fields than expected (csv file is malformed), then the pipeline failed with ArrayIndexOutOfBoundsException
. Now, when the csv file is malformed, CDAP reads the fields that are present in the input, ignores missing fields, and writes the malformed line to the pipeline log.
For example, three fields are expected to be present in the csv file: field1
, field2
, field3
. If a line in file contains value1,value2
, then this is parsed as field1=value1
, field2=value2
. field3
is unset. Previous behavior: pipeline fails.
October 20, 2022
BigQuery Replication Target plugin (version 0.6.4) is available in the CDAP Hub (versions 6.7.1 and 6.7.2) with the following changes:
CDAP-19599: Fixed an issue in the BigQuery Replication Target plugin that caused replication jobs to fail when the BigQuery target table already existed. When you deploy the new version of the plugin, it will automatically be used in new replication jobs. Due to CDAP-19622, if you want to use the new plugin version in existing jobs, recreate each replication job.
September 30, 2022
Database plugins (version 2.9.2) are available in the CDAP Hub (version 6.7.1) with the following changes:
CDAP-19532: Fixed an issue in the Database Batch Source plugin that caused pipelines to fail during runtime when there was a column with precision of 0 in the source returned by JDBC. Now, if a column has a precision of 0, the pipeline no longer fails. This affected CDAP 6.7.1 only. Note: In the Database Batch Source, if a column has precision 0, you must change the data type to Double
in the Output Schema to ensure the pipeline runs successfully.
September 27, 2022
Hive Bulk Export and Hive Bulk Import plugins (version 1.9.0-1.1.0) are available in the CDAP Hub (versions 6.5.1, 6.6.0, and 6.7.1) with the following changes:
PLUGIN-1294: Upgraded the hive-jdbc
dependency in the Hive action plugins to 2.3.3
, which resolves a security vulnerability in org.apache.hive:hive-jdbc
(CVE-2018-1282 for SQL injection).
Note: The Hive JDBC driver 2.3.3 is not backward compatible. You must upgrade your Hive Server to 2.3.3 to use the Hive Bulk Export and Hive Bulk Import plugins version 1.9.0-1.1.0. For more information, see Apache note for CVE-2018-1282.
September 20, 2022
Google BigQuery Replication Target plugin (version 0.6.2) is available in the CDAP Hub versions 6.7.x with the following changes:
PLUGIN-1388: In the Replication BigQuery Target plugin, fixed an issue where the error Already Exists: Job
occurred when a BigQuery job was submitted multiple times internally during the replication job run. Now, this has been fixed and it won't fail the pipeline run.
September 12, 2022
Google Cloud Platform plugins (versions 0.18.7, 0.19.2, 0.20.2) are available in the CDAP Hub versions 6.5.1, 6.6.0, and 6.7.1 with the following changes:
PLUGIN-1373: In the BigQuery Sink plugin, fixed an issue that sometimes resulted in a NullPointerException error when trying to update table metrics.
PLUGIN-1367: In the BigQuery Sink plugin, fixed an issue that caused a NullPointerException error when the output schema was not defined. Note: This fix is available for the BigQuery Sink plugin version 0.20.2.
September 7, 2022
Amazon Kinesis Streaming Source plugin (version 2.0.0) is available in the CDAP Hub with the following changes:
PLUGIN-1364: The Amazon Kinesis Streaming Source version 2.0.0 now requires Scala 2.12 for the execution environment (Dataproc 1.5, Dataproc 2.0 or later).
August 25, 2022
Salesforce plugins (version 1.4.2) are available in the CDAP Hub with the following changes:
PLUGIN-1184: In the Salesforce Batch Source plugin and Salesforce Multi Objects Batch Source plugin, fixed an issue where the plugin failed to retrieve data when the case of an object in the query didn’t match the Salesforce object case. For example, in the Salesforce Batch Source, if the SObject Name was cost
and the Salesforce object was Cost
, the pipeline failed at runtime.
August 5, 2022
Salesforce plugins (version 1.4.1) are available in the CDAP Hub with the following changes:
PLUGIN-1033: In the Salesforce Batch Source plugin and Salesforce Multi Objects Batch Source plugin, fixed an issue where query parsing failed when the query contained DATA
, SCOPE
, or END
keywords.
May 23, 2022
Google Cloud Platform plugins (version 0.19.1) are available in the CDAP Hub version 6.6.0, which include the following new plugins and fixes:
PLUGIN-1256: Fixed an issue that caused the BigQuery Execute action plugin configured with an Encryption Key Name (CMEK) to fail when the SQL query contained DDL Statements.
PLUGIN-954: In the BigQuery Execute action plugin, added property Store Results in a BigQuery Table.
Dataplex Batch Source Known Issues
The plugin currently does not support CSV data on Cloud Storage.
Partition Start Date and Partition End Date are not applicable for Cloud Storage Entities.
The plugin can read data from Cloud Storage entities only if the lake is associated with a Dataproc
Metastore.
May 12, 2022
Oracle plugins (version 1.7.1) are available in the CDAP Hub with the following changes:
Oracle Batch Source and Sink plugins
PLUGIN-1178: Enhancement to allow users to select what database isolation level to use when reading/writing to Oracle. Users can select from READ_COMMITTED or SERIALIZABLE. Default is SERIALIZABLE.
Oracle Sink plugin
PLUGIN-1146: Fixed an issue where the table was required to exist in the login users schema. Now users can specify a different schema when writing to a table.
Oracle Batch Source plugin
PLUGIN-1126: Fixed an issue to treat all timestamps from the database as Gregorian calendar format, specifically the ones older than the Gregorian cut over date (October 15, 1582).
Oracle Batch Source, Sink, and Action plugins
PLUGIN-1095: Enhancement to allow the use of a TNS Connect String. This enables users to take advantage of the Oracle JDBC drive, like load balancing across multiple Oracle instances as well protecting against failover scenarios.
April 18, 2022
Google Drive plugins (1.4.0) are generally available (GA) with the following changes:
Improvements
PLUGIN:1170: Renamed the following Google Sheets Batch Source plugin properties in the UI (display names only):
Renamed Last Data Column Index to Number of Columns to Read
Renamed Last Data Row Index to Number of Rows to Read
Renamed Custom Row Index for Column Names to Column Names Row Number
Renamed First Header Row Index to First Row of Header
Renamed Last Header Row Index to Last Row of Header
Renamed First Footer Row Index to First Row of Footer
Renamed Last Footer Row Index to Last Row of Footer
PLUGIN-1135: Added OAuth2 generation steps to the documentation for Google Drive Batch Source, Google Drive Sink, Google Sheets Batch Source, and Google Sheets Sink.
PLUGIN-1060: Added support in Google Drive Plugin to be able to connect with shared drives.
Behavior Changes
PLUGIN-1149: Changed the default value for the Modification Date Range property to lifetime for Google Drive Batch Source and Google Sheet Batch Source to ensure that by default the plugin doesn't only look for files that were updated on the day the pipeline was run. The default value for this configuration option was today in previous versions.
PLUGIN-632: Removed the following Connection Retry properties from Google Drive plugins:
Max Retry Count
Max Retry Wait
Max Retry Jitter Wait
Bug Fixes
PLUGIN-1151: Fixed the Google Sheets Batch Source plugin Number of Columns to Read property to work correctly.
PLUGIN-1075: Fixed Google Sheets Batch Source to correctly skip empty records when the Skip Empty Data property is set to Yes.
PLUGIN-1034: Improved error messages for Google Sheets Batch Source and Google Sheets Sink when:
Google Drive API is not enabled
Google Sheet API is not enabled
Filter search query is invalid
March 22, 2022
Multi Table plugins (version 1.3.2) are available in the CDAP Hub (all versions) with the following changes:
In the Multi Table Batch Source, added the Fetch Size property that let’s you set the number of rows to fetch at a time per split.
January 12, 2022
Salesforce Plugins (1.3.11) are available with the following changes:
PLUGIN-1002: Salesforce API was upgraded from 45.0 to 53.0, which enables the endDate
field in the Event
table. Now, if you are using the Event
table in the Salesforce batch source or Salesforce Multi Objects batch source sObjects field, you will see an endDate
field.
PLUGIN-777: Fixed an issue in the Salesforce batch source plugin where Salesforce sessions were not closed properly when PK Chunking was enabled.
Multi Table Plugins (1.3.1) are now available with the following changes:
PLUGIN-1027: Improved Multi Database Table batch source plugin to correctly read decimal data.
August 16, 2021
SQL Server Plugins (1.5.5) are now available. Version 1.5.5 and above support the datetime
data type in SQL Server batch sources.
Issue with SQL Server batch source plugin version 1.5.4: Pipelines with SQL Server batch sources version 1.5.4 that have datetime
, datetime2
, or datetimeoffset
columns might fail due to data type mismatches.
Recommendation: Upgrade all pipelines that use SQL Server batch sources version 1.5.4 and have datetime
, dateime2
, or datetimeoffset
columns to version 1.5.5. Version 1.5.5 is backward compatible. SQL Server batch sources with output schemas that map datetime data types to CDAP timestamp
and string
data types will run successfully.
Note: If a pipeline has a SQL Server batch source and other SQL Server plugins, such as a sink, it’s recommended that you upgrade each SQL Server plugin to version 1.5.5.
All new pipelines that use SQL Server batch source version 1.5.5 map datetime data types to the CDAP datetime
data type.
For more information about SQL Server data type mappings, see SQL Server Batch Source.
To use SQL Server batch source version 1.5.5, follow these steps:
Download SQL Server Plugins version 1.5.5 from the CDAP Hub.
To upgrade a pipeline to use SQL Server batch source version 1.5.5, run the following POST request:
POST /v3/namespaces/<namespace-id>/apps/<app-id>/upgrade
This POST request upgrades all artifacts in the pipeline to use the latest available version.
For more information, see “Upgrade an Application” in Lifecycle Microservices.
Special use case: If you use a macro for the database name, schema name, or table name, and if you haven't manually specified an output schema, the schema gets detected and mapped at runtime. The old version (1.5.3 or earlier) maps datetime
and datetime2
to timestamp
data type, and the datetimeoffset
data type to string
data type at runtime, while 1.5.5 and above maps them to datetime
at runtime.
After the upgrade, the SQL Server datetime
, datetime2
, and datetimeoffset
data types are mapped to the CDAP datetime
data type at runtime. If you have a downstream stage or sink that consumes the original timestamp data (which datetime
and datetime2
were mapped to) or string data (which datetimeoffset
was mapped to), either update them or expect them to consume datetime
data.
May 19, 2021
PLUGIN-691: Microsoft SQL Server source plugin version 0.15.2 is available. The SQL Server plugin now correctly handles nullable TIME fields.
May 5, 2021
Google Cloud Platform plugins version 0.17.2 are available. This version fixes the following issues:
PLUGIN-635: Fixed an issue in the BigQuery plugins to correctly delete temporary GCS buckets.
PLUGIN-655: Fixed an issue in the BigQuery sink that caused failures when the input schema was not provided.
April 27, 2021
Google Cloud Platform plugins version 0.17.1 are available. This version fixes the following issues:
PLUGIN-678: Data pipelines that include BigQuery sinks version 0.17.0 fail or give incorrect results. For more information, see the CDAP 6.4.0 Release Log.
PLUGIN-654: Fixed an issue that caused pipelines to fail when Pub/Sub source version 0.17.0 Subscription field was a macro.
Created in 2020 by Google Inc.