Amazon AuroraDB MySQL plugin
Introduction
Amazon Aurora is a MySQL compatible database offered as a service. Users will have needs to write to AuroraDB or read from AuroraDB
Use-case
- Users would like to batch build a data pipeline to read complete table from Amazon Aurora DB instance and write to BigTable.Â
- Users would like to batch build a data pipeline to perform upserts on AuroraDB tables in batchÂ
- Users should get relevant information from the tool tip while configuring the AuroraDB source and AuroraDB sink
- The tool tip for the connection string should be customized specific to the database.Â
- The tool tip should describe accurately what each field is used for
- Users should get field level lineage for the source and sink that is being used
- Reference documentation be available from the source and sink plugins
User Stories
- User should be able to install AuroraDB MySQL source and sink plugins from the Hub
- Users should have each tool tip accurately describe what each field does
- Users should get field level lineage information for the AuroraDB MySQL source and sinkÂ
- Users should be able to setup a pipeline avoiding specifying redundant information
- Users should get updated reference document for AuroraDB MySQLÂ source and sink
- Users should be able to read all the DB types
DeliverablesÂ
- Source code in data integrations org
- Integration test codeÂ
- Relevant documentation in the source repo and reference documentation section in plugin
Relevant linksÂ
- Data-integrations org:Â https://github.com/data-integrations/
- Field level lineage:Â https://docs.cdap.io/cdap/6.0.0-SNAPSHOT/en/developer-manual/metadata/field-lineage.html
- Integration test repos:Â https://github.com/caskdata/cdap-integration-tests
Plugin Type
- Batch Source
- Batch SinkÂ
- Real-time Source
- Real-time Sink
- Action
- Post-Run Action
- Aggregate
- Join
- Spark Model
- Spark Compute
Design / Implementation Tips
- Reuse database-commons module from database-plugins repo.
Design
- It is suggested to place plugin code under database-plugin repository to reuse existing database capabilities.
Source Properties
User Facing Name | Type | Description | Constraints |
---|---|---|---|
Label | String | Label for UI | |
Reference Name | String | Uniquely identified name for lineage | Required |
Driver Name | String | Name of JDBC driver to use | Required (defaults to mysql) |
Cluster endpoint | String | URL of the current master instance of MySQL cluster | Required |
Port | Number | Port of MySQL cluster's master instance | Optional (defaults to 3306) |
Database | String | Database name to connect | Required |
Import Query | String | Query for import data | Valid SQL query |
Username | String | DB username | Required |
Password | String | User password | Required |
Bounding Query | String | Returns max and min of split-By Filed | Valid SQL query |
Split-By Field Name | String | Field name which will be used to generate splits | |
Number of Splits to Generate | Number | Number of splits to generate | |
Connection Arguments | Keyvalue | A list of arbitrary string tag/value pairs as connection arguments, list of properties https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-reference-configuration-properties.html |
Sink Properties
User Facing Name | Type | Description | Constraints |
---|---|---|---|
Label | String | Label for UI | |
Reference Name | String | Uniquely identified name for lineage | Required |
Driver Name | String | Name of JDBC driver to use | Required (defaults to mysql) |
Host | String | URL of the current master instance of MySQL cluster | Required |
Port | Number | Port of MySQL cluster's master instance | Optional (defaults to 3306) |
Database | String | Database name to connect | Required |
Username | String | DB username | Required |
Password | Password | User password | Required |
Connection Arguments | Keyvalue | A list of arbitrary string tag/value pairs as connection arguments, list of properties https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-reference-configuration-properties.html | |
Table Name | String | Name of a database table to write to | Requried |
Future Work
- Amazon AuroraDB PostgreSQL plugin
Test Case(s)
- Test case #1
- Test case #2
Sample Pipeline
Please attach one or more sample pipeline(s) and associated data.Â
Pipeline #1
Pipeline #2
Created in 2020 by Google Inc.