Amazon AuroraDB PostgreSQL plugin

Amazon AuroraDB PostgreSQL plugin

Introduction

Amazon Aurora is a PostgreSQL compatible database offered as a service. Users will have needs to write to AuroraDB or read from AuroraDB.

Use-case

  • Users would like to batch build a data pipeline to read complete table from Amazon Aurora DB instance and write to BigTable. 

  • Users would like to batch build a data pipeline to perform upserts on AuroraDB tables in batch 

  • Users should get relevant information from the tool tip while configuring the AuroraDB source and AuroraDB sink

    • The tool tip for the connection string should be customized specific to the database. 

    • The tool tip should describe accurately what each field is used for

  • Users should get field level lineage for the source and sink that is being used

  • Reference documentation be available from the source and sink plugins

User Stories

  • User should be able to install AuroraDB PosgreSQL source and sink plugins from the Hub

  • Users should have each tool tip accurately describe what each field does

  • Users should get field level lineage information for the AuroraDB PostgreSQL source and sink 

  • Users should be able to setup a pipeline avoiding specifying redundant information

  • Users should get updated reference document for AuroraDB PostgreSQL source and sink

  • Users should be able to read all the DB types

Deliverables 

  • Source code in data integrations org

  • Integration test code 

  • Relevant documentation in the source repo and reference documentation section in plugin

Relevant links 

Plugin Type

Batch Source
Batch Sink 
Real-time Source
Real-time Sink
Action
Post-Run Action
Aggregate
Join
Spark Model
Spark Compute

Design / Implementation Tips

Design

  • It is suggested to place plugin code under database-plugin repository to reuse existing database capabilities.

Source Properties

User Facing Name

Type

Description

Constraints

User Facing Name

Type

Description

Constraints

Label

String

Label for UI



Reference Name

String

Uniquely identified name for lineage

Required

Driver Name

String

Name of JDBC driver to use

Required

(defaults to postgres)

Cluster endpoint

String

URL of the current master instance of PostgreSQL cluster

Required

Port

Number

Port of PostgreSQL cluster's master instance

Optional

(defaults to 5432)

Database

String

Database name to connect

Required

Import Query

String

Query for import data

Valid SQL query

Username

String

DB username

Required

Password

String

User password

Required

Bounding Query

String

Returns max and min of split-By Filed

Valid SQL query

Split-By Field Name

String

Field name which will be used to generate splits



Number of Splits to Generate

Number

Number of splits to generate











Connection Arguments

Keyvalue

A list of arbitrary string tag/value pairs as connection arguments, list of properties 

https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters



Sink Properties

User Facing Name

Type

Description

Constraints

User Facing Name

Type

Description

Constraints

Label

String

Label for UI



Reference Name

String

Uniquely identified name for lineage

Required

Driver Name

String

Name of JDBC driver to use

Required

(defaults to postgres)

Host

String

URL of the current master instance of PostgreSQL cluster

Required

Port

Number

Port of PostgreSQL cluster's master instance

Optional

(defaults to 5432)

Database

String

Database name to connect

Required

Username

String

DB username

Required

Password

Password

User password

Required

Connection Arguments

Keyvalue

A list of arbitrary string tag/value pairs as connection arguments, list of properties

https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters



Table Name

String

Name of a database table to write to

Requried


Test Case(s)

  • Test case #1

  • Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data. 

Pipeline #1

Pipeline #2

Created in 2020 by Google Inc.