CloudSQL plugins

CloudSQL plugins

Introduction

Cloud SQL is a fully-managed database service that makes it easy to set up, maintain, manage, and administer your relational databases on Google Cloud Platform. You can use Cloud SQL with MySQL, PostgreSQL, or SQL Server (currently in beta). Cloud SQL plugins will allow CDAP users to read and write from/to their Cloud SQL instances without any technical knowledge.

Use case(s)

  • As an ETL developer, I would like to read my data in Cloud SQL, so that I can transform it using CDAP

  • As an ETL developer, I want to write the output of my pipeline to Cloud SQL, so that I can use the insights generated from my analytical processes to power my production database in Cloud SQL

User Stories

  • As a user, I would like to create a pipeline using a Cloud SQL source

  • As a user, I want to create a pipeline using a Cloud SQL sink

  • As a user, I want to only specify a query, project ID and instance name to connect to Cloud SQL, so that I don't have to remember complex JDBC connection string syntax

  • As a user, I want to execute a SQL query on Cloud SQL as part of the control flow in my pipeline

  • As a user, I want to execute a SQL query on Cloud SQL as a notification of my pipeline's completion

  • As a user, I want to create a multi-table source and sink for Cloud SQL so that I can read multiple tables at the same time

  • As a user, I want to connect to Cloud SQL over a proxy.

Plugin Type

Batch Source
Batch Sink 
Real-time Source
Real-time Sink
Action
Post-Run Action
Aggregate
Join
Spark Model
Spark Compute

Configurables

CloudSQL Postgres Batch Source

This section defines properties that are configurable for this plugin. 

Section

User Facing Name

Type

Description

Constraints

Optional?

Default

Section

User Facing Name

Type

Description

Constraints

Optional?

Default

Credentials

Service Account

Textbox

 

 

 

auto-detect

Database username

Textbox

The username to use to connect to the CloudSQL database

 

 

 

Database password

Password

The password to use to connect to the CloudSQL database

 

 

 

Cloud SQL properties

Instance name

Select

Select the Cloud SQL instance name

Can this be a select, or does it have to be a textbox

N

 

Import Query

Textarea

The Query that specifies the data to pull from CloudSQL Postgres

 

N

 

Advanced

Bounding query

Textarea

The query to use to derive the bounds (min and max) to use to generate the splits

 

Y

 

 

Split Column

Select

The column to split by

 

Y

 

 

Number of splits

Number

The number of splits to generate

 

Y

1

 

Additional Connection arguments

Keyvalue

A list of keyvalue pairs of connection arguments passed to CloudSQL.

 

 

 

This plugin should also expose all the configuration parameters exposed by the PostgreSQL database plugin, as long as CloudSQL exposes them. Please add them to the above table during development.

Additionally, this should handle all the Postgres Datatypes. The mappings are defined in PostgreSQL database plugin

CloudSQL MySQL Batch Source

This should be identical to the Postgres source in terms of configuration. However, it should have specific handling for all the MySQL Datatypes. All the type mappings in MySQL database plugin should be handled here. Similarly also, it should expose the connection parameters that the MySQL plugin exposes, provided they are available in CloudSQL.

CloudSQL SQLServer Batch Source

This should be identical to the Postgres source in terms of configuration. However, it should have specific handling for all the SQLServer Datatypes. All the type mappings in Microsoft SQL Server database plugin should be handled here. Similarly also, it should expose the connection parameters that the SQLServer plugin exposes, provided they are available in CloudSQL.

CloudSQL Postgres Batch Sink

This section defines properties that are configurable for this plugin. 

Section

User Facing Name

Type

Description

Constraints

Optional?

Default

Section

User Facing Name

Type

Description

Constraints

Optional?

Default

Credentials

Project ID

Textbox

 

 

 

auto-detect

Service Account

Textbox

 

 

 

auto-detect

Database username

Textbox

The username to use to connect to the CloudSQL database

 

 

 

Database password

Password

The password to use to connect to the CloudSQL database

 

 

 

Cloud SQL properties

Instance name

Select

Select the Cloud SQL instance name

Can this be a select, or does it have to be a textbox

N

 

Table name

Text

The table to write data to.

 

N

 

Advanced

Transaction Isolation Level

Select

Transaction isolation level for queries run by this sink

Possible values: TRANSACTION_NONE, TRANSACTION_UNCOMMITTED, TRANSACTION_COMMITTED, TRANSACTION_REPEATABLE_READ, TRANSACTION_SERIALIZABLE

Y

 

Connection Timeout

Number

The timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. The timeout is specified in seconds and a value of zero means that it is disabled

 

Y

 

Additional connection properties

Keyvalue

The number of splits to generate

 

Y

1

Similar to the source, this should also handle all the mappings, and expose any additional properties from the PostgreSQL database plugin

CloudSQL MySQL Sink

The properties should be similar to the PostgreSQL source, but it should expose any additional properties from the MySQL database plugin, and support all the type mappings from there as well

CloudSQL SQLServer Sink

The properties should be similar to the PostgreSQL source, but it should expose any additional properties from the Microsoft SQL Server database plugin, and support all the type mappings from there as well

CloudSQL Postgres Action

This section defines properties that are configurable for this plugin. 

Section

User Facing Name

Type

Description

Constraints

Optional?

Default

Section

User Facing Name

Type

Description

Constraints

Optional?

Default

Credentials

Project ID

Textbox

 

 

 

auto-detect

Service Account

Textbox

 

 

 

auto-detect

Database username

Textbox

The username to use to connect to the CloudSQL database

 

 

 

Database password

Password

The password to use to connect to the CloudSQL database

 

 

 

Cloud SQL properties

Instance name

Select

Select the Cloud SQL instance name

Can this be a select, or does it have to be a textbox

N

 

Database command

Textarea

The database command to run

 

N

 

Advanced

Additional Connection arguments

Keyvalue

A list of keyvalue pairs of connection arguments passed to CloudSQL.

 

Y

 

Connection Timeout

Select

The timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. The timeout is specified in seconds and a value of zero means that it is disabled

 

Y

 

Number of splits

Number

The number of splits to generate

 

Y

1

CloudSQL MySQL Action

Exposes similar configuration to the Postgres plugin, but also exposes any other prominent properties from MySQL database plugin

CloudSQL SQLServer Action

Exposes similar configuration to the Postgres plugin, but also exposes any other prominent properties from Microsoft SQL Server database plugin

Design / Implementation Tips

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

  • Some future work – HYDRATOR-99999

  • Another future work – HYDRATOR-99999

Test Case(s)

  • Test case #1

  • Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data. 

Pipeline #1

Pipeline #2

 

 

Table of Contents

Checklist

User stories documented 
User stories reviewed 
Design documented 
Design reviewed 
Feature merged 
Examples and guides 
Integration tests 
Documentation for feature 
Short video demonstrating the feature

Created in 2020 by Google Inc.