MySQL Replication Source

The MySQL Replication Source will replicate all of the row-level changes in the databases on MySQL server.fefe

Setting up MySQL

Enable the binlog

This is done in the MySQL server configuration file and will look similar to this:

server-id = 1 log_bin = mysql-bin binlog_format = row binlog_row_image = full expire_logs_days = 10

Enable GTIDs (optional)

The MySQL server can be configured to use GTID-based replication. Using GTIDs greatly simplifies replication and makes it possible to easily confirm whether masters and slaves are consistent. Note that if you’re using an earlier version before MySQL 5.6.5, you will not be able to enable GTIDs.

gtid_mode = on enforce_gtid_consistency = on

Set up Session Timeouts (optional)

When initial snapshots of very large databases are executed, it is possible that an established connection will timeout while reading the content of the database tables. We can increase following configs to deal with that:

interactive_timeout = <duration-in-seconds> wait_timeout = <duration-in-seconds>

Create a MySQL User

A MySQL user must be defined with all the following permissions on any database which wants to be replicated: SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE and REPLICATION CLIENT.

If using a hosted option such as Amazon RDS or Amazon Aurora that do not allow a global read lock, table-level locks are used to create the consistent snapshot. In this case, you need to also grant LOCK_TABLES permissions to the user that you create.

Setting up the JDBC Driver

If it is not already installed, instructions for installing the MySQL JDBC driver can be found on the Hub. The MySQL JDBC driver installed should be version 8 or above.

Plugin Properties

Property

Macro Enabled?

Description

Property

Macro Enabled?

Description

Host

No

Required. Hostname of the MySQL server to read from.

Port

No

Required. Port to use to connect to the MySQL server.

JDBC Plugin Name

No

Required. Identifier for the MySQL JDBC driver, which is the name used while uploading the MySQL JDBC driver.

Database Name

No

Required. Name of the database to replicate data from.

User

No

Required. Username to use to connect to the MySQL server. Actual account used by the source while connecting to the MySQL server will be of the form 'user_name'@'%' where user_name is this field.

Password

Yes

Required. Password to use to connect to the MySQL server.

Note: If you use a macro for the password, it must be in the Secure Store. If it’s not in the secure store, the Replication job fails. For more information, see Using Secure Keys.

Consumer ID

No

Optional. Unique numeric ID to identify this origin as an event consumer. This number cannot be the same as another replication job that is reading from the server, and it cannot be the same as the server-id for any MySQL slave that is replicating from the server. By default, random number will be used.

Server Timezone

No

Optional. Time zone of the MySQL server. This is used when converting dates into timestamps.

Replicate Existing Data

No

Optional. Whether to replicate existing data from the source database. By default, pipeline will replicate the existing data from source tables. If set to false, any existing data in the source tables will be ignored and only changes happening after the pipeline started will be replicated.

Schema Mapping

For information about data type conversions from supported source databases to the BigQuery destination, see https://cloud.google.com/data-fusion/docs/reference/replication-data-types#mysql .

Schema Evolution

This section lists the data definition language (DDL) operations supported during MySQL replication.

DDL Operation

Supported?

Create table

Yes

(New table is picked up dynamically when no tables are selected in the pipeline config) 

Rename table

No

Truncate table

Yes

Drop table

Yes

Add nullable column

Yes

Add required column

No

Alter column to make it nullable

Yes

Alter column to make it required

No

Alter column type 

No

Rename column

No

Drop column

No

Troubleshooting

If the replication job is able to start snapshotting the data, but fails when it switches over to read from the binlog with errors in the log like:

This is most likely caused by the replication user using an incompatible password type. This is especially common starting from MySQL 8 on up. To fix this, run the following command:

to change the user to use a MySQL native password.

Created in 2020 by Google Inc.