Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

CDAP replication relies on the cluster administrator setting up replication on HBase, HDFS,  Hive, and Kafka.

  • It is assumed that CDAP is only running on the master cluster.

  • It is assumed that you have not started CDAP before any of these steps.

...

Set up HDFS replication using the solution provided by your distribution. HDFS does not have true replication, but it is usually achieved by scheduling regular distcp jobs.

Hive

Set up replication for the database backing your Hive Metastore. Note that this will simply replicate the Hive metadata, which tables exist, table metadata, etc., but not the data itself. It is assumed you will not be running Hive queries on the slave until after a manual failover occurs.

For example, to set up MySQL 5.7 replication, follow the steps described at Setting Up Binary Log File Position Based Replication.

Kafka

Set up replication for the Kafka brokers you are using. Kafka MirrorMaker is the most common solution. See Mirroring data between clusters and Kafka mirroring (MirrorMaker) for additional information.