HBase Sink

Plugin version: 2.11.0

Writes records to a column family in an HBase table with one record field mapping to the rowkey, and all other record fields mapping to table column qualifiers. This sink differs from the Table sink in that it does not use CDAP datasets, but writes to HBase directly.

The sink is used whenever you need to write to an HBase table in batch. For example, you may want to periodically dump the contents of a relational database into an HBase table.

Configuration

Property

Macro Enabled?

Description

Property

Macro Enabled?

Description

Reference Name

No

Required. This will be used to uniquely identify this sink for lineage, annotating metadata, etc.

HBase Table Name

Yes

Required. The name of the table to write to. Note: Prior to running the pipeline, this table should already exist.

HBase Column Family

Yes

Required. The name of the column family to write to.

Zookeeper Quorum String

Yes

Optional. The ZooKeeper quorum for the hbase instance you are writing to. This should be a comma-separated list of hosts that make up the quorum. You can find the correct value by looking at the hbase.zookeeper.quorum setting in your hbase-site.xml file. This value defaults to 'localhost'.

Zookeeper Client Port

Yes

Optional. The client port used to connect to the ZooKeeper quorum. You can find the correct value by looking at the hbase.zookeeper.quorum setting in your hbase-site.xml. This value defaults to 2181.

Row Field Name

No

Required. Field name indicating that the field value should be written as the rowkey instead of written to a column. The field name specified must be present in the schema, and must not be nullable.

Parent Node of HBase in Zookeeper

No

Optional. The parent node of HBase in ZooKeeper. You can find the correct value by looking at the hbase.zookeeper.quorum setting in your hbase-site.xml. This value defaults to '/hbase'.

Output Schema

No

Required. Schema of records written to the table. Record fields map to row columns. For example, if the schema contains a field named 'user' of type string, the value of that field will be written to the 'user' column. Only simple types are allowed (boolean, int, long, float, double, bytes, string).

Example

This example writes to the attr column family of an HBase table named users:

Property

Value

Property

Value

Reference Name

hbasesink

HBase Table Name

users

HBase Column Family

attr

Zookeeper Quorum String

host1,host2,host3

Zookeeper Client Port

2181

Row Field Name

id

Parent Node of HBase in Zookeeper

/hbase

It takes records with this schema as input:

field name

type

field name

type

id

long

name

string

birthyear

int

The id field will be used as the rowkey when writing to the table. The name and birthyear record fields will be written to column qualifiers named name and birthyear.

Created in 2020 by Google Inc.