Cassandra Batch Source

The Cassandra batch source plugin is available in the Hub.

Plugin version: 2.3.4

Use the Cassandra source plugin when you need to read data from Apache Cassandra. For example, you might want to read in a column family from Cassandra and store the data in a BigQuery table.

The Cassandra source plugin will select the rows returned by the user's query and convert each row to a structured record using the schema specified by the user.

Note: Apache Cassandra v. 2.1.0 is the only supported version of Apache Cassandra.

Configuration

Property

Macro Enabled?

Description

Property

Macro Enabled?

Description

Reference Name

No

Required. Used to uniquely identify this source for lineage, annotating metadata, etc.

Initial Address

Yes

Required. The initial address to connect to.

Port

Yes

Optional. The RPC port for Cassandra. Check the configuration to make sure that start_rpc is true in cassandra.yaml.

Username

Yes

Optional. The username for the keyspace (if one exists). If this is not empty, then you must supply a password.

Password

Yes

Optional. The password for the keyspace (if one exists). If this is not empty, then you must supply a username. 

Other Properties

No

Optional. Any extra properties to include. The property-value pairs should be comma-separated, and each property should be separated by a colon from its corresponding value.

Keyspace

Yes

Required. The keyspace to select data from.

Partitioner

Yes

Required. The partitioner for the keyspace.

Column Family

Yes

Required. The column family or table to select data from.

CQL Query

Yes

Required. The query to select data on. 

Example

This example connects to Apache Cassandra, which is running locally, and reads in records in the specified keyspace (megacorp) and column family (employee) which match the query to (in this case) select all records. All data from the column family will be read on each run.

Property

Value

Property

Value

Initial Address

localhost

Port

9160

Keyspace

megacorp

Partitioner

org.apache.cassandra.dht.Murmur3Partitioner

Column Family

employees

Query

select * from employees where token(id) > ? and token(id) <= ?

Created in 2020 by Google Inc.