Hive Bulk Import Action

The Hive Bulk Import action plugin is available in the Hub.

Plugin version: 1.9.0-1.1.0

Imports data from an HDFS directory/file into a Hive table. The Hive Import Action imports data from HDFS by executing provided Hive Load Statement. Local file storage is not allowed because a pipeline can run on any machine. If LOCAL file storage option is provided, pipeline deployment fails at publish time. The Hive Bulk Import action only accepts Hive LOAD statements. If any other Hive query is provided, pipeline publish will fail. If the Load command is executed successfully, all the files in the directory will be moved, not copied, to a Hive/warehouse directory.

Important: The Hive Bulk Import action works with Hive 2.3.3.

Configuration

Property

Macro Enabled?

Description

Property

Macro Enabled?

Description

Hive Metastore Username

Yes

User identity for connecting to the specified hive database. Required for databases that need authentication. Optional for databases that do not require authentication.

Hive Metastore Password

Yes

Password to use to connect to the specified database. Required for databases that need authentication. Optional for databases that do not require authentication.

JDBC Connection String

Yes

Required. JDBC connection string including database name. Use auth=delegationToken. The CDAP platform will provide appropriate delegation token while running the pipeline.

Statement to Load data into Hive

Yes

Required. Load command to load files data into a Hive table. LOCAL option in LOAD command is not available.

Example

This example connects to a Hive database using the specified JDBC Connection String, which means it will connect to the ‘mydb’ database of a Hive instance running on ‘localhost’ and run the load query. This plugin will read all the files from HDFS path /tmp/hive and load data to table testTable.

Property

Value

Property

Value

Hive Metastore Username

username

Hive Metastore Password

password

JDBC Connection String

jdbc:hive2://localhost:10000/mydb;auth=delegationToken

Statement to Load data into Hive

LOAD DATA INPATH '/tmp/hive' INTO TABLE testTable

Created in 2020 by Google Inc.