Hive Bulk Import Action (Deprecated)
This plugin is no longer available as of July 26, 2024.
Imports data from an HDFS directory/file into a Hive table. The Hive Import Action imports data from HDFS by executing provided Hive Load Statement. Local file storage is not allowed because a pipeline can run on any machine. If LOCAL
file storage option is provided, pipeline deployment fails at publish time. The Hive Bulk Import action only accepts Hive LOAD
statements. If any other Hive query is provided, pipeline publish will fail. If the Load command is executed successfully, all the files in the directory will be moved, not copied, to a Hive/warehouse directory.
Important: The Hive Bulk Import action works with Hive 2.3.3.
Configuration
Property | Macro Enabled? | Description |
---|---|---|
Hive Metastore Username | Yes | User identity for connecting to the specified hive database. Required for databases that need authentication. Optional for databases that do not require authentication. |
Hive Metastore Password | Yes | Password to use to connect to the specified database. Required for databases that need authentication. Optional for databases that do not require authentication. |
JDBC Connection String | Yes | Required. JDBC connection string including database name. Use |
Statement to Load data into Hive | Yes | Required. Load command to load files data into a Hive table. |
Example
This example connects to a Hive database using the specified JDBC Connection String, which means it will connect to the ‘mydb’ database of a Hive instance running on ‘localhost’ and run the load query. This plugin will read all the files from HDFS path /tmp/hive
and load data to table testTable
.
Property | Value |
---|---|
Hive Metastore Username |
|
Hive Metastore Password |
|
JDBC Connection String |
|
Statement to Load data into Hive |
|
Created in 2020 by Google Inc.