...
Vertica Bulk Import Action plugin gets executed after successful mapreduce MapReduce or spark Spark job. It reads all the files in a given directory and bulk imports contents of those files into vertica Vertica table.
Configuration
Property | Macro Enabled? | Description |
---|---|---|
Username | Yes | Optional. User identity for connecting to the specified database. Required for databases that need authentication. Optional for databases that do not require authentication. |
Password | Yes | Optional. Password to use to connect to the specified database. Required for databases that need authentication. Optional for databases that do not require authentication. |
File Path | Yes | Required. Directory or file path which needs to be loaded to database. |
Copy Statement level | Yes | Required. Copy statement level used by the plugin. If Basic is selected, copy statement will be generated automatically. Advanced option takes whole copy statement. Default is Basic. |
Auto commit after each file? | No | Required. Wether a commit needs to happen after every file from the directory or not. If specified false, commit will be applied after all the files are loaded. If specified true, it will be applied after each file. Default is false. |
Vertica Table name | Yes | Optional. Vertica table name to which data will be loaded. Table in Vertica must exist. Only works with Basic Copy Statement Level. |
Delimiter for the input file | Yes | Optional. delimiter in the input file. Only works with Basic Copy Statement Level. Default is , (comma). |
Copy Statement | Yes | Optional. Copy statement for vertica bulk load. Only works with Advanced Copy Statement level. |
Connection String | Yes | Required. JDBC connection string including database name. |
Usage Notes
The plugin can be configured to a read single file or multiple files from a configured HDFS directory and bulk load it into a Vertica table. The plugin uses the capabilities of Vertica to load the data from HDFS into Vertica. The command to load are issued through a Vertica JDBC driver. Vertica's java api VerticaCopyStream
is then used to write contents of the file as stdin stream
to a Vertica table.
...