DB Export Netezza

DB Export Netezza

Introduction 

A source Plugin in Hydrator is needed to efficiently export data from Netezza.   Netezza includes strategies for exporting data including the use of transient external tables,  and a command line tool NZSQL that can export data to a flat file. 

Use-case

A Hydrator User would like to incorporate Netezza data into a pipeline using a source plugin that does not require a JDBC connection to perform the export from Netezza.

 

User Stories

  • As a Hydrator User I want a plugin that I can use to export data from Netezza so that I can use the data in my pipeline.  

  • As a Hydrator User I want a Netezza plugin that exports data efficiently using existing Netezza tools. 

  • As a Hydrator User I want the export data capability of the Netezza plugin to be based on a sql query that I issue.  

  • User should be able to specify credentials.

  • Passwords should not be viewable in plain text from inside pipeline viewer or hydrator studio.  

  • User should be able to specify Netezza Instance

  • User should be able to specify location of nzqsql Utility. 

  • User should be able to specify type of output.

  • User should be able to specify location of output files.  

  • User should know of connectivity errors, or malformed queries/output identifier.

Example

Example for how the plugin should work

Implementation Tips

NZSQL is a command line utility that can run remotely to allow exporting.   The output format can be specified to a flat file with specified delimiter or an external table.

 

nzsql information: Netezza Docs

External Table: Netezza Docs

Command example to output to a flat file: nzsql -d {dbname} -u {username} -h {host} -c "select * from table_with_90_columns " -t -F -o output.csv

 

Command example to use External Table: CREATE EXTERNAL TABLE '/tmp/export.csv' USING (DELIM ',') AS
SELECT foo.x, bar.y, bar.dt FROM foo, bar WHERE foo.x = bar.x;

 

Design 

Design:

Input Json Format:

{ "name": "Netezza", "type": "batchsource", "properties": { "userName": "cdap", "password": "****", "host": "localhost", "nzsqlPath": "/opt/netezza", "db": "user", "outputType": "File", "delimiter": ",", "outputPath": "/home/cdap/netezza", "sqlQuery": "select * from table_with_90_columns" } }

Properties:

  • userName: User name for user db.

  • password: Password for user db.

  • host: Hostname or IP of the Netezza instance.

  • nzsqlPath: Path of nzsql utility.

  • db: Name of the DB from on which sql query needs to be executed.

  • outputType: Type of the output. It may be Flat file or External table.

  • delimiter: Delimiter needs to be specified which can be used while writing records in a flat file.

  • outputPath: Path of the output files.

  • sqlQuery: Sql query which needs to be executed on the specified db name to export the data.

 

Approach:

  • Ganymed SSH-2 library can be used to connect to the Netezza instance to run NZSQL utility as specified by the user.

  • The output format can only be among : Flat files(with specified delimiters) or external table.

  • Assuming the sql query provided by the user will be in below format:

          For example:
          1. nzsql -d {dbname} -u {username} -h {host} -c "select * from table_with_90_columns " -t -F -o output.csv
          2. CREATE EXTERNAL TABLE '/tmp/export.csv' USING (DELIM ',') AS SELECT foo.x, bar.y, bar.dt FROM foo, bar WHERE foo.x = bar.x;

  • User will only specify the query part like "select * from table_with_90_columns" or "SELECT foo.x, bar.y, bar.dt FROM foo, bar WHERE foo.x = bar.x" in "sqlQuery" input and rest of the command has to be created at the plugin side using the inputs provided by the user.

  • In the host input, user will specify the hostname for the Netezza machine instance.

Table of Contents

Checklist

User stories documented 
User stories reviewed 
Design documented 
Design reviewed 
Feature merged 
Examples and guides 
Integration tests 
Documentation for feature 
Short video demonstrating the feature

 

Created in 2020 by Google Inc.