Versions Compared
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Introduction
A plugin that can efficiently export data from Oracle to be used as a source in Hydrator pipelines. Oracle includes command line tools to export data that can be utilized to perform this task.
User Stories
- As a Hydrator User I want to export data from Oracle to be used in my hydrator Pipeline.
- As a Hydrator User I want a Oracle Export plugin that exports data efficiently using existing existing Oracle tools.
- As a Hydrator User I want the export data capability of the Oracle plugin to be based on a sql query that I issue.
- User should be able to specify credentials.
- Passwords should not be viewable in plain text from inside pipeline viewer or hydrator studio.
- User should be able to specify Oracle Instance
- User should be able to specify location of EXP Utility.
- User should be able to specify type of output.
- User should be able to specify location of output files.
- User should know of connectivity errors, or malformed queries/output identifier.
Example
User wants to export the data from test table using filter on name='cask' i.e Select * from test where name='cask';
Plugin configurations:
"oracleServerHostname": "example.com",
"oracleServerPort": "22",
"serverUsernameoracleServerUsername": "oracle",
"serverPasswordoracleServerPassword": "oracle@123",
"dbUsername": "system",
"dbPassword": "cask",
"oracleHome": "/u01/app/oracle/product/11.2.0/xe",
"oracleSID": "cask",
"queryToExecute": "select * from test where name='cask';"
"columnSeparatorpathToWriteFinalOutput" : ","
"outputFilename" : "results/tmp/data.csv"
"destinationDirectoryformat" : "file:///home/cdap/export/csv"
Implementation Tips
Design
Design:
Code Block | ||
---|---|---|
| ||
{ "name": "OracleExportAction", "plugin": { "name": "OracleExportAction", "type": "action", "label": "OracleExportAction", "artifact": { "name": "core-plugins", "version": "1.4.0-SNAPSHOT", "scope": "SYSTEM" }, "properties": { "oracleServerHostname": "example.com", "oracleServerPort": "22", "serverUsernameoracleServerUsername": "oracle", "serverPasswordoracleServerPassword": "oracle@123", "dbUsername": "system", "dbPassword": "cask", "oracleHome": "/u01/app/oracle/product/11.2.0/xe", "oracleSID": "cask", "queryToExecute": "select * from test where name='cask';" "columnSeparatorpathToWriteFinalOutput" : "," "outputFilename" : "results/tmp/data.csv" "destinationDirectoryformat" : "file:///home/cdap/export/csv" } } |
oracleServerHostname:Host nameHostname of the remote DB machine where the data dump command is to be executed
oracleServerPort:Port of the remote DB machine to connect to.Defaults to 22
serverUsernameoracleServerUsername:User name used to connect to Username for remote DB host
serverPasswordoracleServerPassword:Password used to connect to for remote DB host
dbUsername:UserUsername name to connect to oracle DB
dbPassword:Password to connect to oracle DB
oracleHome:Path of the ORACLE_HOME
oracleSID:Oracle SID
queryToExecute: Query to be executed to export the data.
columnSeparator: Column separator to be used in the output file.
outputFilename: Output file name in which to export the data.
destinationDirectory: Absolute path of directory on the current machine where the file is to be copied. (Can be local or hdfs.).Query should have ';' at the end.
pathToWriteFinalOutput: Path where output file to be exported
format: Format of the output file
Plugin would perform below stepsrun below sequence of commands in one session:
1.SSH to the box using the provided $serverUsername and $userPassword.
2.Run below commands:
a.export ORACLE_HOME = $oracleHome;b.export and ORACLE_SID = $oracleSID
c2. The plugin would internally create a temporary sql file(or else we can take sql script as an input from user to avoid temp file creation and deletion ) with the above contents and run the below command:
Eg - "echo \"@script file(/tmp/test.sql
"+"\" | $oracleHome+/bin/sqlplus system/cask@cask";content of test.sql would be:) and add below content.We can take the path of the tmp file as a config from the user or use the home folder of the logged in user where program would always have the access.
set colsep ","
set linesize 999910000
set trimspool on
set heading offnewpage none
set pagesize 0
set wrap off
set feedback heading off
spool /tmp/results.csvon
select * from test where name='cask';
spool off
NOTE : Cannot directly use spool in sqlplus prompt since it prints the query as well along with the data in the final output file.
d. Copy the exported file from the remote machine to the folder specified by the user on the current machine.
e. Remove temporary .sql and dump file.
exit
3.execute $oracleHome/bin/sqlplus -s $dbUsername/$dbPassword@$oracleSID @/tmp/test.sql
4.Read the outstream and write into the specified output file on local.Before writing,multiline trailing spaces remover regex will be applied.
5.Since sqlplus spool generates trailing spaces before and after the column separators,sed command will be applied to remove the spaces.
Table of Contents
Table of Contents style circle
Checklist
- User stories documented
- User stories reviewed
- Design documented
- Design reviewed
- Feature merged
- Examples and guides
- Integration tests
- Documentation for feature
- Short video demonstrating the feature