Introduction
A separate database plugin to support Jethro Data features and configurations.
Use case
- Users can choose and install Jethro Data plugin.
- Users should see Jethro Data logo on plugin configuration page for better experience.
- Users should get relevant information from the tool tip:
- The tool tip should describe accurately what each field is used for.
- Users should not have to specify any redundant configuration.
- Users should get field level lineage for the source and sink that is being used.
- Reference documentation should be updated to account for the changes.
- The source code for Jethro Data database plugin should be placed in repo under data-integrations.org.
User Storie
- Users should be able to install Jethro Data specific database plugin from the Hub.
- Users should have each tool tip accurately describe what each field does.
- Users should get field level lineage information for the Jethro Data plugin.
- Users should be able to setup a pipeline avoiding specifying redundant information.
- Users should get updated reference document for Jethro Data plugin.
- Users should be able to read all the DB types.
Plugin Type
- Batch Source
- Batch Sink
- Real-time Source
- Real-time Sink
- Action
- Post-Run Action
- Aggregate
- Join
- Spark Model
- Spark Compute
Design Tips
- Reference to the Jethro Data jdbc driver: https://jethro.io/driver-downloads
- Reference to the Jethro Data jdbc driver documentation: http://docs.jethro.io/display/JethroLatest/JDBC+Driver
Design
Jethro Data Overview
Customers use Jethro for interactive BI on Big Data. Jethro is a transparent middle tier that requires no changes to existing apps or data. It is self-driving with no maintenance required.
Jethro is compatible with BI tools like Tableau, Qlik and Microstrategy and is data source agnostic.
Jethro delivers on the demands of business users allowing for thousands of concurrent users to run complicated queries over billions of records while delivering the interactive speed that they expect.
Powerful Architecture
Jethro combines two systems to cover the widest range of queries with the highest performance: full indexing and auto cubes. Together they deliver the fastest query performance regardless of query type or repeatability.
Self Driving
Jethro requires no human maintenance or tuning. Indexes, auto cubes and query caches are automatically maintained and kept current by background services.
Source Properties
Section | User Facing Name | Widget Type | Description | Constraints |
---|---|---|---|---|
General | Label | textbox | ||
Reference Name | textbox | Required | ||
Driver Name | textbox | Required | ||
Host | textbox | Required | ||
Port | textbox | Required | ||
Instance | textbox | Required | ||
Import Query | textarea | |||
Bounding Query | textarea | |||
Credentials | Username | textbox | Required | |
Password | password | Required | ||
Advanced | Split-By Field Name | textbox | ||
Number of Splits to Generate | textbox |
Source Data Types Mapping
Jethro Data Types | CDAP Schema Data Types |
---|---|
INTEGER | int |
BIGINT | long |
FLOAT | float |
DOUBLE | double |
STRING | string |
TIMESTAMP | timestamp-micros |
Approach
Create a module jethro-plugin in database-plugins project, reuse existing database-plugins code if possible. Add Jethro-specific properties to configuration, add support for Jethro-specific data types. Update UI widgets JSON definitions.
Sample Pipeline
Please attach one or more sample pipeline(s) and associated data.
Releases
Release X.Y.Z