Teradata Batch Source

The Teradata Batch source is available in the Hub.

Plugin version: 1.7.0

Reads from a Teradata using a configurable SQL query. Outputs one record for each row returned by the query.

Use this source when you need to read from a Teradata. For example, you may want to create daily snapshots of a database table by using this source and writing to a a table in BigQuery.

Configuration

Property

Macro Enabled?

Version Introduced

Description

Property

Macro Enabled?

Version Introduced

Description

Reference Name

No

 

Required. Name used to uniquely identify this source for lineage, annotating metadata, etc.

Driver Name

No

 

Required. Name of the JDBC driver to use.

Default is teradata.

Host

Yes

 

Required. Host that Teradata is running on.

Default is localhost.

Port

Yes

 

Required. Port that Teradata is running on.

Default is 1025.

Database

Yes

 

Required. Teradata database name.

Import Query

Yes

 

Required. The SELECT query to use to import data from the specified table. You can specify an arbitrary number of columns to import, or import all columns using *. The Query should contain the ‘$CONDITIONS’ string. For example, ‘SELECT * FROM table WHERE $CONDITIONS’. The ‘$CONDITIONS’ string will be replaced by Split-By Field Name field limits specified by the bounding query. The ‘$CONDITIONS’ string is not required if Number of Splits to Generate is set to 1.

Bounding Query

Yes

 

Optional. Bounding Query should return the min and max of the values of the ‘splitBy’ field. For example, ‘SELECT MIN(id),MAX(id) FROM table’. Not required if Number of Splits to Generate is set to 1.

Split-By Field Name

Yes

 

Optional. Field Name which will be used to generate splits. Not required if Number of Splits to Generate is set to 1.

Number of Splits to Generate

Yes

 

Optional. Number of splits to generate.

Default is 1.

Fetch Size

Yes

6.6.0/1.7.0

Optional. The number of rows to fetch at a time per split. Larger Fetch Size can result in faster import with the trade-off of higher memory usage.

Default is 1000.

Username

Yes

 

Optional. User identity for connecting to the specified database.

Password

Yes

 

Optional. Password to use to connect to the specified database.

Connection Arguments

Yes

 

Optional. A list of arbitrary string key/value pairs as connection arguments. These arguments will be passed to the JDBC driver as connection arguments for JDBC drivers that may need additional configurations.

Schema

 

 

Required. The schema of records output by the source. This will be used in place of whatever schema comes back from the query. However, it must match the schema that comes back from the query, except it can mark fields as nullable and can contain a subset of the fields.

Example

Suppose you want to read data from Teradata database named “prod” that is running on “localhost” port 1025, as “postgres” user with “postgres” password (Ensure that the driver for Teradata is installed. You can also provide driver name for some specific driver, otherwise “teradata” will be used), then configure plugin with:

Property

Value

Property

Value

Reference Name

src1

Driver Name

teradata

Host

localhost

Port

1025

Database

prod

Import Query

select id, name, email, phone from users

Number of Splits to Generate

1

Username

dbc

Password

dbc

For example, if the ‘id’ column is a primary key of type int and the other columns are non-nullable varchars, output records will have this schema:

Field

Type

Field

Type

id

int

name

string

email

string

phone

string

Data Types Mapping

Teradata specific data types mapped to string and can have multiple input formats and one ‘canonical’ output form. To figure out proper formats, see Teradata data types documentation.

Teradata Data Type

CDAP Schema Data Type

Teradata Data Type

CDAP Schema Data Type

BYTEINT

INT

SMALLINT

INT

INTEGER

INT

BIGINT

LONG

DECIMAL/NUMERIC

DECIMAL

FLOAT/REAL/DOUBLE PRECISION

DOUBLE

NUMBER

DECIMAL

BYTE

BYTES

VARBYTE

BYTES

BLOB

BYTES

CHAR

STRING

VARCHAR

STRING

CLOB

STRING

DATE

DATE

TIME

TIME_MICROS

TIMESTAMP

TIMESTAMP_MICROS

TIME WITH TIME ZONE

TIME_MICROS

TIMESTAMP WITH TIME ZONE

TIMESTAMP_MICROS

INTERVAL YEAR

STRING

INTERVAL YEAR TO MONTH

STRING

INTERVAL MONTH

STRING

INTERVAL DAY

STRING

INTERVAL DAY TO HOUR

STRING

INTERVAL DAY TO MINUTE

STRING

INTERVAL DAY TO SECOND

STRING

INTERVAL HOUR

STRING

INTERVAL HOUR TO MINUTE

STRING

INTERVAL HOUR TO SECOND

STRING

INTERVAL MINUTE

STRING

INTERVAL MINUTE TO SECOND

STRING

INTERVAL SECOND

STRING

ST_Geometry

STRING

Created in 2020 by Google Inc.