CDAP Table Dataset Batch Source (Deprecated)
Note: Datasets and the CDAP Table Dataset Batch Source are deprecated and will be removed in CDAP 7.0.0.
Reads the entire contents of a CDAP Table. Outputs one record for each row in the Table. The Table must conform to a given schema.
The source is used whenever you need to read from a table in batch. For example, you may want to periodically dump the contents of a CDAP Table to a relational database.
Configuration
Property | Macro Enabled? | Description |
---|---|---|
Name | Yes | Required. Table name. If the table does not already exist, it will be created. |
Row Field | Yes | Optional. Record field for which row key will be considered as value instead of row column. The field name specified must be present in the schema, and must not be nullable. |
Output Schema | Yes | Required. Schema of records read from the table. Row columns map to record fields. For example, if the schema contains a field named 'user' of type string, the value of that field will be taken from the value stored in the 'user' column. Only simple types are allowed (boolean, int, long, float, double, bytes, string). |
Example
This example reads from a Table named ‘users’:
{
"name": "Table",
"type": "batchsource",
"properties": {
"name": "users",
"schema": "{
\"type\":\"record\",
\"name\":\"user\",
\"fields\":[
{\"name\":\"id\",\"type\":\"long\"},
{\"name\":\"name\",\"type\":\"string\"},
{\"name\":\"birthyear\",\"type\":\"int\"}
]
}",
"schema.row.field": "id"
}
}
It outputs records with this schema:
field name | type |
---|---|
id | long |
name | string |
birthyear | int |
The ‘id’ field will be read from the row key of the table. The ‘name’ field will be read from the ‘name’ column in the table. The ‘birthyear’ field will be read from the ‘birthyear’ column in the table. Any other columns in the Table will be ignored by the source.
Created in 2020 by Google Inc.