Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note: Datasets and the Dynamic Multiple Fileset

...

Sink are deprecated and will be removed in CDAP 7.0.0.

This plugin is normally used in conjunction with the Multiple Database Table batch source to write records from multiple databases into multiple filesets in text format. Each fileset it writes to will contain a single ‘ingesttime’ partition, which will contain the logical start time of the pipeline run. The plugin expects that the filsets it needs to write to will be set as pipeline arguments, where the key is ‘multisink.[fileset]’ and the value is the fileset schema. Normally, you rely on the Multiple Database Table source to set those pipeline arguments, but they can also be manually set or set by an Action plugin, such as an HTTP Argument Setter, in your pipeline. The sink will expect each record to contain a special Split Field that will be used to determine which records are written to each fileset. For example, suppose the split field is ‘tablename’. A record whose ‘tablename’ field is set to ‘activity’ will be written to the ‘activity’ fileset.

Configuration

Property

Macro Enabled?

Description

Split Field

No

Optional. The name of the field that will be used to determine which fileset to write to.

Default is ‘tablename’.

Field Delimiter

No

Optional. The delimiter used to separate record fields. Defaults to the tab character.

Example

This example uses a comma to delimit record fields:

...