Parse as CSV directive

The PARSE-AS-CSV is a directive for parsing an input record as comma-separated values.

Syntax

parse-as-csv :col ['delimiter'] [<header=true/false>]

The column specifies the column in the record that should be parsed as CSV using the specified delimiter.

This directive supports reading the first record as a header. However, this should not be used in most situations, because the header is not guaranteed to be the first record processed. For example, this can happen when an input file is broken up into multiple pieces. Instead, make sure the pipeline source is configured to use the correct schema. If your input files contain headers, configure the source to skip the header so that it is not read as data.

Examples

Consider a single line from a consumer complaint CSV file. Each line of the CSV file is added as a record:

{ "body": "07/29/2013,Consumer Loan,Vehicle Loan,Managing the loan or lease,,,,Wells Fargo & Company,VA,24540,,N/A,Phone,07/30/2013,Closed with explanation,Yes,No,468882" }

Applying this directive:

parse-as-csv :body ','

results in this record:

Using this record, with a header as the first record, as an example:

Applying this directive:

results in this record:

 

Created in 2020 by Google Inc.