Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Goal 

This is a source plugin that would allow users to read and process mainframe files defined using COBOL Copybook. This should be basic first implementation.

...

Input Format implementation : here 

Design

  • Assumptions:
    1. .cbl file will have the schema in data structure
    2. Both data file and .cbl files would reside on HDFS

  • For each AbstractLine read from the data file if the fields binary or binaryFile is true, the data will be encoded to Base64 format while reading
    for (ExternalField field : externalRecord.getRecordFields()) {
    AbstractFieldValue filedValue = line.getFieldValue(field.getName());
    if (filedValue.isBinary()) {
    value.put(field.getName(), new String(Base64.decodeBase64(Base64.encodeBase64String(
    filedValue.toString().getBytes()))));
    } else {
    value.put(field.getName(), filedValue.toString());
    }
    }

...

"name": "CopyBookReader",
"type": "batchsource",
"properties": {

"schema": "{

\"type\":\"record\",

\"name\":\"etlSchemaBody\",

\"fields\":[

{

\"name\":\"DTAR020-KEYCODE-NO\",

\"type\":\"int\"

},

...

{

\"name\":\"DTAR020-QTY-SOLD\",

\"type\":[\"int\",\"null\"]

},

{

\"name\":\"DTAR020-SALE-PRICE\",

\"type\":[\"double\",\"null\"]

}

]

}",

"referenceName": "CopyBook",

"copybookContents":

"000100* \

...

n

000200* DTAR020 IS THE OUTPUT FROM DTAB020 FROM THE IML \

...

n

000300* CENTRAL REPORTING SYSTEM \

...

n

000400* \

...

n

000500* CREATED BY BRUCE ARTHUR 19/12/90 \

...

n

000600* \

...

n

000700* RECORD LENGTH IS 27. \

...

n

000800* \

...

n

000900 03 DTAR020-KCODE-STORE-KEY. \

...

n

001000 05 DTAR020-KEYCODE-NO PIC X(08). \

...

n

001100 05 DTAR020-STORE-NO PIC S9(03) COMP-3. \

...

n

001200 03 DTAR020-DATE PIC S9(07) COMP-3. \

...

n

001300 03 DTAR020-DEPT-NO PIC S9(03) COMP-3. \

...

n

001400 03 DTAR020-QTY-SOLD PIC S9(9) COMP-3. \

...

n

001500 03 DTAR020-SALE-PRICE PIC S9(9)V99 COMP-3. ",

"binaryFilePath": "file:///home/cdap/cdap/DTAR020_FB.bin",

...