Goal
This is a source plugin that would allow users to read and process mainframe files defined using COBOL Copybook. This should be basic first implementation.
...
Input Format implementation : here
Design
- If "AbstractFieldValue"(JRecord) type is binary, the data will be encoded to Base64 format.
Integer.parseInt(Base64.decodeBase64(Base64.encodeBase64(value.toString().getBytes())).toString());
or
Base64.decodeInteger(Base64.encodeInteger(value.asBigInteger()));
It will depend on the field data type(int or BigInteger)
JRecord AbstractFieldValue type to JAVA primitive data type
JRecord AbstractFieldValue type | JAVA primitive data type | Description | Comments |
---|---|---|---|
char, char just right , char null terminated, char null padded | java.lang.String | ||
num left justified, num right justified , num zero padded | int | ||
binary int, binary int positive, positive binary int fields | int | decode it using BASE64 format and then retrieve it. Integer.parseInt(Base64.decodeBase64( Base64.encodeBase64( filedValue.toString().getBytes())).toString()) | The Base64.decodeBase64() accepts either binary or String data, and therefore, first encoding and then decoding the values Decoding it directly results in improper values |
decimal, Mainframe Packed Decimal, Mainframe Packed Decimal, Mainframe Zoned Numeric | java.math.BigDecimal | Since CDAP Schema.Type does not have a BigDecimal data type, converting everything to DOUBLE | |
Binary Integer Big Endian (Mainframe, AIX etc) -
| java.math.BigDecimal | decode it using BASE64 format and then retrieve it. Base64.decodeInteger(Base64.encodeInteger(filedValue.asBigInteger())) | The Base64.decodeBase64() accepts either binary or String data, and therefore, first encoding and then decoding the values Decoding it directly resulted in improper values Since CDAP Schema.Type does not have BigInteger converting this to LONG |
Boolean / (Y/N) | java.lang.Boolean | ||
Default | java.lang.String |
Examples
Properties :
referenceName : This will be used to uniquely identify this source for lineage, annotating metadata, etc.
copybookContents : Contents of the COBOL copybook file which will contain the data structure
binaryFilePath : Complete path of the .bin to be read.This will be a fixed length binary format file,that matches the copybook.
drop : Comma-separated list of fields to drop. For example: 'field1,field2,field3'.
maxSplitSize : Maximum split-size for each mapper in the MapReduce. \n Job. Defaults to 128MB.
...