Parse XML to JSON directive
The PARSE-XML-TO-JSON directive parses an XML document into a JSON structure. The directive operates on an input column of type string. Application of this directive transforms the XML into a JSON document, simplifying further parsing using the PARSE-AS-JSON directive.
Syntax
parse-xml-to-json :column [depth] :keepStrings [boolean]
column
is the name of the column in the record that is an XML document.depth
indicates the depth at which the XML document parsing should terminate processing.keepStrings
OPTIONAL boolean value that if true, then values will not be coerced into boolean or numeric values and will instead be left as strings. The default value isfalse
.
Note: keepStrings config was introduced in CDAP 6.10.1.
Usage Notes
The PARSE-XML-TO-JSON directive efficiently parses an XML document and presents it as a JSON object for further transformation.
The XML document contains elements, attributes, and content text. A sequence of similar elements is turned into a JSON array, which can then be further parsed using the PARSE-AS-JSON directive.
During parsing, comments, prologs, DTDs, and <[[ ]]>
notations are ignored.
Created in 2020 by Google Inc.