MarkLogic Server is a powerful software solution for harnessing your digital content all in a single database. MarkLogic enables you to build complex applications that interact with large volumes of JSON, XML, SGML, HTML, RDF triples, binary files, and other popular content formats. The unique architecture of MarkLogic ensures that your applications are both scalable and high-performance, delivering query results at search-engine speeds while providing transactional integrity over the underlying database. These plugins will allow you to integrate data in Marklogic with the rest of your data using CDAP.
User Storie(s)
As a pipeline developer, I would like to read data in Marklogic in batch using CDAP, so that I can integrate it easily with the rest of my data.
As a pipeline developer, I would like to write complex structures (XML, JSON, SGML, HTML, RDF triples, binary data, etc) to Marklogic in batch using CDAP, so that I do not have to develop custom code to load my data into Marklogic, and take advantage of the standardization that CDAP offers.
As a pipeline developer, I would like CDAP to support ELT in Marklogic, so that I can take advantage of Marklogic's powerful search and analytics features after loading the data, while still maintaining standardization and lineage in CDAP
Plugin Type
Batch Source
Batch Sink
Real-time Source
Real-time Sink
Action
Post-Run Action
Aggregate
Join
Spark Model
Spark Compute
Configurables
Marklogic batch source.
Category
User Facing Name
Type
Description
Constraints
Basic
Host
text
The host running the Marklogic REST Server
Should validate URL
Port
number
The port that the Marklogic REST Server listens on
Database
text
Database
Input Query
text
Query for data search
Credentials
User
text
The user to perform operations as. The user should have appropriate read privileges
Password
password
The password for the user
Connection
Authentication Type
radio button
The type of authentication to use - Digest or
Connection Type
radio button
The type of connection to use - Direct or Gateway
Advanced
Format
select
Type of document (AUTO/JSON/XML/TEXT/BLOB/DELIMITED), default: AUTO
Delimiter
text
Delimiter if the format is 'delimited'
Bounding Query
text
Query for splits generation
Max Splits
number
Maximum amount of splits
File Name Field
text
Field to store information about the file
Payload Field
text
Field to store data from Binary and Text files
Marklogic batch sink.
Category
User Facing Name
Type
Description
Constraints
Basic
Host
text
The host running the Marklogic REST Server
Should validate URL
Port
number
The port that the Marklogic REST Server listens on
User
text
The user to perform operations as. The user should have appropriate read privileges/
Password
password
The password for the user
Authentication Type
radio button
The type of authentication to use - Digest or
Connection Type
radio button
The type of connection to use - Direct or Gateway
Path
text
Path to document
Advanced
Batch size
number
The batch size for writing to Marklogic
Max retries
number
The maximum retries for requests to marklogic
Format
select
Type of document, default: JSON
Marklogic query executor action.
Category
User Facing Name
Type
Description
Constraints
Basic
Host
text
The host running the Marklogic REST Server
Should validate URL
Port
number
The port that the Marklogic REST Server listens on
User
text
The user to perform operations as. The user should have appropriate read privileges/