Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

MarkLogic Server is a powerful software solution for harnessing your digital content all in a single database. MarkLogic enables you to build complex applications that interact with large volumes of JSON, XML, SGML, HTML, RDF triples, binary files, and other popular content formats. The unique architecture of MarkLogic ensures that your applications are both scalable and high-performance, delivering query results at search-engine speeds while providing transactional integrity over the underlying database. These plugins will allow you to integrate data in Marklogic with the rest of your data using CDAP.

User Storie(s)

  • As a pipeline developer, I would like to read data in Marklogic in batch using CDAP, so that I can integrate it easily with the rest of my data.
  • As a pipeline developer, I would like to write complex structures (XML, JSON, SGML, HTML, RDF triples, binary data, etc) to Marklogic in batch using CDAP, so that I do not have to develop custom code to load my data into Marklogic, and take advantage of the standardization that CDAP offers.
  • As a pipeline developer, I would like CDAP to support ELT in Marklogic, so that I can take advantage of Marklogic's powerful search and analytics features after loading the data, while still maintaining standardization and lineage in CDAP

Plugin Type

  •  Batch Source
  •  Batch Sink 
  •  Real-time Source
  •  Real-time Sink
  •  Action
  •  Post-Run Action
  •  Aggregate
  •  Join
  •  Spark Model
  •  Spark Compute

Configurables

This section defines properties that are configurable for this plugin. 

Marklogic batch source. 

CategoryUser Facing NameTypeDescriptionConstraints
BasicHosttextThe host running the Marklogic REST ServerShould validate URL
PortnumberThe port that the Marklogic REST Server listens on
UsertextThe user to perform operations as. The user should have appropriate read privileges/
PasswordpasswordThe password for the user
Authentication Typeradio buttonThe type of authentication to use - Digest or
Connection Typeradio butonThe type of connection to use - Direct or Gateway

Marklogic batch sink. 

CategoryUser Facing NameTypeDescriptionConstraints
BasicHosttextThe host running the Marklogic REST ServerShould validate URL
PortnumberThe port that the Marklogic REST Server listens on
UsertextThe user to perform operations as. The user should have appropriate read privileges/
PasswordpasswordThe password for the user
Authentication Typeradio buttonThe type of authentication to use - Digest or
Connection Typeradio butonThe type of connection to use - Direct or Gateway
AdvancedBatch sizenumberThe batch size for writing to Marklogic
Max retriesnumberThe maximum retries for requests to marklogic

Marklogic query executor action. 

CategoryUser Facing NameTypeDescriptionConstraints
BasicHosttextThe host running the Marklogic REST ServerShould validate URL
PortnumberThe port that the Marklogic REST Server listens on
UsertextThe user to perform operations as. The user should have appropriate read privileges/
PasswordpasswordThe password for the user
Authentication Typeradio buttonThe type of authentication to use - Digest or
Connection Typeradio butonThe type of connection to use - Direct or Gateway
QuerytextareaThe query to execute in Marklogic

Design / Implementation Tips

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

  • Some future work – HYDRATOR-99999
  • Another future work – HYDRATOR-99999

Test Case(s)

  • Test case #1
  • Test case #2

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data. 

Pipeline #1

Pipeline #2



Table of Contents

Table of Contents
stylecircle

Checklist

  •  User stories documented 
  •  User stories reviewed 
  •  Design documented 
  •  Design reviewed 
  •  Feature merged 
  •  Examples and guides 
  •  Integration tests 
  •  Documentation for feature 
  •  Short video demonstrating the feature