Developing a UDD

There is one simple interface for developing your customized directive. The simple interface io.cdap.wrangler.api.Directive can be used for developing user defined directive.

Simple API

Building a UDD with the simpler UDD API involves nothing more than writing a class with four function (evaluate) and few annotations. Here is an example:

@Plugin(type = UDD.Type) @Name(SimpleUDD.NAME) @Categories(categories = {"example", "simple"}) @Description("My first simple user defined directive") public SimpleUDD implements Directive { public static final String NAME = "my-simple-udd"; public UsageDefinition define() { ... } public void initialize(Arguments args) throws DirectiveParseException { ... } public List<Row> execute(List<Row> rows, ExecutorContext context) throws RecipeException, ErrorRowException { ... } public void destroy() { ... } }

The following is detailed explanation for the above code:

  • @Plugin annotation tells the framework the type of plugin this class represents.

  • @Name annotation provides the name of the plugin. For this type, the directive name and plugin name are the same.

  • @Description annotation provides a short description of the directive.

  • @Categories annotation provides the category this directive belongs to.

  • UsageDefition define() { } Defines the arguments that are expected by the directive.

  • void initialize(Arguments args) { } Invoked before configuring a directive with arguments parsed by the framework based on the define() methods UsageDefintion.

  • execute(...) { } Every Row from previous directive execution is passed to this plugin to execute.

Testing a simple UDD

Because the UDD is a simple three functions class, you can test it with regular testing tools, like JUnit.

public class SimpleUDDTest { @Test public void testSimpleUDD() throws Exception { TestRecipe recipe = new TestRecipe(); recipe("parse-as-csv :body ',';"); recipe("drop :body;"); recipe("rename :body_1 :simpledata;"); recipe("!my-simple-udd ..."); TestRows rows = new TestRows(); rows.add(new Row("body", "root,joltie,mars avenue")); RecipePipeline pipeline = TestingRig.pipeline(RowHash.class, recipe); List<Row> actual = pipeline.execute(rows.toList()); } }

Building a UDD Plugin

There is nothing much to be done here, this example repository includes a maven POM file that is pre-configured for building the directive JAR. All that a developer does it build the project using the following command.

mvn clean package

This would generate two files:

  • Artifact: my-simple-udd-1.0-SNAPSHOT.jar

  • Artifact Configuration: my-simple-udd-1.0-SNAPSHOT.json

Deploying Plugin

There are multiple ways the custom directive can be deployed to CDAP. The two popular ways are through using CDAP CLI (command line interface) and CDAP UI.

CDAP CLI

In order to deploy the directive through CLI. Start the CDAP CLI and use the load artifact command to load the plugin artifact into CDAP.

CDAP UI

Example

Let’s walk through the creation of a user defined directive(udd) called text-reverse that takes one argument: Column Name. Tt's the name of the column in a Row that needs to be reversed. The resulting row will have the Column Name specified in the input have reversed string of characters.

Here is the implementation of the above UDD.

Code Walk Through

Annontations

The following annotations are required for the plugin. If any of these are missing, the plugin or the directive will not be loaded.

  • @Plugin defines the type of plugin it is. For all UDDs it's set to UDD.Type.

  • @Name defines the name of the plugin and as well as the directive name.

  • @Categories defines one or more categories the directive belongs to.

  • @Description provides a short description for the plugin and as well as for the directive.

Call Pattern

The call pattern of UDD is the following :

  • DEFINE : During configure time either in the CDAP Pipeline Transform or Wrangler Service, the define() method is invoked only once to retrieve the information of the usage. The usage defines the specification of the arguments that this directive is going to accept. In our example of text-reverse, the directive accepts only one argument and that is of type TokenType.COLUMN_NAME.

  • INITIALIZE : During the initialization just before pumping in Rows through the directive, the initialize() method is invoked. This method is passed the arguments that are parsed by the system. It also provides the opportunity for the UDD writer to validate and throw exception if the value is not as expected.

  • EXECUTE : Once the pipeline has been set up, the Row is passed into the execute() method to transform.

Testing

The following is the JUnit class that couldn't be any simpler.

 

Created in 2020 by Google Inc.