There is one simple interface for developing your customized directive. The simple interface io.cdap.wrangler.api.Directive
can be used for developing user defined directive.
Building a UDD with the simpler UDD API involves nothing more than writing a class with four function (evaluate) and few annotations. Here is an example:
@Plugin(type = UDD.Type) @Name(SimpleUDD.NAME) @Categories(categories = {"example", "simple"}) @Description("My first simple user defined directive") public SimpleUDD implements Directive { public static final String NAME = "my-simple-udd"; public UsageDefinition define() { ... } public void initialize(Arguments args) throws DirectiveParseException { ... } public List<Row> execute(List<Row> rows, ExecutorContext context) throws RecipeException, ErrorRowException { ... } public void destroy() { ... } } |
The following is detailed explanation for the above code:
@Plugin
annotation tells the framework the type of plugin this class represents.
@Name
annotation provides the name of the plugin. For this type, the directive name and plugin name are the same.
@Description
annotation provides a short description of the directive.
@Categories
annotation provides the category this directive belongs to.
UsageDefition define() { }
Defines the arguments that are expected by the directive.
void initialize(Arguments args) { }
Invoked before configuring a directive with arguments parsed by the framework based on the define()
methods UsageDefintion
.
execute(...) { }
Every Row
from previous directive execution is passed to this plugin to execute.
Because the UDD is a simple three functions class, you can test it with regular testing tools, like JUnit.
public class SimpleUDDTest { @Test public void testSimpleUDD() throws Exception { TestRecipe recipe = new TestRecipe(); recipe("parse-as-csv :body ',';"); recipe("drop :body;"); recipe("rename :body_1 :simpledata;"); recipe("!my-simple-udd ..."); TestRows rows = new TestRows(); rows.add(new Row("body", "root,joltie,mars avenue")); RecipePipeline pipeline = TestingRig.pipeline(RowHash.class, recipe); List<Row> actual = pipeline.execute(rows.toList()); } } |
There is nothing much to be done here, this example repository includes a maven POM file that is pre-configured for building the directive JAR. All that a developer does it build the project using the following command.
mvn clean package |
This would generate two files:
Artifact: my-simple-udd-1.0-SNAPSHOT.jar
Artifact Configuration: my-simple-udd-1.0-SNAPSHOT.json
There are multiple ways the custom directive can be deployed to CDAP. The two popular ways are through using CDAP CLI (command line interface) and CDAP UI.
In order to deploy the directive through CLI. Start the CDAP CLI and use the load artifact
command to load the plugin artifact into CDAP.
$ $CDAP_HOME/bin/cdap cli cdap > load artifact my-simple-udd-1.0-SNAPSHOT.jar config-file my-simple-udd-1.0-SNAPSHOT.json |
Let’s walk through the creation of a user defined directive(udd) called text-reverse
that takes one argument: Column Name
. Tt's the name of the column in a Row
that needs to be reversed. The resulting row will have the Column Name
specified in the input have reversed string of characters.
text-reverse :address text-reverse :id |
Here is the implementation of the above UDD.
@Plugin(type = UDD.Type) @Name(TextReverse.NAME) @Categories(categories = {"text-manipulation"}) @Description("Reverses the column value") public final class TextReverse implements UDD { public static final String NAME = "text-reverse"; private String column; public UsageDefinition define() { UsageDefinition.Builder builder = UsageDefinition.builder(NAME); builder.define("column", TokenType.COLUMN_NAME); return builder.build(); } public void initialize(Arguments args) throws DirectiveParseException { this.column = ((ColumnName) args.value("column").value(); } public List<Row> execute(List<Row> rows, ExecutorContext context) throws RecipeException, ErrorRowException { for(Row row : rows) { int idx = row.find(column); if (idx != -1) { Object object = row.getValue(idx); if (object instanceof String) { String value = (String) object; row.setValue(idx, new StringBuffer(value).reverse().toString()); } } } return rows; } public void destroy() { // no-op } } |
The following annotations are required for the plugin. If any of these are missing, the plugin or the directive will not be loaded.
@Plugin
defines the type of plugin it is. For all UDDs it's set to UDD.Type
.
@Name
defines the name of the plugin and as well as the directive name.
@Categories
defines one or more categories the directive belongs to.
@Description
provides a short description for the plugin and as well as for the directive.
The call pattern of UDD is the following :
DEFINE : During configure time either in the CDAP Pipeline Transform or Wrangler Service, the define()
method is invoked only once to retrieve the information of the usage. The usage defines the specification of the arguments that this directive is going to accept. In our example of text-reverse
, the directive accepts only one argument and that is of type TokenType.COLUMN_NAME
.
INITIALIZE : During the initialization just before pumping in Row
s through the directive, the initialize()
method is invoked. This method is passed the arguments that are parsed by the system. It also provides the opportunity for the UDD writer to validate and throw exception if the value is not as expected.
EXECUTE : Once the pipeline has been set up, the Row
is passed into the execute()
method to transform.
The following is the JUnit class that couldn't be any simpler.
@Test public void testBasicReverse() throws Exception { TestRecipe recipe = new TestRecipe(); recipe.add("parse-as-csv :body ',';"); recipe.add("set-headers :a,:b,:c;"); recipe.add("text-reverse :b"); TestRows rows = new TestRows(); rows.add(new Row("body", "root,joltie,mars avenue")); rows.add(new Row("body", "joltie,root,venus blvd")); RecipePipeline pipeline = TestingRig.pipeline(TextReverse.class, recipe); List<Row> actual = pipeline.execute(rows.toList()); Assert.assertEquals(2, actual.size()); Assert.assertEquals("eitloj", actual.get(0).getValue("b")); Assert.assertEquals("toor", actual.get(1).getValue("b")); } |