Developing a UDD
There is one simple interface for developing your customized directive. The simple interface io.cdap.wrangler.api.Directive
can be used for developing user defined directive.
Simple API
Building a UDD with the simpler UDD API involves nothing more than writing a class with four function (evaluate) and few annotations. Here is an example:
@Plugin(type = UDD.Type)
@Name(SimpleUDD.NAME)
@Categories(categories = {"example", "simple"})
@Description("My first simple user defined directive")
public SimpleUDD implements Directive {
public static final String NAME = "my-simple-udd";
public UsageDefinition define() {
...
}
public void initialize(Arguments args) throws DirectiveParseException {
...
}
public List<Row> execute(List<Row> rows, ExecutorContext context) throws RecipeException, ErrorRowException {
...
}
public void destroy() {
...
}
}
The following is detailed explanation for the above code:
@Plugin
annotation tells the framework the type of plugin this class represents.@Name
annotation provides the name of the plugin. For this type, the directive name and plugin name are the same.@Description
annotation provides a short description of the directive.@Categories
annotation provides the category this directive belongs to.UsageDefition define() { }
Defines the arguments that are expected by the directive.void initialize(Arguments args) { }
Invoked before configuring a directive with arguments parsed by the framework based on thedefine()
methodsUsageDefintion
.execute(...) { }
EveryRow
from previous directive execution is passed to this plugin to execute.
Testing a simple UDD
Because the UDD is a simple three functions class, you can test it with regular testing tools, like JUnit.
public class SimpleUDDTest {
@Test
public void testSimpleUDD() throws Exception {
TestRecipe recipe = new TestRecipe();
recipe("parse-as-csv :body ',';");
recipe("drop :body;");
recipe("rename :body_1 :simpledata;");
recipe("!my-simple-udd ...");
TestRows rows = new TestRows();
rows.add(new Row("body", "root,joltie,mars avenue"));
RecipePipeline pipeline = TestingRig.pipeline(RowHash.class, recipe);
List<Row> actual = pipeline.execute(rows.toList());
}
}
Building a UDD Plugin
There is nothing much to be done here, this example repository includes a maven POM file that is pre-configured for building the directive JAR. All that a developer does it build the project using the following command.
mvn clean package
This would generate two files:
Artifact:
my-simple-udd-1.0-SNAPSHOT.jar
Artifact Configuration:
my-simple-udd-1.0-SNAPSHOT.json
Deploying Plugin
There are multiple ways the custom directive can be deployed to CDAP. The two popular ways are through using CDAP CLI (command line interface) and CDAP UI.
CDAP CLI
In order to deploy the directive through CLI. Start the CDAP CLI and use the load artifact
command to load the plugin artifact into CDAP.
CDAP UI
Example
Let’s walk through the creation of a user defined directive(udd) called text-reverse
that takes one argument: Column Name
. Tt's the name of the column in a Row
that needs to be reversed. The resulting row will have the Column Name
specified in the input have reversed string of characters.
Here is the implementation of the above UDD.
Code Walk Through
Annontations
The following annotations are required for the plugin. If any of these are missing, the plugin or the directive will not be loaded.
@Plugin
defines the type of plugin it is. For all UDDs it's set toUDD.Type
.@Name
defines the name of the plugin and as well as the directive name.@Categories
defines one or more categories the directive belongs to.@Description
provides a short description for the plugin and as well as for the directive.
Call Pattern
The call pattern of UDD is the following :
DEFINE : During configure time either in the CDAP Pipeline Transform or Wrangler Service, the
define()
method is invoked only once to retrieve the information of the usage. The usage defines the specification of the arguments that this directive is going to accept. In our example oftext-reverse
, the directive accepts only one argument and that is of typeTokenType.COLUMN_NAME
.INITIALIZE : During the initialization just before pumping in
Row
s through the directive, theinitialize()
method is invoked. This method is passed the arguments that are parsed by the system. It also provides the opportunity for the UDD writer to validate and throw exception if the value is not as expected.EXECUTE : Once the pipeline has been set up, the
Row
is passed into theexecute()
method to transform.
Testing
The following is the JUnit class that couldn't be any simpler.
Created in 2020 by Google Inc.