Overview
The purpose of this page is to illustrate the plan for ApplicationTemplate and Application consolidation. This work is being tracked in
Motivation
Why do we want to consolidate templates and applications? In CDAP 3.0, an ApplicationTemplate is a way for somebody to write an Application that can be given some configuration to create an Adapter. The story is confusing; one would expect an ApplicationTemplate to create... Applications. Instead, we use the term Adapter because Application means something else already. In addition an ApplicationTemplate can only include a single workflow or a single worker, giving people different experiences for templates and applications.
Really, the goal of templates was to be able to write one piece of Application code that could be used to create multiple Applications. To do this requires that an Application can be configured at creation time instead of at compile time. For example, a user should be able to set the name of their dataset based on configuration instead of hardcoding it in the code. To support this, we plan on making it possible to get a configuration object from the ApplicationContext available in Application's configure() method. This allows somebody to pass in a config when creating an Application through the RESTful API, which can be used to configure an Application. The relevant programmatic API changes are shown below.
Definitions
Artifact - A jar file containing classes that can be used by CDAP.
Application Class - A java class that implements the CDAP Application interface. Deployed by bundling it in an artifact.
Application Config - Configuration given to CDAP to create an Application (can be empty).
Application - An instantiation of an Application Class, created by passing an Application Config to an Application Class
Plugin - An extension to an Application Class. Usually implements an interface used by the Application Class
Use Case Walkthrough
1. Create an Application that uses config
1.1 Deploying the Artifact
A developer writes a configurable Application Class that uses a Flow to read from a stream and write to a Table.
public class MyApp extends AbstractApplication<MyApp.MyConfig> { public static class MyConfig extends Config { @Nullable @Description("The name of the stream to read from. Defaults to 'A'.") private String stream; @Nullable @Description("The name of the table to write to. Defaults to 'X'.") private String table; private MyConfig() { this.stream = "A"; this.table = "X"; } } public void configure() { // ApplicationContext now has a method to get a custom config object whose fields will // be injected using the values given in the RESTful API MyConfig config = getContext().getConfig(); addStream(new Stream(config.stream)); createDataset(config.table, Table.class); addFlow(new MyFlow(config.stream, config.table, config.flowConfig)); } } public class MyFlow implements Flow { @Property private String stream; @Property private String table; MyFlow(String stream, String table) { this.stream = stream; this.table = table; this.flowConfig = flowConfig; } @Override public FlowSpecification configure() { return FlowSpecification.Builder.with() .setName("MyFlow") .setDescription("Reads from a stream and writes to a table") .withFlowlets() .add("reader", new Reader()) .connect() .fromStream(stream).to("reader") .build(); } } public class Reader extends AbstractFlowlet { @Property private String tableName; private Table table; Reader(String tableName) { this.tableName = tableName; } @Override public void initialize(FlowletContext context) throws Exception { table = context.getDataset(tableName); } @ProcessInput public void process(StreamEvent event) { Put put = new Put(Bytes.toBytes(event.getHeaders().get(config.rowkey))); put.add("timestamp", event.getTimestamp()); put.add("body", Bytes.toBytes(event.getBody())); table.put(put); } }
A jar named 'myapp-1.0.0.jar' is built which contains the Application Class. The jar is deployed via the RESTful API:
POST /namespaces/default/artifacts/myapp --data-binary @myapp-1.0.0.jar
Version is determined from the Bundle-Version in the artifact Manifest. It can also be provided as a header. Artifact details are now visible through other RESTful API calls:
GET /namespaces/default/artifacts [ { "name": "myapp", "version": "1.0.0" } ] GET /namespaces/default/artifacts/myapp/versions/1.0.0 { "name": "myapp", "version": "1.0.0", "classes": { "apps": [ { "className": "co.cask.cdap.examples.myapp.MyApp", "properties": { "stream": { "name": "stream", "description": "The name of the stream to read from. Defaults to 'A'.", "type": "string", "required": false }, "table": { "name": "table", "description": "The name of the table to write to. Defaults to 'X'.", "type": "string", "required": false, } } } ], "flows": [ ... ], "flowlets": [ ... ], "datasetModules": [ ... ] } }
In addition, a call can be made to get all Application Classes:
GET /namespaces/default/appClasses [ { "className": "co.cask.cdap.examples.myapp.MyApp", "artifact": { "name": "myapp", "version": "1.0.0" } } ]
1.2 Creating an Application
The user decides to create an application from the deployed artifact. From the calls above, the user gathers that input and output are both configurable. The user decides to create an Application that reads from the 'purchases' stream and writes to the 'events' table.
PUT /namespaces/default/apps/purchaseDump -H 'Content-Type: application/json' -d ' { "artifact": { "name": "myapp", "version": "1.0.0" }, "config": { "stream": "purchases", "table": "events" } }'
The Application now shows up in all the normal RESTful APIs, with all its programs, streams, and datasets.
1.3 Updating an Application
A bug is found in the code, a fix is provided, and a 'myapp-1.0.1.jar' release is made. The artifact is deployed:
POST /namespaces/default/artifacts/myapp --data-binary @myapp-1.0.1.jar
Note: Artifacts are immutable unless they are snapshot versions. Deploying again to version 1.0.0 would cause a conflict error.
A call can be made to determine if there are any Applications using the older artifact:
GET /namespaces/default/apps?artifactName=myapp&artifactVersion=1.0.0 [ "purchaseDump" ]
Calls are made to stop running programs. Another call is then made to update the app:
POST /namespaces/default/apps/purchaseDump/update -d ' { "artifact": { "name": "myapp", "version": "1.0.1" }, "config": { "stream": "purchases", "table": "events" } }'
1.4 Rolling Back an Application
Actually, version 1.0.1 has a bug that's even worse and needs to be rolled back. The same update call can be made:
POST /namespaces/default/apps/purchaseDump/update -d ' { "artifact": { "name": "myapp", "version": "1.0.0" }, "config": { "stream": "purchases", "table": "events" } }'
1.5 Deploying an Artifact and Creating an App in one step
For backwards compatibility, the deploy app API will remain the same and will internally deploy an artifact and create the app in one call. An additional header will be supported specifying the Application Config.
POST /namespaces/default/apps --data-binary @myapp-1.0.0.jar -H 'X-App-Config: { "stream": "purchases", "table": "events" }'
2. Create an Application that uses plugins
2.1 Application Class changes
Now the user decides to update the MyApp Application Class to support pluggable ways of reading from a stream. This is done by introducing a 'StreamReader' interface in their project:
public interface StreamReader { Put read(StreamEvent event); }
The user wants this StreamReader interface to be pluggable. There can be many implementations of StreamReader, and which implementation to use should be configurable. The Flowlet code changes to use the new StreamReader interface using the plugin java API:
public class Reader extends AbstractFlowlet { @Property private String tableName; private Table table; private StreamReader streamReader; Reader(String tableName) { this.tableName = tableName; } @Override public void initialize(FlowletContext context) throws Exception { table = context.getDataset(tableName); streamReader = context.newPluginInstance("readerPluginID"); } @ProcessInput public void process(StreamEvent event) { table.put(streamReader.read(event)); } }
The Application Class is changed to register a "streamreader" plugin based on configuration:
public class MyApp extends AbstractApplication<MyApp.MyConfig> { public static class MyConfig extends Config { @Nullable @Description("The name of the stream to read from. Defaults to 'A'.") private String stream; @Nullable @Description("The name of the table to write to. Defaults to 'X'.") private String table; @Description("The name of the streamreader plugin to use.") private String readerPlugin; @Nullable @Description("Properties to send to the streamreader plugin.") @PluginType("streamreader") private PluginProperties readerPluginProperties; private MyConfig() { this.stream = "A"; this.table = "X"; } } @Override public void configure() { // ApplicationContext now has a method to get a custom config object whose fields will // be injected using the values given in the RESTful API MyConfig config = getContext().getConfig(); addStream(new Stream(config.stream)); createDataset(config.table, Table.class); addFlow(new MyFlow(config.stream, config.table, config.flowConfig)); // arguments are: type, name, id, properties usePlugin("streamreader", config.readerPlugin, "readerPluginID", config.readerPluginProperties); } }
This becomes v2 of the Application Class. It is deployed via the same RESTful API:
POST /namespaces/default/artifacts/myapp --data-binary @myapp-2.0.0.jar
The metadata about this artifact now includes additional information about the config:
GET /namespaces/default/artifacts/myapp/versions/2.0.0 { "name": "myapp", "version": "2.0.0", "classes": { "apps": [ { "className": "co.cask.cdap.examples.myapp.MyApp", "properties": { "stream": { "name": "stream", "description": "The name of the stream to read from. Defaults to 'A'.", "type": "string", "required": false }, "table": { "name": "table", "description": "The name of the table to write to. Defaults to 'X'.", "type": "string", "required": false, }, "readerPlugin": { "name": "readerPlugin", "description": "The name of the streamreader plugin to use.", "type": "string", "required": true }, "readerPluginProperties": { "name": "readerPluginProperties", "description": "Properties to send to the streamreader plugin.", "type": "plugin", "plugintype": "streamreader", "required": false } } } ], "flows": [ ... ], "flowlets": [ ... ], "datasetModules": [ ... ] } }
2.2 Adding plugins
A default implementation of the streamreader plugin is created to implement the previous logic:
@Plugin(type = "streamreader") @Name("default") @Description("Writes timestamp and body as two columns and expects the row key to come as a header in the stream event.") public class DefaultStreamReader implements StreamReader { private DefaultConfig config; public static class DefaultConfig extends PluginConfig { @Description("The header that should be used as the row key to write to. Defaults to 'rowkey'.") @Nullable private String rowkey; private DefaultConfig() { rowkey = "rowkey"; } } public Put read(StreamEvent event) { Put put = new Put(Bytes.toBytes(event.getHeaders().get(config.rowkey))); put.add("timestamp", event.getTimestamp()); put.add("body", Bytes.toBytes(event.getBody())); return put; } }
The plugin is bundled into a 'streamreaders-1.0.0.jar' artifact. It is added as an extension to the myapp artifact:
POST /namespaces/default/artifacts/streamreaders --data-binary streamreaders-1.0.0.jar -H 'X-Extends-Artifacts: myapp-[2.0.0,3.0.0)'
The plugin details can now be seen by querying for extensions to myapp:
GET /namespaces/default/artifacts/myapp/versions/2.0.0/extensions [ "streamreader" ] GET /namespaces/default/artifacts/myapp/versions/2.0.0/extensions/streamreader [ { "name": "default", "type": "reader", "description": "Writes timestamp and body as two columns and expects the row key to come as a header in the stream event.", "className": "co.cask.cdap.examples.myapp.plugins.DefaultStreamReader", "properties": { "rowkey": { "name": "rowkey", "description": "The header that should be used as the row key to write to. Defaults to 'rowkey'.", "type": "string", "required": false } } "artifact": { "name": "streamreaders", "version": "1.0.0" } } ]
5. System Artifacts
System artifacts are special artifacts that can be accessed in other namespaces. They cannot be deployed through the RESTful API unless a conf setting is set. Instead, they are placed in a directory on the CDAP master host. When CDAP starts up, the directory will be scanned and those artifacts will be added to the system. Example uses for system artifacts are the ETLBatch and ETLRealtime applications that we want to include out of the box.
System artifacts are included in results by default and are indicated with a special flag.
GET /namespaces/default/artifacts?includeSystem=true [ { "name": "ETLBatch", "version": "3.1.0", "isSystem": true }, { "name": "ETLRealtime", "version": "3.1.0", "isSystem": true }, { "name": "ETLPlugins", "version": "3.1.0", "isSystem": true }, { "name": "myapp", "version": "1.0.0", "isSystem": false }, { "name": "myapp", "version": "1.0.1", "isSystem": false } ]
System artifacts can be excluded from results using a filter:
GET /namespaces/default/artifacts?includeSystem=false [ { "name": "myapp", "version": "1.0.0", "isSystem": false }, { "name": "myapp", "version": "1.0.1", "isSystem": false } ]
When a user wants to create an application from a system artifact, they make the same RESTful call as before, except adding a special flag to indicate it is a system artifact:
PUT /namespaces/default/apps/somePipeline -d ' { "artifact": { "name":"ETLBatch", "version":"3.1.0", "isSystem": true }, "config": { ... } }'
6. Deleting an Artifact
Non-snapshot artifacts will be immutable. Advanced users can delete an existing artifact, but the assumption will be that they know exactly what they are doing. Deleting an artifact may cause programs that are using it to fail.
7. CDAP Upgrade
The programmatic API changes are all backwards compatible, so existing apps will not need to be recompiled. They will, however, need to be added to the artifact repository as part of the upgrade tool (or force people to redeploy their existing apps).
Any existing adapters will need to be migrated. Ideally, the upgrade tool will create matching applications based on the adapter conf, but at a minimum we will simply delete existing adapters and templates.
RESTful API changes
Application APIs
Type | Path | Body | Headers | Description |
---|---|---|---|---|
GET | /v3/namespaces/<namespace-id>/apps?artifactName=<name>[&artifactVersion=<version>] | get all apps using the given artifact name and version | ||
POST | /v3/namespaces/<namespace-id>/apps | application jar contents | Application-Config: <json of config> | same as deploy api today, except allows passing config as a header |
PUT | /v3/namespaces/<namespace-id>/apps/<app-name> | application jar contents | Application-Config: <json of config> | same as deploy api today, except allows passing config as a header |
PUT | /v3/namespaces/<namespace-id>/apps/<app-name> | { 'artifact': {'name':<name>, 'version':<version>}, 'config': { ... } } | Content-Type: application/json | create an application from an existing artifact. Note: Edits existing API, different behavior based on content-type |
PUT | /v3/namespaces/<namespace-id>/apps/<app-name>/properties | { 'artifact': {'name':<name>, 'version':<version>}, 'config': { ... } } | update an existing application. No programs can be running |
Artifact APIs
Type | Path | Body | Headers | Description |
---|---|---|---|---|
GET | /v3/namespaces/<namespace-id>/artifacts | |||
GET | /v3/namespaces/<namespace-id>/artifacts/<artifact-name> | Get data about all artifact versions | ||
POST | /v3/namespaces/<namespace-id>/artifacts/<artifact-name> | jar contents | Artifact-Version: <version> | Add a new artifact. Version header only needed if Bundle-Version is not in jar Manifest. If both present, header wins. |
GET | /v3/namespaces/<namespace-id>/artifacts/<artifact-name>/versions/<version> | Get details about the artifact, such as what plugins and applications are in the artifact and properties they support | ||
PUT | /v3/namespaces/<namespace-id>/artifacts/<artifact-name>/versions/<version>/classes | list of classes contained in the jar | This is required for 3rd party jars, such as the mysql jdbc connector. It is the equivalent of the .json file we have in 3.0 | |
GET | /v3/namespaces/<namespace-id>/classes/plugintypes |
| ||
GET | /v3/namespaces/<namespace-id>/classes/plugintypes/<plugin-type> | |||
GET | /v3/namespaces/<namespace-id>/classes/plugintypes/<plugin-type>/plugins/<plugin-name> | config properties can be nested now. For example: { "className": "co.cask.cdap.example.MyPlugin", "description": "My Plugin", "name": "MyPlugin", "properties": { "threshold": { "name": "thresh", "type": "int", "required": false }, "user": { "name": "user", "type": "config", "required": true, "fields": { "id": { "name": "id", "type": "long", "required": true }, "digits": { "name": "phoneNumber", "type": "string", "required": true } } } } } | ||
GET | /v3/namespaces/<namespace-id>/classes/apps | |||
GET | /v3/namespaces/<namespace-id>/classes/apps/<app-classname> |
Template APIs (will be removed)
Type | Path | Replaced By |
---|---|---|
GET | /v3/templates | |
GET | /v3/templates/<template-name> | |
GET | /v3/templates/<template-name>/extensions/<plugin-type> | /v3/namespaces/<namespace-id>/classes/plugintypes/<plugin-type> |
GET | /v3/templates/<template-name>/extensions/<plugin-type>/plugins/<plugin-name> | /v3/namespaces/<namespace-id>/classes/plugintypes/<plugin-type>/plugins/<plugin-name> |
PUT | /v3/namespaces/<namespace-id>/templates/<template-id> | |
GET | /v3/namespaces/<namespace-id>/adapters | |
GET | /v3/namespaces/<namespace-id>/adapters/<adapter-name> | |
POST | /v3/namespaces/<namespace-id>/adapters/<adapter-name>/start | |
POST | /v3/namespaces/<namespace-id>/adapters/<adapter-name>/stop | |
GET | /v3/namespaces/<namespace-id>/adapters/<adapter-name>/status | |
GET | /v3/namespaces/<namespace-id>/adapters/<adapter-name>/runs | |
GET | /v3/namespaces/<namespace-id>/adapters/<adapter-name>/runs/<run-id> | |
DELETE | /v3/namespaces/<namespace-id>/adapters/<adapter-name> |