Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Artifacts are managed using the Artifact HTTP RESTful APIs.

Deploying an Artifact

An artifact is deployed through the RESTful the RESTful API. If it contains an Application class, the artifact can then be used to create applications. Once an artifact is deployed, it cannot be changed, with the exception of snapshot versions of artifacts. Snapshot artifacts can be deployed multiple times, with each deployment overwriting the previous artifact. If a program is using a snapshot artifact, changes made to the artifact are picked up when the program is started. Once a program has started, it is unaffected by changes made to the artifact.

...

Normally, an artifact is added to a specific namespace. Users in one namespace cannot see or use artifacts in another namespace. These are referred to as user as user artifacts.

Sometimes there is a need to provide an artifact that can be used across namespaces. One example of this are the pipeline artifacts shipped the pipeline artifacts shipped with CDAP. In such scenarios, system artifact can a system artifact can be used.

System artifacts cannot be added through the RESTful API, but must be added by placing the artifact in a special directory. For Distributed CDAP, this directory is defined by the app.artifact.dir setting in cdapin cdap-site.xml. Multiple directories can be defined by separating them with a semicolon. It defaults to /opt/cdap/master/artifacts. For the CDAP Sandbox, the directory is set to the artifacts directory.

Any artifact in the directory will be added to CDAP when it starts up. In addition, RESTful API call a RESTful API call can be made to scan the directory for any new artifacts that may have been added since CDAP started.

...

Code Block
{
  "parents": [ "cdap-data-pipeline[3.2.0,4.0.0)" ],
  "plugins": [
    {
      "name": "mysql",
      "type": "jdbc",
      "description": "MYSQL JDBC external plugin",
      "className": "com.mysql.jdbc.Driver"
    }
  ]
}

This config file specifies that the artifact can be used by versions 3.2.0 (inclusive) to 4.0.0 (exclusive) of the cdap-data-pipeline artifact. It also specifies that there is one plugin of type jdbc and name mysql with class com.mysql.jdbc.Driver. Once added, this system artifact would be usable by applications in all namespaces.

...

Code Block
public class MyApp extends AbstractApplication<MyApp.MyConfig> {

  public static class MyConfig extends Config {
    private String stream;
    private String table;

    private MyConfig() {
      this.stream = "A";
      this.table = "X";
    }
  }

  public void configure() {
    MyConfig config = getContext().getConfig();
    addStream(new Stream(config.stream));
    createDataset(config.table, Table.class);
    addFlow(new MyFlow(config.stream, config.table, config.flowConfig));
  }
}

public class MyFlow implements AbstractFlow {
  private String stream;
  private String table;

  MyFlow(String stream, String table) {
    this.stream = stream;
    this.table = table;
  }

  @Override
  public void configure() {
    setName("MyFlow");
    setDescription("Reads from a stream and writes to a table");
    addFlowlet("reader", new Reader(table));
    connectStream(stream, "reader");
  }
}

public class Reader extends AbstractFlowlet {
  @Property
  private String tableName;
  private Table table;

  Reader(String tableName) {
    this.tableName = tableName;
  }

  @Override
  public void initialize(FlowletContext context) throws Exception {
    table = context.getDataset(tableName);
  }

  @ProcessInput
  public void process(StreamEvent event) {
    Put put = new Put(Bytes.toBytes(event.getHeaders().get(config.rowkey)));
    put.add("timestamp", event.getTimestamp());
    put.add("body", Bytes.toBytes(event.getBody()));
    table.put(put);
  }
}

Our build system creates a JAR named myapp-1.0.0.jar that contains the MyApp class. The JAR is deployed via the RESTful API:

Code Block
curl localhost:11015/v3/namespaces/default/artifacts/myapp --data-binary @myapp-1.0.0.jar

CDAP determines the version is 1.0.0 by examining the manifest file contained in the JAR. Information about the artifact and the application class in the artifact are now visible through JAR API calls:

Code Block
curl localhost:11015/v3/namespaces/default/artifacts?scope=user
[
  { "name": "myapp", "scope":"USER",  "version": "1.0.0" }
]

curl localhost:11015/v3/namespaces/default/artifacts/myapp/versions/1.0.0
{
  "classes": {
    "apps": [
      {
        "className": "com.company.example.MyApp",
        "configSchema": {
          "fields": [
            { "name": "stream", "type": [ "string", "null" ] },
            { "name": "table", "type": [ "string", "null" ] }
          ],
          "name": "com.company.example.MyApp$MyConfig",
          "type": "record"
        },
        "description": ""
      }
    ],
    "plugins": []
  },
  "name": "myapp",
  "scope": "USER",
  "version": "1.0.0"
}

With this information, a separate deployment team is able to see that the artifact contains an application class, and it contains a config that takes in a value for stream and table. From this information, we decide to create an application named purchaseDump that reads from the purchases stream and writes to the events table:

Code Block
curl -X PUT localhost:11015/v3/namespaces/default/apps/purchaseDump -H 'Content-Type: application/json' -d '
{
  "artifact": {
    "name": "myapp",
    "version": "1.0.0",
    "scope": "user"
  },
  "config": {
    "stream": "purchases",
    "table": "events"
  }
}'

We can then manage the lifecycle of the flow using the Application the Application Lifecycle RESTful APIs. After it has been running for a while, a bug is found in the code. The development team provides a fix, and myapp-1.0.1.jar is released. The artifact is deployed:

...

Code Block
curl localhost:11015/v3/namespaces/default/apps?artifactName=myapp&artifactVersion=1.0.0
[
  {
    "name": "purchaseDump",
    "artifact": {
      "name": "myapp",
      "version": "1.0.0",
      "scope": "user"
    },
    ...
  }
]

The flow for the purchaseDump application is stopped, then the application is updated:

Code Block
curl localhost:11015/v3/namespaces/default/apps/purchaseDump/update -d '
{
  "artifact": {
    "name": "myapp",
    "version": "1.0.1",
    "scope": "user"
  },
  "config": {
    "stream": "purchases",
    "table": "events"
  }
}'

The flow is started again, which picks up the new code. We quickly realize version 1.0.1 has a serious bug and decide to roll back to the previous version. The flow is stopped and another update call is made:

...