Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Applications are created using an Artifact and optional configuration. An Artifact is a JAR file that packages the Java Application class that defines how the Programs, Services, Schedules, Streams, and Datasets interact and Datasets interact. It also packages any dependent classes and libraries needed to run the Application.

...

Code Block
public class MyApp extends AbstractApplication {
  @Override
  public void configure() {
    setName("myApp");
    setDescription("My Sample Application");
    addStream(new Stream("myAppStream"));
    createDataset("myAppDataset", Table.class);

   addFlow(new MyAppFlow());
    addService(new MyService());
    addMapReduce(new MyMapReduce());
    addWorkflow(new MyAppWorkflow());
  }
}

Notice that Streams are defined using the provided Stream class, and Datasets are defined by passing a Table class; both are referenced by name.

Other components are Components are defined using user-written classes that implement correspondent interfaces and are referenced by passing an object, in addition to being assigned a unique name.

Names used for streams and datasets need to be unique across the CDAP namespace, while names used for programs and services need to be unique only to the application.

...

A typical design of a CDAP application class consists of:

  • Streams to ingest data into CDAP;

  • Flows, consisting of Flowlets linked together, to process the ingested data in real time or batch;

  • MapReduce programsSpark programs, and Workflows for batch processing tasks;

  • Workers for processing data in an ad-hoc manner that doesn't fit into real-time or batch paradigms

  • Datasets for storage of data, either raw or the processed results; and

  • Services for serving data and processed results.

Of course, not all components are required: it depends on the application. A minimal application could include a stream, a flow, a flowlet, workflow and a dataset. It's possible a stream is not needed, if other methods of bringing in data are used. In the next pages, we'll look at these components, and their interactions.

...

Application classes can use a Config class to receive a configuration when an Application is created. For example, configuration can be used to specify—at specify, at application creation time—a stream to be created or time, a dataset to be read, rather than having them hard-coded in the AbstractApplication's configure method. The configuration class needs to be the type parameter of the AbstractApplication class. It should also extend the Config class present in the CDAP API. The configuration is provided as part of the request body to create an application. It is available during configuration time through the getConfig() method in AbstractApplication.

...

Code Block
public class MyApp extends AbstractApplication<MyApp.MyAppConfig> {

  public static class MyAppConfig extends Config {
    String streamName;
    String datasetName;

    public MyAppConfig() {
      // Default values
      this.streamName = "myAppStream";
      this.datasetName = "myAppDataset";
    }
  }

  @Override
  public void configure() {
    MyAppConfig config = getConfig();
    setName("myApp");
    setDescription("My Sample Application");
    addStream(new Stream(config.streamName));
    createDataset(config.datasetName, Table.class);
    addFlow(new MyAppFlow(config));
    addService(new MyService(config.datasetName));
    addMapReduce(new MyMapReduce(config.datasetName));
    addWorkflow(new MyAppWorkflow());
  }
}

...