Services (Developer)

Services can be run in a CDAP application to serve data to external clients. Services run in containers and the number of running service instances can be dynamically scaled. Developers can implement custom services to interface with a legacy system and perform additional processing beyond the CDAP processing paradigms. Examples could include running an IP-to-geo lookup and serving user-profiles.

The lifecycle of a custom service can be controlled via the CDAP UI, by using the CDAP Java Client API, or with the CDAP Microservices.

You can add services to your application by calling the addService method in the application's configure method:

public class AnalyticsApp extends AbstractApplication { @Override public void configure() { setName("AnalyticsApp"); setDescription("Application for generating mobile analytics"); ... addService(new IPGeoLookupService()); addService(new UserLookupService()); ... } }

Services are implemented by extending AbstractService, which consists of HttpServiceHandlers to serve requests:

public class IPGeoLookupService extends AbstractService { @Override protected void configure() { setName("IpGeoLookupService"); setDescription("Service to lookup locations of IP addresses."); useDataset("IPGeoTable"); addHandler(new IPGeoLookupHandler()); } }

Service Handlers

ServiceHandlers are used to handle and serve HTTP requests.

You add handlers to your service by calling the addHandler method in the service's configure method, as shown above. Only handler classes that are declared public, with public methods for endpoints, will be exposed by the service.

To use a dataset within a handler, either include the @UseDataSet annotation in the handler, or use the getDataset() method dynamically in the handler to obtain an instance of the dataset (see Service Microservices ). Each request to a method is committed as a single transaction.

public class IPGeoLookupHandler extends AbstractHttpServiceHandler { @UseDataSet("IPGeoTable") Table table; @Path("lookup/{ip}") @GET public void lookup(HttpServiceRequest request, HttpServiceResponder responder, @PathParam("ip") String ip) { // ... responder.sendString(200, location, Charsets.UTF_8); } }

Path and Query Parameters

Handler endpoints can have Path and Query parameters. Path parameters are used to assist with path-mapping of requests, while Query parameters are used to easily parse the query string of a request.

For example, the WordCount application has a Service that exposes an endpoint to retrieve the count of a word and its word associations. In the @Path annotation, {word} is a path parameter that is mapped to a Java String using @PathParam("word") String word. Similarly, the endpoint also allows the query parameter limit with a default value of 10.

An example of calling this endpoint with the Microservices is shown in the Service Microservices.

Note: Any reserved or unsafe characters in the path parameters should be encoded using percent-encoding. See the next section, “About Path Parameters”.

Handling a Large Request Body

Sometimes the request body for a PUT or POST request can be huge and it is not feasible to keep all of it in memory. You can have the handler method return an HttpContentConsumer instead of void to process the request body in smaller pieces.

For example, the SportResults application has an UploadService that exposes an endpoint for uploading files to PartitionedFileSets. It returns an HttpContentConsumer so that it receives the request body in a series of small chunks:

About Path Parameters

The value of a path parameter cannot contain any characters that have a special meaning in URI syntax. If a request has a path parameter that contains such a character, it must be URL-encoded using the "%hh" notation, a percent-symbol followed by two hex characters.

In general, any character that is not a letter, a digit, or one of $-_.+!*'() should be encoded.

However, if the special character is a forward-slash (/), then it will appear to the path matcher as a "/", even if it is escaped as "%2f". This occurs because the path is decoded prior to matching.

There are two ways to work around this:

  • Double-escape any forward-slashes (/) as "%252f". This will prevent the decoding before the path is matched. However, the path parameter's value will contain the "%2f" instead of a "/", and the application code must decode the parameter itself to obtain the actual value.

  • Use a query parameter instead. This is a better solution because the "/" is not a reserved character in the query of a URI.

Service Discovery

Services announce the host and port they are running on so that they can be discovered—and accessed—by other programs.

Service are announced using the name passed in the configure method. The application nameservice id, and hostname required for registering the service are automatically obtained.

The service can then be discovered in a MapReduce, Spark, Worker, or another service using the appropriate program context. You may also access a service in a different application by specifying the application name in the getServiceURL call.

For example, in Workers:

Services and Resources

When a service is configured, the resource requirements for the server that runs all handlers of the service can be set, both in terms of the amount of memory (in megabytes) and the number of virtual cores assigned.

If both the memory and the number of cores needs to be set, this can be done using:

The resource requirements can also be altered through runtime arguments, as explained in Configuring Resources.

Service Thread Model

An HTTP server is started for each Service instance, which by default starts 60 threads to handle client requests. Each thread is basically tied to one active client request and each thread would have its own instance of HttpServiceHandlers. This guarantees there will be no concurrent calls to each HttpServiceHandler object instance. Also, by default, when a thread is idled for more than 60 seconds, it will be terminated automatically, with the HttpServiceHandler.destroy method being called to release resources.

Both the number of service threads and the thread keep-alive time can be altered by these runtime arguments:

  • system.service.threads: Number of threads to use in the HTTP server

  • system.service.thread.keepalive.secs: Number of seconds a thread can sit idle before getting terminated

Created in 2020 by Google Inc.