Operations and Monitoring Guide

  • Logging and Monitoring: CDAP collects logs for all of its internal services and user applications; at the same time, CDAP can be monitored through external systems. Covers log locationlogging messages, the system services and user application logback configuration and CDAP support for logging through the standard SLF4J (Simple Logging Facade for Java) APIs and Logback.

  • Metrics: CDAP collects metrics about the application’s behavior and performance.

  • Preferences and Runtime Arguments: Preferences provide the ability to save configuration information. MapReduce and Spark programs, services, workers, and workflows can receive runtime arguments.

  • CDAP UI: The CDAP UI is available for deploying, querying, and managing CDAP.

Command Line Interface

Most of the administrative operations are also available more conveniently through the Command Line Interface. See the Command Line Interface API for details.

Getting a Health Check

Administrators can check the health of various services in the system. (In these examples, substitute for <host> the host name or IP address of the CDAP server.)

  • To retrieve the health check of the CDAP UI, make a GET request to the URI:

    1 http://<host>:11011/status
  • To retrieve the health check of the CDAP Router, make a GET request to the URI:

    1 http://<host>:11015/status
  • To retrieve the health check of the CDAP Authentication Server, make a GET request to the URI:

    1 http://<host>:10009/status

On success, the calls return a valid HTTP response with a 200 code.

  • To retrieve the health check of all the services running in YARN, make a GET request to the URI:

    1 http://<host>:11015/v3/system/services

    On success, the call returns a JSON string with component names and their corresponding statuses (reformatted to fit):

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 [{"name":"appfabric","description":"Service for managing application lifecycle.","status":"OK","logs":"OK","min":1,"max":1,"requested":1,"provisioned":1}, {"name":"dataset.executor","description":"Service to perform dataset operations.","status":"OK","logs":"OK","min":1,"max":1,"requested":1,"provisioned":1}, {"name":"explore.service","description":"Service to run ad-hoc queries.","status":"OK","logs":"OK","min":1,"max":1,"requested":1,"provisioned":1}, {"name":"log.saver","description":"Service to collect and store logs.","status":"OK","logs":"NOTOK","min":1,"max":1,"requested":1,"provisioned":1}, {"name":"metrics","description":"Service to handle metrics requests.","status":"OK","logs":"OK","min":1,"max":1,"requested":1,"provisioned":1}, {"name":"metrics.processor","description":"Service to process application and system metrics.","status":"OK","logs":"NOTOK","min":1,"max":1,"requested":1,"provisioned":1}, ingestion.","status":"OK","logs":"OK","min":1,"max":1,"requested":1,"provisioned":1}, {"name":"transaction","description":"Service that maintains transaction states.","status":"OK","logs":"NOTOK","min":1,"max":1,"requested":1,"provisioned":1}]