Metrics Microservices
Use the CDAP Metrics Microservices to retrieve the metrics created and saved by CDAP.
As applications process data, CDAP collects metrics about the application’s behavior and performance. Some of these metrics are similar for every application, such as how many events are processed and how many data operations are performed, and are called system or CDAP metrics.
Other metrics are user-defined and differ from application to application.
All methods or endpoints described in this API have a base URL (typically http://<host>:11015
or https://<host>:10443
) that precedes the resource identifier, as described in the Microservices Conventions. These methods return a status code, as listed in the Microservices Status Codes.
Metrics Data
Metrics data is identified by a combination of context and name.
A metrics context consists of a collection of tags. Each tag is composed of a tag name and a tag value.
Metrics contexts are hierarchal, rooted in the CDAP instance, and extend through namespaces, applications, and down to the individual components.
For example, the metrics context:
namespace:default app:PurchaseHistory spark:PurchaseTracker
is a context that identifies a Spark program. It has a parent context, namespace:default app:PurchaseHistory
, which identifies the parent application.
Each level of the context is described by a pair, composed of a tag name and a value, such as:
namespace:default
(tag name: namespace, value: default)app:PurchaseHistory
(tag name: app, value: PurchaseHistory)spark:PurchaseTracker
(tag name: spark, value: PurchaseTracker)
A metrics name is either a name generated by CDAP, and pre-pended with system
, or is a name set by a developer when writing an application, which are pre-pended with user
.
The system metrics vary depending on the context; a list is available of common system metrics for different contexts.
User metrics are defined by the application developer and thus are completely dependent on what the developer sets.
In both cases, searches using this API show, for a given context, all available metrics.
Available Contexts
The context of a metric is typically enclosed into a hierarchy of contexts. For example, the Spark context is enclosed in the application context, which in turn is enclosed in the namespace context. A metric can always be queried (and aggregated) relative to any enclosing context.
System Metric | Context |
---|---|
All Mappers of a MapReduce |
|
All Reducers of a MapReduce |
|
One Run of a MapReduce |
|
One MapReduce |
|
All MapReduce of an application |
|
One service |
|
All services of an application |
|
One Spark program |
|
All Spark programs of an application |
|
One worker |
|
All workers of an application |
|
All components of an application |
|
All components of all applications |
|
Dataset metrics are available at the dataset level, but they can also be queried down to the worker, service, Mapper, or Reducer level:
Dataset Metric | Context |
---|---|
A single dataset in the context of a specific application |
|
A single dataset |
|
All datasets |
|
Available System Metrics
Note: A user metric may have the same name as a system metric. They are distinguished by prepending the respective prefix when querying: user
or system
.
Dataset Metrics
These metrics are available in a dataset context:
Dataset Metric | Description |
---|---|
| Number of bytes written |
| Operations (reads and writes) performed |
| Read operations performed |
| Write operations performed |
Mappers or Reducer Metrics
These metrics are available in a Mappers or Reducers context (specify whether a Mapper or Reducer context is desired, as shown above):
Mappers or Reducers Metrics | Description |
---|---|
| A number from 0 to 100 indicating the progress of the Map or Reduce phase |
| Number of entries read in by the Map or Reduce phase |
| Number of entries written out by the Map or Reduce phase |
Service Metrics
These metrics are available in a service context:
Service Metrics | Description |
---|---|
| Number of requests made to the service |
| Number of successful requests completed by the service |
| Number of failures seen by the service |
Spark Metrics
These metrics are available in a Spark context, where <spark-id>
depends on the Spark program being queried:
Spark Metrics | Description |
---|---|
| Disk space used by the Block Manager |
| Maximum memory given to the Block Manager |
| Memory used by the Block Manager |
| Memory remaining to the Block Manager |
| Number of active jobs |
| Total number of jobs |
| Number of failed stages |
| Number of running stages |
| Number of waiting stages |
Request and Response Metrics
These metrics are available for services, for the system services component context or the user services context:
Request and Response Metrics | Description |
---|---|
| Number of requests received for the service |
| Number of successful responses sent |
| Number of |
Application Logging Metrics
These metrics are available for every application context:
Application Logging Metrics | Description |
---|---|
| Number of |
System Services Logging Metrics
These logging metrics are available for system services, in the system component context:
System Services Logging Metrics | Description |
---|---|
| Number of |
System Services Metric Processor Metrics
These processing metrics are available for system services, in the system component context:
System Services Metric Processor Metrics | Description |
---|---|
| Number of metrics processed by metric processor instance |
| Metrics processing delay in milliseconds. Difference between last metric's timestamp and current time |
Transaction Metrics
These metrics are available for the CDAP transaction service:
Transaction Metrics< |
---|