Logging and Monitoring
CDAP collects logs and metrics for all of its internal services and user applications. Being able to view these details can be very helpful in debugging CDAP applications as well as analyzing their performance. CDAP gives access to its logs, metrics, and other monitoring information through Microservices, the CDAP UI, as well as a Java Client.
In Hadoop clusters, the programs running inside their containers generate individual log files as part of the container. Because an application can consist of multiple programs distributed across the nodes of a cluster, the complete logs for the application may similarly be distributed. As these files generally do not persist beyond the life of the container, they are volatile and not very useful for post-mortem diagnostics and troubleshooting, or for analyzing performance.
To address these issues, the CDAP log framework was designed:
To centralize the location of logs, so that the logs of the individual containers of a program can be merged into one;
To make logs both persistent (available for later use and analysis) and available while the program is still running;
To be extensible usingย custom log pipelines; and
To allow fine-tuning of the logging behavior at the level of an individual application as well as the entire cluster.
Logging Example
This diagram shows the steps CDAP follows when logging a program of an application:
Logs are collected from an individual program running in a YARN container.
YARN writes the log messages emitted by containers to files inside the container.
In addition, CDAP programs publish these messages to Kafka.
The CDAP Log Saver Service is configured to read log messages from Kafka. Log saver reads the messages from Kafka, groups them by program or application, buffers and sorts them in memory, and finally persists them to files in HDFS. Each of these files corresponds to one program or application, depending on how the grouping is configured. (This is set by the propertyย log.publish.partition.key, describedย below.)
In addition to persisting logs to files, the Log.saver also emits metrics about the number of log messages emitted by each program. These metrics can be retrieved by querying theย CDAP metrics system.
For security, the files written out to persistent storage in HDFS have permissions set such that they are accessible only by theย
cdap
ย user.
CDAP Logging Example:ย From a YARN container, through Kafka and the CDAP Log Saver Service, to HDFS.
Logging uses the standardย SLF4J (Simple Logging Facade for Java)ย APIs andย Logback. Logging is configured using instances of Logback's "logback" file, consisting ofย log pipelinesย withย log appenders:
Aย log pipelineย is a process that consumes log events from Kafka, buffers, groups by application or program, sorts, and then invokes theย log appendersย defined in its configuration.
Aย log appenderย (orย appender) is a Java class, responsible for consuming and processing messages; typically, this includes persisting the log events. It can also, for example, collect metrics, maintain metadata about the storage, or emit alerts when it finds certain messages.
User Application Program Logs
Emitting Log Messages from a Program
CDAP supports application logging through the standardย SLF4J (Simple Logging Facade for Java)ย APIs.
Retrieving Log Messages from a Program
The log messages emitted by your application code can be retrieved by:
Using theย CDAP Microservices v3: theย Logging Microservicesย details the available contexts that can be called to retrieve different messages.
Log messages of a program can be viewed in theย CDAP UI. For more information about viewing logs, see Viewing Logs in the Pipeline Studio.
Program Log File Locations
Program logs are stored in locations specified by properties in theย cdap-site.xmlย
file depending on the mode of CDAP (Sandbox or Distributed):
Forย CDAP Sandbox:ย the propertyย
log.collection.root
ย (defaultย${local.data.dir}/logs
) is the root location for collecting program logs when in CDAP Sandbox.Forย Distributed CDAP:ย the propertyย
hdfs.namespace
ย (defaultย/cdap
) is the base directory in HDFS; program logs are stored inย${hdfs.namespace}/logs
ย (by default,ย/cdap/logs
).
Configuring Program Logs and Log Levels
The logging of an application's programs are configured by theย logback-container.xml
ย file, packaged with the CDAP distribution. This "logback" does log rotation once a day at midnight and expires logs older than 14 days. Changes can be made toย logback-container.xml
; afterwards, applications or programs needs to be restarted for the modified logback file to take effect. Changing theย logback-container.xml
ย will only affect programs that are started after the change; existing running programs will not be affected.
Forย CDAP Sandbox:ย As the entire CDAP Sandbox runs in a single JVM, theย
logback.xml
ย file, located inย<CDAP-HOME>/conf
, configures both "container" and CDAP system service logging.Forย Distributed CDAP:ย theย
logback-container.xml
ย file is located inย/etc/cdap/conf
.
You can also use a custom "logback" file with your application, as described in the Developer Manual sectionย Application Logback.
Changing Program Log Levels
When running under Distributed CDAP, the log levels of a program can be changed without modifying theย logback.xml
ย orย logback-container.xml
ย files. This can be done, for all program types, before a particular run or, in the case of a service or worker, while it is running.
The CDAPย Logging Microservicesย can be used to set the log levels of a service or worker which is running. Once changed, they can be reset back to what they were originally by using theย reset endpoint.
Only services or workers can be dynamically changed; other program types are currently not supported. The other program types' log levels can only be changed using their preferences before the program starts.
To configure the log level before an application starts, you can add the logger name as the key and log level as the value in theย preferences, using theย CDAP UI,ย CDAP CLI, or other command line tools. The logger name should be prefixed withย
system.log.level
.For example, if you want to set the log level of a class namedย
MyService
ย with packageยmy.sample.package
ย toยDEBUG
, you would useยsystem.log.level.my.sample.package.MyService
ย as the key andยDEBUG
ย as the value. This can be applied to any package or classes. If the logger name is omitted, it will change the log level of ROOT.To configure the log level of a program dynamically, such as a service or worker which is currently running, see theย Logging Microservices.
Note:ย Theย Logging Microservicesย for changing program log levels can only be used with programs that are running under Distributed CDAP. For changing the log levels of programs run under CDAP Sandbox, you either modify theย logback.xml
ย file, or youย provide a "logback.xml"ย with your application before it is deployed.
CDAP System Services Logs
As CDAP system services run either on cluster edge nodes or in YARN containers, their logging andย its configurationย depends on the service and where it is located.
Retrieving Log Messages from a System Service
The log messages emitted by CDAP system services can be retrieved by:
Using theย CDAP Microservices v3: theย Logging Microservicesย detailsย downloading the logsย emitted by a system service.
You can view log messages of system services can be viewed in theย CDAP UI on theย Administrationย page:
System Service Log File Locations
The location of CDAP system service logs depends on the mode of CDAP (Sandbox or Distributed) and the Hadoop distribution:
Forย CDAP Sandbox:ย system logs are located inย
<CDAP-HOME>/logs
.Forย Distributed CDAP:ย system logs are located inย
/var/log/cdap
ย (with the exception of Cloudera Manager-based clusters). With Cloudera Manager installations, system log files are located in directories underย/var/run/cloudera-scm-agent/process
.
Configuring System Service Logs
CDAP system services that run in YARN containers, such as theย Metrics Service, are configured by the sameย
logback-container.xml
ย thatย configures user application program logging.CDAP system services that run on cluster edge nodes, such as CDAP Master or Router, are configured by theย
logback.xml
ย Changes can be made toยlogback.xml
; afterwards, the service(s) affected will need to be restarted for the modified "logback" file to take effect.Forย CDAP Sandbox:ย theย
logback.xml
ย file is located inย/etc/cdap/conf
.Forย Distributed CDAP:ย the fileย
logback.xml
ย file, located inย<CDAP-HOME>/conf
, configures both "container" and CDAP system service logging.
Changing System Service Log Levels
When running under Distributed CDAP, the log levels of system services can be changed at runtime without either modifying theย logback.xml
ย or restarting CDAP.
The CDAPย Logging Microservicesย can be used to set the log levels of a system service while it is running. Once changed, they can be reset back to what they were originally by using theย reset endpoint.
The Microservices endpoints can be applied to all system services listed atย Logging Microservices. However, sinceย appfabric
ย andย dataset.service
ย are running on the same node, changing log levels of theย appfabric
ย serviceย will alsoย change the log levels of theย dataset.service
.
Note:ย Theย Logging Microservicesย for changing system service log levels can only be used with system services that are running under Distributed CDAP. For changing the log levels of system services under CDAP Sandbox, you need to modify theย logback.xml
ย file and restart CDAP.
Configuring the Log Saver Service
The Log Saver Service is the CDAP service that reads log messages from Kafka, processes them inย log pipelines, persists them to HDFS, and sends metrics on logging to the Metrics Service.
In addition to the defaultย CDAP Log Pipeline, you can specifyย custom log pipelinesย that are run by the log saver service and perform custom actions.
Theย cdap-site.xml
ย file has properties that control the writing of logs to Kafka, the log saver service, the CDAP log pipeline, and anyย custom log pipelinesย that have been defined.
Writing Logs to Kafka
Theseย propertiesย control the writing of logs to Kafka:
Parameter Name | Default Value | Description |
---|---|---|
|
| Kafka topic name used to publish logs |
|
| Number of CDAP Kafka service partitions to publish the logs to |
|
| Publish logs from an application or a program to the same partition. Valid values are "application" or "program". If set to "application", logs from all the programs of an application go to the same partition. If set to "program", logs from the same program go to the same partition. Changes to this property requires restarting of all CDAP applications. |
Notes:
If an external Kafka service is used (instead of the CDAP Kafka service), the number of partitions used forย
log.publish.num.partitions
ย must match the number set in the external service for the topic being used to publish logs (log.kafka.topic
).By default,ย
log.publish.partition.key
ย is set toยprogram
, which means that all logs for the same program go to the same partition. Set this toยapplication
ย if you want all logs from an application to go to the same instance of the Log Saver Service.
Log Saver Service
Theseย propertiesย control the Log Saver Service:
Parameter Name | Default Value | Description |
---|---|---|
|
| Maximum number of log saver instances to run in YARN |
|
| Number of log saver instances to run in YARN |
|
| Memory in megabytes for each log saver instance to run in YARN. |
|
| Number of virtual cores for each log saver instance in YARN |
Log saver instances should be from a minimum of one to a maximum of ten. The maximum is set by the number of Kafka partitions (log.publish.num.partitions
), which by default is 10.
Log Pipeline Configuration
Configuration properties for logging andย custom log pipelinesย are shown in the documentation of theย logging propertiesย section of theย cdap-site.xmlย file.
Theย CDAP log pipelineย is configured by settings in theย cdap-site.xmlย file.
Custom log pipelinesย are configured by a combination of the settings in theย cdap-site.xmlย file and a "logback" file used to specify the custom pipeline. The XML file is placed in theย log.process.pipeline.config.dir
, a local directory on the CDAP Master node that is scanned for log processing pipeline configurations. Each pipeline is defined by a file in the Logback XML format, withย .xml
ย as the file name extension.
Theseย propertiesย control the CDAP log pipeline:
Parameter Name | Default Value | Description |
---|---|---|
|
| Permissions used by the system log pipeline when creating directories |
|
| Batch size to clean up log metadata table |
|
| Maximum time span in milliseconds of a log file created by the system log pipeline |
|
| Maximum size in bytes of a log file created by the system log pipeline |
|
| Permissions used by the system log pipeline when creating files |
|
| Time in days a log file is retained |
Theseย propertiesย control both the CDAP log pipeline and custom log pipelines:
Parameter Name | Default Value | Description |
---|---|---|
|
| The time between log processing pipeline checkpoints in milliseconds |
|
| A local directory on the CDAP Master that is scanned for log processing pipeline configurations. Each pipeline is defined by a file in the logback XML format, with ".xml" as the file name extension. |
|
| The time a log event stays in the log processing pipeline buffer before writing out to log appenders in milliseconds. A longer delay will result in better time ordering of log events before presenting to log appenders but will consume more memory. |
|
| The buffer size in bytes, per topic partition, for fetching log events from Kafka |
|
| Comma-separated list of local directories on the CDAP Master scanned for additional library JAR files to be included for log processing |
Theย log.process.pipeline.*
ย properties can be over-ridden and specified at the custom pipeline level by providing a value in a pipeline's "logback" file for any of these properties.
Logging Framework
This diagram shows in greater detail the components and steps CDAP follows when logging programs of an application and system services with the logging framework:
CDAP Logging Framework:ย From YARN containers, through Kafka and the Log Saver Service, to HDFS
Logs are collected from individual programs running in YARN containers.
YARN writes the log messages emitted by containers to files inside the container.
In addition, CDAP programs publish these messages to Kafka.
CDAP System Services run (depending on the service) either on cluster edge nodes or in YARN containers. Where they run determines the file that configures that service's logging.
The Log Saver Service (log.saver) is configured to read log messages for theย
logs.user-v2
ย Kafka topic (set by the propertyยlog.kafka.topic
). The number of log saver instances can be scaled to process the Kafka partitions in parallel, if needed.Log saver, by default, runs only the CDAP Log Pipeline: it reads the messages from Kafka, groups them by program or application, buffers and sorts them in memory, and finally persists them to files in HDFS. Each of these files corresponds to one program or application, depending on how the grouping is configured. (This is set by the propertyย log.publish.partition.key, describedย below.)
Note:ย These files are configured to rotate based on time and size; they can be changed using the propertiesย
log.pipeline.cdap.file.max.size.bytes
ย andยlog.pipeline.cdap.file.max.lifetime.ms
ย in theย cdap-site.xmlย file as described inย โLog Pipeline Configurationโ.For security, the files written out to persistent storage in HDFS have permissions set such that they are accessible only by theย
cdap
ย user.In addition, custom log pipelines can be configured by adding an XML file in a prescribed location. Each pipeline buffers log messages in memory and sorts them based on their timestamp.
In addition to persisting logs to files, the log saver also emits metrics about the number of log messages emitted by each program. These metrics can be retrieved by querying theย CDAP metrics system.
These tables list the metrics from the sectionย Available System Metricsย of theย Metrics Microservices. See that section for further information.
Application Logging MetricDescription
system.app.log.{error,ย info,ย warn}
Number ofยerror
,ยinfo
, orยwarn
ย log messages logged by an application or applicationsSystem Services Logging MetricDescriptionsystem.services.log.{error,ย info,ย warn}
Number ofยerror
,ยinfo
, orยwarn
ย log messages logged by a system service or system services
Custom Log Pipelines
For a custom log pipeline, create and configure a "logback" file, configuring loggers, appenders, and properties based on your requirements, and place the file at the path specified in theย cdap-site.xml
ย file by the propertyย log.process.pipeline.config.dir
ย of theย cdap-site.xml
ย file.
Each custom pipeline requires a unique name. Properties controlling the pipeline (theย log.process.pipeline.*
ย properties) are describedย above.
For every XML file in theย log.process.pipeline.config.dir
ย directory, a separate log pipeline is created. As they are separate Kafka consumers and processes, each pipeline is isolated and independent of each other. The performance of one pipeline does not affect the performance of another. Though CDAP has been tested with multiple log pipelines and appenders, the fewer of each that are specified will provide better performance.
Configuring Custom Log Pipelines
CDAP looks for "logback" files located in a directory as set by the propertyย log.process.pipeline.config.dir
ย in theย cdap-site.xmlย file. In the default configuration, this is:
Forย CDAP Sandbox:ย
<CDAP-HOME>/ext/logging/config
Forย Distributed CDAP:ย
/opt/cdap/master/ext/logging/config
Example "logback" File for a Custom log pipeline
Here is an example "logback" file, using two appenders (STDOUT
ย andย rollingAppender
). This file must be located (asย noted above) with a file extension ofย .xml
:
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d{ISO8601} - %-5p [%t:%C{1}@%L] - %m%n</pattern>
</encoder>
</appender>
<property name="cdap.log.saver.instance.id" value="instanceId"/>
<appender name="rollingAppender" class="io.cdap.cdap.logging.plugins.RollingLocationLogAppender">
<!-- log file path will be created by the appender as: <basePath>/<namespace-id>/<application-id>/<filePath> -->
<basePath>plugins/applogs</basePath>
<filePath>securityLogs/logFile-${cdap.log.saver.instance.id}.log</filePath>
<!-- cdap is the owner of the log files directory, so cdap will get read/write/execute permissions.
Log files will be read-only for others. -->
<dirPermissions>744</dirPermissions>
<!-- cdap is the owner of the log files, so cdap will get read/write permissions.
Log files will be read-only for others -->
<filePermissions>644</filePermissions>
<!-- It is an optional parameter, which takes number of miliseconds.
Appender will close a file if it is not modified for fileMaxInactiveTimeMs
period of time. Here it is set for thirty minutes. -->
<fileMaxInactiveTimeMs>1800000</fileMaxInactiveTimeMs>
<rollingPolicy class="io.cdap.cdap.logging.plugins.FixedWindowRollingPolicy">
<!-- Only specify the file name without a directory, as the appender will use the
appropriate directory specified in filePath -->
<fileNamePattern>logFile-${cdap.log.saver.instance.id}.log.%i</fileNamePattern>
<minIndex>1</minIndex>
<maxIndex>9</maxIndex>
</rollingPolicy>
<triggeringPolicy class="io.cdap.cdap.logging.plugins.SizeBasedTriggeringPolicy">
<!-- Set the maximum file size appropriately to avoid a large number of small files -->
<maxFileSize>100MB</maxFileSize>
</triggeringPolicy>
<encoder>
<pattern>%-4relative [%thread] %-5level %logger{35} - %msg%n</pattern>
<!-- Do not flush on every event -->
<immediateFlush>false</immediateFlush>
</encoder>
</appender>
<logger name="io.cdap.cdap.logging.plugins.RollingLocationLogAppenderTest" level="INFO">
<appender-ref ref="rollingAppender"/>
</logger>
<root level="INFO">
<appender-ref ref="STDOUT"/>
</root>
</configuration>
Custom Log Appender
You can use any existingย logbackย appender. Theย RollingLocationLogAppender
(an extension of the Logbackย FileAppender
) lets you use HDFS locations in your log pipelines.
If you need an appender beyond what is available through Logback or CDAP, you can write and implement your own custom appender. See theย Logback documentationย for information on this.
As the CDAP LogFramework uses Logback's Appender API, your custom appender needs to implement the same Appender interface. Access to CDAP's system components (such as datasets, metrics,ย LocationFactory
) are made available to theย AppenderContext
, an extension of Logback'sย LoggerContext
:
public class CustomLogAppender extends FileAppender<ILoggingEvent> implements Flushable, Syncable {
. . .
private LocationManager locationManager;
@Override
public void start() {
if (context instanceof AppenderContext) {
AppenderContext context = (AppenderContext) this.context;
locationManager = new LocationManager(context.getLocationFactory() . . .);
. . .
}
}
@Override
public void doAppend(ILoggingEvent eventObject) throws LogbackException {
try {
. . .
OutputStream locationOutputStream = locationManager.getLocationOutputStream . . .
setOutputStream(locationOutputStream);
writeOut(eventObject);
. . .
} catch
. . .
}
}
Adding a dependency on theย cdap-watchdog
ย API will allow you to access theย cdap/cdap-watchdog-api/src/main/java/io/cdap/cdap/api/logging/AppenderContext.java at develop ยท cdapio/cdap ย in your appender.
Enabling Access Log
Access logging can be enabled on distributed CDAP with security turned on. Once enabled, each HTTP access via the Authentication Server and Router will be logged. Log output will be in theย standard Apache HTTPd access log format.
To enable the access logging, complete the following steps:
In theย
logback-container.xml
ย file located inย/etc/cdap/conf
, have the following properties:<appender name="AUDIT" class="ch.qos.logback.core.rolling.RollingFileAppender"> <file>access.log</file> <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy"> <fileNamePattern>access.log.%d{yyyy-MM-dd}</fileNamePattern> <maxHistory>30</maxHistory> </rollingPolicy> <encoder> <pattern>%msg%n</pattern> </encoder> </appender> <logger name="http-access" level="TRACE" additivity="false"> <appender-ref ref="AUDIT" /> </logger> <appender name="EXTERNAL_AUTH_AUDIT" class="ch.qos.logback.core.rolling.RollingFileAppender"> <file>external_auth_access.log</file> <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy"> <fileNamePattern>external_auth_access.log.%d{yyyy-MM-dd}</fileNamePattern> <maxHistory>30</maxHistory> </rollingPolicy> <encoder> <pattern>%msg%n</pattern> </encoder> </appender> <logger name="external-auth-access" level="TRACE" additivity="false"> <appender-ref ref="EXTERNAL_AUTH_AUDIT" /> </logger>
Note:ย By default, these properties are already at the bottom of theย
logback.xml
ย file but commented out.The log filesย
access.log
ย andยexternal_auth_access.log
ย will be available by default underย/home/cdap
ย directory, to configure the log paths, simply provide the path of the log in theยlogback.xml
ย file. For example, having:<file>/var/log/cdap/access.log</file>
will change theย
access.log
ย file to the pathย/var/log/cdap
ย directory.After modification ofย
logback.xml
, restartยcdap-router
ย andยcdap-auth-server
ย using the following commands:$ /etc/init.d/cdap-router restart $ /etc/init.d/cdap-auth-server restart
Monitoring Utilities
CDAP can be monitored using external systems such asย Nagios; a Nagios-style pluginย is availableย for checking the status of CDAP applications, programs, and the CDAP instance itself.
Additional References
For additional information beyond here, see theย Logging,ย Metrics, andย Monitoring Microservices, theย Java Client, and theย Application Logback.
Related content
Created in 2020 by Google Inc.