Logging Guidelines

This page lists the Guidelines that must be used when creating new application logs.

"Relevant Logs" are defined as the logs that are especially interesting for a CDAP User and would help the User to quickly assess the progress or debug any failures in the application. More detailed guidelines for "Relevant Logs" are as following:

Guidelines for "Relevant Logs"

Plugins:

1. Any logs generated directly from within the Plugin code.

2. Logs generated from within user scripts [javascript and python] from certain plugins that allow them are also considered relevant User Logs.

3. Logs from libraries used by the Plugins [CDAP or Otherwise] are not included. [Ideally, underlying libraries should not generate logs at all, any errors are expected to be thrown as exceptions]

4. Plugin code will handle exceptions using:

a. Generate error level log with the stack trace

b. Follow Error guideline[below] and update the message appropriately.

5. Since all logs generated from within the Plugin are visible to the user, they must follow strict guidelines:

1. Crisp and clear messaging oriented towards the Pipeline User and not the developer. Error message received from underlying library may have to be updated.

2. Log level must be assigned from the perspective of the Pipeline User. Error for a developer might not always mean error for the pipeline user.

3. Logs must indicate the name of the Plugin as provided in the “Label" by the User.

4. Error logs must follow guidelines.

6. When Pipeline User is writing his/her own plugin, all logs generated will be included because of #1. #5 above is mainly for internally-developed Plugins like the ones available on Market, although is generally applicable.

CDAP Applications:

1. Any logs generated directly from a Program of the application are always included.

2. Logs from libraries used by the Programs [CDAP or Otherwise] are not included. [Ideally, underlying libraries should not generate logs at all, any errors are expected to be thrown as exceptions]

3. Application code will handle exceptions using:

a. Generate error level log with the stack trace

b. Follow Error guideline and update the message appropriately.

4. Since all logs generated from the Application Programs are considered relevant, they must follow strict guidelines:

1. Crisp and clear messaging oriented towards the Application User and the target use case of the application. Error message received from underlying library may have to be updated.

2. Log level must be assigned from the perspective of the Application User and target use case. Error for a developer might not always mean error for the application user.

3. Error logs must follow guidelines.

5. When Application User is writing his/her own application, all logs generated will be shown because of #1. #4 above is mainly for internally-developed Applications like the ones available on Market, although is generally applicable.

CDAP Platform:

Only include the System that are helpful for User in debugging Pipeline/Application:

1. Pipeline Lifecycle Info: Logs reflecting the execution of the complete Pipeline: start and finish [with success or failure]

2. Program LifeCycle Fatal Errors: Any fatal error related to setup/initializing/running/shutting down programs that are required for pipeline execution.

3. System Fatal Errors: Resource Constraints, Invalid Arguments, Artifact not found, Unsupported Operation etc from CDAP or underlying dependencies.

4. Authentication/Authorization Errors.

5. Any errors received from the underlying dependencies must be updated to reflect clear message for Pipeline User. No stack trace with null message.

6. Stack Traces received as plain messages by CDAP from underlying system library errors must be handled appropriately. [*pending.]

7. Warning level logs may be used very judiciously with clear messaging and guidance.

*Error Logs Guideline:

All error logs must follow this guideline:

1. Clear and crisp description of the root-cause oriented towards Pipeline User.

2. Includes relevant user defined information pertaining to the failure. Eg. name of the program, datasets, namespaces etc created by the user.

3. Error logs will guide the Pipeline user on steps to fixing the error. Eg. In cases like Invalid argument, Insufficient Resources etc.

Logging Levels

The following section provides a guideline for using log levels for User logs. These are logs that are served to the user from the UI interface. [Pipeline/Application/Program/Plugin]

ERROR

indicates severe issues with the program that would force user (administrator, or direct user) intervention. In absence of a FATAL level, this level is also used for errors that force a shutdown of the service or application because it cannot recover from the error. Eg. failing to start program, dataset missing, invalid user input, missing services, etc.

WARN

should be used to indicate potentially unexpected behavior or potential errors. Its not an error(yet) because the program/application can still proceed with whatever it was doing although with potentially reduced functionality. Eg. a config file isn't where it should be and the program will to have to run with default settings or some “expected" transient environmental conditions such as short loss of network or database connectivity, etc.

INFO

is used for logging generally useful information highlighting overall progress of the program/application. This is interesting information that the user always wants to have available but usually doesn't care about under successful runs. User log level is set to INFO by default and so it must be used conservatively. Eg. service start/stop, configuration assumptions, etc.

DEBUG

level designates informational events of lower importance than INFO. DEBUGs represent diagnostically helpful information for not just developers but also application/program users. Eg. loading user configuration, setting up security tokens etc.

TRACE

Trace is mostly out-of-bounds for user logs and should mostly be used only for development, testing or very targeted debugging. Traces could be used for capturing a certain flow through the application. Eg. important events passing through multiple systems etc.