Javadoc Standards


The goal of this document is to formalize conventions and best practices around javadocs in the CDAP code base. 

Content

The purpose of javadoc is to explain the intended meaning or purpose of the element it's attached to. It serves as a specification for that element's behavior. This specification is what then gets fulfilled by the implementation.

A javadoc should not assume that the reader has the code in front of them and should not assume deep familiarity with the code base. In fact, a reader should be able to implement a method based on the method javadoc.

Here are some tips to keep in mind:

  • Write the doc from the reader's point of view. If the reader could not read the code, would they know how to use the class/method?
  • Be specific. If there is a timestamp, document if it is a timestamp in milliseconds or seconds. Many times a concrete example is enormously helpful.
  • Consider error scenarios. What can cause an exception to be thrown? Is dirty state left around if there is an error? Can the method be retried if there is an error?
  • Be consistent. Use the same terminology as other methods in the class and other classes in the module. Try to keep similar style. 

Requirements CDAP 6.9+

Starting from CDAP 6.8 please follow Google JavaDoc Coding Style.

Requirements CDAP 6.8 and before

In general, every class and public method in an api or spi module must have javadocs. Exceptions to this rule are getter methods that simply return the value of a private variable and methods that override another method.

Format

Format is not as important as content, but it is desirable to have a consistent standard across the project so that people know what to expect. Elements of a javadoc should be present in the following order:

/**
 * Summary fragment.
 *
 * Additional descriptive text.
 *
 * @param
 * @return
 * @throws
 * @deprecated
 */

There should be a new line before the at-clauses.  

Summary Fragment

The summary fragment is a required description of what the class or method does. It is a fragment, meaning it is not a complete sentence. The first word should be capitalized and the fragment should end with a period.

Classes

Anything that can be instantiated should be described as a noun. For example:

A service that manages the lifecycle of programs.

and not:

Manages the lifecycle of programs.


Methods

Methods should be described as a verb. In other words, treat is like there is an implicit 'This method ...' in front of the summary fragment. For example:

Validates the specified program options and starts the program if they are valid. 


Additional Descriptive Text

A summary fragment is often not enough for a reader to understand the full behavior of the API. Any additional details after the fragment should be in complete, grammatically correct sentences.

It is a good place to put clarifying examples, notes about performance, warnings about misusing the API, and any other information that might not fall under common usage patterns.

If a method modifies state, the additional descriptive text should mention what state is modified.

If the method throws exceptions and modifies state, the method should document what happens to the state if an exception is thrown. Ideally, methods are implemented in a way where an exception will leave state as it was before the method call. If this is not the case, it should be documented. 

At-clauses

Every at-clause should be followed by a fragment as a description. By convention, the fragment should not be capitalized and should not be a complete sentence. If the fragment is all that is needed, it should not contain a period at the end. For example:

@param limit the maximum number of results to return

If the fragment is not enough to sufficiently document the at-clause, everything after the first fragment should be a complete, grammatically correct sentence. Normal punctuation and capitalization should apply. For example:

@param limit the maximum number of results to return. If the limit is equal to or less than 0, no limit will be applied.

Notice that the second sentence is a complete sentence and not a fragment. The first word 'If' is captialized, and it ends with a period.


@param

Every method argument should be documented with its own @param clause. Multiple @param clauses should appear in same order that they appear in the method signature. If some instances or values for that parameter are not valid input, they must be documented. Similarly, if the parameter can be null, the behavior for a null value must be documented. For example:

@param limit the maximum number of results to return

is not a good javadoc because it does not address what happens if the limit is 0 or below. Does the method throw an exception? Does it do something else? A better javadoc would be:

@param limit the maximum number of results to return. If the limit is equal to or less than 0, no limit will be applied. 

Parameters should be described as nouns and not verbs. In the examples above, 'limit' is correctly treated as a noun.

@param limit limits the size of the results

This incorrectly treats 'limit' as a verb.


@return

If a method returns something, it should be documented with a @return clause. It should be described as a noun. For example:

@return the list of programs that were started

If the method can return null, be sure to document when it will do so.

Trivial methods like getters can just use a summary fragment and omit the return clause.

@throws

Every checked exception must be documented with a @throws clause. Usually, the description begins with an 'if ...'. For example:

@throws ArtifactNotFoundException if the artifact required to create the application could not be found

Enough information must be given so that the caller know how to respond to the exception. The caller should have enough information to know if they can retry the operation, if they should log a message and move on, or if they need to propagate the exception up.

A bad, but very common pattern is:

@throws IOException if an I/O error occurs

Sometimes this is required because some underlying dependency throws an exception like this and does not document it any better than that. Any exception that comes from an internal class must be documented much better than that.


Unchecked exceptions do not need to be documented, though they can be documented if it is helpful to the caller. For example, NullPointerExceptions are usually not documented. An IllegalArgumentException is often documented when there is some validation method on a proto object.

@deprecated

Be sure to documented what should be used instead of the deprecated method or class.

Guidelines

This section contains guidelines for writing javadocs for common CDAP constructs. 

HTTP Handlers

HTTP Handlers should ideally only be responsible for validating user input and calling an underlying service. As such, they do not need to be documented as rigorously as the service, since any actual logic should be handled by the service.

This is not the case for older HTTP Handlers in the code base. If you are adding javadocs to an older HTTP handler, you will have to treat is like a service.

Services

Services usually sit between an http handler and a dataset. Sometimes they sit between other services as well. A service usually is the one that controls the lifecycle of CDAP entities and state about those entities. It usually is in charge of what stateful operations happen transactionally. Services should be carefully documented, as control much of the actual behavior in CDAP. State change and exception handling in particular should be carefully documented. Services are also greatly under-documented in the code base.

Datasets

Datasets or Stores in the CDAP code base are generally responsible for writing data to files or tables. The data schema should be documented for every dataset. Row keys, column names, and column values should all be documented with example data. Some examples of this are the MetadataDataset and ArtifactStore.


The goal of this document is to formalize conventions and best practices around javadocs in the CDAP code base. 

Content

The purpose of javadoc is to explain the intended meaning or purpose of the element it's attached to. It serves as a specification for that element's behavior. This specification is what then gets fulfilled by the implementation.

A javadoc should not assume that the reader has the code in front of them and should not assume deep familiarity with the code base. In fact, a reader should be able to implement a method based on the method javadoc.

Here are some tips to keep in mind:

  • Write the doc from the reader's point of view. If the reader could not read the code, would they know how to use the class/method?
  • Be specific. If there is a timestamp, document if it is a timestamp in milliseconds or seconds. Many times a concrete example is enormously helpful.
  • Consider error scenarios. What can cause an exception to be thrown? Is dirty state left around if there is an error? Can the method be retried if there is an error?
  • Be consistent. Use the same terminology as other methods in the class and other classes in the module. Try to keep similar style. 

Requirements

In general, every class and public method in an api or spi module must have javadocs. Exceptions to this rule are getter methods that simply return the value of a private variable and methods that override another method.

Format

Format is not as important as content, but it is desirable to have a consistent standard across the project so that people know what to expect. Elements of a javadoc should be present in the following order:

/**
 * Summary fragment.
 *
 * Additional descriptive text.
 *
 * @param
 * @return
 * @throws
 * @deprecated
 */

There should be a new line before the at-clauses.  

Summary Fragment

The summary fragment is a required description of what the class or method does. It is a fragment, meaning it is not a complete sentence. The first word should be capitalized and the fragment should end with a period.

Classes

Anything that can be instantiated should be described as a noun. For example:

A service that manages the lifecycle of programs.

and not:

Manages the lifecycle of programs.


Methods

Methods should be described as a verb. In other words, treat is like there is an implicit 'This method ...' in front of the summary fragment. For example:

Validates the specified program options and starts the program if they are valid. 


Additional Descriptive Text

A summary fragment is often not enough for a reader to understand the full behavior of the API. Any additional details after the fragment should be in complete, grammatically correct sentences.

It is a good place to put clarifying examples, notes about performance, warnings about misusing the API, and any other information that might not fall under common usage patterns.

If a method modifies state, the additional descriptive text should mention what state is modified.

If the method throws exceptions and modifies state, the method should document what happens to the state if an exception is thrown. Ideally, methods are implemented in a way where an exception will leave state as it was before the method call. If this is not the case, it should be documented. 

At-clauses

Every at-clause should be followed by a fragment as a description. By convention, the fragment should not be capitalized and should not be a complete sentence. If the fragment is all that is needed, it should not contain a period at the end. For example:

@param limit the maximum number of results to return

If the fragment is not enough to sufficiently document the at-clause, everything after the first fragment should be a complete, grammatically correct sentence. Normal punctuation and capitalization should apply. For example:

@param limit the maximum number of results to return. If the limit is equal to or less than 0, no limit will be applied.

Notice that the second sentence is a complete sentence and not a fragment. The first word 'If' is captialized, and it ends with a period.


@param

Every method argument should be documented with its own @param clause. Multiple @param clauses should appear in same order that they appear in the method signature. If some instances or values for that parameter are not valid input, they must be documented. Similarly, if the parameter can be null, the behavior for a null value must be documented. For example:

@param limit the maximum number of results to return

is not a good javadoc because it does not address what happens if the limit is 0 or below. Does the method throw an exception? Does it do something else? A better javadoc would be:

@param limit the maximum number of results to return. If the limit is equal to or less than 0, no limit will be applied. 

Parameters should be described as nouns and not verbs. In the examples above, 'limit' is correctly treated as a noun.

@param limit limits the size of the results

This incorrectly treats 'limit' as a verb.


@return

If a method returns something, it should be documented with a @return clause. It should be described as a noun. For example:

@return the list of programs that were started

If the method can return null, be sure to document when it will do so.

Trivial methods like getters can just use a summary fragment and omit the return clause.

@throws

Every checked exception must be documented with a @throws clause. Usually, the description begins with an 'if ...'. For example:

@throws ArtifactNotFoundException if the artifact required to create the application could not be found

Enough information must be given so that the caller know how to respond to the exception. The caller should have enough information to know if they can retry the operation, if they should log a message and move on, or if they need to propagate the exception up.

A bad, but very common pattern is:

@throws IOException if an I/O error occurs

Sometimes this is required because some underlying dependency throws an exception like this and does not document it any better than that. Any exception that comes from an internal class must be documented much better than that.


Unchecked exceptions do not need to be documented, though they can be documented if it is helpful to the caller. For example, NullPointerExceptions are usually not documented. An IllegalArgumentException is often documented when there is some validation method on a proto object.

@deprecated

Be sure to documented what should be used instead of the deprecated method or class.

Guidelines

This section contains guidelines for writing javadocs for common CDAP constructs. 

HTTP Handlers

HTTP Handlers should ideally only be responsible for validating user input and calling an underlying service. As such, they do not need to be documented as rigorously as the service, since any actual logic should be handled by the service.

This is not the case for older HTTP Handlers in the code base. If you are adding javadocs to an older HTTP handler, you will have to treat is like a service.

Services

Services usually sit between an http handler and a dataset. Sometimes they sit between other services as well. A service usually is the one that controls the lifecycle of CDAP entities and state about those entities. It usually is in charge of what stateful operations happen transactionally. Services should be carefully documented, as control much of the actual behavior in CDAP. State change and exception handling in particular should be carefully documented. Services are also greatly under-documented in the code base.

Datasets

Datasets or Stores in the CDAP code base are generally responsible for writing data to files or tables. The data schema should be documented for every dataset. Row keys, column names, and column values should all be documented with example data. Some examples of this are the MetadataDataset and ArtifactStore.

Created in 2020 by Google Inc.