Proposed Changes for Version 3.5.0 Documentation

Users and User Stories

We would like to support different users with clear paths to follow in finding what they need in the CDAP documentation.

Currently, we provide limited direction for users as to how to engage and work with the docs. Aside from a particular order or certain examples, we don't suggest a particular approach.

To assist, we identify these different users:

  1. New CDAP Developer
  2. Architect trying to understand CDAP at a high level
  3. Ops guys looking to install CDAP
  4. Passer-by looking to find out what is this CDAP
  5. ETL developer looking to build a data pipeline on Hydrator
  6. Developer with a bit of experience in CDAP trying to understand "how is...?"

We propose replacing these user stories with these users and interests:

  1. User (a Data Scientist)
    1. New to CDAP
    2. Developing a Hydrator Pipeline
  2. Developer (of Java or other supported languages)
    1. New to CDAP
    2. Developing a first Java Application
    3. Developing a first Plugin
    4. Looking for an answer to a particular question
  3. Architect (or a CxO, a corporate officer)
    1. Reviewing CDAP abstractions and components
  4. Administrator/Operator (of networks or clusters)
    1. Installing CDAP
    2. Operating CDAP

It's possible that certain users may fall into more than one category. For example, a User may want to install CDAP themselves, in a Standalone setting, as part of a review process, in order to evaluate its appropriateness for their situation before passing it to an Administrator/Operator for adding it to a cluster.

Different Paths

We can identify paths either by user or by theme, where the themes would be the interests listed above.

Wikipedia identifies three broad approaches to the organization of documentation:

Approach

User Level
TutorialBeginner
ThematicIntermediate
List or ReferenceAdvanced

Of these different approaches, we currently employ all of these, and can suggest resources as the user advances in proficiency.

Two examples of this are the Python Documentation page and the Python Comprehensive Help page.

Another example is the documentation for the Django Project. It shows "The basics" and "Advanced" links for each thematic area.

Introduction Pages

Each manual—and the Documentation itself—has an "Introduction" page at the start.

Currently

These pages are a list of links with descriptions and important terms or phrases marked in bold:

Proposal

Have a tabbed panel that shows for different user streams appropriate content (note: the content would not be a literal; that is a limitation of the current custom Sphinx extension used to illustrate this):

See http://builds.cask.co/artifact/CDAP-DQB31/shared/build-1/Docs-HTML/3.5.0-SNAPSHOT/en/index.html for an example of this.

  • These tabbed panels could be used at the start of each Introduction.
  • As with our existing tabbed-parsed-literals, they would be synchronized, so that all would show the same tab, but with different content, in each area.
  • For each panel or "user stream" we could suggest a series of "The basics" and "Advanced" links.
  • One panel could be a complete index, similar to what we have currently.

Other Ideas

Here are additional ideas that can be employed to help users find appropriate material:

  • We list at the end of each Building Block page the examples that pertain to that building block; for example, Service Examples.
  • We need to check that this list is up-to-date; and an easy way to help this is modify the Examples page to indicate for each example which building blocks it illustrates.
    For example, we describe each example ("A variation of the WordCount example that operates on files. It demonstrates the usage of the FileSet dataset, including a service to upload and download files, and a MapReduce that operates over these files."), but this is too long and may be incomplete. A quick-to-read column with abbreviations might be better: all examples that demonstrate usage of a Stream or a MapReduce could be indicated by an "S" or "MR" in a column.
  • Each Building Block example section should have a link to the examples page.
  • The availability of Training resources from Cask should be mentioned in the docs as a potential resource.
  • Like the Django Project, we could have a section titled "How the documentation is organized" that explains how we intend people to navigate and use the docs. People might not read it, but it can help…

 

Created in 2020 by Google Inc.