Tracker 0.2 Spec

A place to store and gather feedback about the features of Tracker 0.2.

Use Cases

  • As a user of tracker, I want...

    • To see the most active popular datasets in the CDAP system.

    • To see how active popular a dataset is.

    • To edit metadata tags without leaving Tracker.

    • To edit User Properties Metadata (name might change) without leaving Tracker.

    • To suggest common tags that can be applied to datasets in Tracker.

    • To get some visual indication of the truthfulness trustworthiness of the dataset.

    • To preview the data in the dataset without leaving Tracker.

New Features

  • Tracker Analytics - List overall analytics for Tracker as well as analytics for overall datasets

  • Metadata Management - Edit User Properties Metadata (name might change) and User Tags directly from Entity Details Page

  • Promoted Tags - Use Promoted Tags to indicate preferred metadata vocabulary to users

  • Tracker Trust Meter - Quickly visualize the trustworthiness of your data based on other programs utilizing the dataset.

  • Data Preview - Preview datasets directly in Tracker

Tracker Analytics (UI, New Tracker Dataset Cube, New Tracker Service Endpoint)

  • Shown in two places

    • On the Home Page under the search box

    • On a separate tab on dataset details page

  • Overall Metrics for Home Page 

    • Metric 1: Applications accessing the most datasets (Bar)

      • Names clickable and lead to CDAP detail page

      • Hover over name shows tooltip "View in CDAP"

    • Metric 2: Programs accessing the most datasets (Bar)

      • Names clickable and lead to CDAP detail page

      • Hover over name shows tooltip "View in CDAP"

    • Metric 3: Datasets being accessed the most

      • Dataset names clickable directly to detail page

    • Data for past 7 days, static timeframe

  • Dataset Detail Page Metrics

    • Metric 1: Histogram of Audit Messages for dataset

      • Gives the user a visual of when and how active the dataset is

      • Custom Date range

    • Metric 2: Applications accessing the dataset (Bar)

      • Names clickable and lead to CDAP detail page

      • Hover over name shows tooltip "View in CDAP"

    • Metric 3: Programs accessing the dataset (Bar)

      • Names clickable and lead to CDAP detail page

      • Hover over name shows tooltip "View in CDAP"

    • Metric 4: Recent Activity showing time since last...

      • New Program Reading from dataset

      • New Program Writing to dataset

      • Since last truncate

      • Since last update

      • Since last metadata change

    • No custom datepicker on tab, but possible for the histogram widget

    • Timeframe = all time

  • If no data is available for a graph, display an empty graph with a message.


Metadata Management (UI, Tracker Service, Tracker Dataset)

  • Provides a way for the user to edit User metadata directly in Tracker

  • Tags may be added/removed on the dataset Details page using a similar style as in CDAP UI

  • User Properties Metadata may be added/removed on the Dataset details page

  • If a user attempts to add a tag that is already on the dataset, a warning is displayed and the tag is not added.

  • If a user attempts to add a tag that is invalid (space, invalid char, etc), a warning is displayed and the tag is not added.

  • When a user tries to add a tag to a dataset, as they type, a list of tags is displayed along with a count of the number of datasets using that tag

  • Type ahead tags are displayed sorted by usage, the most usage is at the top.

  • All tags have “x” in them so they can be deleted.

Promoted Tags (UI, Tracker Service, Tracker Dataset)

  • A way to indicate to your users the preferred tags to add to datasets.

  • Configured via the Tags menu bar

  • When the user first visits the configuration page, a list of all user entered tags is displayed sorted by alphabetical

  • User tags displayed in a table with the following columns

    • Column 1: Tag Name

    • Column 2: Count of datasets using this tag and clickable link to view all datasets which takes them to the Tracker search results page

    • Column 3: Actions

      • Promote

  • Promoted tags are displayed in a separate table above the user tags with the following columns

    • Column 1: Tag Name

    • Column 2: Count of datasets using this tag and clickable link to view all datasets which takes them to the Tracker search results page

    • Column 3: Actions

      • Demote

  • Columns are sortable

  • Promoted tags are visually different from User tags (green color)

  • Promoted tags can be added one at a time similar to how tags are added in CDAP today

  • Promoted tags can be added as a list or uploaded as a text file

  • When a user tried to add a tag to a dataset, as they type, a list of tags is displayed along with a count of the number of datasets using that tag. Preferred tags are displayed first and in a different color.

  • All tags are sorted by usage when user is typing (most used first).

  • When someone uploads a list that also contains user tags, those user tags are promoted to promoted tags.

  • Before preferred tags are uploaded, a confirmation dialog is displayed to the user indicating the number of tags that will be added. If some tags failed to be added, this number will be less than the number of lines in the file.


Tracker Trust Meter (UI, Tracker Service, Tracker Dataset)

  • A visual display of the trustworthiness of a dataset.

  • Available in the search results and the details page

  • Red green or yellow type of scale with a icon indicating where the dataset is on the scale

  • Metrics used in the score should be accessible via the Analytics tab

  • Initial Metrics to factor into the score

    • More programs reading from the dataset the better

    • Time since last new program started reading

    • Overall audit log activity


Data Preview (UI, CDAP Service (maybe))

  • A easy way to view a preview of the data in the table

  • Accessed via an additional tab in the Dataset details page

  • Only available if the dataset is record scannable.

  • No way to write custom queries, just a select * ordered by most recent key or timestamp (if possible)

  • No time range or downloading of data.

  • Jump button allows the user to go to the full Explore option in the CDAP UI

  • Limit results to a max of 500

  • Infinite scroll up to 500.

Created in 2020 by Google Inc.