Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • CDAP Uptime
    • P1: Should indicate the time (number of hours, days?) for which the CDAP Master process has been running. 
    • P2: In an HA environment, it would be nice to indicate the time of the last master failover.
  • CDAP System Services
    • P1: Should indicate the current number of instances.
    • P1: Should have a way to scale services.
    • P1: Should show service logs
    • P2: Node name where container started
    • P2: Container name
    • P2: master.services YARN application name
  • Middle Drawer:
    • CDAP:
      • P1: # of masters, routers, kafka-servers, auth-servers
      • P1: Router requests - # 200s, 404s, 500s
      • P1: # namespaces, artifacts, apps, programs, datasets, streams, views
      • P1: Transaction snapshot summary (invalid, in-progress, committing, committed)
      • P1: Logs/Metrics service lags
      • P2: Last GC pause time
    • HDFS:
      • P1: Space metrics: yotal, free, used
      • P1: Nodes: yotal, healthy, decommissioned, decommissionInProgress
      • P1: Blocks: missing, corrupt, under-replicated
    • YARN:
      • P1: Nodes: total, new, running, unhealthy, decommissioned, lost, rebooted
      • P1: Apps: total, submitted, accepted, running, failed, killed, new,  new_saving
      • P1: Memory: total, used, free
      • P1: Virtual Cores: total, used, free
      • P1: Queues: total, stopped, running, max_capacity, current_capacity
    • HBase
      • P1: Nodes: total_regionservers, live_regionservers, dead_regionservers, masters
      • P1: No. of namespaces, tables
      • P2: Last major compaction (time + info)
    • Zookeeper: Most of these are from the output of echo mntr | nc localhost 2181
      • P1: Num of alive connections
      • P1: Num of znodes
      • P1: Num of watches
      • P1: Num of ephemeral nodes
      • P1: Data size
      • P1: Open file descriptor count
      • P1: Max file descriptor count
    • Kafka
    • Sentry
      • P1: # of roles
      • P1: # of privileges
      • P1: memory: total, used, available
      • P1: requests per second
      • any more?
    • KMS
      • TBD: Having a hard time hitting the JMX endpoint for KMS
  • Component Overview
    • P1: YARN, HDFS, HBase, Zoookeeper, Kafka, Hive
    • P1: For each component: version, url, logs_url
    • P2: Sentry, KMS
    • P2: Distribution info
    • P2: Plus button - to store custom components and version, url, logs_url for each.

...