...
- CDAP Uptime
- P1: Should indicate the time (number of hours, days?) for which the CDAP Master process has been running.
- P2: In an HA environment, it would be nice to indicate the time of the last master failover.
- CDAP System Services:
- P1: Should indicate the current number of instances.
- P1: Should have a way to scale services.
- P1: Should show service logs
- P2: Node name where container started
- P2: Container name
- P2:
master.services
YARN application name
- Middle Drawer:
- CDAP:
- P1: # of masters, routers, kafka-servers, auth-servers
- P1: Router requests - # 200s, 404s, 500s
- P1: # namespaces, artifacts, apps, programs, datasets, streams, views
- P1: Transaction snapshot summary (invalid, in-progress, committing, committed)
- P1: Logs/Metrics service lags
- P2: Last GC pause time
- HDFS:
- P1: Space metrics: yotal, free, used
- P1: Nodes: yotal, healthy, decommissioned, decommissionInProgress
- P1: Blocks: missing, corrupt, under-replicated
- YARN:
- P1: Nodes: total, new, running, unhealthy, decommissioned, lost, rebooted
- P1: Apps: total, submitted, accepted, running, failed, killed, new, new_saving
- P1: Memory: total, used, free
- P1: Virtual Cores: total, used, free
- P1: Queues: total, stopped, running, max_capacity, current_capacity
- HBase
- P1: Nodes: total_regionservers, live_regionservers, dead_regionservers, masters
- P1: No. of namespaces, tables
- P2: Last major compaction (time + info)
- Zookeeper: Most of these are from the output of
echo mntr | nc localhost 2181
- P1: Num of alive connections
- P1: Num of znodes
- P1: Num of watches
- P1: Num of ephemeral nodes
- P1: Data size
- P1: Open file descriptor count
- P1: Max file descriptor count
- Kafka
- JMX Metrics that Kafka exposes: https://kafka.apache.org/documentation#monitoring
- P1: # of topics
- P1: Message in rate
- P1: Request rate
- P1: # of under replicated partitions
- P1: Partition counts
- Sentry
- P1: # of roles
- P1: # of privileges
- P1: memory: total, used, available
- P1: requests per second
- any more?
- KMS
- TBD: Having a hard time hitting the JMX endpoint for KMS
- CDAP:
- Component Overview
- P1: YARN, HDFS, HBase, Zoookeeper, Kafka, Hive
- P1: For each component: version, url, logs_url
- P2: Sentry, KMS
- P2: Distribution info
- P2: Plus button - to store custom components and version, url, logs_url for each.
...