Tracker Audit Metrics
Tracker Audit Metrics
This is not for 3.4 release.
The goal of this page is to document the design of the Tracker Audit Metrics.
Use-Cases
- As a user of Tracker, I would like to see total number of audit messages by type/subtype in the past T timeframe.
- "Show me the total number of reads in the system in the past 1 hour."
- As a user of Tracker, I would like to see the top N datasets/streams by audit message type/subtype activity in the past T timeframe.
- "Show me the 5 datasets with the most writes in the past 24 hours."
- "Show me the 5 streams with the most metadata_changes in the past 7 days."
- As a user of Tracker, I would like to see the top N namespaces with the most type/subtype activity in the past T timeframe.
- "Show me the 5 namespaces with the most reads in the past 1 hour."
- As a user of Tracker, I would like to see the top N programs reading/writing to a specific dataset in the past T timeframe.
- "Show me the top 5 programs writing to dataset1 in the past 1 hour."
Initial High Level Plan
- As messages come from the Kafka broker and are written to the AuditLog Table, when a message matches one of the metrics criteria, update metrics in a separate OLAP Cube (but the same dataset) as required.
- In the service layer, expose a new endpoint that allows users to query the data in the metrics table and returns the results in JSON.
Storing Metrics in AuditLog Dataset
- Add an additional Cube table to the AuditLog custom dataset to hold metrics.
- The properties of the cube will be as follows
- Resolutions: 1h, 6h, 24h, 1w, 1m, 3m, 6m, 1y
- Aggs:
- namespace (default ns1 ns2)
- namespace,entity_type,entity_name (default,stream,stream1 default,dataset,dataset1)
- namespace,entity_type,entity_name,program (default,stream,stream1,program1 default,dataset,dataset1,program2)
- Measurements:
- access
- read
- write
- unknown
- create
- update
- truncate
- delete
- metadata_change
- count
- Queries to OLAP cube
- "Show me the total number of reads in the system in the past 1 hour."
{ "aggregation": "agg2", "resolution": 3600, "startTs": now, "endTs": now-1h, "measurements": {"access_reads": "SUM"}, "limit": 1 }
- "Show me the 5 datasets with the most writes in the past 24 hours."
{ "aggregation": "agg2", "resolution": 86400, "startTs": now, "endTs": now-24h, "measurements": {"access_writes": "SUM"},
"dimensionValues" : { "entity_type" : "dataset" },
"groupByDimensions": ["namespace","entity_type","entity_name"],
"limit" : 10000
}Results then sorted and top 5 returned
- "Show me the 5 streams with the most metadata_changes in the past 7 days."
{ "aggregation": "agg2", "resolution":604800, "startTs": now, "endTs": now-7d, "measurements": {"metadata_changes": "SUM"},
"dimensionValues" : { "entity_type" : "stream" },
"groupByDimensions": ["namespace","entity_type","entity_name"],
"limit" : 10000
}Results then sorted and top 5 returned
- "Show me the 5 namespaces with the most reads in the past 1 hour."
{ "aggregation": "agg2", "resolution":86400, "startTs": now, "endTs": now-1h, "measurements": {"access_reads": "SUM"},
"groupByDimensions": ["namespace"],
"limit" : 10000
}Results then sorted and top 5 returned
- "Show me the total number of reads in the system in the past 1 hour."
Endpoints
Method | Endpoint | Description | Params | Sample Data | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
GET | /auditmetrics/topEntities?limit={limit} | Returns the entities with the most activity for use in a general chart listing the entities with the most activity in CDAP |
| [ { "namespace": "default", "entityType": "09ed6ccb-fd1a-11e5-a248-0000003b6093", "entityName": "AuditMetrics", "columnValues": { "count": 15, "unknown": 15, "access": 15 } }, { "namespace": "default", "entityType": "b78f346d-fa88-11e5-b588-2ef89310f408", "entityName": "AuditLog", "columnValues": { "count": 12, "unknown": 12, "access": 12 } }, { "namespace": "default", "entityType": "application", "entityName": "CDAPToSlack", "columnValues": { "count": 10, "metadata_change": 10 } }, ... ] | ||||||||
, multiple selections available,
Related content
Tracker Audit Log
Tracker Audit Log
More like this
Audit Metrics Endpoints
Audit Metrics Endpoints
More like this
Audit information publishing
Audit information publishing
More like this
Audit logging for 4.1
Audit logging for 4.1
More like this
Tracker 0.2 Spec
Tracker 0.2 Spec
More like this
Metrics Architecture
Metrics Architecture
More like this
Created in 2020 by Google Inc.