Replace Kafka With TMS for metrics

Checklist

  • User Stories Documented
  • User Stories Reviewed
  • Design Reviewed
  • APIs reviewed
  • Release priorities assigned
  • Test cases reviewed
  • Blog post

Introduction 

Use TMS to publish and fetch metrics.

Goals

Better replication and less code with TMS.

User Stories 

 

Design

Currently, metrics are published by KafkaMetricsCollectionService to and processed by KafkaMetricsProcessorService. In KafkaMetricsProcessorService, MetricsMessageCallback is called to process Kafka messages and store metrics in a metrics data HBase table. MetricsMessageCallback also updates the offset of the latest processed message in a metrics meta table. To replace Kafka with TMS, metrics need to be published and fetched in the same TMS topic, and processed metrics will be stored in the same metrics data HBase table. A new metrics meta table is also needed to store the messageId of the latest processed message. However, if the user upgrade CDAP from the version using Kafka for metrics to the version using TMS, unprocessed messages in Kafka still need to be processed.

Approach

 

Security Impact 

What's the impact on Authorization and how does the design take care of this aspect

Impact on Infrastructure Outages 

System behavior (if applicable - document impact on downstream [ YARN, HBase etc ] component failures) and how does the design take care of these aspect

Test Scenarios

Test IDTest DescriptionExpected Results
   
   
   
   

Releases

Release 4.1.0

Related Work

  • Work #1
  • Work #2
  • Work #3

 

Future work

Created in 2020 by Google Inc.