CDAP Components and Functional Responsibilities
CDAP Components and Functional Responsibilities
Infrastructure components used by Cask Data Application Platform (CDAP)
Following are the underlying infrastructure components used by CDAP and/or CDAP Applications running in CDAP. The components presented below are in no priority order.
- HDFS
- HBase
- Hive
- Kafka
- YARN and
- Zookeeper
- KMS
- Sentry ???
Functional use of infrastructure components
This section provides information about how and for what the components underneath are used.
HDFS
- CDAP Stream
- Apache Tephra WAL
- Deployed Application Artifact and Dataset Artifact
- Aggregated Logs
- CDAP Fileset Dataset
- YARN distributed cache
- Coprocessor jars
HBase
- CDAP System data/metadata (ex: Preferences, Application, Namespace, Artifact…)
- Metrics Cube
- Lineage
- Workflow Statistics
- Run Record and Statistics
- Checkpoint information
- CDAP Table Dataset
Kafka
- Logs
- Metrics
- Audit Logs (Will be moved to HBase in 4.0)
- Metadata updates (Will be moved to HBase in 4.0)
- Notifications (Will be moved to HBase in 4.x)
YARN
- System Services
- User applications
Zookeeper
- Routing Tables
- Coordination
- Secret keys
- Auth keys
Hive
- Dataset integration
- Schema
- Properties
- Serde
KMS
- User Secrets (Ex: Password, access tokens etc..)
, multiple selections available,
Related content
CDAP Components
CDAP Components
More like this
CDAP Modes and Components
CDAP Modes and Components
More like this
CDAP Platform Overview
CDAP Platform Overview
More like this
CDAP Operational APIs
CDAP Operational APIs
More like this
CDAP Microservices Guide
CDAP Microservices Guide
More like this
Core Abstractions
Core Abstractions
More like this
Created in 2020 by Google Inc.