System Datasets
CDAP comes with several system-defined datasets, including but not limited to key-value Tables, indexed Tables and time series. Each of them is defined with the help of one or more embedded Tables, but defines its own interface. Examples include:
The
KeyValueTable
implements a key/value store as a Table with a single column.The
IndexedTable
implements a Table with a secondary key using two embedded Tables, one for the data and one for the secondary index.The
TimeseriesTable
uses a Table to store keyed data over time and allows querying that data over ranges of time.The
ObjectMappedTable
uses a Table to store Java Objects by mapping object fields to table columns. It can be explored through the use of ad-hoc SQL-like queries as described in ObjectMappedTable Exploration.
Custom Datasets
You can define your own dataset classes to implement common data patterns specific to your code.
...
Code Block |
---|
Class MyApp extends AbstractApplication { public void configure() { createDataset("myCounters", UniqueCountTable.class) ... } } |
Passing Properties
You can also pass DatasetProperties
as a third parameter to the createDataset
method. These properties will be used by embedded datasets during creation and will be available via the DatasetSpecification
passed to the dataset constructor. For example, to create a dataset with a TTL (time-to-live, specified in seconds) property, you can use:
...
You can pass other properties, such as for conflict detection and for pre-splitting into multiple regions.
Accessing a Dataset
Application components can access a custom dataset in the same way as all other datasets: via either the @UseDataSet
annotation, or the getDataset()
method of the program context. This is described in more detail in the section on Using Datasets in Programs.
You can also create, drop, and truncate datasets using the Dataset Microservices.
Annotating Dataset Methods
Dataset methods can be annotated with the type of access that they perform on data. Annotations help the CDAP runtime to enforce authorization, as well as track lineage. Dataset methods (including constructors) can be annotated with one of:
...