...
Why are we adding this to RuntimeContext, and not to DatasetContext? The idea is that DatasetContext represents a way to obtain an instance of a dataset in a transactional context. Admin operations are not transactional, and therefore it seems cleaner to add them separately from DatasetContext. Also, in the future this Admin interface will provide, non-dataset related operations.
One complication lies hidden in the implementation of getDatasetProperties()
: When creating a dataset with a certain set of properties, the dataset framework of CDAP does not store that set of properties. Instead, it calls the configure()
method of the dataset definition with these properties. This method returns a dataset spec that contains properties - but it is up to the implementation of every single dataset definition to construct that spec, and it may not reflect the original properties that were passed in. For example, it may contain some properties that are derived from the original properties, or it may use the original properties to set properties on its embedded datasets, but not its own properties.
TODO: check all CDAP implementations of DatasetDefinition.configure()
for whether they preserve the original dataset properties. There are about 40 implementations in our code base.