...
The FileSet
, PartitionedFileSet
, and TimePartitionedFileSet
datasets can be explored through ad-hoc SQL-like queries. To enable exploration, you must set several properties when creating the dataset, and the files in your dataset must meet certain requirements. These properties and requirements are described below.
Explore Properties
A FileSet
, PartitionedFileSet
, or TimePartitionedFileSet
is made explorable by setting several properties when creating the dataset. The FileSetProperties
class (PartitionedFileSetProperties
or TimePartitionedFileSetsProperties
classes for the other two types) should be used to set the following required properties:
...
setUseExisting(true)
has the effect that an existing Hive table can be used. But when the dataset is dropped or truncated, or when explore is disabled for the dataset, the Hive table will remain unaffected.setPossessExisting(true)
directs the dataset to take possession of an existing Hive table. That means that when the dataset is dropped or truncated, or when exploration is disabled for the dataset, the Hive table will be dropped or cleared from all partitions.
Limitations
There are several limitations for fileset exploration:
...