Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The FileSetPartitionedFileSet, and TimePartitionedFileSet datasets can be explored through ad-hoc SQL-like queries. To enable exploration, you must set several properties when creating the dataset, and the files in your dataset must meet certain requirements. These properties and requirements are described below.

Explore Properties

FileSetPartitionedFileSet, or TimePartitionedFileSet is made explorable by setting several properties when creating the dataset. The FileSetProperties class (PartitionedFileSetProperties or TimePartitionedFileSetsProperties classes for the other two types) should be used to set the following required properties:

...

  • setUseExisting(true) has the effect that an existing Hive table can be used. But when the dataset is dropped or truncated, or when explore is disabled for the dataset, the Hive table will remain unaffected.

  • setPossessExisting(true) directs the dataset to take possession of an existing Hive table. That means that when the dataset is dropped or truncated, or when exploration is disabled for the dataset, the Hive table will be dropped or cleared from all partitions.

Limitations

There are several limitations for fileset exploration:

...