One common use case is somebody is dumping snapshots of a database into a tpfs. But they don't want the tpfs to grow infintely, but only want to keep x days of data in the dataset. This setting would allow them to cap how many days of data are in it by deleting older partitions after the pipeline run.
Added a configuration property to Hydrator TPFS sinks that will clean data that is older than a threshold amount of time.
In a way, we are implementing a dataset feature (TTL) in the sink here. Is that the correct approach?
I would say its a useful application stopgap that is good enough for a lot of use cases. Since proper fileset ttl is much more involved, I think it makes sense to have this now until the platform feature is available.
that works for me. Do we have a Jira for supporting TTL in FileSets?