Preview doesn't close leveldb files

Description

I'm running preview of a pipeline about 4 times per hour on a CDAP Sandbox instance. Within about 11 hours, the CDAP Sandbox has hit its open file limit of ~4k.

Each run seems to leak about 40-100 files, and they mostly seem to be files used for LevelDB.

Release Notes

Fixed a resource leak in preview feature.

Activity

Show:
Ali Anwar
August 14, 2018, 12:10 AM

Two users may be running preview independently, right? So it wouldn't be good to lower that value from to 1.
There is a continued leak even after that limit of 10 is encountered.

See the attached graph of open file usage. The first 10 bumps are large, but even when we run preview more, there is an increase.

Sagar Kapare
August 14, 2018, 12:33 AM

Its unclear from the graph at which point we reach to the limit of 10. Is it at 22:00 or happens way earlier than that? Any idea why the graph is flat after that at the value of 3.8K? 

One thing we can try is - set PREVIEW_CACHE_SIZE to 1 and launch 1 preview run. Get the lsof of the process. Launch another preview run. Since cache size is 1 the leveldb for the older run should get deleted. Take the lsof output again and see if the file descriptors for the deleted directories are still hold by the process.

Ali Anwar
August 14, 2018, 12:56 AM

The graph shows the runs from the beginning, so the 10th step (counting from the left) is the 10th run.
The graph is flat after the value of 3.8K because additional preview runs fail to run (due to "Too many open files" errors).

mentioned that the leak is probably due to not closing the DB object in LevelDBTableService.

Sagar Kapare
August 14, 2018, 1:00 AM

Yeah thats true  

We are deleting the directory but not closing the underlying DB in LevelDBTableService which causes process to hold fds for the deleted files as well.

Ali Anwar
August 16, 2018, 8:52 PM
Fixed

Assignee

Ali Anwar

Reporter

Ali Anwar

Docs Impact

None

UX Impact

None

Fix versions

Priority

Major
Configure