Description

Recently we observed the behavior on few cdap instances where Runtime pod goes OOM after every few mins.

Runtime pod uses leveldb for caching metadata. When it goes OOM, we noticed level db directories to be in large size:
```
/data/ldb# du -h
38G ./cdap_system.entity.store.d
20K ./cdap_system.entity.registry
12K ./cdap_system.entity.store.i
```

After cleaning these leveldb directories, pod seems to be stable. We should investigate what is causing this OOM and fix it.

Release Notes

None

Activity

Show:

Wangyuan Zhang February 7, 2022 at 4:39 PM

Internal. No need for a release note

Robin Rielley January 28, 2022 at 9:26 PM

Does this have doc impact or need a release note? Looks like an internal fix?

Wangyuan Zhang January 28, 2022 at 7:30 PM

The issue is large trigger info in run record. In 6.6, we are trimming run record before writing it to local levelDB in runtime pod.

Fixed
Pinned fields
Click on the next to a field label to start pinning.
Details

Assignee

Wangyuan Zhang

Reporter

Vinisha Shah

Affects versions

Fix versions

Priority

Created October 14, 2021 at 10:46 PM
Updated February 7, 2022 at 4:39 PM
Resolved January 28, 2022 at 7:28 PM
Loading...