Expected Mgr Memory Usage - ceph-users

3 Mar 2020

Hello all,

I'm maintaining a small Nautilus 12 OSD cluster (36TB raw). My mon nodes have the
mgr/mds collocated/stacked with the mon. Each are allocated 10gb of RAM.

During a recent single disk failure and corresponding recovery, I noticed my mgr/mon's
were starting to get OOM killed/restarted every 5ish hours - the mgr using around 6.5GB on
all my nodes. My monitoring shows an interesting sawtooth pattern with network usage
(100MB/s at max), disk storage usage, and disk IO (up to 300MB/s against SSD's at max)
usage increasing in parallel with memory usage.

I know the docs for hardware recommendations say:
...
  Monitor and manager daemon memory usage generally
scales with the size of the cluster. For small clusters, 1-2 GB is generally sufficient.
For large clusters, you should provide more (5-10 GB). 
Now, I would like to think my cluster is on the small size of things, so I was hoping 10gb
is enough for the mgr and mon (my OSD nodes are only allocated 32GB of ram), but that
assumption appears to be false.

So I was wondering how mgr's (and to a lesser extent mon's) are expected to scale
in terms of memory. Is it the osd count, or the osd's size, number of pg's, etc.?
And if there's a way to limit the amount of RAM used by these mgr's (it seems the
mon_osd_cache_size and rocksdb_cache_size settings are for mons if I'm not mistaken).

Regards,
Mark