[adding dev]
On Wed, 9 Oct 2019, Aaron Johnson wrote:
Hi all
I have a smallish test cluster (14 servers, 84 OSDs) running 14.2.4.
Monthly OS patching and reboots that go along with it have resulted in
the cluster getting very unwell.
Many of the servers in the cluster are OOM-killing the ceph-osd
processes when they try to start. (6 OSDs per server running on
filestore.). Strace shows the ceph-osd processes are spending hours
reading through the 220k osdmap files after being started.
Is the process size growing during this time? There should be a cap to
the size of the OSDMap cache; perhaps there is a regression there.
One common thing to do here is 'ceph osd set noup' and restart the OSD,
and then monitor the OSD's progress catching up on maps with 'ceph daemon
osd.NN status' (compare the epoch to what you get from 'ceph osd dump |
head'). This will take a while if you are really 220k maps (!!!) behind,
but the memory usage during that period should be relatively constant.
This behavior started after we recently made it about
72% full to see
how things behaved. We also upgraded it to Nautilus 14.2.2 at about the
same time.
I’ve tried starting just one OSD per server at a time in hopes of
avoiding the OOM killer. Also tried setting noin, rebooting the whole
cluster, waiting a day, then marking each of the OSDs in manually. The
end result is the same either way. About 60% of PGs are still down, 30%
are peering, and the rest are in worse shape.
Usually in instances like this in the past, getting all OSDs to catch up
on maps and then unsetting 'noup' will let them all come up and peer at
the same time. But usually what has happened is many of the OSDs are not
caught up and it's not immediately obvious, so PGs don't peer. So setting
noup and waiting for all osds to be caught up (as per 'ceph daemon osd.NNN
status') first generally helps.
But none of that explains why you're seeing OOM, so I'm curious what you
see with memory usage while OSDs are catching up...
Thanks!
sage
Anyone out there have suggestions about how I should go about getting
this cluster healthy again? Any ideas appreciated.
Thanks!
- Aaron