On 3/10/20 10:48 AM, Hartwig Hauschild wrote:
Hi,
I've done a bit more testing ...
Am 05.03.2020 schrieb Hartwig Hauschild:
Hi,
I'm (still) testing upgrading from Luminous to Nautilus and ran into the
following situation:
The lab-setup I'm testing in has three OSD-Hosts.
If one of those hosts dies the store.db in /var/lib/ceph/mon/ on all my
Mon-Nodes starts to rapidly grow in size until either the OSD-host comes
back up or disks are full.
This also happens when I take one single OSD offline - /var/lib/ceph/mon/
grows from around 100MB to ~2GB in about 5 Minutes, then I aborted the test.
Since we've had an OSD-Host fail over a weekend I know that growing won't
stop until the disk is full and that usually happens in around 20 Minutes,
then taking up 17GB of diskspace.
On another cluster that's still on Luminous I
don't see any growth at all.
Retested that cluster as well, observing the size on disk of
/var/lib/ceph/mon/ suggests, that there's writes and deletes / compactions
going on as it kept floating within +- 5% of the original size.
Is that a difference in behaviour between
Luminous and Nautilus or is that
caused by the lab-setup only having three hosts and one lost host causing
all PGs to be degraded at the same time?
I've read somewhere in the docs that I should provide ample space (tens of
GB) for the store.db, found on the ML and Bugtracker that ~100GB might not
be a bad idea and that large clusters may require space on order of
magnitude greater.
Is there some sort of formula I can use to approximate the space required?
I don't know about a formula, but make sure you have enough space. MONs
are dedicated nodes in most production environments, so I usually
install a 400 ~ 1000GB SSD just to make sure they don't run out of space.
Also: is the db supposed to grow this fast in Nautilus when it did not do
that in Luminous? Is that behaviour configurable somewhere?
The MONs need to cache the OSDMaps when not all PGs are active+clean
thus their database grows.
You can compact RocksDB in the meantime, but it won't last for ever.
Just make sure the MONs have enough space.
Wido
>