ceph-mon store.db disk usage increase on OSD-Host fail

List overview All Threads
Download

newer

older

RGWReshardLock::lock failed to...

Single machine / multiple monitors

Hartwig Hauschild

5 Mar 2020 5 Mar '20

11:25 p.m.

Hi, I'm (still) testing upgrading from Luminous to Nautilus and ran into the following situation: The lab-setup I'm testing in has three OSD-Hosts. If one of those hosts dies the store.db in /var/lib/ceph/mon/ on all my Mon-Nodes starts to rapidly grow in size until either the OSD-host comes back up or disks are full. On another cluster that's still on Luminous I don't see any growth at all. Is that a difference in behaviour between Luminous and Nautilus or is that caused by the lab-setup only having three hosts and one lost host causing all PGs to be degraded at the same time? -- Cheers, Hardy

Show replies by date

Hartwig Hauschild

10 Mar 10 Mar

6:48 p.m.

Hi, I've done a bit more testing ... Am 05.03.2020 schrieb Hartwig Hauschild:

...

This also happens when I take one single OSD offline - /var/lib/ceph/mon/ grows from around 100MB to ~2GB in about 5 Minutes, then I aborted the test. Since we've had an OSD-Host fail over a weekend I know that growing won't stop until the disk is full and that usually happens in around 20 Minutes, then taking up 17GB of diskspace.

...

On another cluster that's still on Luminous I don't see any growth at all.

Retested that cluster as well, observing the size on disk of /var/lib/ceph/mon/ suggests, that there's writes and deletes / compactions going on as it kept floating within +- 5% of the original size.

...

Is that a difference in behaviour between Luminous and Nautilus or is that caused by the lab-setup only having three hosts and one lost host causing all PGs to be degraded at the same time?

I've read somewhere in the docs that I should provide ample space (tens of GB) for the store.db, found on the ML and Bugtracker that ~100GB might not be a bad idea and that large clusters may require space on order of magnitude greater. Is there some sort of formula I can use to approximate the space required? Also: is the db supposed to grow this fast in Nautilus when it did not do that in Luminous? Is that behaviour configurable somewhere? -- Cheers, Hardy

Wido den Hollander

6:59 p.m.

On 3/10/20 10:48 AM, Hartwig Hauschild wrote:

...

Hi, I've done a bit more testing ... Am 05.03.2020 schrieb Hartwig Hauschild:

On another cluster that's still on Luminous I don't see any growth at all.

Retested that cluster as well, observing the size on disk of /var/lib/ceph/mon/ suggests, that there's writes and deletes / compactions going on as it kept floating within +- 5% of the original size.

Is that a difference in behaviour between Luminous and Nautilus or is that caused by the lab-setup only having three hosts and one lost host causing all PGs to be degraded at the same time?

I don't know about a formula, but make sure you have enough space. MONs are dedicated nodes in most production environments, so I usually install a 400 ~ 1000GB SSD just to make sure they don't run out of space.

...

Also: is the db supposed to grow this fast in Nautilus when it did not do that in Luminous? Is that behaviour configurable somewhere?

The MONs need to cache the OSDMaps when not all PGs are active+clean thus their database grows. You can compact RocksDB in the meantime, but it won't last for ever. Just make sure the MONs have enough space. Wido >

Hartwig Hauschild

12 Mar 12 Mar

3:44 p.m.

Am 10.03.2020 schrieb Wido den Hollander:

...

On 3/10/20 10:48 AM, Hartwig Hauschild wrote: > Hi, > > I've done a bit more testing ... > > Am 05.03.2020 schrieb Hartwig Hauschild: >> Hi, >>

[ snipped ]

...

That seems fair.

...

Also: is the db supposed to grow this fast in Nautilus when it did not do that in Luminous? Is that behaviour configurable somewhere?

XuYun

4:59 p.m.

We got the same problem today while we were adding memory to OSD nodes, and it decreased monitor’s performance a lot. I noticed that the db kept increasing after an OSD is shutdown, so I guess that it is caused by the warning reports collected by mgr insights module. When I disabled the mgr insights module, the db size reduced to 1xx MB from 3x GB.

...

2020年3月12日下午2:44，Hartwig Hauschild <ml-ceph(a)hauschild.it> 写道： Am 10.03.2020 schrieb Wido den Hollander:

On 3/10/20 10:48 AM, Hartwig Hauschild wrote: > Hi, > > I've done a bit more testing ... > > Am 05.03.2020 schrieb Hartwig Hauschild: >> Hi, >>

[ snipped ]

That seems fair.

Also: is the db supposed to grow this fast in Nautilus when it did not do that in Luminous? Is that behaviour configurable somewhere?

Do you happen to know if that behaved differently in previous releases? I'm just asking because I have not found anything about this yet and may need to explain that it's different now. -- Cheers, Hardy _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

Hartwig Hauschild

5:43 p.m.

Am 12.03.2020 schrieb XuYun:

...

Works like a charm, disk-usage is where I expected it to be. Thanks. Should we ever meet in person: I owe you a drink.

...

2020年3月12日下午2:44，Hartwig Hauschild <ml-ceph(a)hauschild.it> 写道： Am 10.03.2020 schrieb Wido den Hollander:

On 3/10/20 10:48 AM, Hartwig Hauschild wrote: > Hi, > > I've done a bit more testing ... > > Am 05.03.2020 schrieb Hartwig Hauschild: >> Hi, >>

[ snipped ]

That seems fair.

Also: is the db supposed to grow this fast in Nautilus when it did not do that in Luminous? Is that behaviour configurable somewhere?

-- Cheers, Hardy

Wido den Hollander

5:36 p.m.

On 3/12/20 7:44 AM, Hartwig Hauschild wrote:

...

Am 10.03.2020 schrieb Wido den Hollander:

On 3/10/20 10:48 AM, Hartwig Hauschild wrote: > Hi, > > I've done a bit more testing ... > > Am 05.03.2020 schrieb Hartwig Hauschild: >> Hi, >>

[ snipped ]

That seems fair.

Also: is the db supposed to grow this fast in Nautilus when it did not do that in Luminous? Is that behaviour configurable somewhere?

Do you happen to know if that behaved differently in previous releases? I'm just asking because I have not found anything about this yet and may need to explain that it's different now.

It actually became better in recent releases. Nautilus didn't became worse. Hammer and Jewel were very bad with this and they grew to hundreds of GB on large(r) clusters. So no, I'm not aware of any changes. Wido

Hartwig Hauschild

5:46 p.m.

Am 12.03.2020 schrieb Wido den Hollander:

...

On 3/12/20 7:44 AM, Hartwig Hauschild wrote:

Am 10.03.2020 schrieb Wido den Hollander:

On 3/10/20 10:48 AM, Hartwig Hauschild wrote: > Hi, > > I've done a bit more testing ... > > Am 05.03.2020 schrieb Hartwig Hauschild: >> Hi, >>

[ snipped ]

That seems fair.

Also: is the db supposed to grow this fast in Nautilus when it did not do that in Luminous? Is that behaviour configurable somewhere?

Do you happen to know if that behaved differently in previous releases? I'm just asking because I have not found anything about this yet and may need to explain that it's different now.

Fair enough. Disabling the insights-module as XuYun pointed out brought the nautilus-cluster back to the same behaviour luminous is showing here, so I'll check whether we really need the module and how to work around the disk-usage. -- Cheers, Hardy

1508

days inactive

1515

days old

ceph-users@ceph.io

Manage subscription

7 comments

3 participants

tags (0)

participants (3)

Hartwig Hauschild
Wido den Hollander
XuYun