This turned out to be really easy to fix.
1. Find the correct number of extents that would fit on the physical
devices. In my case 13262
2. Just extend the DB LVMs accordingly (in my case, it was all of them,
identically):
find /dev/ceph-*/osd-db-* | xargs -I{} lvextend -l 13626 {}
3. Restart the OSDs
and they seem to automagically expand the bluefs DB part:
Dec 13 20:23:11 mimer-osd14
ceph-5406fed0-d52b-11ec-beff-7ed30a54847b-osd-547[3004851]: debug
2023-12-13T19:23:11.723+0000 7f5126e01200 1 bluefs add_block_device
bdev 1 path /var/lib/ceph/osd/ceph-547/block.db size 53 GiB
reporting 53GB instead of the 17GB I had before.
I would recommend anyone with multiple drives used for block_db to double
check their LVM sizes.
Best regards, Mikael
On Wed, Dec 13, 2023 at 2:37 PM Mikael Öhman <micketeer(a)gmail.com> wrote:
Hi
Our hosts have 3 NVMEs and 48 spinning drives each.
We found that ceph orch made the default lvm size for the block_db 1/3 the
total size of the NVMEs.
I suspect that ceph only considered one of the NVMEs when determining the
size, based the closely related issue;
https://tracker.ceph.com/issues/54541
We started having some bluefs spillover events now, so I'm looking for a
way to fix this.
Best idea I have so far is to manually specify the "block_db_size" in the
osd_spec, then just recreating the entire block_db. Though I'm not sure if
that means we'll hit the same issue
https://tracker.ceph.com/issues/54541
instead.
There would also be a lot of data to move in order to do this to a total
of 588 OSD's. Maybe there is a way to just maybe remove and re-add (bigger)
block_db?
I would appreciate any suggestions or tips.
Best regards, Mikael