On 23/09/2020 04:09, Alexander E. Patrakov wrote:
Sometimes this doesn't help. For data recovery purposes, the most
helpful step if you get the "bluefs enospc" error is to add a separate
db device, like this:
systemctl disable --now ceph-osd@${OSDID}
truncate -s 32G /junk/osd.${OSDID}-recover/block.db
sgdisk -n 0:0:0 /junk/osd.${OSDID}-recover/block.db
ceph-bluestore-tool \
bluefs-bdev-new-db --path /var/lib/ceph/osd/ceph-${OSDID} \
--dev-target /junk/osd.${OSDID}-recover/block.db \
--bluestore-block-db-size=31G --bluefs-log-compact-min-size=31G
Of course you can use a real block device instead of just a file.
After that, export all PGs using ceph-objecttstore-tool and re-import
into a fresh OSD, then destroy or purge the full one.
Here is why the options:
--bluestore-block-db-size=31G: ceph-bluestore-tool refuses to do
anything if this option is not set to any value
--bluefs-log-compact-min-size=31G: make absolutely sure that log
compaction doesn't happen, because it would hit "bluefs enospc" again.
Oh, you went this way... I solved my 'pocket ceph' needs by exporting
disk images (from files) via iscsi and mounting them back to localhost.
That gives me a perfect 'scsi' devices which work exactly as in
production. I have a little playbook (iscsi_loopback) to setup it on
random scrap (including VMs) for development purposes. After iscsi is
loopback-mounted, all other code works exactly the same as it would in
production.
I've got this issue few times on small 10GB osds, so I moved to 15Gb and
it become a less often. I never have had this in real-hardware tests
with real disk sizes (>>100G per OSD).