Bluestore fsck/repair detect and fix leaks at Bluestore level but I
doubt your issue is here.
To be honest I don't understand from the overview why do you think that
there are any leaks at all....
Not sure whether this is relevant but from my experience space "leaks"
are sometimes caused by 64K allocation unit and keeping tons of small
files or massive small EC overwrites.
To verify if this is applicable you might want to inspect bluestore
performance counters (bluestore_stored vs. bluestore_allocated) to
estimate your losses due to high allocation units.
Significant difference at multiple OSDs might indicate that overhead is
caused by high allocation granularity. Compression might make this
analysis not that simple though...
Thanks,
Igor
On 3/26/2020 1:19 AM, vitalif(a)yourcmc.ru wrote:
> I have a question regarding this problem - is it possible to rebuild
> bluestore allocation metadata? I could try it to test if it's an
> allocator problem...
>
>> Hi.
>>
>> I'm experiencing some kind of a space leak in Bluestore. I use EC,
>> compression and snapshots. First I thought that the leak was caused by
>> "virtual clones" (issue #38184). However, then I got rid of most of
>> the snapshots, but continued to experience the problem.
>>
>> I suspected something when I added a new disk to the cluster and free
>> space in the cluster didn't increase (!).
>>
>> So to track down the issue I moved one PG (34.1a) using upmaps from
>> osd11,6,0 to osd6,0,7 and then back to osd11,6,0.
>>
>> It ate +59 GB after the first move and +51 GB after the second. As I
>> understand this proves that it's not #38184. Devirtualizaton of
>> virtual clones couldn't eat additional space after SECOND rebalance of
>> the same PG.
>>
>> The PG has ~39000 objects, it is EC 2+1 and the compression is
>> enabled. Compression ratio is about ~2.7 in my setup, so the PG should
>> use ~90 GB raw space.
>>
>> Before and after moving the PG I stopped osd0, mounted it with
>> ceph-objectstore-tool with debug bluestore = 20/20 and opened the
>> 34.1a***/all directory. It seems to dump all object extents into the
>> log in that case. So now I have two logs with all allocated extents
>> for osd0 (I hope all extents are there). I parsed both logs and added
>> all compressed blob sizes together ("get_ref Blob ... 0x20000 -> 0x...
>> compressed"). But they add up to ~39 GB before first rebalance
>> (34.1as2), ~22 GB after it (34.1as1) and ~41 GB again after the second
>> move (34.1as2) which doesn't indicate a leak.
>>
>> But the raw space usage still exceeds initial by a lot. So it's clear
>> that there's a leak somewhere.
>>
>> What additional details can I provide for you to identify the bug?
>>
>> I posted the same message in the issue tracker,
>>
https://tracker.ceph.com/issues/44731