[ceph-users] Re: Large RocksDB (db_slow_bytes) on OSD which is marked as out

31 Aug 2020

Could you please run:  ceph daemon <osd-id> calc_objectstore_db_histogram

and share the output?

On 8/31/2020 4:33 PM, Wido den Hollander wrote:
>
>
> On 31/08/2020 12:31, Igor Fedotov wrote:
>> Hi Wido,
>>
>> 'b' prefix relates to free list manager which keeps all the free 
>> extents for main device in a bitmap. Its records have fixed size 
>> hence you can easily estimate the overall size for these type of data.
>>
>
> Yes, so I figured.
>
>> But I doubt it takes that much. I presume that DB just lacks the 
>> proper compaction. Which could happen eventually but looks like you 
>> interrupted the process by going offline.
>>
>> May be try manual compaction with ceph-kvstore-tool?
>>
>
> This cluster is suffering from a lot of spillovers. So we tested with 
> marking one OSD as out.
>
> After being marked as out it still had this large DB. A compact didn't 
> work, the RocksDB database just stayed so large.
>
> New OSDs coming into the cluster aren't suffering from this and they 
> have a RocksDB of a couple of MB in size.
>
> Old OSDs installed with Luminous and now upgraded to Nautilus are 
> suffering from this.
>
> It kind of seems like that garbage data stays behind in RocksDB which 
> is never clean up.
>
> Wido
>
>>
>> Thanks,
>>
>> Igor
>>
>>
>>
>> On 8/31/2020 10:57 AM, Wido den Hollander wrote:
>>> Hello,
>>>
>>> On a Nautilus 14.2.8 cluster I am seeing large RocksDB database with 
>>> many slow DB bytes in use.
>>>
>>> To investigate this further I marked one OSD as out and waited for 
>>> the all the backfilling to complete.
>>>
>>> Once the backfilling was completed I exported BlueFS and 
>>> investigated the RocksDB using 'ceph-kvstore-tool'. This resulted in

>>> 22GB of data.
>>>
>>> Listing all the keys in the RocksDB shows me there are 747.000 keys 
>>> in the DB. A small portion are osdmaps, but the biggest amount are 
>>> keys prefixed with 'b'.
>>>
>>> I dumped the stats of the RocksDB and this shows me:
>>>
>>> L1: 1/0: 439.32 KB
>>> L2: 1/0: 2.65 MB
>>> L3: 5/0: 14.36 MB
>>> L4: 127/0: 7.22 GB
>>> L5: 217/0: 13.73 GB
>>> Sum: 351/0: 20.98 GB
>>>
>>> So there is almost 21GB of data in this RocksDB database. Why? Where 
>>> is this coming from?
>>>
>>> Throughout this cluster OSDs are suffering from many slow bytes used 
>>> and I can't figure out why.
>>>
>>> Has anybody seen this or has a clue on what is going on?
>>>
>>> I have an external copy of this RocksDB database to do 
>>> investigations on.
>>>
>>> Thank you,
>>>
>>> Wido
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Large RocksDB (db_slow_bytes) on OSD which is marked as out