For the sake of record here is a link to the corresponding ticket:
https://tracker.ceph.com/issues/44924
On 4/2/2020 6:28 PM, Igor Fedotov wrote:
> So this OSD has 32M of shared blobs and fsck loads them all into
> memory while processing. Hence the RAM consumption.
>
>
> I'm afraid there is no simple way to fix that, will create a ticket
> though.
>
>
> And a side question:
>
> 1) Do you use erasure coding and/or compression for rbd pool?
>
> These stats look suspicious
>
> POOL ID STORED (DATA) (OMAP) OBJECTS
> USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA
> BYTES DIRTY USED COMPR UNDER COMPR
> rbd 1 245 TiB 245 TiB 9.0 MiB 50.26M 151
> TiB 151 TiB 9.0 MiB 90.03 12 TiB N/A N/A 50.26M 35
> TiB 144 TiB
>
> Stored - 245 TiB, Used - 151 TiB
>
> Can't imagine any explanation other than applied compression.
>
>
> Thanks,
>
> Igor
>
>
>
> On 4/2/2020 5:59 PM, Jack wrote:
>> Here it is
>>
>> On 4/2/20 3:48 PM, Igor Fedotov wrote:
>>> And may I have the output for:
>>>
>>> ceph daemon osd.N calc_objectstore_db_histogram
>>>
>>> This will collect some stats on record types in OSD's DB.
>>>
>>>
>>> On 4/2/2020 4:13 PM, Jack wrote:
>>>> (fsck / quick-fix, same story)
>>>>
>>>> On 4/2/20 3:12 PM, Jack wrote:
>>>>> Hi,
>>>>>
>>>>> A simple fsck eats the same amount of memory
>>>>>
>>>>> Cluster usage: rbd with a bit of rgw
>>>>>
>>>>> Here is the ceph df detail
>>>>> All OSDs are single rusty devices
>>>>>
>>>>> On 4/2/20 2:19 PM, Igor Fedotov wrote:
>>>>>> Hi Jack,
>>>>>>
>>>>>> could you please try the following - stop one of already
>>>>>> converted OSDs
>>>>>> and do a quick-fix/fsck/repair against it using
ceph_bluestore_tool:
>>>>>>
>>>>>> ceph-bluestore-tool --path <path to osd> --command
>>>>>> quick-fix|fsck|repair
>>>>>>
>>>>>> Does it cause similar memory usage?
>>>>>>
>>>>>> You can stop experimenting if quick-fix reproduces the issue.
>>>>>>
>>>>>>
>>>>>> Also could you please describe your cluster and its usage a bit:
>>>>>> what's
>>>>>> the usage: rgw/rbd/cephfs? If possible - please share 'ceph
df
>>>>>> detail'
>>>>>> output, do you have standalone DB volume at SSD/NVMe?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Igor
>>>>>>
>>>>>>
>>>>>> On 4/1/2020 6:28 PM, Jack wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> As the upgrade documentation tells:
>>>>>>>> Note that the first time each OSD starts, it will do a
format
>>>>>>>> conversion to improve the accounting for “omap” data.
This may
>>>>>>>> take a few minutes to as much as a few hours (for an HDD
with lots
>>>>>>>> of omap data). You can disable this automatic conversion
with:
>>>>>>> What the documentation does not say is that this process
takes a
>>>>>>> lot of
>>>>>>> memory
>>>>>>>
>>>>>>> I am upgrading a rusty cluster from Nautilus, you can check
out the
>>>>>>> ram
>>>>>>> consumption as attachment
>>>>>>>
>>>>>>> First, we have a 3TB osd conversion: it tooks ~15min, and
19GB of
>>>>>>> memory
>>>>>>>
>>>>>>> Then, we have a larger 6TB osd conversion: it tooks more than
2
>>>>>>> hours,
>>>>>>> and 35GB of memory
>>>>>>>
>>>>>>> Finally, you have the largest 10TB osd: only 1H15, but 52GB
of
>>>>>>> memory
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io