Thank you, Paul.
The thresholds were recently reduced by a factor of
10. I guess you
have a lot of (open) files? Maybe use more active MDS servers?
We'll consider adding more MDS servers, although the workload hasn't
been an issue yet.
Or increase the thresholds, I wouldn't worry at
all about 200k omap
keys if you are running on reasonable hardware.
The usual argument for a low number of omap keys is recovery time, but
if you are running a metadata-heavy workload on something that has
problems recovering 200k keys in less than a few seconds, then you are
doing something wrong anyways.
We haven't had any issues with MDS failovers and/or recovery yet, I
guess higher thresholds would be fine.
To get rid of the warning (for a week) it was sufficient to issue a
deep-scrub on the affected PG while the listomapkeys output was lower
than 200k. Maybe we were just "lucky" until now because the
deep-scrubs are issued outside of business hours, so the number of
open files should be lower.
Anyway, thank you for your input, it seems as if this is not a problem
at the moment.
Regards,
Eugen
Zitat von Paul Emmerich <paul.emmerich(a)croit.io>io>:
The thresholds were recently reduced by a factor of
10. I guess you
have a lot of (open) files? Maybe use more active MDS servers?
>
Or increase the thresholds, I wouldn't worry at
all about 200k omap
keys if you are running on reasonable hardware.
The usual argument for a low number of omap keys is recovery time, but
if you are running a metadata-heavy workload on something that has
problems recovering 200k keys in less than a few seconds, then you are
doing something wrong anyways.
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at
https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
>
www.croit.io
> Tel: +49 89 1896585 90
>
> On Tue, Oct 1, 2019 at 9:10 AM Eugen Block <eblock(a)nde.ag> wrote:
>>
>> Hi all,
>>
>> we have a new issue in our Nautilus cluster.
>> The large omap warning seems to be more common for RGW usage, but we
>> currently only use CephFS and RBD. I found one thread [1] regarding
>> metadata pool, but it doesn't really help in our case.
>>
>> The deep-scrub of PG 36.6 brought up this message (deep-scrub finished
>> with "ok"):
>>
>> 2019-09-30 20:18:22.548401 osd.9 (osd.9) 275 : cluster [WRN] Large
>> omap object found. Object: 36:654134d2:::mds0_openfiles.0:head Key
>> count: 238621 Size (bytes): 9994510
>>
>>
>> I checked xattr (none) and omapheader:
>>
>> ceph01:~ # rados -p cephfs-metadata listxattr mds0_openfiles.0
>> ceph01:~ # rados -p cephfs-metadata getomapheader mds0_openfiles.0
>> header (42 bytes) :
>> 00000000 13 00 00 00 63 65 70 68 20 66 73 20 76 6f 6c 75
>> |....ceph fs volu|
>> 00000010 6d 65 20 76 30 31 31 01 01 0d 00 00 00 74 c3 12 |me
>> v011......t..|
>> 00000020 00 00 00 00 00 01 00 00 00 00 |..........|
>> 0000002a
>>
>> ceph01:~ # ceph fs volume ls
>> [
>> {
>> "name": "cephfs"
>> }
>> ]
>>
>>
>> The respective OSD has default thresholds regarding large_omap:
>>
>> ceph02:~ # ceph daemon osd.9 config show | grep large_omap
>> "osd_deep_scrub_large_omap_object_key_threshold":
"200000",
>> "osd_deep_scrub_large_omap_object_value_sum_threshold":
"1073741824",
>>
>>
>> Can anyone point me to a solution for this?
>>
>> Best regards,
>> Eugen
>>
>>
>> [1]
>>
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-March/033813.html
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io