On Mon, May 3, 2021 at 9:20 AM Magnus Harlander <magnus(a)harlan.de> wrote:
Am 03.05.21 um 00:44 schrieb Ilya Dryomov:
On Sun, May 2, 2021 at 11:15 PM Magnus Harlander <magnus(a)harlan.de> wrote:
Hi,
I know there is a thread about problems with mounting cephfs with 5.11 kernels.
...
Hi Magnus,
What is the output of "ceph config dump"?
Instead of providing those lines, can you run "ceph osd getmap 64281 -o
osdmap.64281" and attach osdmap.64281 file?
Thanks,
Ilya
Hi Ilya,
[root@s1 ~]# ceph config dump
WHO MASK LEVEL OPTION VALUE RO
global basic device_failure_prediction_mode local
global advanced ms_bind_ipv4 false
mon advanced auth_allow_insecure_global_id_reclaim false
mon advanced mon_lease 8.000000
mgr advanced mgr/devicehealth/enable_monitoring true
getmap output is attached,
I see the problem, but I don't understand the root cause yet. It is
related to the two missing OSDs:
May 02 22:54:05 islay kernel: libceph: no match of
type 1 in addrvec
May 02 22:54:05 islay kernel: libceph: corrupt full osdmap (-2) epoch 64281 off 3154
(00000000a90fe1d7 of 000000000083f4bd-00000000c03bdc9b)
max_osd 12
osd.0 up in ...
[v2:192.168.200.141:6804/3027,v1:192.168.200.141:6805/3027] ... exists,up
631bc170-45fd-4948-9a5e-4c278569c0bc
osd.1 up in ... [v2:192.168.200.140:6811/3066,v1:192.168.200.140:6813/3066] ...
exists,up 660a762c-001d-4160-a9ee-d0acd078e776
osd.2 up in ... [v2:192.168.200.141:6815/3008,v1:192.168.200.141:6816/3008] ...
exists,up e4d94d3a-ec58-46a1-b61c-c47dd39012ed
osd.3 up in ... [v2:192.168.200.140:6800/3067,v1:192.168.200.140:6801/3067] ...
exists,up 26d25060-fd99-4d15-a1b2-ebb77646671e
osd.4 up in ... [v2:192.168.200.140:6804/3049,v1:192.168.200.140:6806/3049] ...
exists,up 238f197d-ecbc-4588-8a99-6a63c9bb1a17
osd.5 up in ... [v2:192.168.200.140:6816/3073,v1:192.168.200.140:6817/3073] ...
exists,up a9dcb26f-0f1c-4067-a26b-a29939285e0b
osd.6 up in ... [v2:192.168.200.141:6808/3020,v1:192.168.200.141:6809/3020] ...
exists,up f399b47d-063f-4b2f-bd93-289377dc9945
osd.7 up in ... [v2:192.168.200.141:6800/3023,v1:192.168.200.141:6801/3023] ...
exists,up 3557ceca-7bd8-401e-abd3-59bee168e8f6
osd.8 up in ... [v2:192.168.200.141:6812/3017,v1:192.168.200.141:6813/3017] ...
exists,up 7f9cad3f-163d-4bb7-85b2-fffd46982fff
osd.9 up in ... [v2:192.168.200.140:6805/3053,v1:192.168.200.140:6807/3053] ...
exists,up c543b12a-f9bf-4b83-af16-f6b8a3926e69
The kernel client is failing to parse addrvec entries for non-existent
osd10 and osd11. It is probably being too stringent, but before fixing
it I'd like to understand what happened to those OSDs. It looks like
they were removed but not completely.
What let to their removal? What commands were used?
Thanks,
Ilya