On Sun, Apr 25, 2021 at 11:42 AM Ilya Dryomov <idryomov(a)gmail.com> wrote:
On Sun, Apr 25, 2021 at 12:37 AM Markus Kienast <mark(a)trickkiste.at> wrote:
I am seeing these messages when booting from RBD and booting hangs there.
libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated
131072, skipping
However, Ceph Health is OK, so I have no idea what is going on. I
reboot my 3 node cluster and it works again for about two weeks.
How can I find out more about this issue, how can I dig deeper? Also
there has been at least one report about this issue before on this
mailing list - "[ceph-users] Strange Data Issue - Unexpected client
hang on OSD I/O Error" - but no solution has been presented.
This report was from 2018, so no idea if this is still an issue for
Dyweni the original reporter. If you read this, I would be happy to
hear how you solved the problem.
Hi Markus,
What versions of ceph and the kernel are in use?
Are you also seeing I/O errors and "missing primary copy of ..., will
try copies on ..." messages in the OSD logs (in this case osd2)?
For the sake of archives, the "[ceph-users] Strange Data Issue
- Unexpected client hang on OSD I/O Error" instance has been fixed
in 12.2.12, 13.2.5 and 14.2.0:
https://tracker.ceph.com/issues/37680
I also tried to reply to that thread but it didn't go through because
the old ceph-users(a)lists.ceph.com mailing list is decommissioned.
Thanks,
Ilya