On Tue, Feb 21, 2023 at 1:01 AM Xiubo Li <xiubli(a)redhat.com> wrote:
On 20/02/2023 22:28, Kuhring, Mathias wrote:
Hey Dan, hey Ilya
I know this issue is two years old already, but we are having similar
Do you know, if the fixes got ever backported to RHEL kernels?
It's already backported to RHEL 8 long time ago since kernel-4.18.0-154.el8.
Not looking for el7 but rather el8 fixes.
Wondering if the patches were backported and we shouldn't actually see
Or if you could maybe resolve them with a kernel upgrade.
Most active clients are currently on kernel versions such as:
While the cluster runs with kernel 3.10.0-1160.42.2.el7.x86_64 and
ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy
It seems not backported to el7 yet.
"Yet" might be misleading here -- I don't think there is/was ever
a plan to backport these fixes to RHEL 7.
> Not sure, if the cluster kernel is actually
relevant here for OSD <>
> kernel client connection.
If you are seeing page allocation failures only on the kernel client
nodes, then it's not relevant.
Unless the stack trace is the same as in the original tracker  or
Dan's paste  (note ceph_osdmap_decode() -> osdmap_set_max_osd() ->
krealloc() sequence), you are hitting a different issue. Pasting the
entire splat(s) from the kernel log would be a good start.