I'm not sure what you look for in the CPU graph. If its load or a similar metric you
will not see these lock-ups. You need to look into the syslog and search for it. If these
warnings are there, it might give give a clue as to what hardware component is causing it.
They look something like "BUG: soft lockup - CPU#X stuck for ..."
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: J-P Methot <jp.methot(a)planethoster.info>
Sent: 18 January 2023 17:38:28
To: Frank Schilder; ceph-users
Subject: Re: [ceph-users] Re: Flapping OSDs on pacific 16.2.10
There's nothing in the CPU graph that suggests soft lock-ups at these
times. However, thank you for pointing out that the disk io scheduler
could have an impact. Ubuntu seems to be on mq-deadline by default, so
we just switched to none, as it fits our workload best I believe. I
don't know if this will fix our issue, but I think it's worth testing.
On 1/18/23 11:17, Frank Schilder wrote:
Do you have CPU soft lock-ups around these times? We
had these timeouts due to using the cfq/bfq disk schedulers with SSDs. The osd_op_tp
thread timeout is typical when CPU lockups happen. Could be a sporadic problem with the
disk IO path.
--
Jean-Philippe Méthot
Senior Openstack system administrator
Administrateur système Openstack sénior
PlanetHoster inc.