There's nothing in the CPU graph that suggests soft lock-ups at these
times. However, thank you for pointing out that the disk io scheduler
could have an impact. Ubuntu seems to be on mq-deadline by default, so
we just switched to none, as it fits our workload best I believe. I
don't know if this will fix our issue, but I think it's worth testing.
On 1/18/23 11:17, Frank Schilder wrote:
Do you have CPU soft lock-ups around these times? We
had these timeouts due to using the cfq/bfq disk schedulers with SSDs. The osd_op_tp
thread timeout is typical when CPU lockups happen. Could be a sporadic problem with the
disk IO path.
--
Jean-Philippe Méthot
Senior Openstack system administrator
Administrateur système Openstack sénior
PlanetHoster inc.