[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU

9 Apr 2021

On Fri, Apr 9, 2021 at 7:24 PM Robert LeBlanc &lt;robert(a)leblancnet.us&gt; wrote:
...

 On Fri, Apr 9, 2021 at 11:05 AM Dan van der Ster &lt;dan(a)vanderster.com&gt; wrote:

 Hi Robert,

 Have you checked a log with debug_mon=20 yet to try to see what it's doing?
  I've posted the logs with debug_mon=20 for a period during high CPU
 here https://owncloud.leblancnet.us/owncloud/index.php/s/OtHsBAYN9r5eSbU

 You can look near the end of the log for the verbose logging. I'm not
 sure what to look for in there, nothing sticks out to me. I did
 disable cephx in the config file to see if that would help, but we
 still have the 100% CPU.

Thanks. I didn't see anything ultra obvious to me.

But I did notice the nearfull warnings so I wonder if this cluster is
churning through osdmaps? Did you see a large increase in inbound or
outbound network traffic on this mon following the upgrade?
Totally speculating here, but maybe there is an issue where you have
some old clients, which can't decode an incremental osdmap from a
nautilus mon, so the single mon is busy serving up these maps to the
clients.

Does the mon load decrease if you stop the osdmap churn?, e.g. by
setting norebalance if that is indeed ongoing.

Could you also share debug_ms = 1 for a minute of busy cpu mon?

-- dan

> Thank you,
> Robert LeBlanc

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU