Thanks, Dan.
On the first MON, the command doesn’t even return, but I was able to get a dump from the
one I restarted most recently. The oldest ops look like this:
{
"description": "log(1000 entries from seq 17876238 at
2021-02-25T15:13:20.306487+0100)",
"initiated_at": "2021-02-25T20:40:34.698932+0100",
"age": 183.762551121,
"duration": 183.762599201,
"type_data": {
"events": [
{
"time": "2021-02-25T20:40:34.698932+0100",
"event": "initiated"
},
{
"time": "2021-02-25T20:40:34.698636+0100",
"event": "throttled"
},
{
"time": "2021-02-25T20:40:34.698932+0100",
"event": "header_read"
},
{
"time": "2021-02-25T20:40:34.701407+0100",
"event": "all_read"
},
{
"time": "2021-02-25T20:40:34.701455+0100",
"event": "dispatched"
},
{
"time": "2021-02-25T20:40:34.701458+0100",
"event": "mon:_ms_dispatch"
},
{
"time": "2021-02-25T20:40:34.701459+0100",
"event": "mon:dispatch_op"
},
{
"time": "2021-02-25T20:40:34.701459+0100",
"event": "psvc:dispatch"
},
{
"time": "2021-02-25T20:40:34.701490+0100",
"event": "logm:wait_for_readable"
},
{
"time": "2021-02-25T20:40:34.701491+0100",
"event": "logm:wait_for_readable/paxos"
},
{
"time": "2021-02-25T20:40:34.701496+0100",
"event": "paxos:wait_for_readable"
},
{
"time": "2021-02-25T20:40:34.989198+0100",
"event": "callback finished"
},
{
"time": "2021-02-25T20:40:34.989199+0100",
"event": "psvc:dispatch"
},
{
"time": "2021-02-25T20:40:34.989208+0100",
"event": "logm:preprocess_query"
},
{
"time": "2021-02-25T20:40:34.989208+0100",
"event": "logm:preprocess_log"
},
{
"time": "2021-02-25T20:40:34.989278+0100",
"event": "forward_request_leader"
},
{
"time": "2021-02-25T20:40:34.989344+0100",
"event": "forwarded"
},
{
"time": "2021-02-25T20:41:58.658022+0100",
"event": "resend forwarded message to leader"
},
{
"time": "2021-02-25T20:42:27.735449+0100",
"event": "resend forwarded message to leader"
}
],
"info": {
"seq": 41550,
"src_is_mon": false,
"source": "osd.104 v2:XXX:6864/16579",
"forwarded_to_leader": true
}
Any idea what that might be about? Almost looks like this:
https://tracker.ceph.com/issues/24180
I set debug_mon to 0, but I keep getting a lot of log spill in journals. It’s about 1-2
messages per second, mostly RocksDB stuff, but nothing that actually looks serious or even
log-worthy. I noticed that before that despite logging being set to warning level, the
cluster log keeps being written to the MON log. But it shouldn’t cause such massive
stability issues, should it? The date on the log op is also weird. 15:13+0100 was hours
ago.
Here’s my log config:
global advanced clog_to_syslog_level
warning
global basic err_to_syslog
true
global basic log_to_file
false
global basic log_to_stderr
false
global basic log_to_syslog
true
global advanced mon_cluster_log_file_level
error
global advanced mon_cluster_log_to_file
false
global advanced mon_cluster_log_to_stderr
false
global advanced mon_cluster_log_to_syslog
false
global advanced mon_cluster_log_to_syslog_level
warning
Ceph version is 15.2.8.
Janek
On 25. Feb 2021, at 20:33, Dan van der Ster
<dan(a)vanderster.com> wrote:
ceph daemon mon.`hostname -s` ops
That should show you the accumulating ops.
.. dan
On Thu, Feb 25, 2021, 8:23 PM Janek Bevendorff <janek.bevendorff(a)uni-weimar.de
<mailto:janek.bevendorff@uni-weimar.de>> wrote:
Hi,
All of a sudden, we are experiencing very concerning MON behaviour. We have five MONs and
all of them have thousands up to tens of thousands of slow ops, the oldest one blocking
basically indefinitely (at least the timer keeps creeping up). Additionally, the MON
stores keep inflating heavily. Under normal circumstances we have about 450-550MB there.
Right now its 27GB and growing (rapidly).
I tried restarting all MONs, I disabled auto-scaling (just in case) and checked the
system load and hardware. I also restarted the MGR and MDS daemons, but to no avail.
Is there any way I can debug this properly? I can’t seem to find how I can actually view
what ops are causing this and what client (if any) may be responsible for it.
Thanks
Janek
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io <mailto:ceph-users@ceph.io>
To unsubscribe send an email to ceph-users-leave(a)ceph.io
<mailto:ceph-users-leave@ceph.io>