After some more digging, all three MDS enter state up:rejoin but don't move
on from there when restarting.
Also, MDS 0 (not the one with a trimming problem) consistently has
mds.0.cache failed to open ino 0x101 err -116/0
mds.0.cache failed to open ino 0x102 err -116/0
in the log when restarting.
On Thu, Jul 8, 2021 at 6:29 AM Zachary Ulissi <zulissi(a)gmail.com> wrote:
We're running a rook-ceph cluster that has gotten
stuck in "1 MDSs behind
on trimming".
* 1 filesystem, three active MDS servers each with standby
* Quite a few files (20M objects), daily snapshots. This might be a
problem?
* Ceph pacific 16.2.4
* `ceph health detail` doesn't provide much help (see below)
* num_segments is very slowly increasing over time
* Restarting all of the MDSs returns to the same point.
* moderate CPU usage for each MDS server (~30% for the stuck one, ~80% of
a core for the others)
* logs for the stuck MDS looks clean, it hits rejoin_joint_start then
standard 'updating MDS map to version XXX" messages
* `ceph daemon mds.x ops` shows no active ops on each of the MDS servers
* `mds_log_max_segments` is set to 128, setting to a higher number causes
the warning to go away, but the filesystem remains degraded, and setting it
back to 128 shows num_segments has not changed.
* I've tried playing around with other MDS settings based on various posts
on this list and elsewhere, to no avail
* `cephfs-journal-tool journal inspect` for each rank says journal
integrity is fine.
Something similar happened last week and (probably by accident by
removing/adding nodes?) I got the MDSs to start recovering and the
filesystem went back to healthy.
I'm at a bit of a loss for what else to try.
Thanks!
Zack
`ceph health detail`
HEALTH_WARN mons are allowing insecure global_id reclaim; 1 filesystem is
degraded; 1 MDSs behind on trimming; mon x is low on available space
[WRN] AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure
global_id reclaim
mon.x has auth_allow_insecure_global_id_reclaim set to true
mon.ad has auth_allow_insecure_global_id_reclaim set to true
mon.af has auth_allow_insecure_global_id_reclaim set to true
[WRN] FS_DEGRADED: 1 filesystem is degraded
fs myfs is degraded
[WRN] MDS_TRIM: 1 MDSs behind on trimming
mds.myfs-d(mds.2): Behind on trimming (340/128) max_segments: 128,
num_segments: 340
[WRN] MON_DISK_LOW: mon x is low on available space
mon.x has 22% avail
`ceph config get mds`
WHO MASK LEVEL OPTION VALUE RO
global basic log_file *
global basic log_to_file false
mds basic mds_cache_memory_limit 17179869184
mds advanced mds_cache_trim_decay_rate 1.000000
mds advanced mds_cache_trim_threshold 1048576
mds advanced mds_log_max_segments 128
mds advanced mds_recall_max_caps 5000
mds advanced mds_recall_max_decay_rate 2.500000
global advanced mon_allow_pool_delete true
global advanced mon_allow_pool_size_one true
global advanced mon_cluster_log_file
global advanced mon_pg_warn_min_per_osd 0
global advanced osd_pool_default_pg_autoscale_mode on
global advanced osd_scrub_auto_repair true
global advanced rbd_default_features 3