This does not explain incomplete and inactive PGs. Are you hitting
https://tracker.ceph.com/issues/46847 (see also thread "Ceph does not recover from
OSD restart"? In that case, temporarily stopping and restarting all new OSDs might
help.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Zhenshi Zhou <deaderzzs(a)gmail.com>
Sent: 29 October 2020 08:30:25
To: Frank Schilder
Cc: ceph-users
Subject: Re: [ceph-users] monitor sst files continue growing
After add OSDs into the cluster, the recovery and backfill progress has not finished yet
Zhenshi Zhou <deaderzzs@gmail.com<mailto:deaderzzs@gmail.com>> 于2020年10月29日周四
下午3:29写道:
MGR is stopped by me cause it took too much memories.
For pg status, I added some OSDs in this cluster, and it
Frank Schilder <frans@dtu.dk<mailto:frans@dtu.dk>> 于2020年10月29日周四 下午3:27写道:
Your problem is the overall cluster health. The MONs store cluster history information
that will be trimmed once it reaches HEALTH_OK. Restarting the MONs only makes things
worse right now. The health status is a mess, no MGR, a bunch of PGs inactive, etc. This
is what you need to resolve. How did your cluster end up like this?
It looks like all OSDs are up and in. You need to find out
- why there are inactive PGs
- why there are incomplete PGs
This usually happens when OSDs go missing.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Zhenshi Zhou <deaderzzs@gmail.com<mailto:deaderzzs@gmail.com>>
Sent: 29 October 2020 07:37:19
To: ceph-users
Subject: [ceph-users] monitor sst files continue growing
Hi all,
My cluster is in wrong state. SST files in /var/lib/ceph/mon/xxx/store.db
continue growing. It claims mon are using a lot of disk space.
I set "mon compact on start = true" and restart one of the monitors. But
it started and campacting for a long time, seems it has no end.
[image.png]