On 2020-10-06 13:05, Igor Fedotov wrote:
On 10/6/2020 1:04 PM, Kristof Coucke wrote:
Another strange thing is going on:
No client software is using the system any longer, so we would expect
that all IOs are related to the recovery (fixing of the degraded PG).
However, the disks that are reaching high IO are not a member of the
PGs that are being fixed.
So, something is heavily using the disk, but I can't find the process
immediately. I've read something that there can be old client
processes that keep on connecting to an OSD for retrieving data for a
specific PG while that PG is no longer available on that disk.
I bet it's rather PG removal happening in background....
^^ This, and probably the accompanying RocksDB housekeeping that goes
with it. As only removing PGs shouldn't be a too big a deal at all.
Especially with very small files (and a lot of them) you probably have a
lot of OMAP / META data, (ceph osd df will tell you).
If that's indeed the case than there is a (way) quicker option to get
out of this situation: offline compacting of the OSDs. This process
happens orders of magnitude faster than when the OSDs are still online.
To check if this hypothesis is true: are the OSD servers under CPU
stress where the PGs were located previously (and not the new hosts)?
Offline compaction per host:
systemctl stop ceph-osd.target
for osd in `ls /var/lib/ceph/osd/`; do (ceph-kvstore-tool bluestore-kv
/var/lib/ceph/osd/$osd compact &);done
Gr. Stefan