On Mon, Aug 31, 2020 at 5:02 AM Stefan Kooman <stefan(a)bit.nl> wrote:
Hi list,
We had some stuck ops on our MDS. In order to figure out why, we looked
up the documention. The first thing it mentions is the following:
ceph daemon mds.<name> dump cache /tmp/dump.txt
Our MDS had 170 GB in cache at that moment.
Turns out that is a sure way to get your active MDS replaced by a standby.
Is this supposed to work on MDS with large cache size? If not, than a
big warning sign to prohibit running this on MDSes with large caches
would be appropriate.
Gr. Stefan
P.s. I think our only option was to get the active restarted at that
point, but still.
Yes, there should be a note in the docs about that. It seems a new PR
is up to respond to this issue:
https://github.com/ceph/ceph/pull/36823
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D