I've figured out but I'm scared from the result.
The solution is "mon_osd_min_down_reporters = 1"
Due to "two node" cluster and "replicated 2" with "chooseleaf
host"
the reporter count should be set to 1 but on a malfunction this could
be a serious problem.
Is there any better solution?
by morphin <morphinwithyou(a)gmail.com>om>, 20 May 2021 Per, 22:04
tarihinde şunu yazdı:
>
> Hello
>
> I have a weird problem on 3 node cluster. "Nautilus 14.2.9"
> When I try power failure OSD's are not marking as DOWN and MDS do not
> respond anymore.
> If I manually set osd down then MDS becomes active again.
>
> BTW: Only 2 node has OSD's. Third node is only for MON.
>
> I've set mon_osd_down_out_interval = 0.3 in ceph.conf at global
> section and restart all MON's but when I check it with "ceph daemon
> mon.ID config show" I see mon_osd_down_out_interval: "600". I
didn't
> get it why its still "600" and honestly I don't know even it has any
> effect on my problem.
>
> Where should I check?