Hi Dan,
I now added it to ceph.conf and restarted all MONs. The running config now shows as:
# ceph config show mon.ceph-01 | grep -e NAME -e mon_osd_down_out_subtree_limit
NAME VALUE SOURCE OVERRIDES
IGNORES
mon_osd_down_out_subtree_limit host file (mon[host])
The config DB entry moved from column ignores to overrides, that is, it is still not used.
Looks like a priority bug to me. On startup, the config DB setting should have higher
priority than source "default" (and lower than "file" as is the case).
Should I open a tracker ticket?
I tested a shutdown of all OSDs on a host and it works now as expected and desired.
Thanks!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Frank Schilder <frans(a)dtu.dk>
Sent: 15 July 2020 10:15:12
To: Dan van der Ster
Cc: Anthony D'Atri; ceph-users
Subject: [ceph-users] Re: mon_osd_down_out_subtree_limit not working?
Setting it in ceph.conf is exactly what I wanted to avoid :). I will give it a try though.
I guess this should become an issue in the tracker?
Is it, by any chance, required to restart *all* daemons or should MONs be enough?
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Dan van der Ster <dan(a)vanderster.com>
Sent: 15 July 2020 10:10:44
To: Frank Schilder
Cc: Anthony D'Atri; ceph-users
Subject: Re: [ceph-users] Re: mon_osd_down_out_subtree_limit not working?
Hrmm that is strange.
We set it via /etc/ceph/ceph.conf, not the config framework. Maybe try that?
-- dan
On Wed, Jul 15, 2020 at 9:59 AM Frank Schilder <frans(a)dtu.dk> wrote:
Hi Dan,
it still does not work. When I execute
# ceph config set global mon_osd_down_out_subtree_limit host
2020-07-15 09:17:11.890 7f36cf7fe700 -1 set_mon_vals failed to set
mon_osd_down_out_subtree_limit = host: Configuration option
'mon_osd_down_out_subtree_limit' may not be modified at runtime
I get now a warning that one cannot change the value at run time. However, a restart of
all monitors still does not apply the value:
# ceph config show mon.ceph-01 | grep -e NAME -e mon_osd_down_out_subtree_limit | sed -e
"s/ */\t/g"
NAME VALUE SOURCE OVERRIDES IGNORES
mon_osd_down_out_subtree_limit rack default mon
so the setting in the config data base is still ignored. Any ideas? I cannot shut down
the entire cluster for something that simple.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Dan van der Ster <dan(a)vanderster.com>
Sent: 14 July 2020 17:38:27
To: Frank Schilder
Cc: Anthony D'Atri; ceph-users
Subject: Re: [ceph-users] Re: mon_osd_down_out_subtree_limit not working?
Seems that
ceph config set mon mon_osd_down_out_subtree_limit
isn't working. (I've seen this sort of config namespace issue in the past).
I'd try `ceph config set global mon_osd_down_out_subtree_limit host`
then restart the mon and check `ceph daemon mon.ceph-01 config get
mon_osd_down_out_subtree_limit` again.
-- dan
On Tue, Jul 14, 2020 at 1:35 PM Frank Schilder <frans(a)dtu.dk> wrote:
>
> Hi Dan,
>
> thanks for your reply. There is still a problem.
>
> Firstly, I did indeed forget to restart the mon even though I looked at the help for
mon_osd_down_out_subtree_limit and it says it requires a restart. Stupid me. Well, now I
did a restart and it still doesn't work. Here is the situation:
>
> # ceph config dump | grep subtree
> mon advanced mon_osd_down_out_subtree_limit host
*
> mon advanced mon_osd_reporter_subtree_level datacenter
>
> # ceph config get mon.ceph-01 mon_osd_down_out_subtree_limit
> host
>
> # ceph daemon mon.ceph-01 config get mon_osd_down_out_subtree_limit
> {
> "mon_osd_down_out_subtree_limit": "rack"
> }
>
> # ceph config show mon.ceph-01 | grep subtree
> mon_osd_down_out_subtree_limit rack default
mon
> mon_osd_reporter_subtree_level datacenter mon
>
> The default overrides the mon config database setting. What is going on here? I
restarted all 3 monitors.
>
> Best regards and thanks for your help,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Dan van der Ster <dan(a)vanderster.com>
> Sent: 14 July 2020 10:53:13
> To: Frank Schilder
> Cc: Anthony D'Atri; ceph-users
> Subject: Re: [ceph-users] Re: mon_osd_down_out_subtree_limit not working?
>
> mon_osd_down_out_subtree_limit has been working well here. Did you
> restart the mon's after making that config change?
> Can you do this just to make sure it took effect?
>
> ceph daemon mon.`hostname -s` config get mon_osd_down_out_subtree_limit
>
> -- dan
>
> On Tue, Jul 14, 2020 at 8:57 AM Frank Schilder <frans(a)dtu.dk> wrote:
> >
> > Yes. After the time-out of 600 secs the OSDs got marked down, all PGs got
remapped and recovery/rebalancing started as usual. In the past, I did service on servers
with the flag noout set and would expect that mon_osd_down_out_subtree_limit=host has the
same effect when shutting down an entire host. Unfortunately, in my case these two
settings behave differently.
> >
> > If I understand the documentation correctly, the OSDs should not get marked out
automatically.
> >
> > Best regards,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Anthony D'Atri <anthony.datri(a)gmail.com>
> > Sent: 14 July 2020 04:32:05
> > To: Frank Schilder
> > Subject: Re: [ceph-users] mon_osd_down_out_subtree_limit not working?
> >
> > Did it start rebalancing?
> >
> > > On Jul 13, 2020, at 4:29 AM, Frank Schilder <frans(a)dtu.dk> wrote:
> > >
> > > if I shut down all OSDs on this host, these OSDs should not be marked out
automatically after mon_osd_down_out_interval(=600) seconds. I did a test today and,
unfortunately, the OSDs do get marked as out. Ceph status was showing 1 host down as
expected.
> > _______________________________________________
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io