New subject: upmap balancer and consequences of osds briefly marked out

1 May 2020

Hi,

You're correct that all the relevant upmap entries are removed when an
OSD is marked out.
You can try to use this script which will recreate them and get the
cluster back to HEALTH_OK quickly:
https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-rema…

Cheers, Dan

On Fri, May 1, 2020 at 9:36 AM Dylan McCulloch &lt;dmc(a)unimelb.edu.au&gt; wrote:
>
> Hi all,
>
> We're using upmap balancer which has made a huge improvement in evenly
distributing data on our osds and has provided a substantial increase in usable capacity.
>
> Currently on ceph version: 12.2.13 luminous
>
> We ran into a firewall issue recently which led to a large number of osds being
briefly marked 'down' & 'out'. The osds came back 'up' &
'in' after about 25 mins and the cluster was fine but had to perform a significant
amount of backfilling/recovery despite there being no end-user client I/O during that
period.
>
> Presumably the large number of remapped pgs and backfills were due to pg_upmap_items
being removed from the osdmap when osds were marked out and subsequently those pgs were
redistributed using the default crush algorithm.
> As a result of the brief outage our cluster became significantly imbalanced again
with several osds very close to full.
> Is there any reasonable mitigation for that scenario?
>
> The auto-balancer will not perform optimizations while there are degraded pgs, so it
would only start reapplying pg upmap exceptions after initial recovery is complete (at
which point capacity may be dangerously reduced).
> Similarly, as admins, we normally only apply changes when the cluster is in a healthy
state, but if the same issue were to occur again would it be advisable to manually apply
balancer plans while initial recovery is still taking place?
>
> I guess my concern from this experience is that making use of the capacity gained by
using upmap balancer appears to carry some risk. i.e. it's possible for a brief outage
to remove those space efficiencies relatively quickly and potentially result in full
osds/cluster before the automatic balancer is able to resume and redistribute pgs using
upmap.
>
> Curious whether others have any thoughts or experience regarding this.
>
> Cheers,
> Dylan
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io

Re: upmap balancer and consequences of osds briefly marked out