[ceph-users] Re: Strategy for add new osds

16 Jun 2021

I've to say I am reading quite some interesting strategies in this
thread and I'd like to shortly take the time to compare them:

1) one by one osd adding

- least amount of pg rebalance
- will potentially re-re-balance data that has just been distributed
  with the next OSD phase in
- limits the impact if you have a bug in the hdd/ssd series

The biggest problem with this approach is that you will re-re-re-balance
data over and over again and that will slowdown the process significantly.

2) reweighted phase in

- Starting slow with reweighting to a small amount of its potential
- Allows to see how the new OSD performs
- Needs manual interaction for growing
- delays the phase in possibly for "longer than necessary"

We use this approach when phasing in multiple, larger OSDs that are from
a newer / not so well known series of disks.

3) noin / norebalance based phase in

- Interesting approach to delay rebalancing until the "proper/final" new
  storage is in place
- Unclear how much of a difference it makes if you insert the new set of
  osds within a short timeframe (i.e. adding 1 osd at minute 0, 2nd at
  minute 1, etc.)

4) All at once / randomly

- Least amount of manual tuning
- In a way something one "would expect" ceph to do right (but in
  practice doesn't all the time)
- Might (likely) cause short term re-adjustments
- Might cause client i/o slowdown (see next point)

5) General slowing down

What we actually do in datacenterlight.ch is slowing down phase ins by
default via the followign tunings:

# Restrain recovery operations so that normal cluster is not affected
[osd]
osd max backfills = 1
osd recovery max active = 1
osd recovery op priority = 2

This works well in about 90% of the cases for us.

Quite an interesting thread, thanks everyone for sharing!

Cheers,

Nico

Anthony D'Atri &lt;anthony.datri(a)gmail.com&gt; writes:

...
   Hi,

 as far as I understand it,

 you get no real benefit with doing them one by one, as each osd add, can cause a lot of
data to be moved to a different osd, even tho you just rebalanced it. 
 Less than with older releases, but yeah.

 I’ve known someone who advised against doing them in parallel because one would — for a
time — have PGs with multiple remaps in the acting set.  The objection may have been
paranoia, I’m not sure.

 One compromise is to upweight the new OSDs one node at a time, so the churn is limited to
one failure domain at a time.

 — aad
 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io 

--
Sustainable and modern Infrastructures by ungleich.ch

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Strategy for add new osds