[ceph-users] Re: OSD rebalancing issue - should drives be distributed equally over all nodes

25 Sep 2019

Hi Thomas,

How does your crush map/tree look?

If your crush failure domain is by host, then your 96x 8T disks will be as useful as
you're 1.6T disks, because smallest failure domain is your limiting factor.

So you can either redistribute your disks to be 16x8T+32x1.6T per host, or you could group
your 1.6T nodes into groups (chassis perhaps) and move the 8T nodes into their own
chassis, and then set your failure domain to chassis, and this would likely lead to a much
more even distribution.

I imagine right now you're 1.6T disks are nearful, and your 8T disks are anything
but.

Be careful with something like this however, because you will probably run into some iops
discrepancies due to number of spindles/TB difference across 'chassis'.

Hope that helps.

Reed

...
  On Sep 23, 2019, at 4:07 AM, Thomas
&lt;74cmonty(a)gmail.com&gt; wrote:

 Hi,

 I'm facing several issues with my ceph cluster (2x MDS, 6x ODS).
 Here I would like to focus on the issue with pgs backfill_toofull.
 I assume this is related to the fact that the data distribution on my
 OSDs is not balanced.

 This is the current ceph status:
 root@ld3955:~# ceph -s
   cluster:
     id:     6b1b5117-6e08-4843-93d6-2da3cf8a6bae
     health: HEALTH_ERR
             1 MDSs report slow metadata IOs
             78 nearfull osd(s)
             1 pool(s) nearfull
             Reduced data availability: 2 pgs inactive, 2 pgs peering
             Degraded data redundancy: 304136/153251211 objects degraded
 (0.198%), 57 pgs degraded, 57 pgs undersized
             Degraded data redundancy (low space): 265 pgs backfill_toofull
             3 pools have too many placement groups
             74 slow requests are blocked > 32 sec
             80 stuck requests are blocked > 4096 sec

   services:
     mon: 3 daemons, quorum ld5505,ld5506,ld5507 (age 98m)
     mgr: ld5505(active, since 3d), standbys: ld5506, ld5507
     mds: pve_cephfs:1 {0=ld3976=up:active} 1 up:standby
     osd: 368 osds: 368 up, 367 in; 302 remapped pgs

   data:
     pools:   5 pools, 8868 pgs
     objects: 51.08M objects, 195 TiB
     usage:   590 TiB used, 563 TiB / 1.1 PiB avail
     pgs:     0.023% pgs not active
              304136/153251211 objects degraded (0.198%)
              1672190/153251211 objects misplaced (1.091%)
              8564 active+clean
              196  active+remapped+backfill_toofull
              57   active+undersized+degraded+remapped+backfill_toofull
              35   active+remapped+backfill_wait
              12   active+remapped+backfill_wait+backfill_toofull
              2    active+remapped+backfilling
              2    peering

   io:
     recovery: 18 MiB/s, 4 objects/s

 Currently I'm using 6 OSD nodes.
 Node A
 48x 1.6TB HDD
 Node B
 48x 1.6TB HDD
 Node C
 48x 1.6TB HDD
 Node D
 48x 1.6TB HDD
 Node E
 48x 7.2TB HDD
 Node F
 48x 7.2TB HDD

 Question:
 Is it advisable to distribute the drives equally over all nodes?
 If yes, how should this be executed w/o ceph disruption?

 Regards
 Thomas

 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io 

2024

2023

2022

2021

2020

2019

[ceph-users] Re: OSD rebalancing issue - should drives be distributed equally over all nodes