[ceph-users] Re: Cluster degraded after adding OSDs to increase capacity

27 Aug 2020

The new drives are larger capacity than the first drives I added to the
cluster, but they're all SAS HDDs.

cephuser@ceph01:~$ ceph osd df tree
ID CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA    OMAP    META    AVAIL
 %USE  VAR  PGS STATUS TYPE NAME
-1       122.79410        - 123 TiB  42 TiB  41 TiB 217 GiB 466 GiB   81
TiB 33.86 1.00   -        root default
-3        40.93137        -  41 TiB  14 TiB  14 TiB  72 GiB 154 GiB   27
TiB 33.86 1.00   -            host ceph01
 0   hdd   2.72849  0.95001 2.7 TiB 2.2 TiB 2.1 TiB 7.4 GiB  24 GiB  569
GiB 79.64 2.35 218     up         osd.0
 1   hdd   2.72849  1.00000 2.7 TiB 2.1 TiB 2.0 TiB 7.6 GiB  23 GiB  694
GiB 75.16 2.22 196     up         osd.1
 2   hdd   2.72849  1.00000 2.7 TiB 1.6 TiB 1.6 TiB 8.8 GiB  18 GiB  1.1
TiB 60.39 1.78 199     up         osd.2
 3   hdd   2.72849  0.95001 2.7 TiB 2.2 TiB 2.1 TiB 8.3 GiB  23 GiB  583
GiB 79.13 2.34 202     up         osd.3
 4   hdd   2.72849  1.00000 2.7 TiB 2.1 TiB 2.0 TiB 8.4 GiB  22 GiB  692
GiB 75.22 2.22 214     up         osd.4
 5   hdd   2.72849  1.00000 2.7 TiB 1.7 TiB 1.7 TiB 8.5 GiB  19 GiB  1.0
TiB 62.39 1.84 195     up         osd.5
 6   hdd   2.72849  1.00000 2.7 TiB 2.0 TiB 2.0 TiB 8.5 GiB  21 GiB  709
GiB 74.62 2.20 217     up         osd.6
22   hdd   5.45799  1.00000 5.5 TiB 4.2 GiB 165 MiB 2.0 GiB 2.1 GiB  5.5
TiB  0.08 0.00  23     up         osd.22
23   hdd   5.45799  1.00000 5.5 TiB 2.7 GiB 161 MiB 1.5 GiB 1.0 GiB  5.5
TiB  0.05 0.00  23     up         osd.23
27   hdd   5.45799  1.00000 5.5 TiB  23 GiB  17 GiB 5.0 GiB 1.3 GiB  5.4
TiB  0.42 0.01  63     up         osd.27
28   hdd   5.45799  1.00000 5.5 TiB  10 GiB 2.8 GiB 6.0 GiB 1.3 GiB  5.4
TiB  0.18 0.01  82     up         osd.28
-5        40.93137        -  41 TiB  14 TiB  14 TiB  71 GiB 157 GiB   27
TiB 33.89 1.00   -            host ceph02
 7   hdd   2.72849  1.00000 2.7 TiB 2.1 TiB 2.1 TiB 9.6 GiB  23 GiB  652
GiB 76.66 2.26 221     up         osd.7
 8   hdd   2.72849  0.95001 2.7 TiB 2.4 TiB 2.4 TiB 7.6 GiB  26 GiB  308
GiB 88.98 2.63 220     up         osd.8
 9   hdd   2.72849  1.00000 2.7 TiB 2.1 TiB 2.0 TiB 8.5 GiB  23 GiB  679
GiB 75.71 2.24 214     up         osd.9
10   hdd   2.72849  1.00000 2.7 TiB 2.0 TiB 1.9 TiB 7.5 GiB  21 GiB  777
GiB 72.18 2.13 208     up         osd.10
11   hdd   2.72849  1.00000 2.7 TiB 2.0 TiB 2.0 TiB 6.1 GiB  22 GiB  752
GiB 73.10 2.16 191     up         osd.11
12   hdd   2.72849  1.00000 2.7 TiB 1.5 TiB 1.5 TiB 9.1 GiB  18 GiB  1.2
TiB 56.45 1.67 188     up         osd.12
13   hdd   2.72849  1.00000 2.7 TiB 1.7 TiB 1.7 TiB 7.9 GiB  19 GiB 1024
GiB 63.37 1.87 193     up         osd.13
25   hdd   5.45799  1.00000 5.5 TiB 4.9 GiB 165 MiB 3.7 GiB 1.0 GiB  5.5
TiB  0.09 0.00  42     up         osd.25
26   hdd   5.45799  1.00000 5.5 TiB 2.9 GiB 157 MiB 1.6 GiB 1.2 GiB  5.5
TiB  0.05 0.00  26     up         osd.26
29   hdd   5.45799  1.00000 5.5 TiB  24 GiB  18 GiB 4.2 GiB 1.2 GiB  5.4
TiB  0.43 0.01  58     up         osd.29
30   hdd   5.45799  1.00000 5.5 TiB  21 GiB  14 GiB 5.6 GiB 1.3 GiB  5.4
TiB  0.38 0.01  71     up         osd.30
-7        40.93137        -  41 TiB  14 TiB  14 TiB  73 GiB 156 GiB   27
TiB 33.83 1.00   -            host ceph03
14   hdd   2.72849  1.00000 2.7 TiB 2.1 TiB 2.1 TiB 6.9 GiB  23 GiB  627
GiB 77.56 2.29 205     up         osd.14
15   hdd   2.72849  1.00000 2.7 TiB 2.0 TiB 1.9 TiB 6.8 GiB  21 GiB  793
GiB 71.62 2.12 189     up         osd.15
16   hdd   2.72849  1.00000 2.7 TiB 1.9 TiB 1.9 TiB 8.7 GiB  21 GiB  813
GiB 70.89 2.09 209     up         osd.16
17   hdd   2.72849  1.00000 2.7 TiB 2.1 TiB 2.1 TiB 8.6 GiB  23 GiB  609
GiB 78.19 2.31 216     up         osd.17
18   hdd   2.72849  1.00000 2.7 TiB 1.7 TiB 1.7 TiB 9.1 GiB  19 GiB  1.0
TiB 62.40 1.84 209     up         osd.18
19   hdd   2.72849  0.95001 2.7 TiB 2.2 TiB 2.2 TiB 9.1 GiB  24 GiB  541
GiB 80.65 2.38 210     up         osd.19
20   hdd   2.72849  1.00000 2.7 TiB 1.8 TiB 1.8 TiB 8.4 GiB  19 GiB  969
GiB 65.32 1.93 200     up         osd.20
21   hdd   5.45799  1.00000 5.5 TiB 3.7 GiB 161 MiB 2.2 GiB 1.3 GiB  5.5
TiB  0.07 0.00  28     up         osd.21
24   hdd   5.45799  1.00000 5.5 TiB 4.9 GiB 177 MiB 3.6 GiB 1.1 GiB  5.5
TiB  0.09 0.00  37     up         osd.24
31   hdd   5.45799  1.00000 5.5 TiB 8.9 GiB 2.7 GiB 5.0 GiB 1.2 GiB  5.4
TiB  0.16 0.00  59     up         osd.31
32   hdd   5.45799  1.00000 5.5 TiB 6.0 GiB 182 MiB 4.7 GiB 1.1 GiB  5.5
TiB  0.11 0.00  70     up         osd.32
                      TOTAL 123 TiB  42 TiB  41 TiB 217 GiB 466 GiB   81
TiB 33.86
MIN/MAX VAR: 0.00/2.63  STDDEV: 37.27

On Thu, Aug 27, 2020 at 8:43 AM Eugen Block &lt;eblock(a)nde.ag&gt; wrote:

...
  Hi,

 are the new OSDs in the same root and is it the same device class? Can
 you share the output of ‚ceph osd df tree‘?

 Zitat von Dallas Jones &lt;djones(a)tech4learning.com&gt;om>:

  My 3-node Ceph cluster (14.2.4) has been running
fine for months.  However,
  my data pool became close to full a couple of
weeks ago, so I added 12  new
  OSDs, roughly doubling the capacity of the
cluster. However, the pool  size
  has not changed, and the health of the cluster
has changed for the worse.
 The dashboard shows the following cluster status:

    - PG_DEGRADED_FULL: Degraded data redundancy (low space): 2 pgs
    backfill_toofull
    - POOL_NEARFULL: 6 pool(s) nearfull
    - OSD_NEARFULL: 1 nearfull osd(s)

 Output from ceph -s:

   cluster:
     id:     e5a47160-a302-462a-8fa4-1e533e1edd4e
     health: HEALTH_ERR
             1 nearfull osd(s)
             6 pool(s) nearfull
             Degraded data redundancy (low space): 2 pgs backfill_toofull

   services:
     mon: 3 daemons, quorum ceph01,ceph02,ceph03 (age 5w)
     mgr: ceph01(active, since 4w), standbys: ceph03, ceph02
     mds: cephfs:1 {0=ceph01=up:active} 2 up:standby
     osd: 33 osds: 33 up (since 43h), 33 in (since 43h); 1094 remapped pgs
     rgw: 3 daemons active (ceph01, ceph02, ceph03)

   data:
     pools:   6 pools, 1632 pgs
     objects: 134.50M objects, 7.8 TiB
     usage:   42 TiB used, 81 TiB / 123 TiB avail
     pgs:     213786007/403501920 objects misplaced (52.983%)
              1088 active+remapped+backfill_wait
              538  active+clean
              4    active+remapped+backfilling
              2    active+remapped+backfill_wait+backfill_toofull

   io:
     recovery: 477 KiB/s, 330 keys/s, 29 objects/s

 Can someone steer me in the right direction for how to get my cluster
 healthy again?

 Thanks in advance!

 -Dallas
 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io 

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Cluster degraded after adding OSDs to increase capacity