So,
doing nothing and wait for the ceph to recover?
In theory there should be enough disk space (more disks arriving tomorrow),
but I fear that there might be an issue, when the backups get exported over
night to this s3. Currently the max_avail lingers around 13TB and I hope,
that the data will go to other PGs than the ones that are currently on
filled OSDs.
Am Di., 23. März 2021 um 18:58 Uhr schrieb Dan van der Ster <
dan(a)vanderster.com>gt;:
Hi,
backfill_toofull is not a bad thing when the cluster is really full
like yours. You should expect some of the most full OSDs to eventually
start decreasing in usage, as the PGs are moved to the new OSDs. Those
backfill_toofull states should then resolve themselves as the OSD
usage flattens out.
Keep an eye on the usage of the backfill_full and nearfull OSDs though
-- if they do eventually go above the full_ratio (95% by default),
then writes to those OSDs would stop.
But if on the other hand you're suffering from lots of slow ops or
anything else visible to your users, then you could try to take some
actions to slow down the rebalancing. Just let us know if that's the
case and we can see about changing osd_max_backfills, some weights or
maybe using the upmap-remapped tool.
-- Dan
On Tue, Mar 23, 2021 at 6:07 PM Boris Behrens <bb(a)kervyn.de> wrote:
Ok, I should have listened to you :)
In the last week we added more storage but the issue got worse instead.
Today I realized that the PGs were up to 90GB (bytes column in ceph pg
ls said
95705749636), and the balance kept mentioning the 2048 PGs for this
pool. We were at 72% utilization (ceph osd df tree, first line) for our
cluster and I increased the PGs to 2048.
Now I am in a world of trouble.
The space in the cluster went down, I am at 45% misplaced objects, and
we already
added 20x4TB disks just to not go down completly.
The utilization is still going up and the overall free space in the
cluster seems
to go down. This is what my ceph status looks like and now I
really need help to get that thing back to normal:
[root@s3db1 ~]# ceph status
cluster:
id: dca79fff-ffd0-58f4-1cff-82a2feea05f4
health: HEALTH_WARN
4 backfillfull osd(s)
17 nearfull osd(s)
37 pool(s) backfillfull
13 large omap objects
Low space hindering backfill (add storage if this doesn't
resolve
itself): 570 pgs backfill_toofull
services:
mon: 3 daemons, quorum ceph-s3-mon1,ceph-s3-mon2,ceph-s3-mon3 (age
44m)
mgr: ceph-mgr2(active, since 15m), standbys:
ceph-mgr3, ceph-mgr1
mds: 3 up:standby
osd: 110 osds: 110 up (since 28m), 110 in (since 28m); 1535 remapped
pgs
rgw: 3 daemons active (eu-central-1,
eu-msg-1, eu-secure-1)
task status:
data:
pools: 37 pools, 4032 pgs
objects: 116.23M objects, 182 TiB
usage: 589 TiB used, 206 TiB / 795 TiB avail
pgs: 160918554/348689415 objects misplaced (46.150%)
2497 active+clean
779 active+remapped+backfill_wait
538 active+remapped+backfill_wait+backfill_toofull
186 active+remapped+backfilling
32 active+remapped+backfill_toofull
io:
client: 27 MiB/s rd, 69 MiB/s wr, 497 op/s rd, 153 op/s wr
recovery: 1.5 GiB/s, 922 objects/s
Am Di., 16. März 2021 um 09:34 Uhr schrieb Boris Behrens <bb(a)kervyn.de>de>:
>
> Hi Dan,
>
> my EC profile look very "default" to me.
> [root@s3db1 ~]# ceph osd erasure-code-profile ls
> default
> [root@s3db1 ~]# ceph osd erasure-code-profile get default
> k=2
> m=1
> plugin=jerasure
> technique=reed_sol_van
>
> I don't understand the ouput, but the balancing get worse over night:
>
> [root@s3db1 ~]# ceph-scripts/tools/ceph-pool-pg-distribution 11
> Searching for PGs in pools: ['11']
> Summary: 1024 PGs on 84 osds
>
> Num OSDs with X PGs:
> 15: 8
> 16: 7
> 17: 6
> 18: 10
> 19: 1
> 32: 10
> 33: 4
> 34: 6
> 35: 8
> 65: 5
> 66: 5
> 67: 4
> 68: 10
> [root@s3db1 ~]# ceph-scripts/tools/ceph-pg-histogram --normalize
--pool=11
> # NumSamples = 84; Min = 4.12; Max = 5.09
> # Mean = 4.553355; Variance = 0.052415; SD = 0.228942; Median 4.561608
> # each ∎ represents a count of 1
> 4.1244 - 4.2205 [ 8]: ∎∎∎∎∎∎∎∎
> 4.2205 - 4.3166 [ 6]: ∎∎∎∎∎∎
> 4.3166 - 4.4127 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎
> 4.4127 - 4.5087 [ 10]: ∎∎∎∎∎∎∎∎∎∎
> 4.5087 - 4.6048 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎
> 4.6048 - 4.7009 [ 19]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
> 4.7009 - 4.7970 [ 6]: ∎∎∎∎∎∎
> 4.7970 - 4.8931 [ 8]: ∎∎∎∎∎∎∎∎
> 4.8931 - 4.9892 [ 4]: ∎∎∎∎
> 4.9892 - 5.0852 [ 1]: ∎
> [root@s3db1 ~]# ceph osd df tree | sort -nk 17 | tail
> 14 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 724 GiB 19 GiB 0 B
724 GiB
80.56 1.07 56 up osd.14
> 19 hdd 3.68750 1.00000 3.7 TiB 3.0 TiB
2.9 TiB 466 MiB 7.9 GiB
708 GiB 81.25 1.08 53 up osd.19
> 4 hdd 3.63689 1.00000 3.6 TiB 3.0 TiB
698 GiB 703 MiB 0 B
698 GiB 81.27 1.08 48 up osd.4
> 24 hdd 3.63689 1.00000 3.6 TiB 3.0 TiB
695 GiB 640 MiB 0 B
695 GiB 81.34 1.08 46 up osd.24
> 75 hdd 3.68750 1.00000 3.7 TiB 3.0 TiB
2.9 TiB 440 MiB 8.1 GiB
704 GiB 81.35 1.08 48 up osd.75
> 71 hdd 3.68750 1.00000 3.7 TiB 3.0 TiB
3.0 TiB 7.5 MiB 8.0 GiB
663 GiB 82.44 1.09 47 up osd.71
> 76 hdd 3.68750 1.00000 3.7 TiB 3.1 TiB
3.0 TiB 251 MiB 9.0 GiB
617 GiB 83.65 1.11 50 up osd.76
> 33 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB
3.0 TiB 399 MiB 8.1 GiB
618 GiB 83.85 1.11 55 up osd.33
> 35 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB
3.0 TiB 317 MiB 8.8 GiB
617 GiB 83.87 1.11 50 up osd.35
> 34 hdd 3.73630 1.00000 3.7 TiB 3.2 TiB
3.1 TiB 451 MiB 8.7 GiB
545 GiB 85.75 1.14 54 up osd.34
>
> Am Mo., 15. März 2021 um 17:23 Uhr schrieb Dan van der Ster <
dan(a)vanderster.com>gt;:
>>
>> Hi,
>>
>> How wide are your EC profiles? If they are really wide, you might be
>> reaching the limits of what is physically possible. Also, I'm not sure
>> that upmap in 14.2.11 is very smart about *improving* existing upmap
>> rules for a given PG, in the case that a PG already has an upmap-items
>> entry but it would help the distribution to add more mapping pairs to
>> that entry. What this means, is that it might sometimes be useful to
>> randomly remove some upmap entries and see if the balancer does a
>> better job when it replaces them.
>>
>> But before you do that, I re-remembered that looking at the total PG
>> numbers is not useful -- you need to check the PGs per OSD for the
>> eu-central-1.rgw.buckets.data pool only.
>>
>> We have a couple tools that can help with this:
>>
>> 1. To see the PGs per OSD for a given pool:
>>
https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-pool-pg-dis…
>>
>> E.g.: ./ceph-pool-pg-distribution 11 # to see the distribution of
>> your eu-central-1.rgw.buckets.data pool.
>>
>> The output looks like this on my well balanced clusters:
>>
>> # ceph-scripts/tools/ceph-pool-pg-distribution 15
>> Searching for PGs in pools: ['15']
>> Summary: 256 pgs on 56 osds
>>
>> Num OSDs with X PGs:
>> 13: 16
>> 14: 40
>>
>> You should expect a trimodal for your cluster.
>>
>> 2. You can also use another script from that repo to see the PGs per
>> OSD normalized to crush weight:
>> ceph-scripts/tools/ceph-pg-histogram --normalize --pool=15
>>
>> This might explain what is going wrong.
>>
>> Cheers, Dan
>>
>>
>> On Mon, Mar 15, 2021 at 3:04 PM Boris Behrens <bb(a)kervyn.de> wrote:
>> >
>> > Absolutly:
>> > [root@s3db1 ~]# ceph osd df tree
>> > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META
AVAIL %USE VAR PGS STATUS TYPE NAME
>> > -1 673.54224 - 674 TiB
496 TiB 468 TiB 97 GiB 1.2
TiB 177 TiB 73.67 1.00 - root default
>> > -2 58.30331 - 58 TiB
42 TiB 38 TiB 9.2 GiB 99
GiB 16 TiB 72.88 0.99 - host s3db1
>> > 23 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 714 MiB 25
GiB 3.7 TiB 74.87 1.02 194 up osd.23
>> > 69 hdd 14.55269 1.00000 15 TiB
11 TiB 11 TiB 1.6 GiB 40
GiB 3.4 TiB 76.32 1.04 199 up osd.69
>> > 73 hdd 14.55269 1.00000 15 TiB
11 TiB 11 TiB 1.3 GiB 34
GiB 3.8 TiB 74.15 1.01 203 up osd.73
>> > 79 hdd 3.63689 1.00000 3.6 TiB
2.4 TiB 1.3 TiB 1.8 GiB 0
B 1.3 TiB 65.44 0.89 47 up osd.79
>> > 80 hdd 3.63689 1.00000 3.6 TiB
2.4 TiB 1.3 TiB 2.2 GiB 0
B 1.3 TiB 65.34 0.89 48 up osd.80
>> > 81 hdd 3.63689 1.00000 3.6 TiB
2.4 TiB 1.3 TiB 1.1 GiB 0
B 1.3 TiB 65.38 0.89 47 up osd.81
>> > 82 hdd 3.63689 1.00000 3.6 TiB
2.5 TiB 1.1 TiB 619 MiB 0
B 1.1 TiB 68.46 0.93 41 up osd.82
>> > -11 50.94173 - 51 TiB
37 TiB 37 TiB 3.5 GiB 98
GiB 14 TiB 71.90 0.98 - host s3db10
>> > 63 hdd 7.27739 1.00000 7.3 TiB
5.3 TiB 5.3 TiB 647 MiB 14
GiB 2.0 TiB 72.72 0.99 94 up osd.63
>> > 64 hdd 7.27739 1.00000 7.3 TiB
5.3 TiB 5.2 TiB 668 MiB 14
GiB 2.0 TiB 72.23 0.98 93 up osd.64
>> > 65 hdd 7.27739 1.00000 7.3 TiB
5.2 TiB 5.2 TiB 227 MiB 14
GiB 2.1 TiB 71.16 0.97 100 up osd.65
>> > 66 hdd 7.27739 1.00000 7.3 TiB
5.4 TiB 5.4 TiB 313 MiB 14
GiB 1.9 TiB 74.25 1.01 92 up osd.66
>> > 67 hdd 7.27739 1.00000 7.3 TiB
5.1 TiB 5.1 TiB 584 MiB 14
GiB 2.1 TiB 70.63 0.96 96 up osd.67
>> > 68 hdd 7.27739 1.00000 7.3 TiB
5.2 TiB 5.2 TiB 720 MiB 14
GiB 2.1 TiB 71.72 0.97 101 up osd.68
>> > 70 hdd 7.27739 1.00000 7.3 TiB
5.1 TiB 5.1 TiB 425 MiB 14
GiB 2.1 TiB 70.59 0.96 97 up osd.70
>> > -12 50.99052 - 51 TiB
38 TiB 37 TiB 2.1 GiB 97
GiB 13 TiB 73.77 1.00 - host s3db11
>> > 46 hdd 7.27739 1.00000 7.3 TiB
5.6 TiB 5.6 TiB 229 MiB 14
GiB 1.7 TiB 77.05 1.05 97 up osd.46
>> > 47 hdd 7.27739 1.00000 7.3 TiB
5.1 TiB 5.1 TiB 159 MiB 13
GiB 2.2 TiB 70.00 0.95 89 up osd.47
>> > 48 hdd 7.27739 1.00000 7.3 TiB
5.2 TiB 5.2 TiB 279 MiB 14
GiB 2.1 TiB 71.82 0.97 98 up osd.48
>> > 49 hdd 7.27739 1.00000 7.3 TiB
5.5 TiB 5.4 TiB 276 MiB 14
GiB 1.8 TiB 74.90 1.02 95 up osd.49
>> > 50 hdd 7.27739 1.00000 7.3 TiB
5.2 TiB 5.2 TiB 336 MiB 14
GiB 2.0 TiB 72.13 0.98 93 up osd.50
>> > 51 hdd 7.27739 1.00000 7.3 TiB
5.7 TiB 5.6 TiB 728 MiB 15
GiB 1.6 TiB 77.76 1.06 98 up osd.51
>> > 72 hdd 7.32619 1.00000 7.3 TiB
5.3 TiB 5.3 TiB 147 MiB 13
GiB 2.0 TiB 72.75 0.99 95 up osd.72
>> > -37 58.55478 - 59 TiB
44 TiB 44 TiB 4.4 GiB 122
GiB 15 TiB 75.20 1.02 - host s3db12
>> > 19 hdd 3.68750 1.00000 3.7 TiB
2.9 TiB 2.9 TiB 454 MiB 8.2
GiB 780 GiB 79.35 1.08 53 up osd.19
>> > 71 hdd 3.68750 1.00000 3.7 TiB
3.0 TiB 2.9 TiB 7.1 MiB 8.0
GiB 734 GiB 80.56 1.09 47 up osd.71
>> > 75 hdd 3.68750 1.00000 3.7 TiB
2.9 TiB 2.9 TiB 439 MiB 8.2
GiB 777 GiB 79.43 1.08 48 up osd.75
>> > 76 hdd 3.68750 1.00000 3.7 TiB
3.0 TiB 3.0 TiB 241 MiB 8.9
GiB 688 GiB 81.77 1.11 50 up osd.76
>> > 77 hdd 14.60159 1.00000 15 TiB
11 TiB 11 TiB 880 MiB 30
GiB 3.6 TiB 75.44 1.02 201 up osd.77
>> > 78 hdd 14.60159 1.00000 15 TiB
10 TiB 10 TiB 1015 MiB 28
GiB 4.2 TiB 71.26 0.97 193 up osd.78
>> > 83 hdd 14.60159 1.00000 15 TiB
11 TiB 11 TiB 1.4 GiB 30
GiB 3.8 TiB 73.76 1.00 203 up osd.83
>> > -3 58.49872 - 58 TiB
42 TiB 36 TiB 8.2 GiB 89
GiB 17 TiB 71.71 0.97 - host s3db2
>> > 1 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 3.2 GiB 37
GiB 3.7 TiB 74.58 1.01 196 up osd.1
>> > 3 hdd 3.63689 1.00000 3.6 TiB
2.3 TiB 1.3 TiB 566 MiB 0
B 1.3 TiB 64.11 0.87 50 up osd.3
>> > 4 hdd 3.63689 1.00000 3.6 TiB
2.9 TiB 771 GiB 695 MiB 0
B 771 GiB 79.30 1.08 48 up osd.4
>> > 5 hdd 3.63689 1.00000 3.6 TiB
2.4 TiB 1.2 TiB 482 MiB 0
B 1.2 TiB 66.51 0.90 49 up osd.5
>> > 6 hdd 3.63689 1.00000 3.6 TiB
2.3 TiB 1.3 TiB 1.8 GiB 0
B 1.3 TiB 64.00 0.87 42 up osd.6
>> > 7 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 639 MiB 26
GiB 4.0 TiB 72.44 0.98 192 up osd.7
>> > 74 hdd 14.65039 1.00000 15 TiB
10 TiB 10 TiB 907 MiB 26
GiB 4.2 TiB 71.32 0.97 193 up osd.74
>> > -4 58.49872 - 58 TiB
43 TiB 36 TiB 34 GiB 85
GiB 16 TiB 72.69 0.99 - host s3db3
>> > 2 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 980 MiB 26
GiB 3.8 TiB 74.36 1.01 203 up osd.2
>> > 9 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 8.4 GiB 33
GiB 3.9 TiB 73.51 1.00 186 up osd.9
>> > 10 hdd 14.65039 1.00000 15 TiB
10 TiB 10 TiB 650 MiB 26
GiB 4.2 TiB 71.64 0.97 201 up osd.10
>> > 12 hdd 3.63689 1.00000 3.6 TiB
2.3 TiB 1.3 TiB 754 MiB 0
B 1.3 TiB 64.17 0.87 44 up osd.12
>> > 13 hdd 3.63689 1.00000 3.6 TiB
2.8 TiB 813 GiB 2.4 GiB 0
B 813 GiB 78.17 1.06 58 up osd.13
>> > 14 hdd 3.63689 1.00000 3.6 TiB
2.9 TiB 797 GiB 19 GiB 0
B 797 GiB 78.60 1.07 56 up osd.14
>> > 15 hdd 3.63689 1.00000 3.6 TiB
2.3 TiB 1.3 TiB 2.2 GiB 0
B 1.3 TiB 63.96 0.87 41 up osd.15
>> > -5 58.49872 - 58 TiB
43 TiB 36 TiB 6.7 GiB 97
GiB 15 TiB 74.04 1.01 - host s3db4
>> > 11 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 940 MiB 26
GiB 4.0 TiB 72.49 0.98 196 up osd.11
>> > 17 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 1022 MiB 26
GiB 3.6 TiB 75.23 1.02 204 up osd.17
>> > 18 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 945 MiB 45
GiB 3.8 TiB 74.16 1.01 193 up osd.18
>> > 20 hdd 3.63689 1.00000 3.6 TiB
2.6 TiB 1020 GiB 596 MiB 0
B 1020 GiB 72.62 0.99 57 up osd.20
>> > 21 hdd 3.63689 1.00000 3.6 TiB
2.6 TiB 1023 GiB 1.9 GiB 0
B 1023 GiB 72.54 0.98 41 up osd.21
>> > 22 hdd 3.63689 1.00000 3.6 TiB
2.6 TiB 1023 GiB 797 MiB 0
B 1023 GiB 72.54 0.98 53 up osd.22
>> > 24 hdd 3.63689 1.00000 3.6 TiB
2.9 TiB 766 GiB 618 MiB 0
B 766 GiB 79.42 1.08 46 up osd.24
>> > -6 58.89636 - 59 TiB
43 TiB 43 TiB 3.0 GiB 108
GiB 16 TiB 73.40 1.00 - host s3db5
>> > 0 hdd 3.73630 1.00000 3.7 TiB
2.7 TiB 2.6 TiB 92 MiB 7.2
GiB 1.1 TiB 71.16 0.97 45 up osd.0
>> > 25 hdd 3.73630 1.00000 3.7 TiB
2.7 TiB 2.6 TiB 2.4 MiB 7.3
GiB 1.1 TiB 71.23 0.97 41 up osd.25
>> > 26 hdd 3.73630 1.00000 3.7 TiB
2.8 TiB 2.7 TiB 181 MiB 7.6
GiB 935 GiB 75.57 1.03 45 up osd.26
>> > 27 hdd 3.73630 1.00000 3.7 TiB
2.7 TiB 2.6 TiB 5.1 MiB 7.0
GiB 1.1 TiB 71.20 0.97 47 up osd.27
>> > 28 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 977 MiB 26
GiB 3.8 TiB 73.85 1.00 197 up osd.28
>> > 29 hdd 14.65039 1.00000 15 TiB
11 TiB 10 TiB 872 MiB 26
GiB 4.1 TiB 71.98 0.98 196 up osd.29
>> > 30 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 943 MiB 27
GiB 3.6 TiB 75.51 1.03 202 up osd.30
>> > -7 58.89636 - 59 TiB
44 TiB 43 TiB 13 GiB 122
GiB 15 TiB 74.97 1.02 - host s3db6
>> > 32 hdd 3.73630 1.00000 3.7 TiB
2.8 TiB 2.7 TiB 27 MiB 7.6
GiB 940 GiB 75.42 1.02 55 up osd.32
>> > 33 hdd 3.73630 1.00000 3.7 TiB
3.1 TiB 3.0 TiB 376 MiB 8.2
GiB 691 GiB 81.94 1.11 55 up osd.33
>> > 34 hdd 3.73630 1.00000 3.7 TiB
3.1 TiB 3.0 TiB 450 MiB 8.5
GiB 620 GiB 83.79 1.14 54 up osd.34
>> > 35 hdd 3.73630 1.00000 3.7 TiB
3.1 TiB 3.0 TiB 316 MiB 8.4
GiB 690 GiB 81.98 1.11 50 up osd.35
>> > 36 hdd 14.65039 1.00000 15 TiB
11 TiB 10 TiB 489 MiB 25
GiB 4.1 TiB 71.69 0.97 208 up osd.36
>> > 37 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 11 GiB 38
GiB 4.0 TiB 72.41 0.98 195 up osd.37
>> > 38 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 1.1 GiB 26
GiB 3.7 TiB 74.88 1.02 204 up osd.38
>> > -8 58.89636 - 59 TiB
44 TiB 43 TiB 3.8 GiB 111
GiB 15 TiB 74.16 1.01 - host s3db7
>> > 39 hdd 3.73630 1.00000 3.7 TiB
2.8 TiB 2.7 TiB 19 MiB 7.5
GiB 936 GiB 75.54 1.03 39 up osd.39
>> > 40 hdd 3.73630 1.00000 3.7 TiB
2.6 TiB 2.5 TiB 144 MiB 7.1
GiB 1.1 TiB 69.87 0.95 39 up osd.40
>> > 41 hdd 3.73630 1.00000 3.7 TiB
2.7 TiB 2.7 TiB 219 MiB 7.6
GiB 1011 GiB 73.57 1.00 55 up osd.41
>> > 42 hdd 3.73630 1.00000 3.7 TiB
2.6 TiB 2.5 TiB 593 MiB 7.1
GiB 1.1 TiB 70.02 0.95 47 up osd.42
>> > 43 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 500 MiB 27
GiB 3.7 TiB 74.67 1.01 204 up osd.43
>> > 44 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 1.1 GiB 27
GiB 3.7 TiB 74.62 1.01 193 up osd.44
>> > 45 hdd 14.65039 1.00000 15 TiB
11 TiB 11 TiB 1.2 GiB 29
GiB 3.6 TiB 75.16 1.02 204 up osd.45
>> > -9 51.28331 - 51 TiB
39 TiB 39 TiB 4.9 GiB 107
GiB 12 TiB 76.50 1.04 - host s3db8
>> > 8 hdd 7.32619 1.00000 7.3 TiB
5.6 TiB 5.5 TiB 474 MiB 14
GiB 1.7 TiB 76.37 1.04 98 up osd.8
>> > 16 hdd 7.32619 1.00000 7.3 TiB
5.7 TiB 5.7 TiB 783 MiB 15
GiB 1.6 TiB 78.39 1.06 100 up osd.16
>> > 31 hdd 7.32619 1.00000 7.3 TiB
5.7 TiB 5.6 TiB 441 MiB 14
GiB 1.6 TiB 77.70 1.05 91 up osd.31
>> > 52 hdd 7.32619 1.00000 7.3 TiB
5.6 TiB 5.5 TiB 939 MiB 14
GiB 1.7 TiB 76.29 1.04 102 up osd.52
>> > 53 hdd 7.32619 1.00000 7.3 TiB
5.4 TiB 5.4 TiB 848 MiB 18
GiB 1.9 TiB 74.30 1.01 98 up osd.53
>> > 54 hdd 7.32619 1.00000 7.3 TiB
5.6 TiB 5.6 TiB 1.0 GiB 16
GiB 1.7 TiB 76.99 1.05 106 up osd.54
>> > 55 hdd 7.32619 1.00000 7.3 TiB
5.5 TiB 5.5 TiB 460 MiB 15
GiB 1.8 TiB 75.46 1.02 105 up osd.55
>> > -10 51.28331 - 51 TiB
37 TiB 37 TiB 3.8 GiB 96
GiB 14 TiB 72.77 0.99 - host s3db9
>> > 56 hdd 7.32619 1.00000 7.3 TiB
5.2 TiB 5.2 TiB 846 MiB 13
GiB 2.1 TiB 71.16 0.97 104 up osd.56
>> > 57 hdd 7.32619 1.00000 7.3 TiB
5.6 TiB 5.6 TiB 513 MiB 15
GiB 1.7 TiB 76.53 1.04 96 up osd.57
>> > 58 hdd 7.32619 1.00000 7.3 TiB
5.2 TiB 5.2 TiB 604 MiB 13
GiB 2.1 TiB 71.23 0.97 98 up osd.58
>> > 59 hdd 7.32619 1.00000 7.3 TiB
5.1 TiB 5.1 TiB 414 MiB 13
GiB 2.2 TiB 70.03 0.95 88 up osd.59
>> > 60 hdd 7.32619 1.00000 7.3 TiB
5.5 TiB 5.5 TiB 227 MiB 14
GiB 1.8 TiB 75.54 1.03 97 up osd.60
>> > 61 hdd 7.32619 1.00000 7.3 TiB
5.1 TiB 5.1 TiB 456 MiB 13
GiB 2.2 TiB 70.01 0.95 95 up osd.61
>> > 62 hdd 7.32619 1.00000 7.3 TiB
5.5 TiB 5.4 TiB 843 MiB 14
GiB 1.8 TiB 74.93 1.02 110 up osd.62
>> > TOTAL 674 TiB
496 TiB 468 TiB 97 GiB 1.2
TiB 177 TiB 73.67
>> > MIN/MAX VAR: 0.87/1.14 STDDEV:
4.22
>> >
>> > Am Mo., 15. März 2021 um 15:02 Uhr schrieb Dan van der Ster <
dan(a)vanderster.com>gt;:
>> >>
>> >> OK thanks. Indeed "prepared 0/10 changes" means it thinks
things
are balanced.
>> >> Could you again share the full
ceph osd df tree?
>> >>
>> >> On Mon, Mar 15, 2021 at 2:54 PM Boris Behrens <bb(a)kervyn.de>
wrote:
>> >> >
>> >> > Hi Dan,
>> >> >
>> >> > I've set the autoscaler to warn, but it actually does not warn
for now. So not touching it for now.
>> >> >
>> >> > this is what the log says in minute intervals:
>> >> > 2021-03-15 13:51:00.970 7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/active
>> >> > 2021-03-15 13:51:00.970
7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/sleep_interval
>> >> > 2021-03-15 13:51:00.970
7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/begin_time
>> >> > 2021-03-15 13:51:00.970
7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/end_time
>> >> > 2021-03-15 13:51:00.970
7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/begin_weekday
>> >> > 2021-03-15 13:51:00.970
7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/end_weekday
>> >> > 2021-03-15 13:51:00.971
7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/pool_ids
>> >> > 2021-03-15 13:51:01.203
7f307d5fd700 4 mgr[balancer] Optimize
plan auto_2021-03-15_13:51:00
>> >> > 2021-03-15 13:51:01.203
7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/mode
>> >> > 2021-03-15 13:51:01.203
7f307d5fd700 4 mgr[balancer] Mode upmap,
max misplaced 0.050000
>>> >> > 2021-03-15 13:51:01.203 7f307d5fd700 4 mgr[balancer]
do_upmap
>> >> > 2021-03-15 13:51:01.203
7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/upmap_max_iterations
>> >> > 2021-03-15 13:51:01.203
7f307d5fd700 4 mgr get_config get_config
key: mgr/balancer/upmap_max_deviation
>> >> > 2021-03-15 13:51:01.203
7f307d5fd700 4 mgr[balancer] pools
['eu-msg-1.rgw.data.root',
'eu-msg-1.rgw.buckets.non-ec',
'eu-central-1.rgw.users.keys', 'eu-central-1.rgw.gc',
'eu-central-1.rgw.buckets.data', 'eu-central-1.rgw.users.email',
'eu-msg-1.rgw.gc', 'eu-central-1.rgw.usage',
'eu-msg-1.rgw.users.keys',
'eu-central-1.rgw.buckets.index', 'rbd', 'eu-msg-1.rgw.log',
'whitespace-again-2021-03-10_2', 'eu-msg-1.rgw.buckets.index',
'eu-msg-1.rgw.meta', 'eu-central-1.rgw.log', 'default.rgw.gc',
'eu-central-1.rgw.buckets.non-ec', 'eu-msg-1.rgw.usage',
'whitespace-again-2021-03-10', 'fra-1.rgw.meta',
'eu-central-1.rgw.users.uid', 'eu-msg-1.rgw.users.email',
'fra-1.rgw.control', 'eu-msg-1.rgw.users.uid',
'eu-msg-1.rgw.control',
'.rgw.root', 'eu-msg-1.rgw.buckets.data', 'default.rgw.control',
'fra-1.rgw.log', 'default.rgw.data.root',
'whitespace-again-2021-03-10_3',
'default.rgw.log', 'eu-central-1.rgw.meta',
'eu-central-1.rgw.data.root',
'default.rgw.users.uid', 'eu-central-1.rgw.control']
>> >> > 2021-03-15 13:51:01.224
7f307d5fd700 4 mgr[balancer] prepared
0/10 changes
>> >> >
>> >> > Am Mo., 15. März 2021 um 14:15 Uhr schrieb Dan van der Ster <
dan(a)vanderster.com>gt;:
>> >> >>
>> >> >> I suggest to just disable the autoscaler until your balancing
is
understood.
>> >> >>
>> >> >> What does your active mgr log say (with debug_mgr 4/5), grep
balancer
>> >> >>
/var/log/ceph/ceph-mgr.*.log
>> >> >>
>> >> >> -- Dan
>> >> >>
>> >> >> On Mon, Mar 15, 2021 at 1:47 PM Boris Behrens
<bb(a)kervyn.de>
wrote:
>> >> >> >
>> >> >> > Hi,
>> >> >> > this unfortunally did not solve my problem. I still have
some
OSDs that fill up to 85%
>> >> >> >
>> >> >> > According to the logging, the autoscaler might want to add
more PGs to one Bucken and reduce almost all other buckets to 32.
>> >> >> > 2021-03-15
12:19:58.825 7f307f601700 4 mgr[pg_autoscaler]
Pool
'eu-central-1.rgw.buckets.data' root_id -1 using 0.705080476146 of
space, bias 1.0, pg target 1974.22533321 quantized to 2048 (current 1024)
>> >> >> >
>> >> >> > Why the balancing does not happen is still nebulous to
me.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Am Sa., 13. März 2021 um 16:37 Uhr schrieb Dan van der
Ster <
dan(a)vanderster.com>gt;:
>> >> >> >>
>> >> >> >> OK
>> >> >> >> Btw, you might need to fail to a new mgr... I'm
not sure if
the current active will read that new config.
>> >> >> >>
>> >> >> >> .. dan
>> >> >> >>
>> >> >> >>
>> >> >> >> On Sat, Mar 13, 2021, 4:36 PM Boris Behrens
<bb(a)kervyn.de>
wrote:
>> >> >> >>>
>> >> >> >>> Hi,
>> >> >> >>>
>> >> >> >>> ok thanks. I just changed the value and rewighted
everything
back to 1. Now I let it sync the weekend and check how it will be on
monday.
>> >> >> >>> We tried
to have the systems total storage balanced as
possible. New systems will be with
8TB disks but for the exiting ones we
added 16TB to offset the 4TB disks and we needed a lot of storage fast,
because of a DC move. If you have any recommendations I would be happy to
hear them.
>> >> >> >>>
>> >> >> >>> Cheers
>> >> >> >>> Boris
>> >> >> >>>
>> >> >> >>> Am Sa., 13. März 2021 um 16:20 Uhr schrieb Dan van
der Ster <
dan(a)vanderster.com>gt;:
>> >> >> >>>>
>> >> >> >>>> Thanks.
>> >> >> >>>>
>> >> >> >>>> Decreasing the max deviation to 2 or 1 should
help in your
case. This option controls when the balancer stops trying to move PGs
around -- by default it stops when the deviation from the mean is 5. Yes
this is too large IMO -- all of our clusters have this set to 1.
>> >> >> >>>>
>> >> >> >>>> And given that you have some OSDs with more
than 200 PGs,
you definitely shouldn't increase the num PGs.
>> >> >> >>>>
>> >> >> >>>> But anyway with your mixed device sizes it
might be
challenging to make a perfectly uniform distribution. Give it a try with
1
though, and let us know how it goes.
>> >> >> >>>>
>> >> >> >>>> .. Dan
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> On Sat, Mar 13, 2021, 4:11 PM Boris Behrens
<bb(a)kervyn.de>
wrote:
>> >> >> >>>>>
>> >> >> >>>>> Hi Dan,
>> >> >> >>>>>
>> >> >> >>>>> upmap_max_deviation is default (5) in our
cluster. Is 1
the recommended deviation?
>> >> >> >>>>>
>> >> >> >>>>> I added the whole ceph osd df tree, (I
need to remove some
OSDs and readd them as bluestore with SSD, so 69, 73 and 82 are
a bit off
now. I also reweighted to try to get the %USE mitigated).
>> >> >> >>>>>
>> >> >> >>>>> I will increase the mgr debugging to see
what is the
problem.
>> >> >> >>>>>
>> >> >> >>>>> [root@s3db1 ~]# ceph osd df tree
>> >> >> >>>>> ID CLASS WEIGHT REWEIGHT SIZE RAW
USE DATA OMAP
META AVAIL %USE VAR PGS STATUS TYPE NAME
>> >> >> >>>>>
-1 673.54224 - 659 TiB 491 TiB 464 TiB 96
GiB 1.2 TiB 168 TiB 74.57
1.00 - root default
>> >> >> >>>>>
-2 58.30331 - 44 TiB 22 TiB 17 TiB 5.7
GiB 38 GiB 22 TiB 49.82
0.67 - host s3db1
>> >> >> >>>>>
23 hdd 14.65039 1.00000 15 TiB 1.8 TiB 1.7 TiB 156
MiB 4.4 GiB 13 TiB 12.50
0.17 101 up osd.23
>> >> >> >>>>>
69 hdd 14.55269 0 0 B 0 B 0 B 0
B 0 B 0 B 0
0 11 up osd.69
>> >> >> >>>>>
73 hdd 14.55269 1.00000 15 TiB 10 TiB 10 TiB 6.1
MiB 33 GiB 4.2 TiB 71.15
0.95 107 up osd.73
>> >> >> >>>>>
79 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 747 GiB 2.0
GiB 0 B 747 GiB 79.94
1.07 52 up osd.79
>> >> >> >>>>>
80 hdd 3.63689 1.00000 3.6 TiB 2.6 TiB 1.0 TiB 1.9
GiB 0 B 1.0 TiB 71.61
0.96 58 up osd.80
>> >> >> >>>>>
81 hdd 3.63689 1.00000 3.6 TiB 2.2 TiB 1.5 TiB 1.1
GiB 0 B 1.5 TiB 60.07
0.81 55 up osd.81
>> >> >> >>>>>
82 hdd 3.63689 1.00000 3.6 TiB 1.9 TiB 1.7 TiB 536
MiB 0 B 1.7 TiB 52.68
0.71 30 up osd.82
>> >> >> >>>>>
-11 50.94173 - 51 TiB 38 TiB 38 TiB 3.7
GiB 100 GiB 13 TiB 74.69
1.00 - host s3db10
>> >> >> >>>>>
63 hdd 7.27739 1.00000 7.3 TiB 5.5 TiB 5.5 TiB 616
MiB 14 GiB 1.7 TiB 76.04
1.02 92 up osd.63
>> >> >> >>>>>
64 hdd 7.27739 1.00000 7.3 TiB 5.5 TiB 5.5 TiB 820
MiB 15 GiB 1.8 TiB 75.54
1.01 101 up osd.64
>> >> >> >>>>>
65 hdd 7.27739 1.00000 7.3 TiB 5.3 TiB 5.3 TiB 109
MiB 14 GiB 2.0 TiB 73.17
0.98 105 up osd.65
>> >> >> >>>>>
66 hdd 7.27739 1.00000 7.3 TiB 5.8 TiB 5.8 TiB 423
MiB 15 GiB 1.4 TiB 80.38
1.08 98 up osd.66
>> >> >> >>>>>
67 hdd 7.27739 1.00000 7.3 TiB 5.1 TiB 5.1 TiB 572
MiB 14 GiB 2.2 TiB 70.10
0.94 100 up osd.67
>> >> >> >>>>>
68 hdd 7.27739 1.00000 7.3 TiB 5.3 TiB 5.3 TiB 630
MiB 13 GiB 2.0 TiB 72.88
0.98 107 up osd.68
>> >> >> >>>>>
70 hdd 7.27739 1.00000 7.3 TiB 5.4 TiB 5.4 TiB 648
MiB 14 GiB 1.8 TiB 74.73
1.00 102 up osd.70
>> >> >> >>>>>
-12 50.99052 - 51 TiB 39 TiB 39 TiB 2.9
GiB 99 GiB 12 TiB 77.24
1.04 - host s3db11
>> >> >> >>>>>
46 hdd 7.27739 1.00000 7.3 TiB 5.7 TiB 5.7 TiB 102
MiB 15 GiB 1.5 TiB 78.91
1.06 97 up osd.46
>> >> >> >>>>>
47 hdd 7.27739 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 61
MiB 13 GiB 2.1 TiB 71.47
0.96 96 up osd.47
>> >> >> >>>>>
48 hdd 7.27739 1.00000 7.3 TiB 6.1 TiB 6.1 TiB 853
MiB 15 GiB 1.2 TiB 83.46
1.12 109 up osd.48
>> >> >> >>>>>
49 hdd 7.27739 1.00000 7.3 TiB 5.7 TiB 5.7 TiB 708
MiB 15 GiB 1.5 TiB 78.96
1.06 98 up osd.49
>> >> >> >>>>>
50 hdd 7.27739 1.00000 7.3 TiB 5.9 TiB 5.8 TiB 472
MiB 15 GiB 1.4 TiB 80.40
1.08 102 up osd.50
>> >> >> >>>>>
51 hdd 7.27739 1.00000 7.3 TiB 5.9 TiB 5.9 TiB 729
MiB 15 GiB 1.3 TiB 81.70
1.10 110 up osd.51
>> >> >> >>>>>
72 hdd 7.32619 1.00000 7.3 TiB 4.8 TiB 4.8 TiB 91
MiB 12 GiB 2.5 TiB 65.82
0.88 89 up osd.72
>> >> >> >>>>>
-37 58.55478 - 59 TiB 46 TiB 46 TiB 5.0
GiB 124 GiB 12 TiB 79.04
1.06 - host s3db12
>> >> >> >>>>>
19 hdd 3.68750 1.00000 3.7 TiB 3.1 TiB 3.1 TiB 462
MiB 8.2 GiB 559 GiB 85.18
1.14 55 up osd.19
>> >> >> >>>>>
71 hdd 3.68750 1.00000 3.7 TiB 2.9 TiB 2.8 TiB 3.9
MiB 7.8 GiB 825 GiB 78.14
1.05 50 up osd.71
>> >> >> >>>>>
75 hdd 3.68750 1.00000 3.7 TiB 3.1 TiB 3.1 TiB 576
MiB 8.3 GiB 555 GiB 85.29
1.14 57 up osd.75
>> >> >> >>>>>
76 hdd 3.68750 1.00000 3.7 TiB 3.2 TiB 3.1 TiB 239
MiB 9.3 GiB 501 GiB 86.73
1.16 50 up osd.76
>> >> >> >>>>>
77 hdd 14.60159 1.00000 15 TiB 11 TiB 11 TiB 880
MiB 30 GiB 3.6 TiB 75.57
1.01 202 up osd.77
>> >> >> >>>>>
78 hdd 14.60159 1.00000 15 TiB 11 TiB 11 TiB 1.0
GiB 30 GiB 3.4 TiB 76.65
1.03 196 up osd.78
>> >> >> >>>>>
83 hdd 14.60159 1.00000 15 TiB 12 TiB 12 TiB 1.8
GiB 31 GiB 2.9 TiB 80.04
1.07 223 up osd.83
>> >> >> >>>>>
-3 58.49872 - 58 TiB 43 TiB 38 TiB 8.1
GiB 91 GiB 16 TiB 73.15
0.98 - host s3db2
>> >> >> >>>>>
1 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 3.1
GiB 38 GiB 3.6 TiB 75.52
1.01 194 up osd.1
>> >> >> >>>>>
3 hdd 3.63689 1.00000 3.6 TiB 2.2 TiB 1.4 TiB 418
MiB 0 B 1.4 TiB 60.94
0.82 52 up osd.3
>> >> >> >>>>>
4 hdd 3.63689 0.89999 3.6 TiB 3.2 TiB 401 GiB 845
MiB 0 B 401 GiB 89.23
1.20 53 up osd.4
>> >> >> >>>>>
5 hdd 3.63689 1.00000 3.6 TiB 2.3 TiB 1.3 TiB 437
MiB 0 B 1.3 TiB 62.88
0.84 51 up osd.5
>> >> >> >>>>>
6 hdd 3.63689 1.00000 3.6 TiB 2.0 TiB 1.7 TiB 1.8
GiB 0 B 1.7 TiB 54.51
0.73 47 up osd.6
>> >> >> >>>>>
7 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 493
MiB 26 GiB 3.8 TiB 73.90
0.99 185 up osd.7
>> >> >> >>>>>
74 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1.1
GiB 27 GiB 3.5 TiB 76.27
1.02 208 up osd.74
>> >> >> >>>>>
-4 58.49872 - 58 TiB 43 TiB 37 TiB 33
GiB 86 GiB 15 TiB 74.05
0.99 - host s3db3
>> >> >> >>>>>
2 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 850
MiB 26 GiB 4.0 TiB 72.78
0.98 203 up osd.2
>> >> >> >>>>>
9 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 8.3
GiB 33 GiB 3.6 TiB 75.62
1.01 189 up osd.9
>> >> >> >>>>>
10 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 663
MiB 28 GiB 3.5 TiB 76.34
1.02 211 up osd.10
>> >> >> >>>>>
12 hdd 3.63689 1.00000 3.6 TiB 2.4 TiB 1.2 TiB 633
MiB 0 B 1.2 TiB 66.22
0.89 44 up osd.12
>> >> >> >>>>>
13 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 720 GiB 2.3
GiB 0 B 720 GiB 80.66
1.08 66 up osd.13
>> >> >> >>>>>
14 hdd 3.63689 1.00000 3.6 TiB 3.1 TiB 552 GiB 18
GiB 0 B 552 GiB 85.18
1.14 60 up osd.14
>> >> >> >>>>>
15 hdd 3.63689 1.00000 3.6 TiB 2.0 TiB 1.7 TiB 2.1
GiB 0 B 1.7 TiB 53.72
0.72 44 up osd.15
>> >> >> >>>>>
-5 58.49872 - 58 TiB 45 TiB 37 TiB 7.2
GiB 99 GiB 14 TiB 76.37
1.02 - host s3db4
>> >> >> >>>>>
11 hdd 14.65039 1.00000 15 TiB 12 TiB 12 TiB 897
MiB 28 GiB 2.8 TiB 81.15
1.09 205 up osd.11
>> >> >> >>>>>
17 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1.2
GiB 27 GiB 3.6 TiB 75.38
1.01 211 up osd.17
>> >> >> >>>>>
18 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 965
MiB 44 GiB 4.0 TiB 72.86
0.98 188 up osd.18
>> >> >> >>>>>
20 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 796 GiB 529
MiB 0 B 796 GiB 78.63
1.05 66 up osd.20
>> >> >> >>>>>
21 hdd 3.63689 1.00000 3.6 TiB 2.6 TiB 1.1 TiB 2.1
GiB 0 B 1.1 TiB 70.32
0.94 47 up osd.21
>> >> >> >>>>>
22 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 802 GiB 882
MiB 0 B 802 GiB 78.47
1.05 58 up osd.22
>> >> >> >>>>>
24 hdd 3.63689 1.00000 3.6 TiB 2.8 TiB 856 GiB 645
MiB 0 B 856 GiB 77.01
1.03 47 up osd.24
>> >> >> >>>>>
-6 58.89636 - 59 TiB 44 TiB 44 TiB 2.4
GiB 111 GiB 15 TiB 75.22
1.01 - host s3db5
>> >> >> >>>>>
0 hdd 3.73630 1.00000 3.7 TiB 2.4 TiB 2.3 TiB 70
MiB 6.6 GiB 1.3 TiB 65.00
0.87 48 up osd.0
>> >> >> >>>>>
25 hdd 3.73630 1.00000 3.7 TiB 2.4 TiB 2.3 TiB 5.3
MiB 6.6 GiB 1.4 TiB 63.86
0.86 41 up osd.25
>> >> >> >>>>>
26 hdd 3.73630 1.00000 3.7 TiB 2.9 TiB 2.8 TiB 181
MiB 7.6 GiB 862 GiB 77.47
1.04 48 up osd.26
>> >> >> >>>>>
27 hdd 3.73630 1.00000 3.7 TiB 2.3 TiB 2.2 TiB 7.0
MiB 6.1 GiB 1.5 TiB 61.00
0.82 48 up osd.27
>> >> >> >>>>>
28 hdd 14.65039 1.00000 15 TiB 12 TiB 12 TiB 937
MiB 30 GiB 2.8 TiB 81.19
1.09 203 up osd.28
>> >> >> >>>>>
29 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 536
MiB 26 GiB 3.8 TiB 73.95
0.99 200 up osd.29
>> >> >> >>>>>
30 hdd 14.65039 1.00000 15 TiB 12 TiB 11 TiB 744
MiB 28 GiB 3.1 TiB 79.07
1.06 207 up osd.30
>> >> >> >>>>>
-7 58.89636 - 59 TiB 44 TiB 44 TiB 14
GiB 122 GiB 14 TiB 75.41
1.01 - host s3db6
>> >> >> >>>>>
32 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB 3.0 TiB 16
MiB 8.2 GiB 622 GiB 83.74
1.12 65 up osd.32
>> >> >> >>>>>
33 hdd 3.73630 0.79999 3.7 TiB 3.0 TiB 2.9 TiB 14
MiB 8.1 GiB 740 GiB 80.67
1.08 52 up osd.33
>> >> >> >>>>>
34 hdd 3.73630 0.79999 3.7 TiB 2.9 TiB 2.8 TiB 449
MiB 7.7 GiB 877 GiB 77.08
1.03 52 up osd.34
>> >> >> >>>>>
35 hdd 3.73630 0.79999 3.7 TiB 2.3 TiB 2.2 TiB 133
MiB 7.0 GiB 1.4 TiB 62.18
0.83 42 up osd.35
>> >> >> >>>>>
36 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 544
MiB 26 GiB 4.0 TiB 72.98
0.98 220 up osd.36
>> >> >> >>>>>
37 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 11
GiB 38 GiB 3.6 TiB 75.30
1.01 200 up osd.37
>> >> >> >>>>>
38 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1.2
GiB 28 GiB 3.3 TiB 77.43
1.04 217 up osd.38
>> >> >> >>>>>
-8 58.89636 - 59 TiB 47 TiB 46 TiB 3.9
GiB 116 GiB 12 TiB 78.98
1.06 - host s3db7
>> >> >> >>>>>
39 hdd 3.73630 1.00000 3.7 TiB 3.2 TiB 3.2 TiB 19
MiB 8.5 GiB 499 GiB 86.96
1.17 43 up osd.39
>> >> >> >>>>>
40 hdd 3.73630 1.00000 3.7 TiB 2.6 TiB 2.5 TiB 144
MiB 7.0 GiB 1.2 TiB 68.33
0.92 39 up osd.40
>> >> >> >>>>>
41 hdd 3.73630 1.00000 3.7 TiB 3.0 TiB 2.9 TiB 218
MiB 7.9 GiB 732 GiB 80.86
1.08 64 up osd.41
>> >> >> >>>>>
42 hdd 3.73630 1.00000 3.7 TiB 2.5 TiB 2.4 TiB 594
MiB 7.0 GiB 1.2 TiB 67.97
0.91 50 up osd.42
>> >> >> >>>>>
43 hdd 14.65039 1.00000 15 TiB 12 TiB 12 TiB 564
MiB 28 GiB 2.9 TiB 80.32
1.08 213 up osd.43
>> >> >> >>>>>
44 hdd 14.65039 1.00000 15 TiB 12 TiB 11 TiB 1.3
GiB 28 GiB 3.1 TiB 78.59
1.05 198 up osd.44
>> >> >> >>>>>
45 hdd 14.65039 1.00000 15 TiB 12 TiB 12 TiB 1.2
GiB 30 GiB 2.8 TiB 81.05
1.09 214 up osd.45
>> >> >> >>>>>
-9 51.28331 - 51 TiB 41 TiB 41 TiB 4.9
GiB 108 GiB 10 TiB 79.75
1.07 - host s3db8
>> >> >> >>>>>
8 hdd 7.32619 1.00000 7.3 TiB 5.8 TiB 5.8 TiB 472
MiB 15 GiB 1.5 TiB 79.68
1.07 99 up osd.8
>> >> >> >>>>>
16 hdd 7.32619 1.00000 7.3 TiB 5.9 TiB 5.8 TiB 785
MiB 15 GiB 1.4 TiB 80.25
1.08 97 up osd.16
>> >> >> >>>>>
31 hdd 7.32619 1.00000 7.3 TiB 5.5 TiB 5.5 TiB 438
MiB 14 GiB 1.8 TiB 75.36
1.01 87 up osd.31
>> >> >> >>>>>
52 hdd 7.32619 1.00000 7.3 TiB 5.7 TiB 5.7 TiB 844
MiB 15 GiB 1.6 TiB 78.19
1.05 113 up osd.52
>> >> >> >>>>>
53 hdd 7.32619 1.00000 7.3 TiB 6.2 TiB 6.1 TiB 792
MiB 18 GiB 1.1 TiB 84.46
1.13 109 up osd.53
>> >> >> >>>>>
54 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.6 TiB 959
MiB 15 GiB 1.7 TiB 76.73
1.03 115 up osd.54
>> >> >> >>>>>
55 hdd 7.32619 1.00000 7.3 TiB 6.1 TiB 6.1 TiB 699
MiB 16 GiB 1.2 TiB 83.56
1.12 122 up osd.55
>> >> >> >>>>>
-10 51.28331 - 51 TiB 39 TiB 39 TiB 4.7
GiB 100 GiB 12 TiB 76.05
1.02 - host s3db9
>> >> >> >>>>>
56 hdd 7.32619 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 840
MiB 13 GiB 2.1 TiB 71.06
0.95 105 up osd.56
>> >> >> >>>>>
57 hdd 7.32619 1.00000 7.3 TiB 6.1 TiB 6.0 TiB 1.0
GiB 16 GiB 1.2 TiB 83.17
1.12 102 up osd.57
>> >> >> >>>>>
58 hdd 7.32619 1.00000 7.3 TiB 6.0 TiB 5.9 TiB 43
MiB 15 GiB 1.4 TiB 81.56
1.09 105 up osd.58
>> >> >> >>>>>
59 hdd 7.32619 1.00000 7.3 TiB 5.9 TiB 5.9 TiB 429
MiB 15 GiB 1.4 TiB 80.64
1.08 94 up osd.59
>> >> >> >>>>>
60 hdd 7.32619 1.00000 7.3 TiB 5.4 TiB 5.3 TiB 226
MiB 14 GiB 2.0 TiB 73.25
0.98 101 up osd.60
>> >> >> >>>>>
61 hdd 7.32619 1.00000 7.3 TiB 4.8 TiB 4.8 TiB 1.1
GiB 12 GiB 2.5 TiB 65.84
0.88 103 up osd.61
>> >> >> >>>>>
62 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.6 TiB 1.0
GiB 15 GiB 1.7 TiB 76.83
1.03 126 up osd.62
>> >> >> >>>>>
TOTAL 674 TiB 501 TiB 473 TiB 96
GiB 1.2 TiB 173 TiB 74.57
>> >> >> >>>>>
MIN/MAX VAR: 0.17/1.20 STDDEV: 10.25
>> >> >> >>>>>
>> >> >> >>>>>
>> >> >> >>>>>
>> >> >> >>>>> Am Sa., 13. März 2021 um 15:57 Uhr schrieb
Dan van der
Ster <dan(a)vanderster.com>om>:
>> >> >>
>>>>>>
>> >> >> >>>>>> No, increasing num PGs won't help
substantially.
>> >> >> >>>>>>
>> >> >> >>>>>> Can you share the entire output of
ceph osd df tree ?
>> >> >> >>>>>>
>> >> >> >>>>>> Did you already set
>> >> >> >>>>>>
>> >> >> >>>>>> ceph config set mgr
mgr/balancer/upmap_max_deviation 1
>> >> >> >>>>>>
>> >> >> >>>>>>
>> >> >> >>>>>> ??
>> >> >> >>>>>> And I recommend debug_mgr 4/5 so you
can see some basic
upmap balancer logging.
>> >> >>
>>>>>>
>> >> >> >>>>>> .. Dan
>> >> >> >>>>>>
>> >> >> >>>>>>
>> >> >> >>>>>>
>> >> >> >>>>>>
>> >> >> >>>>>>
>> >> >> >>>>>>
>> >> >> >>>>>> On Sat, Mar 13, 2021, 3:49 PM Boris
Behrens <bb(a)kervyn.de>
wrote:
>> >> >>
>>>>>>>
>> >> >> >>>>>>> Hello people,
>> >> >> >>>>>>>
>> >> >> >>>>>>> I am still struggeling with the
balancer
>> >> >> >>>>>>> (
https://www.mail-archive.com/ceph-users@ceph.io/msg09124.html)
>> >> >>
>>>>>>> Now I've read some more and might think that I do not
have enough PGs.
>> >> >>
>>>>>>> Currently I have 84OSDs and 1024PGs for the main pool
(3008 total). I
>> >> >>
>>>>>>> have the autoscaler enabled, but I doesn't tell me to
increase the
>> >> >>
>>>>>>> PGs.
>> >> >> >>>>>>>
>> >> >> >>>>>>> What do you think?
>> >> >> >>>>>>>
>> >> >> >>>>>>> --
>> >> >> >>>>>>> Die Selbsthilfegruppe
"UTF-8-Probleme" trifft sich
diesmal abweichend
>> >> >>
>>>>>>> im groüen Saal.
>> >> >> >>>>>>>
_______________________________________________
>> >> >> >>>>>>> ceph-users mailing list --
ceph-users(a)ceph.io
>> >> >> >>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io
>> >> >> >>>>>
>> >> >> >>>>>
>> >> >> >>>>>
>> >> >> >>>>> --
>> >> >> >>>>> Die Selbsthilfegruppe
"UTF-8-Probleme" trifft sich diesmal
abweichend im groüen Saal.
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> --
>> >> >> >>> Die Selbsthilfegruppe "UTF-8-Probleme"
trifft sich diesmal
abweichend im groüen Saal.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft
sich diesmal
abweichend im groüen Saal.
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich
diesmal
abweichend im groüen Saal.
>> >
>> >
>> >
>> > --
>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal
abweichend im groüen Saal.
>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
im groüen Saal.
--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.