I have a cluster running 15.2.1, was originally running 14.x, the cluster is running the
balance module in upmap mode (I have tried crush-compat in the past)
Most OSD's are around the same & used give or take 0.x, however there is one OSD
that is down a good few % and a few that are above average by 1 or 2 %, I have been trying
to get the balance to fix this.
I have tried running a manual osdmaptool command on an export of my map, but it lists no
fixed however does display the underfall OSD in it's output (overfull
3,4,5,6,7,8,9,10,11,12,13,14,15,18,19,20 underfull [36])
The debug output is just lots of:
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 trying 2.55
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 2.55 [12,3,7,6,33,34,30,35,21,18] ->
[12,3,7,6,33,34,30,35,21,16]
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 will try adding new remapping pair 18 ->
16 for 2.55 NOT selected osd
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 stddev 528.667 -> 528.667
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 Overfull search osd.7 target 170.667
deviation 9.33327
Is there anything I can to try and balance the overfull onto the underful OSDs to balance
out the last bit.
Show replies by date
Have attached crushmap encase anyone can see any issues there:
ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE
VAR PGS STATUS
26 hdd 0.00999 1.00000 10 GiB 1.1 GiB 143 MiB 40 MiB 984 MiB 8.9 GiB 11.40
0.16 33 up
27 hdd 0.00999 1.00000 10 GiB 1.2 GiB 164 MiB 29 MiB 995 MiB 8.8 GiB 11.61
0.16 32 up
28 hdd 0.00999 1.00000 10 GiB 1.1 GiB 149 MiB 28 MiB 996 MiB 8.9 GiB 11.46
0.16 31 up
39 hdd 0.00999 1.00000 10 GiB 1.1 GiB 152 MiB 31 MiB 993 MiB 8.8 GiB 11.49
0.16 33 up
40 hdd 0.00999 1.00000 10 GiB 1.1 GiB 142 MiB 33 MiB 991 MiB 8.9 GiB 11.39
0.16 31 up
41 hdd 0.00999 1.00000 10 GiB 1.2 GiB 162 MiB 28 MiB 996 MiB 8.8 GiB 11.58
0.16 32 up
3 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 723 KiB 19 GiB 2.4 TiB 73.63
1.01 257 up
4 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.6 TiB 5.4 MiB 19 GiB 2.4 TiB 73.23
1.00 257 up
5 hdd 9.09599 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 601 KiB 20 GiB 2.2 TiB 76.22
1.04 265 up
6 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 624 KiB 20 GiB 2.4 TiB 73.98
1.01 256 up
7 hdd 9.09599 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 4.9 MiB 20 GiB 2.2 TiB 75.63
1.03 265 up
8 hdd 9.09599 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 591 KiB 20 GiB 2.2 TiB 76.05
1.04 265 up
9 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 2.7 MiB 19 GiB 2.4 TiB 73.69
1.01 257 up
10 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 1.0 MiB 20 GiB 2.4 TiB 73.53
1.00 256 up
11 hdd 9.09599 1.00000 9.1 TiB 6.6 TiB 6.6 TiB 5.7 MiB 19 GiB 2.5 TiB 72.64
0.99 251 up
12 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 1.6 MiB 20 GiB 2.4 TiB 73.95
1.01 257 up
13 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 2.3 MiB 19 GiB 2.4 TiB 73.43
1.00 257 up
14 hdd 9.09599 1.00000 9.1 TiB 6.8 TiB 6.8 TiB 1.5 MiB 20 GiB 2.3 TiB 74.56
1.02 261 up
15 hdd 9.09599 1.00000 9.1 TiB 6.8 TiB 6.8 TiB 1.9 MiB 20 GiB 2.3 TiB 74.81
1.02 262 up
16 hdd 9.09599 1.00000 9.1 TiB 6.8 TiB 6.8 TiB 2.0 MiB 20 GiB 2.3 TiB 74.46
1.02 261 up
17 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 761 KiB 19 GiB 2.4 TiB 73.43
1.00 256 up
18 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 1.8 MiB 19 GiB 2.4 TiB 73.50
1.00 257 up
19 hdd 9.09599 1.00000 9.1 TiB 6.8 TiB 6.7 TiB 3.9 MiB 19 GiB 2.3 TiB 74.25
1.01 261 up
20 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 950 KiB 19 GiB 2.4 TiB 73.70
1.01 257 up
21 hdd 9.09599 1.00000 9.1 TiB 6.8 TiB 6.7 TiB 2.4 MiB 20 GiB 2.3 TiB 74.42
1.02 260 up
22 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 840 KiB 20 GiB 2.4 TiB 73.59
1.01 256 up
29 hdd 9.09599 1.00000 9.1 TiB 6.5 TiB 6.4 TiB 289 KiB 19 GiB 2.6 TiB 71.03
0.97 249 up
30 hdd 9.09599 1.00000 9.1 TiB 6.5 TiB 6.5 TiB 2.1 MiB 19 GiB 2.6 TiB 71.85
0.98 253 up
31 hdd 9.09599 1.00000 9.1 TiB 6.5 TiB 6.5 TiB 1.2 MiB 19 GiB 2.6 TiB 71.69
0.98 251 up
32 hdd 9.09599 1.00000 9.1 TiB 6.6 TiB 6.6 TiB 26 KiB 19 GiB 2.5 TiB 72.71
0.99 255 up
33 hdd 9.09599 1.00000 9.1 TiB 6.5 TiB 6.5 TiB 737 KiB 19 GiB 2.6 TiB 71.88
0.98 252 up
34 hdd 9.09599 1.00000 9.1 TiB 6.6 TiB 6.6 TiB 823 KiB 19 GiB 2.5 TiB 72.24
0.99 253 up
35 hdd 9.09599 1.00000 9.1 TiB 6.4 TiB 6.4 TiB 1.1 MiB 18 GiB 2.7 TiB 70.86
0.97 248 up
36 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 1.5 MiB 18 GiB 3.0 TiB 67.01
0.92 236 up
37 hdd 9.09599 1.00000 9.1 TiB 6.6 TiB 6.6 TiB 1.7 MiB 19 GiB 2.5 TiB 72.82
0.99 256 up
38 hdd 9.09599 1.00000 9.1 TiB 6.5 TiB 6.5 TiB 2.5 MiB 19 GiB 2.6 TiB 71.95
0.98 253 up
0 hdd 0.00999 1.00000 10 GiB 1.2 GiB 161 MiB 29 MiB 995 MiB 8.8 GiB 11.58
0.16 32 up
1 hdd 0.00999 1.00000 10 GiB 1.2 GiB 154 MiB 35 MiB 989 MiB 8.8 GiB 11.51
0.16 33 up
2 hdd 0.00999 1.00000 10 GiB 1.1 GiB 141 MiB 29 MiB 995 MiB 8.9 GiB 11.38
0.16 31 up
26 hdd 0.00999 1.00000 10 GiB 1.1 GiB 143 MiB 40 MiB 984 MiB 8.9 GiB 11.40
0.16 33 up
27 hdd 0.00999 1.00000 10 GiB 1.2 GiB 164 MiB 29 MiB 995 MiB 8.8 GiB 11.61
0.16 32 up
28 hdd 0.00999 1.00000 10 GiB 1.1 GiB 149 MiB 28 MiB 996 MiB 8.9 GiB 11.46
0.16 31 up
39 hdd 0.00999 1.00000 10 GiB 1.1 GiB 152 MiB 31 MiB 993 MiB 8.8 GiB 11.49
0.16 33 up
40 hdd 0.00999 1.00000 10 GiB 1.1 GiB 142 MiB 33 MiB 991 MiB 8.9 GiB 11.39
0.16 31 up
41 hdd 0.00999 1.00000 10 GiB 1.2 GiB 162 MiB 28 MiB 996 MiB 8.8 GiB 11.58
0.16 32 up
TOTAL 273 TiB 200 TiB 199 TiB 336 MiB 588 GiB 73 TiB 73.21
Thanks
---- On Tue, 05 May 2020 14:23:54 +0800 Ashley Merrick <singapore(a)amerrick.co.uk>
wrote ----
I have a cluster running 15.2.1, was originally running 14.x, the cluster is running the
balance module in upmap mode (I have tried crush-compat in the past)
Most OSD's are around the same & used give or take 0.x, however there is one OSD
that is down a good few % and a few that are above average by 1 or 2 %, I have been trying
to get the balance to fix this.
I have tried running a manual osdmaptool command on an export of my map, but it lists no
fixed however does display the underfall OSD in it's output (overfull
3,4,5,6,7,8,9,10,11,12,13,14,15,18,19,20 underfull [36])
The debug output is just lots of:
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 trying 2.55
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 2.55 [12,3,7,6,33,34,30,35,21,18] ->
[12,3,7,6,33,34,30,35,21,16]
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 will try adding new remapping pair 18 ->
16 for 2.55 NOT selected osd
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 stddev 528.667 -> 528.667
2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 Overfull search osd.7 target 170.667
deviation 9.33327
Is there anything I can to try and balance the overfull onto the underful OSDs to balance
out the last bit.