Hi,
I have tested the PG-upmap offline optimization with 1 of my pools: ssd
This pool is unbalanced; here's the ouput of ceph osd df tree before the
optimization:
root@ld3955:~# ceph osd df tree class ssd
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META
AVAIL %USE VAR PGS STATUS TYPE NAME
-1 1532.88501 - 27 TiB 27 GiB 4.4 GiB 816 KiB 23 GiB
27 TiB 0.10 1.00 - root default
-46 353.82300 - 1.1 TiB 3.7 GiB 702 MiB 144 KiB 3.0 GiB
1.1 TiB 0.33 3.35 - host ld4257
20 ssd 0.37099 1.00000 371 GiB 1.2 GiB 234 MiB 44 KiB 1024 MiB
370 GiB 0.33 3.35 25 up osd.20
21 ssd 0.37099 1.00000 371 GiB 1.2 GiB 234 MiB 20 KiB 1024 MiB
370 GiB 0.33 3.35 27 up osd.21
22 ssd 0.37099 1.00000 371 GiB 1.2 GiB 234 MiB 80 KiB 1024 MiB
370 GiB 0.33 3.35 26 up osd.22
-34 356.22299 - 4.6 TiB 4.9 GiB 936 MiB 228 KiB 4.0 GiB
4.6 TiB 0.10 1.06 - host ld4464
23 ssd 0.37099 1.00000 371 GiB 1.2 GiB 234 MiB 44 KiB 1024 MiB
370 GiB 0.33 3.35 29 up osd.23
24 ssd 0.37099 1.00000 371 GiB 1.2 GiB 234 MiB 40 KiB 1024 MiB
370 GiB 0.33 3.35 31 up osd.24
25 ssd 0.37099 1.00000 371 GiB 1.2 GiB 234 MiB 52 KiB 1024 MiB
370 GiB 0.33 3.35 31 up osd.25
26 ssd 3.48999 1.00000 3.5 TiB 1.2 GiB 234 MiB 92 KiB 1024 MiB
3.5 TiB 0.03 0.35 265 up osd.26
-37 356.22299 - 4.6 TiB 4.9 GiB 936 MiB 152 KiB 4.0 GiB
4.6 TiB 0.10 1.06 - host ld4465
27 ssd 0.37099 1.00000 371 GiB 1.2 GiB 234 MiB 20 KiB 1024 MiB
370 GiB 0.33 3.35 24 up osd.27
28 ssd 0.37099 1.00000 371 GiB 1.2 GiB 234 MiB 20 KiB 1024 MiB
370 GiB 0.33 3.35 28 up osd.28
29 ssd 0.37099 1.00000 371 GiB 1.2 GiB 234 MiB 20 KiB 1024 MiB
370 GiB 0.33 3.35 22 up osd.29
30 ssd 3.48999 1.00000 3.5 TiB 1.2 GiB 234 MiB 92 KiB 1024 MiB
3.5 TiB 0.03 0.35 258 up osd.30
-3 116.65399 - 4.2 TiB 3.5 GiB 491 MiB 76 KiB 3.0 GiB
4.2 TiB 0.08 0.82 - host ld5505
8 ssd 3.48999 1.00000 3.5 TiB 1.2 GiB 164 MiB 20 KiB 1024 MiB
3.5 TiB 0.03 0.33 288 up osd.8
9 ssd 0.37199 1.00000 372 GiB 1.2 GiB 164 MiB 24 KiB 1024 MiB
371 GiB 0.31 3.16 28 up osd.9
10 ssd 0.37199 1.00000 372 GiB 1.2 GiB 164 MiB 32 KiB 1024 MiB
371 GiB 0.31 3.16 31 up osd.10
-7 116.65399 - 4.2 TiB 3.5 GiB 491 MiB 72 KiB 3.0 GiB
4.2 TiB 0.08 0.82 - host ld5506
11 ssd 0.37199 1.00000 372 GiB 1.2 GiB 164 MiB 24 KiB 1024 MiB
371 GiB 0.31 3.16 36 up osd.11
12 ssd 3.48999 1.00000 3.5 TiB 1.2 GiB 164 MiB 32 KiB 1024 MiB
3.5 TiB 0.03 0.33 260 up osd.12
13 ssd 0.37199 1.00000 372 GiB 1.2 GiB 164 MiB 16 KiB 1024 MiB
371 GiB 0.31 3.16 28 up osd.13
-10 116.65399 - 4.2 TiB 3.5 GiB 491 MiB 80 KiB 3.0 GiB
4.2 TiB 0.08 0.82 - host ld5507
14 ssd 0.37199 1.00000 372 GiB 1.2 GiB 164 MiB 24 KiB 1024 MiB
371 GiB 0.31 3.16 24 up osd.14
15 ssd 0.37199 1.00000 372 GiB 1.2 GiB 164 MiB 32 KiB 1024 MiB
371 GiB 0.31 3.16 26 up osd.15
16 ssd 3.48999 1.00000 3.5 TiB 1.2 GiB 164 MiB 24 KiB 1024 MiB
3.5 TiB 0.03 0.33 259 up osd.16
-13 116.65399 - 4.2 TiB 3.5 GiB 490 MiB 64 KiB 3.0 GiB
4.2 TiB 0.08 0.82 - host ld5508
17 ssd 0.37199 1.00000 372 GiB 1.2 GiB 164 MiB 28 KiB 1024 MiB
371 GiB 0.31 3.16 19 up osd.17
18 ssd 0.37199 1.00000 372 GiB 1.2 GiB 163 MiB 8 KiB 1024 MiB
371 GiB 0.31 3.16 24 up osd.18
19 ssd 3.48999 1.00000 3.5 TiB 1.2 GiB 164 MiB 28 KiB 1024 MiB
3.5 TiB 0.03 0.33 259 up osd.19
TOTAL 27 TiB 27 GiB 4.4 GiB 823 KiB 23 GiB
27 TiB 0.10
MIN/MAX VAR: 0.33/3.35 STDDEV: 0.20
The output of osdmaptool implies many modifications affecting osd.11 and
osd.12, means the optimizer wants to shift PGs from 12 to 11.
root@ld3955:~# source out_ssd.txt
set 66.41 pg_upmap_items mapping to [12->13]
set 66.4e pg_upmap_items mapping to [22->20]
set 66.7c pg_upmap_items mapping to [28->29]
set 66.9f pg_upmap_items mapping to [12->11]
set 66.147 pg_upmap_items mapping to [12->11]
set 66.1b1 pg_upmap_items mapping to [12->11]
set 66.203 pg_upmap_items mapping to [12->11]
set 66.257 pg_upmap_items mapping to [28->30]
set 66.27d pg_upmap_items mapping to [28->30]
set 66.300 pg_upmap_items mapping to [12->11]
set 66.354 pg_upmap_items mapping to [28->29]
set 66.35b pg_upmap_items mapping to [28->30]
set 66.38a pg_upmap_items mapping to [12->11]
set 66.3d0 pg_upmap_items mapping to [28->30]
However this makes no sense as osd.11 has already more PGs than other OSDs.
In fact there's one OSD with least PGs: osd.17
Why is the optimizer not shifting PGs to osd.17?
Here's the output of ceph osd df tree after the optimization:
root@ld3955:~# ceph osd df tree class ssd
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META
AVAIL %USE VAR PGS STATUS TYPE NAME
-1 1532.88501 - 27 TiB 27 GiB 3.2 GiB 816 KiB 23 GiB
27 TiB 0.10 1.00 - root default
-46 353.82300 - 1.1 TiB 3.6 GiB 502 MiB 144 KiB 3.0 GiB
1.1 TiB 0.33 3.40 - host ld4257
20 ssd 0.37099 1.00000 371 GiB 1.2 GiB 167 MiB 44 KiB 1024 MiB
370 GiB 0.33 3.49 25 up osd.20
21 ssd 0.37099 1.00000 371 GiB 1.2 GiB 167 MiB 20 KiB 1024 MiB
370 GiB 0.33 3.44 27 up osd.21
22 ssd 0.37099 1.00000 371 GiB 1.2 GiB 167 MiB 80 KiB 1024 MiB
370 GiB 0.31 3.28 26 up osd.22
-34 356.22299 - 4.6 TiB 4.8 GiB 668 MiB 228 KiB 4.0 GiB
4.6 TiB 0.10 1.07 - host ld4464
23 ssd 0.37099 1.00000 371 GiB 1.2 GiB 167 MiB 44 KiB 1024 MiB
370 GiB 0.33 3.44 29 up osd.23
24 ssd 0.37099 1.00000 371 GiB 1.2 GiB 167 MiB 40 KiB 1024 MiB
370 GiB 0.31 3.28 31 up osd.24
25 ssd 0.37099 1.00000 371 GiB 1.2 GiB 167 MiB 52 KiB 1024 MiB
370 GiB 0.33 3.49 31 up osd.25
26 ssd 3.48999 1.00000 3.5 TiB 1.2 GiB 167 MiB 92 KiB 1024 MiB
3.5 TiB 0.03 0.34 265 up osd.26
-37 356.22299 - 4.6 TiB 4.8 GiB 669 MiB 152 KiB 4.0 GiB
4.6 TiB 0.10 1.07 - host ld4465
27 ssd 0.37099 1.00000 371 GiB 1.2 GiB 167 MiB 20 KiB 1024 MiB
370 GiB 0.31 3.28 24 up osd.27
28 ssd 0.37099 1.00000 371 GiB 1.2 GiB 167 MiB 20 KiB 1024 MiB
370 GiB 0.33 3.49 28 up osd.28
29 ssd 0.37099 1.00000 371 GiB 1.2 GiB 167 MiB 20 KiB 1024 MiB
370 GiB 0.31 3.28 23 up osd.29
30 ssd 3.48999 1.00000 3.5 TiB 1.2 GiB 167 MiB 92 KiB 1024 MiB
3.5 TiB 0.03 0.36 257 up osd.30
-3 116.65399 - 4.2 TiB 3.3 GiB 350 MiB 76 KiB 3.0 GiB
4.2 TiB 0.08 0.81 - host ld5505
8 ssd 3.48999 1.00000 3.5 TiB 1.1 GiB 117 MiB 20 KiB 1024 MiB
3.5 TiB 0.03 0.33 288 up osd.8
9 ssd 0.37199 1.00000 372 GiB 1.1 GiB 117 MiB 24 KiB 1024 MiB
371 GiB 0.30 3.13 28 up osd.9
10 ssd 0.37199 1.00000 372 GiB 1.1 GiB 117 MiB 32 KiB 1024 MiB
371 GiB 0.30 3.13 31 up osd.10
-7 116.65399 - 4.2 TiB 3.3 GiB 350 MiB 72 KiB 3.0 GiB
4.2 TiB 0.08 0.81 - host ld5506
11 ssd 0.37199 1.00000 372 GiB 1.1 GiB 117 MiB 24 KiB 1024 MiB
371 GiB 0.30 3.13 41 up osd.11
12 ssd 3.48999 1.00000 3.5 TiB 1.1 GiB 117 MiB 32 KiB 1024 MiB
3.5 TiB 0.03 0.33 254 up osd.12
13 ssd 0.37199 1.00000 372 GiB 1.1 GiB 117 MiB 16 KiB 1024 MiB
371 GiB 0.30 3.13 29 up osd.13
-10 116.65399 - 4.2 TiB 3.3 GiB 350 MiB 80 KiB 3.0 GiB
4.2 TiB 0.08 0.81 - host ld5507
14 ssd 0.37199 1.00000 372 GiB 1.1 GiB 117 MiB 24 KiB 1024 MiB
371 GiB 0.30 3.13 24 up osd.14
15 ssd 0.37199 1.00000 372 GiB 1.1 GiB 117 MiB 32 KiB 1024 MiB
371 GiB 0.30 3.13 26 up osd.15
16 ssd 3.48999 1.00000 3.5 TiB 1.1 GiB 117 MiB 24 KiB 1024 MiB
3.5 TiB 0.03 0.33 259 up osd.16
-13 116.65399 - 4.2 TiB 3.3 GiB 350 MiB 64 KiB 3.0 GiB
4.2 TiB 0.08 0.81 - host ld5508
17 ssd 0.37199 1.00000 372 GiB 1.1 GiB 117 MiB 28 KiB 1024 MiB
371 GiB 0.30 3.13 19 up osd.17
18 ssd 0.37199 1.00000 372 GiB 1.1 GiB 117 MiB 8 KiB 1024 MiB
371 GiB 0.30 3.13 24 up osd.18
19 ssd 3.48999 1.00000 3.5 TiB 1.1 GiB 117 MiB 28 KiB 1024 MiB
3.5 TiB 0.03 0.33 259 up osd.19
TOTAL 27 TiB 27 GiB 3.2 GiB 823 KiB 23 GiB
27 TiB 0.10
MIN/MAX VAR: 0.33/3.49 STDDEV: 0.19
Regards
Thomas
Show replies by date