I set the target_size_ratio of pools by mistake as multiple pools sharing
the same raw capacity. After I adjust it, a large number of pgs are in the
backfill state, but the usage rate of osds is still growing, How do I need
to adjust it?
[root@node01 smd]# ceph osd pool autoscale-statusPOOL
SIZE TARGET SIZE RATE RAW
CAPACITY RATIO TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW
PG_NUM AUTOSCALE BULK device_health_metrics
291.9M 3.0 48289G 0.0000
1.0 1 on False
deeproute-replica-hdd-pool 12613M
3.0 785.8T 0.0010 0.1000 0.0010 1.0
32 on False deeproute-replica-ssd-pool
12352G 3.0 48289G 0.9901
50.0000 0.9901 1.0 1024 on False
.rgw.root 5831
3.0 48289G 0.0099 0.5000 0.0099 1.0
8 on False default.rgw.log
182 3.0 48289G 0.0000
1.0 32 on
False default.rgw.control 0
3.0 48289G 0.0000
1.0 32 on False default.rgw.meta
0 3.0 48289G
0.0000 4.0 8 on
False os-dsglczutvqsgowpz.rgw.control 0
3.0 16096G 0.1667 0.5000
0.1667 1.0 64 on False
os-dsglczutvqsgowpz.rgw.meta 104.2k
3.0 16096G 0.1667 0.5000 0.1667 1.0
64 on False
os-dsglczutvqsgowpz.rgw.buckets.index 57769M
3.0 16096G 0.1667 0.5000 0.1667 1.0
32 on False
os-dsglczutvqsgowpz.rgw.buckets.non-ec 496.5M
3.0 16096G 0.1667 0.5000 0.1667 1.0
32 on False os-dsglczutvqsgowpz.rgw.log
309.0M 3.0 16096G 0.1667
0.5000 0.1667 1.0 32 on False
os-dsglczutvqsgowpz.rgw.buckets.data 147.7T
1.3333333730697632 785.8T 0.7992 80.0000
0.7992 1.0 1024 on False cephfs-metadata
3231M 3.0
16096G 0.0006 4.0 32
on False cephfs-replicated-pool 23137G
3.0 785.8T 0.1998 20.0000
0.1998 1.0 128 on False .nfs
100.1k 3.0
48289G 0.0000 1.0 32
on False os-dsglczutvqsgowpz.rgw.otp 0
3.0 16096G 0.1667 0.5000
0.1667 1.0 8 on False
[root@node01 pg]# ceph osd df | egrep -i "name|hdd"
1 hdd 10.91399 1.00000 11 TiB 5.3 TiB 5.2 TiB 17 KiB
55 GiB 5.6 TiB 48.26 1.33 195 up
22 hdd 10.91399 1.00000 11 TiB 8.6 TiB 8.6 TiB 8 KiB
77 GiB 2.3 TiB 79.11 2.17 212 up
31 hdd 10.91399 1.00000 11 TiB 3.1 TiB 3.0 TiB 12 KiB
29 GiB 7.8 TiB 28.11 0.77 197 up
51 hdd 10.91399 1.00000 11 TiB 3.9 TiB 3.8 TiB 10 KiB
38 GiB 7.1 TiB 35.28 0.97 186 up
60 hdd 10.91399 1.00000 11 TiB 927 GiB 916 GiB 14 KiB
11 GiB 10 TiB 8.29 0.23 167 up
70 hdd 10.91399 1.00000 11 TiB 2.3 TiB 2.2 TiB 13 KiB
5.2 GiB 8.7 TiB 20.63 0.57 177 up
78 hdd 10.91399 1.00000 11 TiB 3.2 TiB 3.2 TiB 17 KiB
31 GiB 7.7 TiB 29.56 0.81 185 up
96 hdd 10.91399 1.00000 11 TiB 3.9 TiB 3.8 TiB 11 KiB
38 GiB 7.1 TiB 35.31 0.97 195 up
9 hdd 10.91399 1.00000 11 TiB 3.0 TiB 2.9 TiB 17 KiB
14 GiB 8.0 TiB 27.11 0.75 183 up
19 hdd 10.91399 1.00000 11 TiB 4.0 TiB 4.0 TiB 11 KiB
47 GiB 6.9 TiB 36.66 1.01 192 up
29 hdd 10.91399 1.00000 11 TiB 5.3 TiB 5.2 TiB 14 KiB
40 GiB 5.6 TiB 48.40 1.33 202 up
47 hdd 10.91399 1.00000 11 TiB 736 GiB 734 GiB 10 KiB
2.1 GiB 10 TiB 6.59 0.18 172 up
56 hdd 10.91399 1.00000 11 TiB 3.9 TiB 3.8 TiB 10 KiB
38 GiB 7.0 TiB 35.56 0.98 184 up
65 hdd 10.91399 1.00000 11 TiB 6.1 TiB 6.1 TiB 9 KiB
57 GiB 4.8 TiB 56.22 1.55 214 up
88 hdd 10.91399 1.00000 11 TiB 4.6 TiB 4.6 TiB 13 KiB
47 GiB 6.3 TiB 42.50 1.17 194 up
98 hdd 10.91399 1.00000 11 TiB 7.6 TiB 7.5 TiB 14 KiB
60 GiB 3.3 TiB 69.31 1.90 210 up
2 hdd 10.91399 1.00000 11 TiB 1.6 TiB 1.6 TiB 16 KiB
19 GiB 9.3 TiB 14.38 0.40 182 up
30 hdd 10.91399 1.00000 11 TiB 4.6 TiB 4.5 TiB 13 KiB
39 GiB 6.3 TiB 41.87 1.15 204 up
40 hdd 10.91399 1.00000 11 TiB 1.7 TiB 1.7 TiB 10 KiB
13 GiB 9.2 TiB 15.43 0.42 168 up
48 hdd 10.91399 1.00000 11 TiB 1.5 TiB 1.5 TiB 14 KiB
11 GiB 9.4 TiB 13.78 0.38 168 up
58 hdd 10.91399 1.00000 11 TiB 7.7 TiB 7.7 TiB 15 KiB
75 GiB 3.2 TiB 70.85 1.95 216 up
74 hdd 10.91399 1.00000 11 TiB 6.2 TiB 6.2 TiB 13 KiB
65 GiB 4.7 TiB 57.22 1.57 215 up
83 hdd 10.91399 1.00000 11 TiB 2.4 TiB 2.4 TiB 18 KiB
21 GiB 8.5 TiB 21.81 0.60 183 up
92 hdd 10.91399 1.00000 11 TiB 3.1 TiB 3.0 TiB 15 KiB
29 GiB 7.8 TiB 28.13 0.77 183 up
12 hdd 10.91399 1.00000 11 TiB 5.3 TiB 5.2 TiB 9 KiB
40 GiB 5.6 TiB 48.40 1.33 203 up
23 hdd 10.91399 1.00000 11 TiB 4.7 TiB 4.7 TiB 19 KiB
41 GiB 6.2 TiB 43.50 1.20 188 up
32 hdd 10.91399 1.00000 11 TiB 2.4 TiB 2.3 TiB 6 KiB
21 GiB 8.6 TiB 21.66 0.60 186 up
43 hdd 10.91399 1.00000 11 TiB 2.3 TiB 2.2 TiB 12 KiB
5.0 GiB 8.7 TiB 20.62 0.57 177 up
52 hdd 10.91399 1.00000 11 TiB 3.8 TiB 3.8 TiB 14 KiB
30 GiB 7.1 TiB 34.67 0.95 187 up
71 hdd 10.91399 1.00000 11 TiB 3.1 TiB 3.1 TiB 14 KiB
29 GiB 7.8 TiB 28.35 0.78 174 up
95 hdd 10.91399 1.00000 11 TiB 5.5 TiB 5.4 TiB 10 KiB
63 GiB 5.4 TiB 50.29 1.38 197 up
104 hdd 10.91399 1.00000 11 TiB 3.8 TiB 3.8 TiB 11 KiB
38 GiB 7.1 TiB 34.93 0.96 180 up
3 hdd 10.91399 1.00000 11 TiB 2.4 TiB 2.3 TiB 12 KiB
21 GiB 8.6 TiB 21.65 0.60 179 up
24 hdd 10.91399 1.00000 11 TiB 2.3 TiB 2.3 TiB 10 KiB
20 GiB 8.6 TiB 21.16 0.58 170 up
33 hdd 10.91399 1.00000 11 TiB 7.0 TiB 6.9 TiB 13 KiB
73 GiB 3.9 TiB 64.03 1.76 214 up
45 hdd 10.91399 1.00000 11 TiB 3.2 TiB 3.1 TiB 13 KiB
23 GiB 7.8 TiB 28.92 0.79 188 up
69 hdd 10.91399 1.00000 11 TiB 3.0 TiB 3.0 TiB 16 KiB
21 GiB 7.9 TiB 27.48 0.76 180 up
79 hdd 10.91399 1.00000 11 TiB 6.2 TiB 6.1 TiB 23 KiB
57 GiB 4.8 TiB 56.44 1.55 206 up
89 hdd 10.91399 1.00000 11 TiB 2.2 TiB 2.2 TiB 16 KiB
12 GiB 8.7 TiB 20.30 0.56 180 up
99 hdd 10.91399 1.00000 11 TiB 4.6 TiB 4.6 TiB 15 KiB
47 GiB 6.3 TiB 42.46 1.17 194 up
5 hdd 10.91399 1.00000 11 TiB 2.3 TiB 2.3 TiB 18 KiB
20 GiB 8.6 TiB 20.91 0.57 183 up
15 hdd 10.91399 1.00000 11 TiB 7.6 TiB 7.5 TiB 8 KiB
59 GiB 3.3 TiB 69.31 1.90 215 up
25 hdd 10.91399 1.00000 11 TiB 1.6 TiB 1.6 TiB 11 KiB
12 GiB 9.3 TiB 14.73 0.40 176 up
44 hdd 10.91399 1.00000 11 TiB 3.1 TiB 3.1 TiB 9 KiB
36 GiB 7.8 TiB 28.74 0.79 189 up
63 hdd 10.91399 1.00000 11 TiB 3.0 TiB 3.0 TiB 9 KiB
21 GiB 7.9 TiB 27.49 0.76 189 up
72 hdd 10.91399 1.00000 11 TiB 4.0 TiB 4.0 TiB 14 KiB
40 GiB 6.9 TiB 36.71 1.01 195 up
81 hdd 10.91399 1.00000 11 TiB 6.9 TiB 6.9 TiB 14 KiB
66 GiB 4.0 TiB 63.43 1.74 209 up
91 hdd 10.91399 1.00000 11 TiB 2.3 TiB 2.3 TiB 15 KiB
20 GiB 8.6 TiB 20.91 0.57 179 up
7 hdd 10.91399 1.00000 11 TiB 4.7 TiB 4.7 TiB 16 KiB
55 GiB 6.2 TiB 43.10 1.18 198 up
37 hdd 10.91399 1.00000 11 TiB 5.4 TiB 5.4 TiB 14 KiB
56 GiB 5.5 TiB 49.68 1.37 212 up
46 hdd 10.91399 1.00000 11 TiB 832 GiB 822 GiB 15 KiB
10 GiB 10 TiB 7.45 0.20 175 up
55 hdd 10.91399 1.00000 11 TiB 6.1 TiB 6.1 TiB 20 KiB
56 GiB 4.8 TiB 56.20 1.54 198 up
64 hdd 10.91399 1.00000 11 TiB 3.8 TiB 3.8 TiB 25 KiB
30 GiB 7.1 TiB 34.63 0.95 190 up
84 hdd 10.91399 1.00000 11 TiB 830 GiB 820 GiB 11 KiB
10 GiB 10 TiB 7.43 0.20 177 up
94 hdd 10.91399 1.00000 11 TiB 2.2 TiB 2.2 TiB 9 KiB
12 GiB 8.7 TiB 20.33 0.56 176 up
102 hdd 10.91399 1.00000 11 TiB 3.2 TiB 3.2 TiB 11 KiB
31 GiB 7.7 TiB 29.52 0.81 173 up
6 hdd 10.91399 1.00000 11 TiB 4.6 TiB 4.6 TiB 21 KiB
47 GiB 6.3 TiB 42.12 1.16 200 up
18 hdd 10.91399 1.00000 11 TiB 6.2 TiB 6.1 TiB 11 KiB
64 GiB 4.7 TiB 56.83 1.56 210 up
41 hdd 10.91399 1.00000 11 TiB 3.1 TiB 3.0 TiB 20 KiB
28 GiB 7.8 TiB 28.09 0.77 188 up
49 hdd 10.91399 1.00000 11 TiB 904 GiB 893 GiB 21 KiB
11 GiB 10 TiB 8.09 0.22 182 up
59 hdd 10.91399 1.00000 11 TiB 2.4 TiB 2.4 TiB 8 KiB
21 GiB 8.5 TiB 22.33 0.61 188 up
68 hdd 10.91399 1.00000 11 TiB 3.9 TiB 3.8 TiB 17 KiB
38 GiB 7.1 TiB 35.31 0.97 190 up
97 hdd 10.91399 1.00000 11 TiB 3.1 TiB 3.0 TiB 17 KiB
29 GiB 7.8 TiB 28.12 0.77 195 up
105 hdd 10.91399 1.00000 11 TiB 4.0 TiB 3.9 TiB 13 KiB
39 GiB 7.0 TiB 36.25 1.00 193 up
8 hdd 10.91399 1.00000 11 TiB 6.9 TiB 6.8 TiB 16 KiB
58 GiB 4.1 TiB 62.80 1.73 209 up
17 hdd 10.91399 1.00000 11 TiB 4.7 TiB 4.7 TiB 13 KiB
54 GiB 6.2 TiB 43.10 1.18 196 up
27 hdd 10.91399 1.00000 11 TiB 803 GiB 793 GiB 10 KiB
9.8 GiB 10 TiB 7.19 0.20 178 up
38 hdd 10.91399 1.00000 11 TiB 2.3 TiB 2.3 TiB 14 KiB
20 GiB 8.6 TiB 20.90 0.57 177 up
57 hdd 10.91399 1.00000 11 TiB 2.4 TiB 2.4 TiB 13 KiB
28 GiB 8.5 TiB 22.05 0.61 181 up
67 hdd 10.91399 1.00000 11 TiB 4.8 TiB 4.7 TiB 9 KiB
48 GiB 6.1 TiB 43.90 1.21 195 up
93 hdd 10.91399 1.00000 11 TiB 3.2 TiB 3.2 TiB 8 KiB
37 GiB 7.7 TiB 29.45 0.81 186 up
103 hdd 10.91399 1.00000 11 TiB 4.4 TiB 4.4 TiB 15 KiB
23 GiB 6.5 TiB 40.62 1.12 191 up
[root@node01 smd]# ceph -s cluster: id:
f7f1c8ba-f793-436b-bb73-0964108a30c1 health: HEALTH_OK services:
mon: 3 daemons, quorum a,b,c (age 8w) mgr: a(active, since 3w),
standbys: b mds: 1/1 daemons up, 1 hot standby osd: 108 osds:
108 up (since 19h), 108 in (since 4M); 104 remapped pgs rgw: 9
daemons active (9 hosts, 1 zones) data: volumes: 1/1 healthy
pools: 17 pools, 5787 pgs objects: 160.99M objects, 181 TiB
usage: 304 TiB used, 545 TiB / 849 TiB avail pgs:
119776974/1218292354 objects misplaced (9.832%) 5683
active+clean 102 active+remapped+backfill_wait
2 active+remapped+backfilling io: client: 9.4 MiB/s rd, 87
MiB/s wr, 1.79k op/s rd, 2.34k op/s wr recovery: 11 MiB/s, 7
objects/s progress: Global Recovery Event (0s)
[............................]
郑亮 <zhengliang0901(a)gmail.com> 于2023年3月16日周四 17:04写道:
Hi all,
I have a 9 node cluster running *Pacific 16.2.10*. OSDs live on 9 of the
nodes with each one having 4 x 1.8T ssd and 8 x 10.9T hdd for a total of
108 OSDs. We create three crush roots as belows.
1. The hdds (8x9=72) of all nodes form a large crush root, which is used
as a data pool, and object storage and cephfs share this crush root.
2. Take 3 ssds from the 4 ssds of each node as rbd block storage.
3. An ssd on each remaining node is used as an index pool for cephfs and
object storage.
[root@node01 smd]# ceph osd treeID CLASS WEIGHT TYPE NAME
STATUS REWEIGHT PRI-AFF -92
15.71910 root root-1c31624a-ad18-445e-8e42-86b71c1fd76f
-112 1.74657 host
node01-fa2cdf3e-7212-4b5f-b62a-3ab1e803547f
13 ssd 1.74657 osd.13
up 1.00000 1.00000 -103 1.74657
host node02-4e232f27-fe4b-4d0e-bd2a-67d5006a0cdd
34 ssd 1.74657 osd.34
up 1.00000 1.00000 -109
1.74657 host node03-3ae63d7a-9f65-4bea-b2ba-ff3fe342753d
28 ssd 1.74657 osd.28
up 1.00000 1.00000 -118
1.74657 host node04-37a3f92a-f6d8-41f9-a774-3069fc2f50b8
54 ssd 1.74657 osd.54
up 1.00000 1.00000
-106 1.74657 host node05-f667fa27-cc13-4b93-ad56-5dc4c31ffd77
53 ssd 1.74657
osd.53 up 1.00000 1.00000
-91 1.74657 host
node06-3808c8f6-8e10-47c7-8456-62c1e0e800ed
61 ssd 1.74657 osd.61
up 1.00000 1.00000 -97 1.74657
host node07-78216b0d-0999-44e8-905d-8737a5f6f51f
50 ssd 1.74657 osd.50
up 1.00000 1.00000 -115
1.74657 host node08-947bd556-fb06-497d-8f2c-c4a679d2b06f
86 ssd 1.74657 osd.86
up 1.00000 1.00000 -100
1.74657 host node09-d9ae9046-0716-454f-ba0c-b03cf9986ba8
85 ssd 1.74657 osd.85 up
1.00000 1.00000
-38 785.80701 root
root-6041a4dc-7c9a-44ed-999c-a847cca81012 -85
87.31189 host node01-e5646053-2cf8-4ba5-90d5-bb1a63b1234c
1 hdd 10.91399 osd.1 up
1.00000 1.00000 22 hdd 10.91399 osd.22
up 0.90002 1.00000 31 hdd 10.91399 osd.31
up 1.00000 1.00000 51 hdd 10.91399 osd.51
up 1.00000 1.00000 60 hdd 10.91399 osd.60
up 1.00000 1.00000 70 hdd 10.91399
osd.70 up 1.00000 1.00000 78 hdd
10.91399 osd.78 up 1.00000 1.00000
96 hdd 10.91399 osd.96
up 1.00000 1.00000 -37
87.31189 host node02-be9925fd-60de-4147-81eb-720d7145715f
9 hdd 10.91399 osd.9
up 1.00000 1.00000
19 hdd 10.91399 osd.19 up
1.00000 1.00000 29 hdd 10.91399
osd.29 up 1.00000 1.00000
47 hdd 10.91399 osd.47
up 1.00000 1.00000 56 hdd 10.91399
osd.56 up 1.00000 1.00000
65 hdd 10.91399 osd.65
up 1.00000 1.00000 88 hdd
10.91399 osd.88 up 1.00000 1.00000
98 hdd 10.91399 osd.98
up 1.00000 1.00000 -52
87.31189 host node03-7828653d-6033-4e88-92b0-d8709b0ab218
2 hdd 10.91399 osd.2
up 1.00000 1.00000
30 hdd 10.91399 osd.30 up
1.00000 1.00000 40 hdd 10.91399
osd.40 up 1.00000 1.00000
48 hdd 10.91399 osd.48
up 1.00000 1.00000 58 hdd 10.91399
osd.58 up 1.00000 1.00000
74 hdd 10.91399 osd.74
up 1.00000 1.00000 83 hdd 10.91399 osd.83
up 1.00000 1.00000 92 hdd 10.91399 osd.92
up 1.00000 1.00000 -46 87.31189
host node04-e986c3fc-a21b-44ff-9b02-b60b82ee63d7 12 hdd
10.91399 osd.12 up 1.00000 1.00000
23 hdd 10.91399 osd.23 up
1.00000 1.00000 32 hdd 10.91399 osd.32
up 1.00000 1.00000 43 hdd 10.91399 osd.43
up 1.00000 1.00000 52 hdd 10.91399 osd.52
up 1.00000 1.00000 71 hdd 10.91399 osd.71
up 1.00000 1.00000
95 hdd 10.91399 osd.95
up 1.00000 1.00000 104 hdd 10.91399 osd.104
up 1.00000 1.00000 -88 87.31189 host
node05-fe31d85f-b3b9-4393-b24b-030dbcdfacea 3 hdd
10.91399 osd.3 up 1.00000 1.00000
24 hdd 10.91399 osd.24 up
1.00000 1.00000 33 hdd 10.91399 osd.33
up 1.00000 1.00000 45 hdd 10.91399 osd.45
up 1.00000 1.00000 69 hdd 10.91399 osd.69
up 1.00000 1.00000 79 hdd 10.91399 osd.79
up 1.00000 1.00000
89 hdd 10.91399 osd.89
up 1.00000 1.00000 99 hdd 10.91399
osd.99 up 1.00000 1.00000
-55 87.31189 host
node06-6f16ba4b-0082-472a-b243-b1a058070918
5 hdd 10.91399 osd.5
up 1.00000 1.00000 15 hdd 10.91399
osd.15 up 1.00000 1.00000
25 hdd 10.91399 osd.25
up 1.00000 1.00000 44 hdd
10.91399 osd.44 up 1.00000 1.00000
63 hdd 10.91399 osd.63
up 1.00000 1.00000 72
hdd 10.91399 osd.72 up 1.00000
1.00000 81 hdd 10.91399 osd.81
up 1.00000 1.00000
91 hdd 10.91399 osd.91 up
1.00000 1.00000 -43 87.31189 host
node07-5dee846a-2814-4e04-bcfd-ff689d49795c
7 hdd 10.91399 osd.7
up 1.00000 1.00000 37 hdd 10.91399
osd.37 up 1.00000 1.00000
46 hdd 10.91399 osd.46
up 1.00000 1.00000 55 hdd
10.91399 osd.55 up 1.00000 1.00000
64 hdd 10.91399 osd.64
up 1.00000 1.00000 84 hdd 10.91399 osd.84
up 1.00000 1.00000 94 hdd 10.91399
osd.94 up 1.00000 1.00000 102 hdd
10.91399 osd.102 up 1.00000 1.00000
-58 87.31189 host node08-2d6b7ab3-2067-4e94-b77c-24d6e626e396
6 hdd 10.91399 osd.6
up 1.00000 1.00000 18 hdd 10.91399 osd.18
up 1.00000 1.00000 41 hdd 10.91399 osd.41
up 1.00000 1.00000 49 hdd 10.91399 osd.49
up 1.00000 1.00000 59 hdd 10.91399
osd.59 up 1.00000 1.00000
68 hdd 10.91399 osd.68
up 1.00000 1.00000 97 hdd 10.91399 osd.97
up 1.00000 1.00000 105 hdd 10.91399 osd.105
up 1.00000 1.00000 -49 87.31189 host
node09-e6bf0642-e3b4-48f0-9d0e-3d87ceacced8 8 hdd
10.91399 osd.8 up 1.00000 1.00000
17 hdd 10.91399 osd.17 up
1.00000 1.00000 27 hdd 10.91399 osd.27
up 1.00000 1.00000 38 hdd 10.91399 osd.38
up 1.00000 1.00000 57 hdd 10.91399 osd.57
up 1.00000 1.00000 67 hdd 10.91399 osd.67
up 1.00000 1.00000 93 hdd 10.91399
osd.93 up 1.00000 1.00000 103 hdd
10.91399 osd.103 up 1.00000 1.00000
-8 47.16115 root root-ea7a1878-722e-49d5-8a91-c618a6aefe29
-13 5.24013 host node01-4c465825-6bd1-42a5-b087-51a82cb2865c
0 ssd 1.74657 osd.0
up 0.95001 1.00000 11 ssd 1.74699 osd.11
up 1.00000 1.00000 16 ssd 1.74657 osd.16
up 1.00000 1.00000 -22 5.24013 host
node02-0e3e418d-9129-44ee-8453-53680171270e 4 ssd
1.74657 osd.4 up 1.00000 1.00000
14 ssd 1.74657 osd.14 up
0.95001 1.00000 82 ssd 1.74699 osd.82
up 1.00000 1.00000 -28 5.24013 host
node03-274397b4-8559-4a17-a9eb-1c16920ad432 10 ssd
1.74657 osd.10 up 1.00000 1.00000
76 ssd 1.74657 osd.76 up
1.00000 1.00000 100 ssd 1.74699 osd.100
up 0.95001 1.00000 -31 5.24013 host
node04-df05c3cc-08ef-426e-85b8-088cb8c1b4e2 20 ssd
1.74657 osd.20 up 1.00000 1.00000
39 ssd 1.74657 osd.39 up
0.95001 1.00000 90 ssd 1.74699 osd.90
up 0.95001 1.00000 -25 5.24013 host
node05-d07e7e92-8290-49ba-b09c-ad34bddb1eae 26 ssd
1.74657 osd.26 up 0.95001 1.00000
62 ssd 1.74699 osd.62 up
1.00000 1.00000 73 ssd 1.74657 osd.73
up 1.00000 1.00000 -16 5.24013 host
node06-a7a40371-a128-4ab1-90a2-62c99e040036 42 ssd
1.74657 osd.42 up 0.95001 1.00000
75 ssd 1.74657 osd.75 up
0.95001 1.00000 107 ssd 1.74699 osd.107
up 1.00000 1.00000 -7 5.24013 host
node07-971dee6e-dec4-4c0a-86d1-54d0b23832bd 21 ssd
1.74699 osd.21 up 1.00000 1.00000
66 ssd 1.74657 osd.66 up
0.95001 1.00000 77 ssd 1.74657 osd.77
up 1.00000 1.00000 -19 5.24013 host
node08-9a47282b-9530-4cc7-9e29-7f6c0b2f5184 35 ssd
1.74699 osd.35 up 1.00000 1.00000
80 ssd 1.74657 osd.80 up
1.00000 1.00000 101 ssd 1.74657 osd.101
up 1.00000 1.00000 -34 5.24013 host
node09-f09a8013-2426-4ced-b9c7-02c06ca9d6fc 36 ssd
1.74699 osd.36 up 0.95001 1.00000
87 ssd 1.74657 osd.87 up
1.00000 1.00000 106 ssd 1.74657 osd.106
up 1.00000 1.00000
After ceph has been running normally for a period of time, I found that
the distribution of osd data is seriously uneven, and autoscale PGs were
turned on in my envs. My hdds are identical, but utilization ranges from
6.14% to 74.35%
root@node09:/# ceph osd df | grep hdd ID CLASS WEIGHT REWEIGHT SIZE RAW USE
DATA OMAP META AVAIL %USE VAR PGS STATUS 1 hdd 10.91399
1.00000 11 TiB 5.3 TiB 5.2 TiB 15 KiB 53 GiB 5.7 TiB 48.15 1.38 174 up
22 hdd 10.91399 0.90002 11 TiB 8.1 TiB 8.0 TiB 12 KiB 71 GiB 2.8 TiB
74.35 2.13 160 up 31 hdd 10.91399 1.00000 11 TiB 3.0 TiB 2.9 TiB 13
KiB 27 GiB 8.0 TiB 27.14 0.78 179 up 51 hdd 10.91399 1.00000 11 TiB
3.7 TiB 3.7 TiB 9 KiB 36 GiB 7.2 TiB 34.12 0.98 173 up 60 hdd 10.91399
1.00000 11 TiB 973 GiB 962 GiB 14 KiB 11 GiB 10 TiB 8.71 0.25 167 up
70 hdd 10.91399 1.00000 11 TiB 2.0 TiB 2.0 TiB 11 KiB 3.9 GiB 8.9 TiB
18.42 0.53 173 up 78 hdd 10.91399 1.00000 11 TiB 3.0 TiB 2.9 TiB 12
KiB 28 GiB 7.9 TiB 27.18 0.78 173 up 96 hdd 10.91399 1.00000 11 TiB
3.7 TiB 3.7 TiB 10 KiB 36 GiB 7.2 TiB 34.15 0.98 176 up 9 hdd 10.91399
1.00000 11 TiB 2.8 TiB 2.8 TiB 16 KiB 13 GiB 8.1 TiB 25.38 0.73 178 up
19 hdd 10.91399 1.00000 11 TiB 3.8 TiB 3.8 TiB 13 KiB 43 GiB 7.1 TiB
35.04 1.00 177 up 29 hdd 10.91399 1.00000 11 TiB 5.1 TiB 5.0 TiB 14
KiB 38 GiB 5.9 TiB 46.38 1.33 182 up 47 hdd 10.91399 1.00000 11 TiB
686 GiB 684 GiB 9 KiB 1.4 GiB 10 TiB 6.14 0.18 172 up 56 hdd 10.91399
1.00000 11 TiB 3.7 TiB 3.7 TiB 12 KiB 36 GiB 7.2 TiB 34.16 0.98 174 up
65 hdd 10.91399 1.00000 11 TiB 5.9 TiB 5.9 TiB 7 KiB 54 GiB 5.0 TiB
54.27 1.55 183 up 88 hdd 10.91399 1.00000 11 TiB 4.5 TiB 4.4 TiB 12
KiB 44 GiB 6.4 TiB 41.17 1.18 173 up 98 hdd 10.91399 1.00000 11 TiB
7.3 TiB 7.2 TiB 13 KiB 56 GiB 3.7 TiB 66.49 1.90 179 up 2 hdd 10.91399
1.00000 11 TiB 1.5 TiB 1.5 TiB 12 KiB 18 GiB 9.4 TiB 14.02 0.40 177 up
30 hdd 10.91399 1.00000 11 TiB 4.4 TiB 4.4 TiB 10 KiB 37 GiB 6.5 TiB
40.29 1.15 180 up 40 hdd 10.91399 1.00000 11 TiB 1.4 TiB 1.4 TiB 10
KiB 11 GiB 9.5 TiB 13.14 0.38 172 up 48 hdd 10.91399 1.00000 11 TiB
1.4 TiB 1.4 TiB 12 KiB 11 GiB 9.5 TiB 13.17 0.38 168 up 58 hdd 10.91399
1.00000 11 TiB 7.5 TiB 7.4 TiB 14 KiB 70 GiB 3.5 TiB 68.31 1.96 182 up
74 hdd 10.91399 1.00000 11 TiB 6.0 TiB 6.0 TiB 10 KiB 61 GiB 4.9 TiB
55.15 1.58 181 up 83 hdd 10.91399 1.00000 11 TiB 2.2 TiB 2.2 TiB 22
KiB 19 GiB 8.7 TiB 20.13 0.58 174 up 92 hdd 10.91399 1.00000 11 TiB
3.0 TiB 2.9 TiB 11 KiB 27 GiB 8.0 TiB 27.15 0.78 171 up 12 hdd 10.91399
1.00000 11 TiB 5.1 TiB 5.0 TiB 9 KiB 38 GiB 5.9 TiB 46.37 1.33 176 up
23 hdd 10.91399 1.00000 11 TiB 4.4 TiB 4.4 TiB 22 KiB 37 GiB 6.5 TiB
40.24 1.15 173 up 32 hdd 10.91399 1.00000 11 TiB 2.2 TiB 2.2 TiB 10
KiB 19 GiB 8.7 TiB 20.13 0.58 176 up 43 hdd 10.91399 1.00000 11 TiB
2.0 TiB 2.0 TiB 10 KiB 4.4 GiB 8.9 TiB 18.41 0.53 176 up 52 hdd 10.91399
1.00000 11 TiB 3.6 TiB 3.6 TiB 12 KiB 29 GiB 7.3 TiB 33.27 0.95 171 up
71 hdd 10.91399 1.00000 11 TiB 3.0 TiB 2.9 TiB 15 KiB 27 GiB 8.0 TiB
27.13 0.78 170 up 95 hdd 10.91399 1.00000 11 TiB 5.4 TiB 5.3 TiB 10
KiB 60 GiB 5.6 TiB 49.04 1.40 173 up104 hdd 10.91399 1.00000 11 TiB
3.7 TiB 3.7 TiB 8 KiB 36 GiB 7.2 TiB 34.13 0.98 171 up 3 hdd 10.91399
1.00000 11 TiB 2.2 TiB 2.2 TiB 9 KiB 19 GiB 8.7 TiB 20.14 0.58 173 up
24 hdd 10.91399 1.00000 11 TiB 2.2 TiB 2.2 TiB 9 KiB 19 GiB 8.7 TiB
20.12 0.58 170 up 33 hdd 10.91399 1.00000 11 TiB 6.8 TiB 6.7 TiB 14
KiB 69 GiB 4.1 TiB 62.15 1.78 181 up 45 hdd 10.91399 1.00000 11 TiB
2.9 TiB 2.8 TiB 10 KiB 20 GiB 8.0 TiB 26.27 0.75 178 up 69 hdd 10.91399
1.00000 11 TiB 2.9 TiB 2.8 TiB 15 KiB 20 GiB 8.0 TiB 26.25 0.75 173 up
79 hdd 10.91399 1.00000 11 TiB 5.9 TiB 5.9 TiB 28 KiB 54 GiB 5.0 TiB
54.25 1.55 178 up 89 hdd 10.91399 1.00000 11 TiB 2.1 TiB 2.1 TiB 16
KiB 11 GiB 8.8 TiB 19.25 0.55 173 up 99 hdd 10.91399 1.00000 11 TiB
4.5 TiB 4.4 TiB 13 KiB 44 GiB 6.4 TiB 41.14 1.18 174 up 5 hdd 10.91399
1.00000 11 TiB 2.2 TiB 2.2 TiB 14 KiB 19 GiB 8.7 TiB 20.12 0.58 172 up
15 hdd 10.91399 1.00000 11 TiB 7.3 TiB 7.2 TiB 7 KiB 56 GiB 3.7 TiB
66.48 1.90 177 up 25 hdd 10.91399 1.00000 11 TiB 1.4 TiB 1.4 TiB 11
KiB 10 GiB 9.5 TiB 13.13 0.38 176 up 44 hdd 10.91399 1.00000 11 TiB
3.1 TiB 3.0 TiB 9 KiB 34 GiB 7.9 TiB 28.03 0.80 179 up 63 hdd 10.91399
1.00000 11 TiB 2.9 TiB 2.8 TiB 10 KiB 20 GiB 8.0 TiB 26.26 0.75 178 up
72 hdd 10.91399 1.00000 11 TiB 3.7 TiB 3.7 TiB 12 KiB 36 GiB 7.2 TiB
34.13 0.98 180 up 81 hdd 10.91399 1.00000 11 TiB 6.7 TiB 6.6 TiB 11
KiB 62 GiB 4.2 TiB 61.30 1.76 181 up 91 hdd 10.91399 1.00000 11 TiB
2.2 TiB 2.2 TiB 16 KiB 19 GiB 8.7 TiB 20.11 0.58 176 up 7 hdd 10.91399
1.00000 11 TiB 4.6 TiB 4.5 TiB 13 KiB 52 GiB 6.3 TiB 42.03 1.20 180 up
37 hdd 10.91399 1.00000 11 TiB 5.3 TiB 5.2 TiB 14 KiB 52 GiB 5.7 TiB
48.16 1.38 182 up 46 hdd 10.91399 1.00000 11 TiB 784 GiB 775 GiB 14
KiB 8.8 GiB 10 TiB 7.01 0.20 173 up 55 hdd 10.91399 1.00000 11 TiB
5.9 TiB 5.9 TiB 20 KiB 54 GiB 5.0 TiB 54.25 1.55 177 up 64 hdd 10.91399
1.00000 11 TiB 3.6 TiB 3.6 TiB 18 KiB 28 GiB 7.3 TiB 33.23 0.95 176 up
84 hdd 10.91399 1.00000 11 TiB 784 GiB 775 GiB 10 KiB 9.0 GiB 10 TiB
7.01 0.20 176 up 94 hdd 10.91399 1.00000 11 TiB 2.1 TiB 2.1 TiB 8 KiB
12 GiB 8.8 TiB 19.28 0.55 169 up102 hdd 10.91399 1.00000 11 TiB 3.0
TiB 2.9 TiB 10 KiB 28 GiB 8.0 TiB 27.13 0.78 170 up 6 hdd 10.91399
1.00000 11 TiB 4.5 TiB 4.4 TiB 23 KiB 44 GiB 6.4 TiB 41.14 1.18 181 up
18 hdd 10.91399 1.00000 11 TiB 6.0 TiB 6.0 TiB 9 KiB 61 GiB 4.9 TiB
55.15 1.58 183 up 41 hdd 10.91399 1.00000 11 TiB 3.0 TiB 2.9 TiB 15
KiB 27 GiB 8.0 TiB 27.12 0.78 179 up 49 hdd 10.91399 1.00000 11 TiB
785 GiB 775 GiB 19 KiB 9.5 GiB 10 TiB 7.02 0.20 176 up 59 hdd 10.91399
1.00000 11 TiB 2.2 TiB 2.2 TiB 8 KiB 19 GiB 8.7 TiB 20.12 0.58 178 up
68 hdd 10.91399 1.00000 11 TiB 3.7 TiB 3.7 TiB 13 KiB 36 GiB 7.2 TiB
34.15 0.98 172 up 97 hdd 10.91399 1.00000 11 TiB 3.0 TiB 2.9 TiB 15
KiB 27 GiB 8.0 TiB 27.13 0.78 178 up105 hdd 10.91399 1.00000 11 TiB
3.7 TiB 3.7 TiB 13 KiB 36 GiB 7.2 TiB 34.14 0.98 179 up 8 hdd 10.91399
1.00000 11 TiB 6.6 TiB 6.5 TiB 13 KiB 55 GiB 4.3 TiB 60.40 1.73 181 up
17 hdd 10.91399 1.00000 11 TiB 4.6 TiB 4.5 TiB 14 KiB 52 GiB 6.3 TiB
42.03 1.20 177 up 27 hdd 10.91399 1.00000 11 TiB 783 GiB 774 GiB 8
KiB 9.4 GiB 10 TiB 7.01 0.20 176 up 38 hdd 10.91399 1.00000 11 TiB
2.2 TiB 2.2 TiB 15 KiB 19 GiB 8.7 TiB 20.11 0.58 174 up 57 hdd 10.91399
1.00000 11 TiB 2.3 TiB 2.3 TiB 12 KiB 26 GiB 8.6 TiB 21.01 0.60 177 up
67 hdd 10.91399 1.00000 11 TiB 4.5 TiB 4.4 TiB 9 KiB 44 GiB 6.4 TiB
41.14 1.18 176 up 93 hdd 10.91399 1.00000 11 TiB 3.1 TiB 3.0 TiB 7
KiB 35 GiB 7.9 TiB 28.01 0.80 171 up103 hdd 10.91399 1.00000 11 TiB
4.2 TiB 4.2 TiB 10 KiB 22 GiB 6.7 TiB 38.52 1.10 176 up
three pools (deeproute-replica-hdd-pool,
os-dsglczutvqsgowpz.rgw.buckets.data, and cephfs-replicated-pool ) are
set to use the same device class hdd. I noticed that the *effective ratio* of
the three pools is very different, not sure if it will affect the data
rebalancing.
[root@node01 smd]# ceph osd pool autoscale-statusPOOL
SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO EFFECTIVE
RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE BULK device_health_metrics
278.5M 3.0 48289G 0.0000
1.0 1 on False deeproute-replica-hdd-pool
12231M 3.0 785.8T 0.9901 50.0000
0.9901 1.0 4096 on False deeproute-replica-ssd-pool
12333G 3.0 48289G 0.9901 50.0000
0.9901 1.0 1024 on False .rgw.root
5831 3.0 48289G 0.0099 0.5000
0.0099 1.0 8 on False default.rgw.log
182 3.0 48289G 0.0000
1.0 32 on False default.rgw.control
0 3.0 48289G 0.0000
1.0 32 on False default.rgw.meta 0
3.0 48289G 0.0000
4.0 8 on False os-dsglczutvqsgowpz.rgw.control 0
3.0 16096G 0.1667 0.5000 0.1667 1.0
64 on False os-dsglczutvqsgowpz.rgw.meta 99.81k
3.0 16096G 0.1667 0.5000 0.1667 1.0
64 on False os-dsglczutvqsgowpz.rgw.buckets.index 59437M
3.0 16096G 0.1667 0.5000 0.1667 1.0 32
on False os-dsglczutvqsgowpz.rgw.buckets.non-ec 498.7M
3.0 16096G 0.1667 0.5000 0.1667 1.0 32
on False os-dsglczutvqsgowpz.rgw.log 1331M
3.0 16096G 0.1667 0.5000 0.1667 1.0 32
on False os-dsglczutvqsgowpz.rgw.buckets.data 143.1T
1.3333333730697632 785.8T 0.2429 0.5000 0.0099 1.0 32
on False cephfs-metadata 3161M
3.0 16096G 0.0006 4.0 32
on False cephfs-replicated-pool 21862G
3.0 785.8T 0.0815 1.0 32
on False .nfs 84599
3.0 48289G 0.0000 1.0 32 on
False os-dsglczutvqsgowpz.rgw.otp 0
3.0 16096G 0.1667 0.5000
[root@node01 smd]# ceph df--- RAW STORAGE ---CLASS SIZE AVAIL USED RAW USED
%RAW USEDhdd 786 TiB 545 TiB 240 TiB 240 TiB 30.60ssd 63 TiB 25 TiB 38
TiB 38 TiB 59.95TOTAL 849 TiB 571 TiB 278 TiB 278 TiB 32.77 --- POOLS
---POOL ID PGS STORED OBJECTS USED %USED MAX
AVAILdevice_health_metrics 1 1 280 MiB 135 841 MiB 0.02
1.8 TiBdeeproute-replica-hdd-pool 11 4096 12 GiB 9.44k 36 GiB 0.02
69 TiBdeeproute-replica-ssd-pool 12 1024 12 TiB 3.43M 37 TiB
87.62 1.8 TiB.rgw.root 25 8 5.7 KiB 20 228
KiB 0 1.8 TiBdefault.rgw.log 26 32 182 B 2
24 KiB 0 1.8 TiBdefault.rgw.control 27 32 0 B 8
0 B 0 1.8 TiBdefault.rgw.meta 28 8 0 B
0 0 B 0 1.8 TiBos-dsglczutvqsgowpz.rgw.control 29 64 0 B
8 0 B 0 4.9 TiBos-dsglczutvqsgowpz.rgw.meta 30 64 100 KiB
456 4.3 MiB 0 4.9 TiBos-dsglczutvqsgowpz.rgw.buckets.index 31 32 58 GiB
10.40k 174 GiB 1.15 4.9 TiBos-dsglczutvqsgowpz.rgw.buckets.non-ec 32 32 500
MiB 247.78k 4.3 GiB 0.03 4.9 TiBos-dsglczutvqsgowpz.rgw.log 33 32
689 MiB 432 2.0 GiB 0.01 4.9 TiBos-dsglczutvqsgowpz.rgw.buckets.data 34
32 130 TiB 134.96M 174 TiB 45.69 155 TiBcephfs-metadata 35
32 3.1 GiB 1.67M 9.3 GiB 0.06 4.9 TiBcephfs-replicated-pool
36 32 21 TiB 8.25M 64 TiB 23.79 69 TiB.nfs
37 32 89 KiB 9 341 KiB 0 1.8 TiBos-dsglczutvqsgowpz.rgw.otp
38 8 0 B 0 0 B 0 4.9 TiB
[root@node01 smd]# ceph -s
cluster:
id: f7f1c8ba-f793-436b-bb73-0964108a30c1
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 7w)
mgr: a(active, since 2w), standbys: b
mds: 1/1 daemons up, 1 hot standby
osd: 108 osds: 108 up (since 5h), 108 in (since 4M); 1 remapped pgs
rgw: 9 daemons active (9 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 17 pools, 5561 pgs
objects: 148.37M objects, 162 TiB
usage: 276 TiB used, 572 TiB / 849 TiB avail
pgs: 2786556/1119434807 objects misplaced (0.249%)
5559 active+clean
1 active+remapped+backfilling
1 active+clean+scrubbing+deep
io:
client: 212 MiB/s rd, 27 MiB/s wr, 4.04k op/s rd, 2.01k op/s wr
recovery: 4.0 MiB/s, 4 objects/s
progress:
Global Recovery Event (2d)
[===========================.] (remaining: 39m)
So my question is, how can I adjust to make the osd data more evenly
distributed? Thanks!
Best Regards,
Liang Zheng