Dear Ceph experts,
recently we have upgraded our ceph cluster from octopus (15.2.17) to pacific (16.2.14 and
then to 16.2.15).
Just after upgrade warnings that all (except device_health_metrics pool) our pools have
too many placement groups appeared.
This warning looks like generated by autoscaler, but there was no suggestions from it.
When I started to investigate situation, I turned bulk flag to true for pools intended to
store large amount of data (just to look changes).
After that most pools reports that they have too few placement groups, but there is still
no suggestions from autoscaler (NEW PG_NUM column is empty)
# ceph osd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO
TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE BULK
metadata 22567M 3.0 57255G 0.0012
1.0 32 warn False
rbd_hdd02_ec_meta 20817k 3.0 365.3T 0.0000
1.0 32 warn False
device_health_metrics 1002M 3.0 205.4T 0.0000
1.0 1 on False
rbd_ssd_ec_meta 0 3.0 57255G 0.0000
1.0 32 warn False
cmd_spool 45094G 1.3333333730697632 205.4T 0.2859
1.0 512 warn True
data 60496G 3.0 365.3T 0.4851
1.0 512 warn True
rbd 2625G 3.0 205.4T 0.0374
1.0 64 warn True
kedr_spool 6900G 10240G 1.3333333730697632 205.4T 0.0649
1.0 128 warn True
gcf_spool 1119M 1.3333333730697632 205.4T 0.0000
1.0 32 warn True
rbd_ssd 107.8G 3.0 57255G 0.0056
1.0 32 warn True
data_ssd 775.8G 3.0 57255G 0.0407
1.0 128 warn True
only warnings in 'ceph status' and 'ceph health detail'
# ceph health detail
HEALTH_WARN 6 pools have too few placement groups; 4 pools have too many placement groups
[WRN] POOL_TOO_FEW_PGS: 6 pools have too few placement groups
Pool data has 512 placement groups, should have 2048
Pool rbd has 64 placement groups, should have 512
Pool kedr_spool has 128 placement groups, should have 256
Pool gcf_spool has 32 placement groups, should have 256
Pool rbd_ssd has 32 placement groups, should have 2048
Pool data_ssd has 128 placement groups, should have 2048
[WRN] POOL_TOO_MANY_PGS: 4 pools have too many placement groups
Pool metadata has 32 placement groups, should have 32
Pool rbd_hdd02_ec_meta has 32 placement groups, should have 32
Pool rbd_ssd_ec_meta has 32 placement groups, should have 32
Pool cmd_spool has 512 placement groups, should have 512
When I changed number of placement groups of pool kedr_spool from 128 to 256 as suggested
in warning, data rebalanced successfully,
but now pool warns that it have too many placement groups:
Pool kedr_spool has 256 placement groups, should have 256.
The most strange for me in these warnings is that the current number of PGs is the same
that should be.
What is it and what how to solve these warnings?
Best regards,
Dmitriy
Show replies by date