Hello ceph community,
I have some questions about the pg autoscaler. I have a cluster with several pools. One
of them is a cephfs pool which is behaving in an expected / sane way, and another is a RBD
pool with an ec profile of k=2, m=2.
The cluster has about 60 drives across across about 10 failure domains. (Failure domain
is set to “chassis”, and there are some chassis with 4 blades per chassis, and the rest
with 1 host per chassis).
The rbd ec pool has 66TiB stored with 128 PGs. Each PG has about 500k objects in them,
which seems like quite a lot. When rebalancing, this EC pool is always the longpole.
The confusing part is that I am getting inconsistent output on the status on the
autoscaler. For example:
root@vis-mgmt:~# ceph osd pool autoscale-status | grep rbd_ec
rbd_ec 67241G 2.0 856.7T 0.1533
1.0 64 on False
Which tells me I have 64 PG_NUM (a lie).
root@vis-mgmt:~# ceph osd pool ls detail | grep rbd_ec
pool 4 'rbd_ec' erasure profile ec22 size 4 min_size 3 crush_rule 1 object_hash
rjenkins pg_num 128 pgp_num 120 pg_num_target 64 pgp_num_target 64 autoscale_mode on
last_change 83396 lfor 0/83395/83393 flags hashpspool,ec_overwrites,selfmanaged_snaps
stripe_width 8192 application rbd
Tells me I have 128 PGs (correct), but with a pgp_num which is not a power of 2 (120 pgs).
Also, I am not sure what the pg_num_target and pgp_num_raget are and why they are
different from pg_num and pgp_num.
Is there anything I can look into to find out is the autoscaler is working correctly for
this pool? Any other tweaks I need to do? Seems to me that with that capacity it ought
to have more than 128 PGs…
Thank you!
George
Show replies by date