Hi guys!
Based on our observation of the impact of the balancer on the
performance of the entire cluster, we have drawn conclusions that we
would like to discuss with you.
- A newly created pool should be balanced before being handed over
to the user. This, I believe, is quite evident.
- When replacing a disk, it is advisable to exchange it directly
for a new one. As soon as the OSD replacement occurs, the balancer
should be invoked to realign any improperly placed PGs during the disk
outage and disk recovery.
Perhaps an even better method is to pause recovery and backfilling
before removing the disk, remove the disk itself, promptly add a new
one, and then resume recovery and backfilling. It's essential to per
form all of this as quickly as possible (using a script).
Ad. We are using a community balancer developed by Jonas Jelton because
the built-in one does not meet our requirements.
What are your thoughts on this?
Michal
Hi,
I have an Openstack platform deployed with Yoga and ceph-ansible pacific on
Rocky 8.
Now I need to do an upgrade to Openstack zed with octopus on Rocky 9.
This is the path of the upgrade I have traced
- upgrade my nodes to Rocky 9 keeping Openstack yoga with ceph-ansible
pacific.
- convert ceph pacific from ceph-ansible to cephadm.
- stop Openstack platform yoga
- upgrade ceph pacific to octopus
- upgrade Openstack yoga to zed.
Any thoughts or guide lines to keep in mind and follow regarding ceph
convertion and upgrade.
Ps : on my ceph I have rbd, rgw and cephfs pools.
Regards.
Hi community,
I am have multiple bucket was delete but lifecycle of bucket still exist,
how i can delete it with radosgw-admin, because user can't access to bucket
for delete lifecycle. User for this bucket does not exist.
root@ceph:~# radosgw-admin lc list
[
{
"bucket": ":r30203:f3fec4b6-a248-4f3f-be75-b8055e61233a.33081.8",
"started": "Wed, 06 Dec 2023 10:43:55 GMT",
"status": "COMPLETE"
},
{
"bucket": ":r30304:f3fec4b6-a248-4f3f-be75-b8055e61233a.33081.13",
"started": "Wed, 06 Dec 2023 10:43:54 GMT",
"status": "COMPLETE"
},
{
"bucket":
":ec3204cam04:f3fec4b6-a248-4f3f-be75-b8055e61233a.31736.1",
"started": "Wed, 06 Dec 2023 10:44:30 GMT",
"status": "COMPLETE"
},
{
"bucket": ":r30105:f3fec4b6-a248-4f3f-be75-b8055e61233a.33081.5",
"started": "Wed, 06 Dec 2023 10:44:40 GMT",
"status": "COMPLETE"
},
{
"bucket": ":r30303:f3fec4b6-a248-4f3f-be75-b8055e61233a.33081.14",
"started": "Wed, 06 Dec 2023 10:44:40 GMT",
"status": "COMPLETE"
},
{
"bucket":
":ec3201cam02:f3fec4b6-a248-4f3f-be75-b8055e61233a.56439.2",
"started": "Wed, 06 Dec 2023 10:43:56 GMT",
"status": "COMPLETE"
},
Thanks to the community.
Hi All,
Looking for some help/explanation around erasure code pools, etc.
I set up a 3-node Ceph (Quincy) cluster with each box holding 7 OSDs
(HDDs) and each box running Monitor, Manager, and iSCSI Gateway. For the
record the cluster runs beautifully, without resource issues, etc.
I created an Erasure Code Profile, etc:
~~~
ceph osd erasure-code-profile set my_ec_profile plugin=jerasure k=4 m=2
crush-failure-domain=osd
ceph osd crush rule create-erasure my_ec_rule my_ec_profile
ceph osd crush rule create-replicated my_replicated_rule default host
~~~
My Crush Map is:
~~~
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
device 18 osd.18 class hdd
device 19 osd.19 class hdd
device 20 osd.20 class hdd
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 zone
type 10 region
type 11 root
# buckets
host ceph_1 {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
# weight 38.09564
alg straw2
hash 0 # rjenkins1
item osd.0 weight 5.34769
item osd.1 weight 5.45799
item osd.2 weight 5.45799
item osd.3 weight 5.45799
item osd.4 weight 5.45799
item osd.5 weight 5.45799
item osd.6 weight 5.45799
}
host ceph_2 {
id -5 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
# weight 38.09564
alg straw2
hash 0 # rjenkins1
item osd.7 weight 5.34769
item osd.8 weight 5.45799
item osd.9 weight 5.45799
item osd.10 weight 5.45799
item osd.11 weight 5.45799
item osd.12 weight 5.45799
item osd.13 weight 5.45799
}
host ceph_3 {
id -7 # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
# weight 38.09564
alg straw2
hash 0 # rjenkins1
item osd.14 weight 5.34769
item osd.15 weight 5.45799
item osd.16 weight 5.45799
item osd.17 weight 5.45799
item osd.18 weight 5.45799
item osd.19 weight 5.45799
item osd.20 weight 5.45799
}
root default {
id -1 # do not change unnecessarily
id -2 class hdd # do not change unnecessarily
# weight 114.28693
alg straw2
hash 0 # rjenkins1
item ceph_1 weight 38.09564
item ceph_2 weight 38.09564
item ceph_3 weight 38.09564
}
# rules
rule replicated_rule {
id 0
type replicated
step take default
step chooseleaf firstn 0 type host
step emit
}
rule my_replicated_rule {
id 1
type replicated
step take default
step chooseleaf firstn 0 type host
step emit
}
rule my_ec_rule {
id 2
type erasure
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step choose indep 3 type host
step chooseleaf indep 2 type osd
step emit
}
# end crush map
~~~
Finally I create a pool:
~~~
ceph osd pool create my_pool 32 32 erasure my_ec_profile my_ec_rule
ceph osd pool application enable my_meta_pool rbd
rbd pool init my_meta_pool
rbd pool init my_pool
rbd create --size 16T my_pool/my_disk_1 --data-pool my_pool
--image-feature journaling
~~~
So all this is to have some VMs (oVirt VMs, for the record) with
automatic fall-over in the case of a Ceph Node loss - ie I was trying to
"replicate" a 3-Disk RAID 5 array across the Ceph Nodes, so that I could
loose a Node and still have a working set of VMs.
However, I took one of the Ceph Nodes down (gracefully) for some
maintenance the other day and I lost *all* the VMs (ie oVirt complained
that there was no active pool). As soon as I brought the down node back
up everything was good again.
So my question is: What did I do wrong with my config?
Sound I, for example, change the EC Profile to `k=2, m=1`, but how is
that practically different from `k=4, m=2` - yes, the later spreads the
pool over more disks, but it should still only put 2 disks on each node,
shouldn't it?
Thanks in advance
Cheers
Dulux-Oz
Closing the loop (blocked waiting for Neha's input): how are we using Gibba
on a day-to-day basis? Is it only used for checking reef point releases?
- To be discussed again next week, as Neha had a conflict
[Nizam] http://old.ceph.com/pgcalc is not working anymore, is there any
replacement for pgcalc page in the new ceph site?
- Last attempt at a fix https://github.com/ceph/ceph.io/issues/265
- Nizam will take a look; the PR needs some CSS work
- We may not even need this link due to the autoscaler, etc.
- Perhaps add some kind of "banner" to the pgcalc page to make it clear
that users should look to the autoscaler
Update on 18.2.1 bluestore issue?
- Fix has been raised; also not as serious as initially suspected
- Fix is in testing
--
Laura Flores
She/Her/Hers
Software Engineer, Ceph Storage <https://ceph.io>
Chicago, IL
lflores(a)ibm.com | lflores(a)redhat.com <lflores(a)redhat.com>
M: +17087388804