Is there a way to enable the LUKS encryption format on a snapshot that was created from an unencrypted image without losing data? I've seen in https://docs.ceph.com/en/quincy/rbd/rbd-encryption/ that "Any data written to the image prior to its format may become unreadable, though it may still occupy storage resources." and observed that to be the case when running `encryption format` on an image that already has data in it. However is there any way to take a snapshot of an unencrypted image and enable encryption on the snapshot (or even on a new image cloned from the snapshot?)
we are running a 3-node ceph cluster with version 17.2.6.
For CephFS snapshots we have configured the following snap schedule with
/PATH 2h 72h15d6m
But we observed that max 50 snapshot are preserved. If a new snapshot is
created the oldest 51st is deleted.
Is there a limit for maximum cephfs snapshots or maybe this is a bug?
I have found the setting "mds_max_snaps_per_dir" which is 100 by default
but I think this is not related to my problem?
I followed the steps to repair journal and MDS I found here in the list.
I hit a bug that stopped my MDS to start so I took the long way with
reading the data.
Everything went fine and I can even mount one of my CephFS now. That's a
But when I start scrub, I just get return code -116 and no scrub is
initiatet. I didn't find that code in the docs. Can you help me?
[ceph: root@ceph06 /]# ceph tell mds.mds01.ceph06.huavsw scrub start
2023-04-29T10:46:36.926+0000 7ff676ff5700 0 client.79389355
ms_handle_reset on v2:192.168.23.66:6800/1133836262
2023-04-29T10:46:36.953+0000 7ff676ff5700 0 client.79389361
ms_handle_reset on v2:192.168.23.66:6800/1133836262
(I get the same error, no matter what kind of scrub I start)
GnuPG : 6265BAE6 , A84CB603
Telegram, Signal: widhalmt(a)widhalm.or.at
The cluster is with Pacific and deployed by cephadm on container.
The case is to import OSDs after host OS reinstallation.
All OSDs are SSD who has DB/WAL and data together.
Did some research, but not able to find a working solution.
Wondering if anyone has experiences in this?
What needs to be done before host OS reinstallation and what's after?
I've started playing with Lua scripting and would like to ask If anyone knows about a way to drop or close user request on the prerequest context.
I would like to block creating buckets with dots in the name, but the use-case could be blocking certain operations, etc.
I was able to come up with some like this
if string.find(Request.HTTP.URI, '%.') then
Request.Response.HTTPStatusCode = 400
Request.Response.HTTPStatus = "InvalidBucketName"
Request.Response.Message = "Dots are not allowed."
This works fine, but the bucket is created which is something that I don't want to do. As a dirty workaround, I've thought about changing the bucket name here to an already existing bucket, but the Request.Bucket.Name = "taken" doesn't seem to work as the log gives me an error "attempt to index a nil value (field 'Bucket')".
Any help is much appreciated.
Dear Ceph folks,
I would like to listen to your advice on the following topic: We have a 6-node Ceph cluster (for RGW usage only ) running on Luminous 12.2.12, and now will add 10 new nodes. Our plan is to phase out the old 6 nodes, and run RGW Ceph cluster with the new 10 nodes on Nautilus version。
I can think of two ways to achieve the above goal. The first method would be: 1) Upgrade the current 6-node cluster from Luminous 12.2.12 to Nautilus 14.2.22; 2) Expand the cluster with the 10 new nodes, and then re-balance; 3) After rebalance completes, remove the 6 old nodes from the cluster
The second method would get rid of the procedure to upgrade the old 6-node from Luminous to Nautilus, because those 6 nodes will be phased out anyway, but then we have to deal with a hybrid cluster with 6-node on Luminous 12.2.12, and 10-node on Nautilus, and after re-balancing, we can remove the 6 old nodes from the cluster.
Any suggestions, advice, or best practice would be highly appreciated.
I'm using a dockerized Ceph 17.2.6 under Ubuntu 22.04.
Presumably I'm missing a very basic thing, since this seems a very simple
question: how can I call cephfs-top in my environment? It is not inckuded
in the Docker Image which is accessed by "cephadm shell".
And calling the version found in the source code always fails with "[errno
13] RADOS permission denied", even when using "--cluster" with the correct
ID, "--conffile" and "--id".
The auth user client.fstop exists, and "ceph fs perf stats" runs.
What am I missing?
we run a ceph cluster in stretch mode with one pool. We know about this bug:
Can anyone tell me what happens when a pool gets to 100% full? At the moment raw OSD usage is about 54% but ceph throws me a "POOL_BACKFILLFULL" error:
$ ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 63 TiB 29 TiB 34 TiB 34 TiB 54.19
TOTAL 63 TiB 29 TiB 34 TiB 34 TiB 54.19
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 415 MiB 105 1.2 GiB 0.04 1.1 TiB
vm_stretch_live 2 64 15 TiB 4.02M 34 TiB 95.53 406 GiB
So the pool warning / calculation is just a bug, because it thinks its 50% of the total size. I know ceph will stop IO / set OSDs to read only if the hit a "backfillfull_ratio" ... but what will happen if the pool gets to 100% full ?
Will IO still be possible?
No limits / quotas are set on the pool ...
I have a Ceph 16.2.12 cluster with uniform hardware, same drive make/model,
etc. A particular OSD is showing higher latency than usual in `ceph osd
perf`, usually mid to high tens of milliseconds while other OSDs show low
single digits, although its drive's I/O stats don't look different from
those of other drives. The workload is mainly random 4K reads and writes,
the cluster is being used as Openstack VM storage.
Is there a way to trace, which particular PG, pool and disk image or object
cause this OSD's excessive latency? Is there a way to tell Ceph to
I would appreciate any advice or pointers.
We have a RGW cluster running Luminous (12.2.11) that has one object with an extremely large OMAP database in the index pool. Listomapkeys on the object returned 390 Million keys to start. Through bilog trim commands, we’ve whittled that down to about 360 Million. This is a bucket index for a regrettably unsharded bucket. There are only about 37K objects actually in the bucket, but through years of neglect, the bilog grown completely out of control. We’ve hit some major problems trying to deal with this particular OMAP object. We just crashed 4 OSDs when a bilog trim caused enough churn to knock one of the OSDs housing this PG out of the cluster temporarily. The OSD disks are 6.4TB NVMe, but are split into 4 partitions, each housing their own OSD daemon (collocated journal).
We want to be rid of this large OMAP object, but are running out of options to deal with it. Reshard outright does not seem like a viable option, as we believe the deletion would deadlock OSDs can could cause impact. Continuing to run `bilog trim` 1000 records at a time has been what we’ve done, but this also seems to be creating impacts to performance/stability. We are seeking options to remove this problematic object without creating additional problems. It is quite likely this bucket is abandoned, so we could remove the data, but I fear even the deletion of such a large OMAP could bring OSDs down and cause potential for metadata loss (the other bucket indexes on that same PG).
Any insight available would be highly appreciated.