Hi all,
I have a problem with exporting 2 different sub-folder ceph-fs kernel mounts via nfsd to the same IP address. The top-level structure on the ceph fs is something like /A/S1 and /A/S2. On a file server I mount /A/S1 and /A/S2 as two different file systems under /mnt/S1 and /mnt/S2 using the ceph fs kernel client. Then, these 2 mounts are exported with lines like these in /etc/exports:
/mnt/S1 -options NET
/mnt/S2 -options IP
IP is an element of NET, meaning that the host at IP should be the only host being able to access /mnt/S1 and /mnt/S2. What we observe is that any attempt to mount the export /mnt/S1 on the host at IP results in /mnt/S2 being mounted instead.
My first guess was that here we have a clash of fsids and the ceph fs is simply reporting the same fsid to nfsd and, hence, nfsd thinks both mountpoints contain the same. So I modified the second export line to
/mnt/S2 -options,fsid=100 IP
to no avail. The two folders are completely disjoint, neither symlinks nor hard-links between them. So it should be safe to export these as 2 different file systems.
Exporting such constructs to non-overlapping networks/IPs works as expected - even when exporting subdirs of a dir (like exporting /A/B and /A/B/C from the same file server to strictly different IPs). It seems the same-IP config that breaks expectations.
Am I missing here a magic -yes-i-really-know-what-i-am-doing hack? The file server is on AlmaLinux release 8.7 (Stone Smilodon) and all ceph packages match the ceph version octopus latest of our cluster.
Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
Hello all,
I have a weird behavior on my Cephfs.
On May 29th I noticed a drop of 50Tb in my data pool. It has been followed
by a decrease of space usage in the metadata pool since then.
From May 29th, still happening as I write, the metadata pool has lost 1Tb
over the initial 1.8Tb.
Regarding the number of objects, it was 8.4 Million and is now 7.8 Million.
I assume that my users have deleted a lot a of files on that day (our
cephfs consists a very small files of about 4Mb).
But such a huge decrease in metadata pool got me really concerned, it
really seems huge.
I thought that maybe it was the mds lazy deletion happening but I am not
sure.
Does anyone have any thought about this ?
Do you know more about lazy deletion ? There is not much documentation
about this online.
Do you know any way, command, logfile to search in order to get the current
lazy deletion operations ?
I noticed that the read_ops, read_bytes, write_ops and write_bytes reported
by the command rados df detail are negative for the metadata pool.
My cluster is running Nautilus.
Any help would be appreciated,
Best regards,
Nate
Hi guys,
I'm awake since 36h and try to restore a broken ceph Pool (2 PGs incomplete)
My vm are all broken. Some Boot, some Dont Boot...
Also I have 5 removed disk with Data of that Pool "in my Hands" - Dont ask...
So my question is it possible to restore Data of these other disks and "add" them thee others for healing?
Best regards
Ben
Hi,
a cluster has ms_bind_msgr1 set to false in the config database.
Newly created MONs still listen on port 6789 and add themselves as
providing messenger v1 into the monmap.
How do I change that?
Shouldn't the MONs use the config for ms_bind_msgr1?
Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin
https://www.heinlein-support.de
Tel: 030 / 405051-43
Fax: 030 / 405051-19
Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
Hi,
Is it possible to disable ACL in favor of bucket policy (on a bucket or
globally)?
The goal is to forbid users to use any bucket/object ACLs and only allow
bucket policies.
Seems there is no documentation in that regard which applies to Ceph RGW.
Apology if I am sending this in the wrong mailing list.
Regards,
Rasool
Hello.
I think i found a bug in cephadm/ceph orch:
Redeploying a container image (tested with alertmanager) after removing
a custom `mgr/cephadm/container_image_alertmanager` value, deploys the
previous container image and not the default container image.
I'm running `cephadm` from ubuntu 22.04 pkg 17.2.5-0ubuntu0.22.04.3 and
`ceph` version 17.2.6.
Here is an example. Node clrz20-08 is the node altermanager is running
on, clrz20-01 the node I'm controlling ceph from:
* Get alertmanager version
```
root@clrz20-08:~# cephadm ls | jq '.[] | select(.service_name ==
"alertmanager")| .container_image_name'
"quay.io/prometheus/alertmanager:v0.23.0"
```
* Set alertmanager image
```
root@clrz20-01:~# ceph config set mgr
mgr/cephadm/container_image_alertmanager quay.io/prometheus/alertmanager
root@clrz20-01:~# ceph config get mgr
mgr/cephadm/container_image_alertmanager
quay.io/prometheus/alertmanager
```
* redeploy altermanager
```
root@clrz20-01:~# ceph orch redeploy alertmanager
Scheduled to redeploy alertmanager.clrz20-08 on host 'clrz20-08'
```
* Get alertmanager version
```
root@clrz20-08:~# cephadm ls | jq '.[] | select(.service_name ==
"alertmanager")| .container_image_name'
"quay.io/prometheus/alertmanager:latest"
```
* Remove alertmanager image setting, revert to default:
```
root@clrz20-01:~# ceph config rm mgr
mgr/cephadm/container_image_alertmanager
root@clrz20-01:~# ceph config get mgr
mgr/cephadm/container_image_alertmanager
quay.io/prometheus/alertmanager:v0.23.0
```
* redeploy altermanager
```
root@clrz20-01:~# ceph orch redeploy alertmanager
Scheduled to redeploy alertmanager.clrz20-08 on host 'clrz20-08'
```
* Get alertmanager version
```
root@clrz20-08:~# cephadm ls | jq '.[] | select(.service_name ==
"alertmanager")| .container_image_name'
"quay.io/prometheus/alertmanager:latest"
```
-> `mgr/cephadm/container_image_alertmanager` is set to
`quay.io/prometheus/alertmanager:v0.23.0`, but redeploy uses
`quay.io/prometheus/alertmanager:latest`. This looks like a bug.
* Set alertmanager image explicitly to the default value
```
root@clrz20-01:~# ceph config set mgr
mgr/cephadm/container_image_alertmanager
quay.io/prometheus/alertmanager:v0.23.0
root@clrz20-01:~# ceph config get mgr
mgr/cephadm/container_image_alertmanager
quay.io/prometheus/alertmanager:v0.23.0
```
* redeploy altermanager
```
root@clrz20-01:~# ceph orch redeploy alertmanager
Scheduled to redeploy alertmanager.clrz20-08 on host 'clrz20-08'
```
* Get alertmanager version
```
root@clrz20-08:~# cephadm ls | jq '.[] | select(.service_name ==
"alertmanager")| .container_image_name'
"quay.io/prometheus/alertmanager:v0.23.0"
```
-> Setting `mgr/cephadm/container_image_alertmanager` to the default
setting fixes the issue.
Bests,
Daniel
Hi,
I usually install the SRPM and then build from ceph.spec like this:
rpmbuild -bb /root/rpmbuild/SPECS/ceph.spec --without ceph_test_package
But it take a long time and contain many packages that I don't need. So is there a way to optimize this build process for only needed package, for example ceph-radosgw?
Thanks.