Hi you all,
I'm almost new to ceph and I'm understanding, day by day, why the
official support is so expansive :)
I setting up a ceph nfs network cluster whose recipe can be found here
below.
#######################
--> cluster creation cephadm bootstrap --mon-ip 10.20.20.81
--cluster-network 10.20.20.0/24 --fsid $FSID --initial-dashboard-user adm \
--initial-dashboard-password 'Hi_guys' --dashboard-password-noupdate
--allow-fqdn-hostname --ssl-dashboard-port 443 \
--dashboard-crt /etc/ssl/wildcard.it/wildcard.it.crt --dashboard-key
/etc/ssl/wildcard.it/wildcard.it.key \
--allow-overwrite --cleanup-on-failure
cephadm shell --fsid $FSID -c /etc/ceph/ceph.conf -k
/etc/ceph/ceph.client.admin.keyring
cephadm add-repo --release reef && cephadm install ceph-common
--> adding hosts and set labels
for IP in $(grep ceph /etc/hosts | awk '{print $1}') ; do ssh-copy-id -f
-i /etc/ceph/ceph.pub root@$IP ; done
ceph orch host add cephstage01 10.20.20.81 --labels
_admin,mon,mgr,prometheus,grafana
ceph orch host add cephstage02 10.20.20.82 --labels
_admin,mon,mgr,prometheus,grafana
ceph orch host add cephstage03 10.20.20.83 --labels
_admin,mon,mgr,prometheus,grafana
ceph orch host add cephstagedatanode01 10.20.20.84 --labels
osd,nfs,prometheus
ceph orch host add cephstagedatanode02 10.20.20.85 --labels
osd,nfs,prometheus
ceph orch host add cephstagedatanode03 10.20.20.86 --labels
osd,nfs,prometheus
--> network setup and daemons deploy
ceph config set mon public_network 10.20.20.0/24,192.168.7.0/24
ceph orch apply mon
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83"
ceph orch apply mgr
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83"
ceph orch apply prometheus
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86"
ceph orch apply grafana
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86"
ceph orch apply node-exporter
ceph orch apply alertmanager
ceph config set mgr mgr/cephadm/secure_monitoring_stack true
--> disks and osd setup
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ; do
ssh root@$IP "hostname && wipefs -a -f /dev/sdb&& wipefs -a -f
/dev/sdc"; done
ceph config set mgr mgr/cephadm/device_enhanced_scan true
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch device ls --hostname=$IP --wide --refresh ; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch device zap $IP /dev/sdb; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch device zap $IP /dev/sdc ; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch daemon add osd $IP:/dev/sdb ; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch daemon add osd $IP:/dev/sdc ; done
--> ganesha nfs cluster
ceph mgr module enable nfs
ceph fs volume create vol1
ceph nfs cluster create nfs-cephfs
"cephstagedatanode01,cephstagedatanode02,cephstagedatanode03" --ingress
--virtual-ip 192.168.7.80 --ingress-mode default
ceph nfs export create cephfs --cluster-id nfs-cephfs --pseudo-path /mnt
--fsname vol1
--> nfs mount
mount -t nfs -o nfsvers=4.1,proto=tcp 192.168.7.80:/mnt /mnt/ceph
is my recipe correct?
the cluster is set up by 3 mon/mgr nodes and 3 osd/nfs nodes, on the
latters I installed one 3tb ssd, for the data, and one 300gb ssd for the
journaling but
my problems are :
- Although I can mount the export I can't write on it
- I can't understand how to use the sdc disks for journaling
- I can't understand the concept of "pseudo path"
here below you can find the json output of the exports
--> check
ceph nfs export ls nfs-cephfs
ceph nfs export info nfs-cephfs /mnt
------------------------------------
json file
---------
{
"export_id": 1,
"path": "/",
"cluster_id": "nfs-cephfs",
"pseudo": "/mnt",
"access_type": "RW",
"squash": "none",
"security_label": true,
"protocols": [
4
],
"transports": [
"TCP"
],
"fsal": {
"name": "CEPH",
"user_id": "nfs.nfs-cephfs.1",
"fs_name": "vol1"
},
"clients": []
}
------------------------------------
Thanks in advance
Rob
Hi,
We are testing rbd-mirroring. There seems to be a permission error with
the rbd-mirror user. Using this user to query the mirror pool status gives:
failed to query services: (13) Permission denied
And results in the following output:
health: UNKNOWN
daemon health: UNKNOWN
image health: OK
images: 3 total
2 replaying
1 stopped
So, this command: rbd --id rbd-mirror mirror pool status rbd
So basically the health and daemon health cannot be obtained due to
permission errors, but status about images can.
When the command is run with admin permissions the health and daemon
health are returned without issue.
I tested this on Reef 18.2.2.
Is this expected behavior? If not, I will create a tracker ticket for it.
Gr. Stefan
Hi,
in quay.io I can find a lot of grafana versions for ceph (https://quay.io/repository/ceph/grafana?tab=tags) how can I find out which version should be used when I upgrade my cluster to 17.2.x ? Can I simply take the latest grafana version? Or is there a specfic grafana version I need to use?
Hi,
I'm trying to estimate the possible impact when large PGs are
splitted. Here's one example of such a PG:
PG_STAT OBJECTS BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG UP
86.3ff 277708 414403098409 0 0 3092
3092
[187,166,122,226,171,234,177,163,155,34,81,239,101,13,117,8,57,111]
Their main application is RGW on EC (currently 1024 PGs on 240 OSDs),
8TB HDDs backed by SSDs. There are 6 RGWs running behind HAProxies. It
took me a while to convince them to do a PG split and now they're
trying to assess how big the impact could be. The fullest OSD is
already at 85% usage, the least filled one at 59%, so there is
definitely room for a better balancing which, will be necessary until
the new hardware arrives. The current distribution is around 100 PGs
per OSD which usually would be fine, but since the PGs are that large
only a few PGs difference have a huge impact on the OSD utilization.
I'm targeting 2048 PGs for that pool for now, probably do another
split when the new hardware has been integrated.
Any comments are appreciated!
Eugen
Colleagues, thank you for the advice to check the operability of MGRs. In fact, it is strange also: we checked our nodes for the network issues (ip connectivity, sockets, ACL, DNS) and find nothing wrong - but suddenly just the restart of all MGRs solved the problem with stale PGs and with ceph commands hang!
So, we are at the start point again - ceph is working except MDS daemons crash. But now we see some additional errors in MDS logs when try to start the daemon:
dir 0x1000dd10fa0 object missing on disk; some files may be lost (/volumes/csi/csi-vol-2eb40f89-f2e1-11ee-b657-3aa98da4c4a6/1080803d-1277-4ad8-ae80-a004bd3a5699/gallery/pc-12083932925583528732)
dir 0x1000dd10f9d object missing on disk; some files may be lost (/volumes/csi/csi-vol-2eb40f89-f2e1-11ee-b657-3aa98da4c4a6/1080803d-1277-4ad8-ae80-a004bd3a5699/cadserver-filevault/project-files/661fb14d341d3746ea5c2a8f
I promiced to create the bug, so will do it later a bit. But should I try to do something more from my side also? What I did exactly last time:
cephfs-journal-tool journal reset
cephfs-table-tool all reset session
cephfs-data-scan scan_extents
cephfs-data-scan scan_inodes
cephfs-data-scan scan_links
cephfs-data-scan cleanup
And one more question: is it possible to access to cephfs content directly, without MDS?
Colleagues, thank you for the advice to check the operability of MGRs. In fact, it is strange also: we checked our nodes for the network issues (ip connectivity, sockets, ACL, DNS) and find nothing wrong - but suddenly just the restart of all MGRs solved the problem with stale PGs and with ceph commands hang!
So, we are at the start point again - ceph is working except MDS daemons crash. But now we see some additional errors in MDS logs when try to start the daemon:
dir 0x1000dd10fa0 object missing on disk; some files may be lost (/volumes/csi/csi-vol-2eb40f89-f2e1-11ee-b657-3aa98da4c4a6/1080803d-1277-4ad8-ae80-a004bd3a5699/gallery/pc-12083932925583528732)
dir 0x1000dd10f9d object missing on disk; some files may be lost (/volumes/csi/csi-vol-2eb40f89-f2e1-11ee-b657-3aa98da4c4a6/1080803d-1277-4ad8-ae80-a004bd3a5699/cadserver-filevault/project-files/661fb14d341d3746ea5c2a8f
I promiced to create the bug, so will do it later a bit. But should I try to do something more from my side also? What I did exactly last time:
cephfs-journal-tool journal reset
cephfs-table-tool all reset session
cephfs-data-scan scan_extents
cephfs-data-scan scan_inodes
cephfs-data-scan scan_links
cephfs-data-scan cleanup
And one more question: is it possible to access to cephfs content directly, without MDS?
Hi,
We're testing with rbd-mirror (mode snapshot) and try to get status
updates about snapshots as fast a possible. We want to use rbd-mirror as
a migration tool between two clusters and keep downtime during migration
as short as possible. Therefore we have tuned the following parameters
and set them to 1 second (default 30 seconds):
rbd_mirror_pool_replayers_refresh_interval
rbd_mirror_image_state_check_interval
rbd_mirror_sync_point_update_age
However, on the destination cluster, the "last_update:" field is only
updated every 30 seconds. Is this tunable?
Goal is to determine when the last snapshot that is made on the source
has made it to the target and a demote (source) and promote (target) can
be initiated.
Gr. Stefan
On 4/26/24 15:47, Vahideh Alinouri wrote:
> The result of this command shows one of the servers in the cluster,
> but I have node-exporter daemons on all servers.
The default service specification looks like this:
service_type: node-exporter
service_name: node-exporter
placement:
host_pattern: '*'
If you apply this YAML code the orchestrator should deploy one
node-exporter daemon to each host of the cluster.
Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin
https://www.heinlein-support.de
Tel: 030 / 405051-43
Fax: 030 / 405051-19
Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
Hi,
Similar case as with previously fixed https://tracker.ceph.com/issues/48382 - https://github.com/ceph/ceph/pull/47308.
Confirmed on Cephadm deployed Ceph 18.2.2/17.2.7 with Openstack Antelope/Yoga.
I’m getting "404 NoSuchBucket" error with public buckets. Enabled with Swift/Keystone integration - everything else works fine.
With rgw_swift_account_in_url = true and proper endpoints: "https://rgw.test/swift/v1/AUTH_%(project_id)s"
ticking public access in horizon properly sets ACL on the bucket according to swift client:
swift -v stat test-bucket
URL: https://rgw.test/swift/v1/AUTH_daksjhdkajdshda/testbucket
Auth Token:
Account: AUTH_daksjhdkajdshda
Container: testbucket
Objects: 1
Bytes: 1021036
Read ACL: .r:*,.rlistings
Write ACL:
Sync To:
Sync Key:
X-Timestamp: 1710947159.41219
X-Container-Bytes-Used-Actual: 1024000
X-Storage-Policy: default-placement
X-Storage-Class: STANDARD
Last-Modified: Thu, 21 Mar 2024 10:30:05 GMT
X-Trans-Id: tx00000092ac12312312312-1231231231-1701e5-default
X-Openstack-Request-Id: tx00000092ac12312312312-1231231231-1701e5-default
Accept-Ranges: bytes
Content-Type: text/plain; charset=utf-8
however still getting 404 NoSuchBucket error
Could someone using the latest version of Ceph with Swift/Keystone integration please test public buckets? Thank you.
Best regards,
Bartosz Bezak