June 2021 - ceph-users - lists.ceph.io

by Robert W. Eckert

Hi - this just started happening in the past few days using Ceph Pacific 16.2.4 via cephadmin (Podman containers) The dashboard is returning No active ceph-mgr instance is currently running the dashboard. A failover may be in progress. Retrying in 5 seconds... And ceph status returns cluster: id: fe3a7cb0-69ca-11eb-8d45-c86000d08867 health: HEALTH_WARN Module 'dashboard' has failed dependency: cannot import name 'AuthManager' clock skew detected on mon.cube services: mon: 3 daemons, quorum story,cube,rhel1 (age 46h) mgr: cube.tvlgnp(active, since 47h), standbys: rhel1.zpzsjc, story.gffann mds: 2/2 daemons up, 1 standby osd: 13 osds: 13 up (since 46h), 13 in (since 46h) rgw: 3 daemons active (3 hosts, 1 zones) data: volumes: 1/1 healthy pools: 11 pools, 497 pgs objects: 1.50M objects, 2.1 TiB usage: 6.2 TiB used, 32 TiB / 38 TiB avail pgs: 497 active+clean io: client: 255 B/s rd, 2.7 KiB/s wr, 0 op/s rd, 0 op/s wr The only thing that has happened on the cluster was one of the servers was rebooted. No configuration changes were performed Any suggestions? Thanks, rob

2 years, 11 months

2
3
0 0

driver name rbd.csi.ceph.com not found in the list of registered CSI drivers ?

by Ralph Soika

Hi, I try to connect my new ceph cluster (octopus) with my kubernetes system. Therefor I followed the setup guide form the official documentation: https://docs.ceph.com/en/octopus/rbd/rbd-kubernetes/ The csi-rbdplugin-provisioner is running successful on all my kubernetes worker nodes (as far as I can see). Now I try to deploy the nginx example : --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ceph-pvc spec: accessModes: - ReadWriteOnce volumeMode: Filesystem resources: requests: storage: 1Gi storageClassName: ceph --- apiVersion: v1 kind: Pod metadata: name: csi-rbd-demo-pod spec: containers: - name: web-server image: nginx volumeMounts: - name: mypvc mountPath: /var/lib/www/html volumes: - name: mypvc persistentVolumeClaim: claimName: ceph-pvc readOnly: false A Persitence volume is created $ kubectl get pv .... pvc-5c64ed45-adde-4fe7-9b38-9d4c7a8f7d34 1Gi RWO Delete Bound default/ceph-pvc ceph 7m15s and also the persitence volume claim: $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ceph-pvc Bound pvc-5c64ed45-adde-4fe7-9b38-9d4c7a8f7d34 1Gi RWO ceph 8m14s But the POD is not deployed, because of the following error message: MountVolume.MountDevice failed for volume "pvc-5c64ed45-adde-4fe7-9b38-9d4c7a8f7d34" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name rbd.csi.ceph.com not found in the list of registered CSI drivers Can someone help me to understand the meaning of this error message? Did I need to install something else ? Maybe a ceph-fs-plugin...? Thanks for any help === Ralph --

2 years, 11 months

1
1
0 0

suggestion for Ceph client network config

by Götz Reinicke

Hi all We get a new samba smb fileserver who mounts our cephfs for exporting some shares. What might be a good or better network setup for that server? Should I configure two interfaces - one for the smb share export towards our workstations and desktops and one towards the ceph cluster? Or would it be „ok“ for all traffic to be on one interface? The server has 40G ports. Thanks for your suggestions and feedback . Regards . Götz

2 years, 11 months

2
1
0 0

lib remoto in ubuntu

by Alfredo Rezinovsky

I cannot enable cephadm because it cannot find remoto lib. Even when I installed it using "pip3 install remoto" and then installed ir from the deb package build from the git sources at https://github.com/alfredodeza/remoto/ If I type "import remoto" in a python3 prompt it works. -- Alfrenovsky

2 years, 11 months

2
1
0 0

Ceph Ansible fails on check if monitor initial keyring already exists

by Jared Jacob

I am running the Ceph ansible script to install ceph version Stable-6.0 (Pacific). When running the sample yml file that was supplied by the github repo it runs fine up until the "ceph-mon : check if monitor initial keyring already exists" step. There it will hang for 30-40 minutes before failing. From my understanding ceph ansible should be creating this keyring and using it for communication between monitors, so does anyone know why the playbook would have a hard time with this step? Thanks in advance!

2 years, 11 months

2
1
0 0

ceph and openstack throttling experience

by Marcel Kuiper

Hi We're running ceph nautilus 14.2.21 (going to octopus latest in a few weeks) as volume and instance backend for our openstack vm's. Our clusters run somewhere between 500 - 1000 OSDs on SAS HDDs with NVMe's as journal and db device Currently we do not have our vm's capped on iops and throughput. We regularly get slowops warnings (once or twice per day) and wonder whether there are more users with sort of the same setup that do throttle their openstack vm's. - What kind of numbers are used in the field for IOPS and throughput limiting? - As a side question, is there an easy way to get rid of the slowops warning besides restarting the involved osd. Otherwise the warning seems to stay forever Regards Marcel

2 years, 11 months

2
3
0 0

Ceph Octopus - How to customize the Grafana configuration

by Ralph Soika

Hello, I have installed and bootsraped a Ceph manager node via cephadm and the options: --initial-dashboard-user admin --initial-dashboard-password [PASSWORD] --dashboard-password-noupdate Everything works fine. I also have the Grafana Board to monitor my cluster. But the access to Grafana is open for anonymous users because of the grafana.ini template with the option: [auth.anonymous] enabled = true I can't figure out how to tweak the default grafana.ini file. Can someone help me how to do this? I tried to do this with the command: # ceph config-key set mgr/cephadm/services/grafana/grafana.ini \ -i /tmp//grafana.ini.j2 # ceph orch reconfig grafana But without any effect. I also did not really understand where I should place the grafana.ini file on my Host? Thanks for any help === Ralph --

2 years, 11 months

2
3
0 0

nautilus: rbd ls returns ENOENT for some images

by Peter Lieven

Hi, we currently run into an issue where a rbd ls for a namespace returns ENOENT for some of the images in that namespace. /usr/bin/rbd --conf=XXX --id XXX ls 'mypool/28ef9470-76eb-4f77-bc1b-99077764ff7c' -l --format=json 2021-06-09 11:03:34.916 7f2225ffb700 -1 librbd::io::AioCompletion: 0x55cacccc2390 fail: (2) No such file or directory 2021-06-09 11:03:34.916 7f2225ffb700 -1 librbd::io::AioCompletion: 0x55caccd2b920 fail: (2) No such file or directory 2021-06-09 11:03:34.920 7f2225ffb700 -1 librbd::io::AioCompletion: 0x55caccd9b4e0 fail: (2) No such file or directory rbd: error opening 34810ac2-3112-4fef-938c-b76338b0eeaf.raw: (2) No such file or directory rbd: error opening c9882583-6dd5-4eca-bb82-3e81f7d63fa9.raw: (2) No such file or directory rbd: error opening 5d5251d1-f017-4382-845c-65e504683742.raw: (2) No such file or directory 2021-06-09 11:03:34.924 7f2225ffb700 -1 librbd::io::AioCompletion: 0x55cacce07b00 fail: (2) No such file or directory rbd: error opening c625b898-ec34-4446-9455-d2b70d9e378f.raw: (2) No such file or directory 2021-06-09 11:03:34.924 7f2225ffb700 -1 librbd::io::AioCompletion: 0x55caccd7cce0 fail: (2) No such file or directory rbd: error opening 990c4bbe-6a7b-4adf-aab8-432e18d79e58.raw: (2) No such file or directory 2021-06-09 11:03:34.924 7f2225ffb700 -1 librbd::io::AioCompletion: 0x55cacce336f0 fail: (2) No such file or directory rbd: error opening 7382eb5b-a3eb-41e2-89b6-512f7b1d86c0.raw: (2) No such file or directory [{"image":"108600c6-2312-4d61-9f5b-35b351112512.raw","size":31457280000,"format":2,"lock_type":"exclusive"},{"image":"1292ef0c-2333-44f1-be30-39105f7d176e.raw","size":262149242880,"format":2,"lock_type":"exclusive"},{"image":"8cda5c3f-cdbd-42f4-918f-1480354e7965.raw","size":262149242880,"format":2,"lock_type":"exclusive"}] rbd: listing images failed: (2) No such file or directory The way to trigger this state was that the images which show "No such file or directory" were deleted with rbd rm, but the operation was interrupted (rbd process was killed) due to a timeout. What is the best way to recover from this and how to properly clean up? Release is nautilus 14.2.20 Thanks, Peter

2 years, 11 months

2
4
0 0

Integration of openstack to ceph

by Michel Niyoyita

Dear Ceph Users, Anyone can help on the guidance of how I can integrate ceph to openstack ? especially RGW. Regards Michel

2 years, 11 months

3
2
0 0

delete stray OSD daemon after replacing disk

by mabi

Hello, I replaced an OSD disk on one of my Nautilus OSD node which created a new osd number. Now ceph shows that there is one cephadm stray daemon (the old OSD #1 which I replaced) and which I can't remove as you can see below: # ceph health detail HEALTH_WARN 1 stray daemon(s) not managed by cephadm [WRN] CEPHADM_STRAY_DAEMON: 1 stray daemon(s) not managed by cephadm stray daemon osd.1 on host ceph1e not managed by cephadm # ceph orch daemon rm osd.1 --force Error EINVAL: Unable to find daemon(s) ['osd.1'] Is there another command I am missing? Best regards, Mabi

2 years, 11 months

3
4
0 0

2024

2023

2022

2021

2020

2019

ceph-users June 2021