May 2021 - ceph-users - lists.ceph.io

CephFS Subvolume Snapshot data corruption?

by Andras Sali

Hi, We experienced a strange issue with a CephFS snapshot becoming partially unreadable. The snapshot was created about 2 months ago and we started a read operation from it. For a while everything was working fine with all directories accessible, however after some point clients (FUSE, v15.2.9) started complaining about I/O error on directories that were working fine previously. When listing the top level contents of the snapshot, the directories that show I/O error (e.g. home) are reporting missing metadata when listing via 'ls': d????????? ? ? ? ? ? home Creating new snapshots is working properly and otherwise the whole cluster (Ceph v15.2.9) is reporting a healthy status. Has anyone experienced an issue like this before? We tried to restart the MDS servers, however this didn't solve the issue. As additional context, during the read time from the snapshot other snaptrim operations (for other subvolumes) were ongoing, but not for the one affected by the data error. Any insight into what might cause this and how to avoid / recover from such a situation would be much appreciated. Thank you and kind regards, Andras

2 years, 11 months

1
0
0 0

one ODS out-down after upgrade to v16.2.3

by Milosz Szewczak

Hi, I updated cluster as usual to newest version 16.2.3 from 16.2.1 ceph orch upgrade start --ceph-version 16.2.3 all went fine.... except on of OSD.54 stop working on host i see in logs ( after try to manual start ) ``` root@ceph-nvme01:/var/lib/ceph/77fc6eb4-7146-11eb-aa58-55847fcdb1f1/osd.54# /usr/bin/docker run --rm --ipc=host --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init --name ceph-77fc6eb4-7146-11eb-aa58-55847fcdb1f1-osd.54-activate -e CONTAINER_IMAGE=docker.io/ceph/ceph@sha256:c820cef23fb93518d5b35683d6301bae36511e52e0f8cd1495fd58805b849383 -e NODE_NAME=ceph-nvme01 -e CEPH_USE_RANDOM_NONCE=1 -v /var/run/ceph/77fc6eb4-7146-11eb-aa58-55847fcdb1f1:/var/run/ceph:z -v /var/log/ceph/77fc6eb4-7146-11eb-aa58-55847fcdb1f1:/var/log/ceph:z -v /var/lib/ceph/77fc6eb4-7146-11eb-aa58-55847fcdb1f1/crash:/var/lib/ceph/crash:z -v /var/lib/ceph/77fc6eb4-7146-11eb-aa58-55847fcdb1f1/osd.54:/var/lib/ceph/osd/ceph-54:z -v /var/lib/ceph/77fc6eb4-7146-11eb-aa58-55847fcdb1f1/osd.54/config:/etc/ceph/ceph.conf:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm docker.io/ceph/ceph@sha256:c820cef23fb93518d5b35683d6301bae36511e52e0f8cd1495fd58805b849383 lvm activate 54 07b257ee-bb09-4c4e-9f70-9f0da56253c7 --no-systemd Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-54 Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-e5ddebd9-52f5-481d-9470-fb2c94fe79e3/osd-block-07b257ee-bb09-4c4e-9f70-9f0da56253c7 --path /var/lib/ceph/osd/ceph-54 --no-mon-config stderr: failed to read label for /dev/ceph-e5ddebd9-52f5-481d-9470-fb2c94fe79e3/osd-block-07b257ee-bb09-4c4e-9f70-9f0da56253c7: (2) No such file or directory --> RuntimeError: command returned non-zero exit status: 1 ``` ----- ``` root@ceph-nvme01:/var/lib/ceph/77fc6eb4-7146-11eb-aa58-55847fcdb1f1/osd.54# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 55.5M 1 loop /snap/core18/1988 loop2 7:2 0 70.4M 1 loop /snap/lxd/19647 loop3 7:3 0 55.5M 1 loop /snap/core18/1997 loop4 7:4 0 32.3M 1 loop /snap/snapd/11588 loop5 7:5 0 32.3M 1 loop /snap/snapd/11402 loop6 7:6 0 67.6M 1 loop /snap/lxd/20326 sda 8:0 1 447.1G 0 disk ├─sda1 8:1 1 512M 0 part └─sda2 8:2 1 445G 0 part └─md0 9:0 0 445G 0 raid1 / sdb 8:16 1 447.1G 0 disk ├─sdb1 8:17 1 512M 0 part /boot/efi └─sdb2 8:18 1 445G 0 part └─md0 9:0 0 445G 0 raid1 / nvme1n1 259:0 0 2.9T 0 disk └─ceph--257763b1--ef44--43e8--a8a7--f046f5a69fd4-osd--block--1086e697--7949--417b--9ce2--1459db52bbe2 253:0 0 2.9T 0 lvm nvme0n1 259:1 0 2.9T 0 disk └─ceph--7488e6b6--5883--4208--8951--0e3e5cc1c903-osd--block--aea644af--7a01--49c9--8169--cb6ec24d8b78 253:1 0 2.9T 0 lvm nvme2n1 259:5 0 2.9T 0 disk └─ceph--de73e328--c694--4f46--9a5f--0a7e356cab85-osd--block--b3a93422--57c0--4be3--ad61--836bb7763fae 253:4 0 2.9T 0 lvm nvme4n1 259:6 0 2.9T 0 disk └─ceph--08d2cbac--cb88--40cf--ad6e--88fc5ca3491c-osd--block--35ec77e0--db35--4a43--bb25--4666ce0ea756 253:2 0 2.9T 0 lvm nvme3n1 259:7 0 2.9T 0 disk └─ceph--f28ba826--b16c--4e4e--b0de--f556274bbc07-osd--block--399337e3--f2f8--4561--8e02--2e9830b31ec5 253:3 0 2.9T 0 lvm nvme7n1 259:8 0 2.9T 0 disk └─ceph--0c92fca4--5ff6--49f6--b994--694253ac9225-osd--block--973a39c9--f330--46f4--afc9--fb11f2862d5d 253:9 0 2.9T 0 lvm nvme10n1 259:9 0 2.9T 0 disk └─ceph--e5ddebd9--52f5--481d--9470--fb2c94fe79e3-osd--block--07b257ee--bb09--4c4e--9f70--9f0da56253c7 253:6 0 2.9T 0 lvm nvme6n1 259:10 0 2.9T 0 disk └─ceph--caf0ef69--b335--4b6e--a560--ecd4cd74fc9b-osd--block--f0bfe9f0--26f8--4dd4--95eb--3ef5a728cbf9 253:8 0 2.9T 0 lvm nvme8n1 259:11 0 2.9T 0 disk └─ceph--a4da070c--9db2--408e--9fae--b7b61023177f-osd--block--66b6ce4e--8ce3--4960--980d--e617c7378763 253:10 0 2.9T 0 lvm nvme9n1 259:12 0 2.9T 0 disk └─ceph--21562262--bc8b--4002--8575--7b6729ffef84-osd--block--9c427785--3dbe--4def--ab30--a19579fa63ce 253:11 0 2.9T 0 lvm nvme11n1 259:13 0 2.9T 0 disk └─ceph--2de30cc9--9d2a--4f61--818e--ce91a90c3489-osd--block--c4bc16fb--a645--42a6--a2d2--b09c812843e7 253:7 0 2.9T 0 lvm nvme5n1 259:15 0 2.9T 0 disk └─ceph--aebd9d97--b693--404e--9a81--55df782a15af-osd--block--2fffb8f0--f41d--423c--becd--88339631de13 253:5 0 2.9T 0 lvm ``` Additional i saw in GUI that all other OSD have *ceph_version* ceph version 16.2.3 (381b476cb3900f9a92eb95d03b4850b953cfd79a) pacific (stable) but failied one have *ceph_version* ceph version 16.2.1 (afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable) so look like that it didn't updated well. How to fix this error ? Milosz

2 years, 11 months

1
0
0 0

Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

by Boris Behrens

Hi Amit, I just pinged the mons from every system and they are all available. Am Mo., 10. Mai 2021 um 21:18 Uhr schrieb Amit Ghadge <amitg.b14(a)gmail.com>: > We seen slowness due to unreachable one of them mgr service, maybe here > are different, you can check monmap/ ceph.conf mon entry and then verify > all nodes are successfully ping. > > > -AmitG > > > On Tue, 11 May 2021 at 12:12 AM, Boris Behrens <bb(a)kervyn.de> wrote: > >> Hi guys, >> >> does someone got any idea? >> >> Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens <bb(a)kervyn.de>: >> >> > Hi, >> > since a couple of days we experience a strange slowness on some >> > radosgw-admin operations. >> > What is the best way to debug this? >> > >> > For example creating a user takes over 20s. >> > [root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user >> > --display-name=test-bb-user >> > 2021-05-05 14:08:14.297 7f6942286840 1 robust_notify: If at first you >> > don't succeed: (110) Connection timed out >> > 2021-05-05 14:08:14.297 7f6942286840 0 ERROR: failed to distribute >> cache >> > for eu-central-1.rgw.users.uid:test-bb-user >> > 2021-05-05 14:08:24.335 7f6942286840 1 robust_notify: If at first you >> > don't succeed: (110) Connection timed out >> > 2021-05-05 14:08:24.335 7f6942286840 0 ERROR: failed to distribute >> cache >> > for eu-central-1.rgw.users.keys:**** >> > { >> > "user_id": "test-bb-user", >> > "display_name": "test-bb-user", >> > .... >> > } >> > real 0m20.557s >> > user 0m0.087s >> > sys 0m0.030s >> > >> > First I thought that rados operations might be slow, but adding and >> > deleting objects in rados are fast as usual (at least from my >> perspective). >> > Also uploading to buckets is fine. >> > >> > We changed some things and I think it might have to do with this: >> > * We have a HAProxy that distributes via leastconn between the 3 >> radosgw's >> > (this did not change) >> > * We had three times a daemon with the name "eu-central-1" running (on >> the >> > 3 radosgw's) >> > * Because this might have led to our data duplication problem, we have >> > split that up so now the daemons are named per host (eu-central-1-s3db1, >> > eu-central-1-s3db2, eu-central-1-s3db3) >> > * We also added dedicated rgw daemons for garbage collection, because >> the >> > current one were not able to keep up. >> > * So basically ceph status went from "rgw: 1 daemon active >> (eu-central-1)" >> > to "rgw: 14 daemons active (eu-central-1-s3db1, eu-central-1-s3db2, >> > eu-central-1-s3db3, gc-s3db12, gc-s3db13...) >> > >> > >> > Cheers >> > Boris >> > >> >> >> -- >> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >> groÃƒ¼en Saal. >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io >> > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groÃƒ¼en Saal.

2 years, 11 months

1
0
0 0

Building ceph clusters with 8TB SSD drives?

by Matt Larson

Is anyone trying Ceph clusters containing larger (4-8TB) SSD drives? 8TB SSDs are described here ( https://www.anandtech.com/show/16136/qlc-8tb-ssd-review-samsung-870-qvo-sab… ) and make use QLC NAND flash memory to reach the costs and capacity. Currently, the 8TB Samsung 870 SSD is $800/ea at some online retail stores. SATA form-factor SSDs can reach read/write rates of 560/520 MB/s, while not as great as nVME drives is still a multiple faster than 7200 RPM drives. SSDs now appear to have much lower failure rates than HDs in 2021 ( https://www.techspot.com/news/89590-backblaze-latest-storage-reliability-fi… ). Are there any major caveats to considering working with larger SSDs for data pools? Thanks, Matt -- Matt Larson, PhD Madison, WI 53705 U.S.A.

2 years, 12 months

5
4
0 0

How to deploy ceph with ssd ?

by codignotto

I'm deploying 6 ceph servers with 128GB of memory each, 12 SSDs of 1 Tb on each server, 10Gb network cards connected to 10Gb port switches. I'm following this documentation https://docs.ceph.com/en/octopus/cephadm/install/ But I don't know if this is the best way to get the most out of the disks, I will use it with RBD only and deliver it to a proxmox cluster. Do you have any more complete documentation? Some tuning tips for the best SSD speed process? Many Tks

2 years, 12 months

1
0
0 0

v16.2.2 Pacific released

by David Galloway

This is the second backport release in the Pacific stable series. For a detailed release notes with links & changelog please refer to the official blog entry at https://ceph.io/releases/v16-2-2-pacific-released Notable Changes --------------- * Cephadm now supports an *ingress* service type that provides load balancing and HA (via haproxy and keepalived on a virtual IP) for RGW service. The experimental *rgw-ha* service has been removed. Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at http://download.ceph.com/tarballs/ceph-16.2.2.tar.gz * For packages, see http://docs.ceph.com/docs/master/install/get-packages/ * Release git sha1: e8f22dde28889481f4dda2beb8a07788204821d3

2 years, 12 months

4
3
0 0

Host crash undetected by ceph health check

by Frank Schilder

Dear cephers, today it seems I observed an impossible event for the first time: an OSD host crashed, but the ceph health monitoring did not recognise the crash. Not a single OSD was marked down and IO simply stopped, waiting for the crashed OSDs to respond. All that was reported was slow ops, slow meta data IO, MDS behind on trimming, but no OSD fail. I have rebooted these machines a lot of times and have never seen the health check fail to recognise that instantly. The only difference I see is that these were clean shut-downs, not crashes (I believe the OSDs mark themselves as down). For debugging this problem, can anyone provide me with a pointer when this could be the result of a misconfiguration? Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

2 years, 12 months

1
1
0 0

radosgw-admin user create takes a long time (with failed to distribute cache message)

by Boris Behrens

Hi, since a couple of days we experience a strange slowness on some radosgw-admin operations. What is the best way to debug this? For example creating a user takes over 20s. [root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user --display-name=test-bb-user 2021-05-05 14:08:14.297 7f6942286840 1 robust_notify: If at first you don't succeed: (110) Connection timed out 2021-05-05 14:08:14.297 7f6942286840 0 ERROR: failed to distribute cache for eu-central-1.rgw.users.uid:test-bb-user 2021-05-05 14:08:24.335 7f6942286840 1 robust_notify: If at first you don't succeed: (110) Connection timed out 2021-05-05 14:08:24.335 7f6942286840 0 ERROR: failed to distribute cache for eu-central-1.rgw.users.keys:**** { "user_id": "test-bb-user", "display_name": "test-bb-user", .... } real 0m20.557s user 0m0.087s sys 0m0.030s First I thought that rados operations might be slow, but adding and deleting objects in rados are fast as usual (at least from my perspective). Also uploading to buckets is fine. We changed some things and I think it might have to do with this: * We have a HAProxy that distributes via leastconn between the 3 radosgw's (this did not change) * We had three times a daemon with the name "eu-central-1" running (on the 3 radosgw's) * Because this might have led to our data duplication problem, we have split that up so now the daemons are named per host (eu-central-1-s3db1, eu-central-1-s3db2, eu-central-1-s3db3) * We also added dedicated rgw daemons for garbage collection, because the current one were not able to keep up. * So basically ceph status went from "rgw: 1 daemon active (eu-central-1)" to "rgw: 14 daemons active (eu-central-1-s3db1, eu-central-1-s3db2, eu-central-1-s3db3, gc-s3db12, gc-s3db13...) Cheers Boris

2 years, 12 months

1
1
0 0

Weird PG Acting Set

by Lazuardi Nasution

Hi, Suddenly we have a recovery_unfound situation. I find that PG acting set is missing some OSDs which are up. Why can't OSD 3 and 71 on following PG query result be members of PG acting set? Currently, we use v15.2.8. How to recover from this situation? { "snap_trimq": "[]", "snap_trimq_len": 0, "state": "active+forced_recovery+recovery_unfound+undersized+degraded+remapped", "epoch": 237505, "up": [ 3, 237, 71, 132, 115, 56 ], "acting": [ 2147483647, 237, 2147483647, 132, 115, 56 ], "backfill_targets": [ "3(0)", "71(2)" ], "acting_recovery_backfill": [ "3(0)", "56(5)", "71(2)", "115(4)", "132(3)", "237(1)" ], Best regards.

2 years, 12 months

2
3
0 0

Performance compare between CEPH multi replica and EC

by zp_8483

Hi all, How much EC performance will degrade compared to multi replica when using the same hardware configuration. Is there any offical data?

2 years, 12 months

2
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users May 2021