Hi,
We experienced a strange issue with a CephFS snapshot becoming partially
unreadable.
The snapshot was created about 2 months ago and we started a read operation
from it. For a while everything was working fine with all directories
accessible, however after some point clients (FUSE, v15.2.9) started
complaining about I/O error on directories that were working fine
previously.
When listing the top level contents of the snapshot, the directories that
show I/O error (e.g. home) are reporting missing metadata when listing via
'ls':
d????????? ? ? ? ? ? home
Creating new snapshots is working properly and otherwise the whole cluster
(Ceph v15.2.9) is reporting a healthy status.
Has anyone experienced an issue like this before? We tried to restart the
MDS servers, however this didn't solve the issue.
As additional context, during the read time from the snapshot other
snaptrim operations (for other subvolumes) were ongoing, but not for the
one affected by the data error.
Any insight into what might cause this and how to avoid / recover from such
a situation would be much appreciated.
Thank you and kind regards,
Andras
Hi Amit,
I just pinged the mons from every system and they are all available.
Am Mo., 10. Mai 2021 um 21:18 Uhr schrieb Amit Ghadge <amitg.b14(a)gmail.com>:
> We seen slowness due to unreachable one of them mgr service, maybe here
> are different, you can check monmap/ ceph.conf mon entry and then verify
> all nodes are successfully ping.
>
>
> -AmitG
>
>
> On Tue, 11 May 2021 at 12:12 AM, Boris Behrens <bb(a)kervyn.de> wrote:
>
>> Hi guys,
>>
>> does someone got any idea?
>>
>> Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens <bb(a)kervyn.de>:
>>
>> > Hi,
>> > since a couple of days we experience a strange slowness on some
>> > radosgw-admin operations.
>> > What is the best way to debug this?
>> >
>> > For example creating a user takes over 20s.
>> > [root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user
>> > --display-name=test-bb-user
>> > 2021-05-05 14:08:14.297 7f6942286840 1 robust_notify: If at first you
>> > don't succeed: (110) Connection timed out
>> > 2021-05-05 14:08:14.297 7f6942286840 0 ERROR: failed to distribute
>> cache
>> > for eu-central-1.rgw.users.uid:test-bb-user
>> > 2021-05-05 14:08:24.335 7f6942286840 1 robust_notify: If at first you
>> > don't succeed: (110) Connection timed out
>> > 2021-05-05 14:08:24.335 7f6942286840 0 ERROR: failed to distribute
>> cache
>> > for eu-central-1.rgw.users.keys:****
>> > {
>> > "user_id": "test-bb-user",
>> > "display_name": "test-bb-user",
>> > ....
>> > }
>> > real 0m20.557s
>> > user 0m0.087s
>> > sys 0m0.030s
>> >
>> > First I thought that rados operations might be slow, but adding and
>> > deleting objects in rados are fast as usual (at least from my
>> perspective).
>> > Also uploading to buckets is fine.
>> >
>> > We changed some things and I think it might have to do with this:
>> > * We have a HAProxy that distributes via leastconn between the 3
>> radosgw's
>> > (this did not change)
>> > * We had three times a daemon with the name "eu-central-1" running (on
>> the
>> > 3 radosgw's)
>> > * Because this might have led to our data duplication problem, we have
>> > split that up so now the daemons are named per host (eu-central-1-s3db1,
>> > eu-central-1-s3db2, eu-central-1-s3db3)
>> > * We also added dedicated rgw daemons for garbage collection, because
>> the
>> > current one were not able to keep up.
>> > * So basically ceph status went from "rgw: 1 daemon active
>> (eu-central-1)"
>> > to "rgw: 14 daemons active (eu-central-1-s3db1, eu-central-1-s3db2,
>> > eu-central-1-s3db3, gc-s3db12, gc-s3db13...)
>> >
>> >
>> > Cheers
>> > Boris
>> >
>>
>>
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groüen Saal.
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
>
--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
Is anyone trying Ceph clusters containing larger (4-8TB) SSD drives?
8TB SSDs are described here (
https://www.anandtech.com/show/16136/qlc-8tb-ssd-review-samsung-870-qvo-sab…
) and make use QLC NAND flash memory to reach the costs and capacity.
Currently, the 8TB Samsung 870 SSD is $800/ea at some online retail stores.
SATA form-factor SSDs can reach read/write rates of 560/520 MB/s, while not
as great as nVME drives is still a multiple faster than 7200 RPM drives.
SSDs now appear to have much lower failure rates than HDs in 2021 (
https://www.techspot.com/news/89590-backblaze-latest-storage-reliability-fi…
).
Are there any major caveats to considering working with larger SSDs for
data pools?
Thanks,
Matt
--
Matt Larson, PhD
Madison, WI 53705 U.S.A.
I'm deploying 6 ceph servers with 128GB of memory each, 12 SSDs of 1 Tb on
each server, 10Gb network cards connected to 10Gb port switches. I'm
following this documentation
https://docs.ceph.com/en/octopus/cephadm/install/
But I don't know if this is the best way to get the most out of the disks,
I will use it with RBD only and deliver it to a proxmox cluster. Do you
have any more complete documentation? Some tuning tips for the best SSD
speed process?
Many Tks
Dear cephers,
today it seems I observed an impossible event for the first time: an OSD host crashed, but the ceph health monitoring did not recognise the crash. Not a single OSD was marked down and IO simply stopped, waiting for the crashed OSDs to respond. All that was reported was slow ops, slow meta data IO, MDS behind on trimming, but no OSD fail. I have rebooted these machines a lot of times and have never seen the health check fail to recognise that instantly. The only difference I see is that these were clean shut-downs, not crashes (I believe the OSDs mark themselves as down).
For debugging this problem, can anyone provide me with a pointer when this could be the result of a misconfiguration?
Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
Hi,
since a couple of days we experience a strange slowness on some
radosgw-admin operations.
What is the best way to debug this?
For example creating a user takes over 20s.
[root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user
--display-name=test-bb-user
2021-05-05 14:08:14.297 7f6942286840 1 robust_notify: If at first you
don't succeed: (110) Connection timed out
2021-05-05 14:08:14.297 7f6942286840 0 ERROR: failed to distribute cache
for eu-central-1.rgw.users.uid:test-bb-user
2021-05-05 14:08:24.335 7f6942286840 1 robust_notify: If at first you
don't succeed: (110) Connection timed out
2021-05-05 14:08:24.335 7f6942286840 0 ERROR: failed to distribute cache
for eu-central-1.rgw.users.keys:****
{
"user_id": "test-bb-user",
"display_name": "test-bb-user",
....
}
real 0m20.557s
user 0m0.087s
sys 0m0.030s
First I thought that rados operations might be slow, but adding and
deleting objects in rados are fast as usual (at least from my perspective).
Also uploading to buckets is fine.
We changed some things and I think it might have to do with this:
* We have a HAProxy that distributes via leastconn between the 3 radosgw's
(this did not change)
* We had three times a daemon with the name "eu-central-1" running (on the
3 radosgw's)
* Because this might have led to our data duplication problem, we have
split that up so now the daemons are named per host (eu-central-1-s3db1,
eu-central-1-s3db2, eu-central-1-s3db3)
* We also added dedicated rgw daemons for garbage collection, because the
current one were not able to keep up.
* So basically ceph status went from "rgw: 1 daemon active (eu-central-1)"
to "rgw: 14 daemons active (eu-central-1-s3db1, eu-central-1-s3db2,
eu-central-1-s3db3, gc-s3db12, gc-s3db13...)
Cheers
Boris
Hi,
Suddenly we have a recovery_unfound situation. I find that PG acting set is
missing some OSDs which are up. Why can't OSD 3 and 71 on following PG
query result be members of PG acting set? Currently, we use v15.2.8. How to
recover from this situation?
{
"snap_trimq": "[]",
"snap_trimq_len": 0,
"state":
"active+forced_recovery+recovery_unfound+undersized+degraded+remapped",
"epoch": 237505,
"up": [
3,
237,
71,
132,
115,
56
],
"acting": [
2147483647,
237,
2147483647,
132,
115,
56
],
"backfill_targets": [
"3(0)",
"71(2)"
],
"acting_recovery_backfill": [
"3(0)",
"56(5)",
"71(2)",
"115(4)",
"132(3)",
"237(1)"
],
Best regards.