February 2024 - ceph-users

Ceph as rootfs?

by Jeremy Hansen

Is it possible to use Ceph as a root filesystem for a pxe booted host? Thanks

3 months, 1 week

5
4
0 0

CompleteMultipartUpload takes a long time to finish

by Ondřej Kukla

Hello, For some time now I’m struggling with the time it takes to CompleteMultipartUpload on one of my rgw clusters. I have a customer with ~8M objects in one bucket uploading quite a large files. From 100GB to like 800GB. I’ve noticed when they are uploading ~200GB files that the requests started timeouting on a LB we have infront of the rgw. When I’ve started going through the logs I’ve noticed that the CompleteMultipartUpload request took like 700s to finish. Which seemed ok-ish, but the number seem quite large. However, when they started uploading 750GB files the time to complete the multipart upload ended around 2500s -> more than 40minutes which seems like a way to much. Do you have a similar experience? Is there anything we can do to improve this? How much time does the CompleteMultipartUpload takes on your clusters? The cluster is running on version 17.2.6. Regards, Ondrej

3 months, 1 week

2
2
0 0

Re: CompleteMultipartUpload takes a long time to finish

by Ondřej Kukla

Hello Anthony, The replicated index pool has about 20TiB of free space and we are using Intel P5510 NVMe Enterprise SSDs so I guess the HW shouldn’t be the issue. Yes, I’m able to change the timeout on our LB, but I’m not sure if I want to set it to 40minutes+… Ondrej > On 5. 2. 2024, at 20:09, Anthony D'Atri <anthony.datri(a)gmail.com> wrote: > > Do you have sufficient capacity in the non-ec pool? Is it on fast media? > > You should be able to increase the timeout on your LB. > >> On Feb 5, 2024, at 13:51, Ondřej Kukla <ondrej(a)kuuk.la> wrote: >> >> Hello, >> >> For some time now I’m struggling with the time it takes to CompleteMultipartUpload on one of my rgw clusters. >> >> I have a customer with ~8M objects in one bucket uploading quite a large files. From 100GB to like 800GB. >> >> >> I’ve noticed when they are uploading ~200GB files that the requests started timeouting on a LB we have infront of the rgw. >> >> When I’ve started going through the logs I’ve noticed that the CompleteMultipartUpload request took like 700s to finish. Which seemed ok-ish, but the number seem quite large. >> >> However, when they started uploading 750GB files the time to complete the multipart upload ended around 2500s -> more than 40minutes which seems like a way to much. >> >> >> Do you have a similar experience? Is there anything we can do to improve this? How much time does the CompleteMultipartUpload takes on your clusters? >> >> The cluster is running on version 17.2.6. >> >> Regards, >> >> Ondrej >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io >

3 months, 2 weeks

1
0
0 0

Re: RBD Image Returning 'Unknown Filesystem LVM2_member' On Mount - Help Please

by duluxoz

~~~ Hello, I think that /dev/rbd* devices are flitered "out" or not filter "in" by the fiter option in the devices section of /etc/lvm/lvm.conf. So pvscan (pvs, vgs and lvs) don't look at your device. ~~~ Hi Gilles, So the lvm filter from the lvm.conf file is set to the default of `filter = [ "a|.*|" ]`, so that's accept every block device, so no luck there :-( ~~~ For Ceph based LVM volumes, you would do this to import: Map every one of the RBDs to the host Include this in /etc/lvm/lvm.conf: types = [ "rbd", 1024 ] pvscan vgscan pvs vgs If you see the VG: vgimportclone -n <make a name for VG> /dev/rbd0 /dev/rbd1 ... --import Now you should be able to vgchange -a y <your VG> and see the LVs ~~~ Hi Alex, Did the above as you suggested - the rbd devices (3 of them, none of which were originally part of an lvm on the ceph servers - at least, not set up manually by me) still do not show up using pvscan, etc. So I still can't mount any of them (not without re-creating a fs, anyway, and thus losing the data I'm trying to read/import) - they all return the same error message (see original post). Anyone got any other ideas? <hopeful tone in voice> :-) Cheers Dulux-Oz

3 months, 2 weeks

4
4
0 0

Throughput metrics missing iwhen updating Ceph Quincy to Reef

by Jose Vicente

3 months, 2 weeks

4
8
0 0

How can I clone data from a faulty bluestore disk?

by Carl J Taylor

Hi, I have a small cluster with some faulty disks within it and I want to clone the data from the faulty disks onto new ones. The cluster is currently down and I am unable to do things like ceph-bluestore-fsck but ceph-bluestore-tool bluefs-export does appear to be working. Any help would be appreciated Many thanks Carl

3 months, 2 weeks

5
5
0 0

RBD mirroring to an EC pool

by Jan Kasprzak

Hello, Ceph users, I would like to use my secondary Ceph cluster for backing up RBD OpenNebula volumes from my primary cluster using mirroring in image+snapshot mode. Because it is for backups only, not a cold-standby, I would like to use erasure coding on the secondary side to save a disk space. Is it supported at all? I tried to create a pool: secondary# ceph osd pool create one-mirror erasure k6m2 secondary# ceph osd pool set one-mirror allow_ec_overwrites true set pool 13 allow_ec_overwrites to true secondary# rbd mirror pool enable --site-name secondary one-mirror image 2024-02-02T11:00:34.123+0100 7f95070ad5c0 -1 librbd::api::Mirror: mode_set: failed to allocate mirroring uuid: (95) Operation not supported When I created a replicated pool instead, this step worked: secondary# ceph osd pool create one-mirror-repl replicated secondary# rbd mirror pool enable --site-name secondary one-mirror-repl image secondary# So, is RBD mirroring supported with erasure-coded pools at all? Thanks! -Yenya -- | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> | | https://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise. --Larry Wall

3 months, 2 weeks

2
2
0 0

Problem starting radosgw-admin and rados hangs when .rgw.root is incomplete

by Carl J Taylor

Hi, Can anyone shed light on this please? I have had our cluster crashed and now managed to get everything back up and running, osds have nearly rebalanced but I am seeing issues with rgw. 2024-02-05T01:29:56.272+0000 7f7237e75f40 20 rados->read ofs=0 len=0 2024-02-05T01:29:56.276+0000 7f7237e75f40 20 rados_obj.operate() r=-2 bl.length=0 2024-02-05T01:29:56.276+0000 7f7237e75f40 20 realm 2024-02-05T01:29:56.276+0000 7f7237e75f40 20 rados->read ofs=0 len=0 2024-02-05T01:29:56.276+0000 7f7237e75f40 20 rados_obj.operate() r=-2 bl.length=0 2024-02-05T01:29:56.276+0000 7f7237e75f40 4 RGWPeriod::init failed to init realm id : (2) No such file or directory 2024-02-05T01:29:56.276+0000 7f7237e75f40 20 rados->read ofs=0 len=0 2024-02-05T01:29:56.276+0000 7f7237e75f40 20 rados_obj.operate() r=-2 bl.length=0 2024-02-05T01:29:56.276+0000 7f7237e75f40 20 rados->read ofs=0 len=0 2024-02-05T01:29:56.276+0000 7f7237e75f40 20 rados_obj.operate() r=0 bl.length=17 2024-02-05T01:29:56.276+0000 7f7237e75f40 20 rados->read ofs=0 len=0 The .rgw.root and .rgw.index are both marked incomplete and one pg for each was restored from a bad disk. Both pools are now showing a status of peering_blocked_by_history_les_bound. I do have some other pgs with important data that can be recovered from the disks but it is not essential that is done straight away. I need to get RGW running so I can delete old data and free up some space to allow backfiling to complete. Version is 18.2.1 running under cephadm data: pools: 19 pools, 801 pgs objects: 9.23M objects, 4.7 TiB usage: 9.8 TiB used, 2.7 TiB / 12 TiB avail pgs: 2.122% pgs not active 435947/18456424 objects degraded (2.362%) 559225/18456424 objects misplaced (3.030%) 758 active+clean 17 incomplete 12 active+undersized+degraded+remapped+backfill_toofull 12 active+remapped+backfill_toofull 2 active+clean+scrubbing+deep If anyone can suggest a known way of recovering from this your advice would be appreciated. Kind regards Carl.

3 months, 2 weeks

1
0
0 0

RADOSGW Multi-Site Sync Metrics

by Rhys Powell

Hi All, I am in the process of implementing multi-site RGW instance and have successfully set up a POC and confirmed the functionality. I am working on metrics and alerting for this service, and I am not seeing metrics available for the output shown by radosgw-admin sync status --rgw-realm=<<realm-name>> Sample output: [@cepha-cn02 ~]# radosgw-admin sync status --rgw-realm=<<realm-name>> realm a207b396-8d1b-408b-851e-10ad545861b7 (realm-name) zonegroup 77e8924b-05e3-4d86-b887-aedd7fe5306c (zonegroup-name) zone a26c27b2-d6ac-4eab-a4ce-1036ce2d37dc (zone-name) metadata sync syncing full sync: 0/64 shards incremental sync: 64/64 shards metadata is caught up with master data sync source: 8c7d69db-85ae-45f4-b4ec-f712fad4af07 (zone-name) syncing full sync: 0/128 shards incremental sync: 128/128 shards data is caught up with source I'd like to measure, track, and alert on shard status during sync operations. Is there a way to expose these metrics? I'm struggling to find guidance or details. Thanks in advance Rhys Rhys Powell (He/Him) KORE<https://www.korewireless.com/> | Senior Systems Engineer (m) rpowell(a)korewireless.com<mailto:rpowell@korewireless.com> LinkedIn<https://www.linkedin.com/company/kore-wireless/> | Twitter<https://twitter.com/KORE_Wireless>| Instagram<https://www.instagram.com/kore_wireless/> Disclaimer The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful. This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.

3 months, 2 weeks

2
1
0 0

RBD Image Returning 'Unknown Filesystem LVM2_member' On Mount - Help Please

by duluxoz

Hi All, All of this is using the latest version of RL and Ceph Reef I've got an existing RBD Image (with data on it - not "critical" as I've got a back up, but its rather large so I was hoping to avoid the restore scenario). The RBD Image used to be server out via an (Ceph) iSCSI Gateway, but we are now looking to use plain old kernal module. The RBD Image has been RBD Mapped to the client's /dev/rbd0 location. So now I'm trying a straight `mount /dev/rbd0 /mount/old_image/` as a test What I'm getting back is `mount: /mount/old_image/: unknown filesystem type 'LVM2_member'.` All my Google Foo is telling me that to solve this issue I need to reformat the image with a new file system - which would mean "losing" the data. So my question is: How can I get to this data using rbd kernal modules (the iSCSI Gateway is no longer available, so not an option), or am I stuck with the restore option? Or is there something I'm missing (which would not surprise me in the least)? :-) Thanks in advance (as always, you guys and gals are really, really helpful) Cheers Dulux-Oz

3 months, 2 weeks

5
7
0 0

2024

2023

2022

2021

2020

2019

ceph-users February 2024