May 2021 - ceph-users - lists.ceph.io

"radosgw-admin bucket radoslist" loops when a multipart upload is happening

by Boris Behrens

Hi together, I still search for orphan objects and came across a strange bug: There is a huge multipart upload happening (around 4TB), and listing the rados objects in the bucket loops over the multipart upload. -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groÃƒ¼en Saal.

2 years, 11 months

1
2
0 0

Bucket index OMAP keys unevenly distributed among shards

by James, GleSYS

Hi, we're running 15.2.7 and our cluster is warning us about LARGE_OMAP_OBJECTS (1 large omap objects). Here is what the distribution looks like for the bucket in question, and as you can see all but 3 of the keys reside in shard 2. .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.0 1 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.8 0 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.9 0 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.7 0 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.1 0 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.4 0 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.3 1 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.2 262384 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.6 0 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.5 0 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.12 0 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.10 1 .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.11 0 osd_deep_scrub_large_omap_object_key_threshold is set to 200000 by default, hence the warning observed for this bucket. Dynamic resharding is enabled, and the bucket is not in the process of being resharded. Versioning not in use for this bucket, so we're not affected by https://tracker.ceph.com/issues/46456. Can anyone help us understand why all the keys are getting mapped to a singe shard? Is there a bug here, or is this expected behaviour? Could it be related to the fact that the bucket contains large multipart uploads? (Object names look like this:) _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5900 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5901 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5902 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5903 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5904 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5905 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5906 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5907 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5908 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~0W-YhP3F7qc70Ad8JoBIugKzu225qs2.5909 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~2uuwqny_HicO6kx_lPmWEf0zoyvdm_9.7152 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~2uuwqny_HicO6kx_lPmWEf0zoyvdm_9.7153 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~2uuwqny_HicO6kx_lPmWEf0zoyvdm_9.7154 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~2uuwqny_HicO6kx_lPmWEf0zoyvdm_9.7155 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~2uuwqny_HicO6kx_lPmWEf0zoyvdm_9.7156 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~2uuwqny_HicO6kx_lPmWEf0zoyvdm_9.7157 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~2uuwqny_HicO6kx_lPmWEf0zoyvdm_9.7158 _multipart_TOOTHROT/anonymised/TOOTHROT-DISK1-8c59002f-cffd-4f74-a680-147383ab8d78.vhdx.2~2uuwqny_HicO6kx_lPmWEf0zoyvdm_9.7159

2 years, 11 months

1
0
0 0

ceph-ansible in Pacific and beyond?

by Matthew Vernon

Hi, I caught up with Sage's talk on what to expect in Pacific ( https://www.youtube.com/watch?v=PVtn53MbxTc ) and there was no mention of ceph-ansible at all. Is it going to continue to be supported? We use it (and uncontainerised packages) for all our clusters, so I'd be a bit alarmed if it was going to go away... Regards, Matthew -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

2 years, 11 months

17
25
0 0

ceph orch status hangs forever

by Sebastian Luna Valero

Hi, After an unschedule power outage our Ceph (Octopus) cluster reports a healthy state with: "ceph status". However, when we run "ceph orch status" the command hangs forever. Are there other commands that we can run for a more thorough health check of the cluster? After looking at: https://docs.ceph.com/en/octopus/rados/operations/health-checks/ I also run "ceph crash ls-new" but it hangs forever as well. Any ideas? Our Ceph cluster is currently used as backend storage for our OpenStack cluster, and we are also having issues with storage volumes attached to VMs, but we don't know how to narrow down the root cause. Any feedback is highly appreciated. Best regards, Sebastian

2 years, 11 months

2
3
0 0

Suitable 10G Switches for ceph storage - any recommendations?

by Hermann Himmelbauer

Dear Ceph users, I am currently constructing a small hyperconverged Proxmox cluster with ceph as storage. So far I always had 3 nodes, which I directly linked together via 2 bonded 10G network interfaces for the Ceph storage, so I never needed any switching devices. This new cluster has more nodes, so I am considering using a 10G switch for the storage network. As I have no experience with such a setup, I wonder if there are any specific issues that I should think of (latency...)? As the whole cluster should be not too expensive, I am currently thinking of the following solution: 2* CRS317-1G-16s+RM switches: https://mikrotik.com/product/crs317_1g_16s_rm#fndtn-testresults SFP+ Cables like these: https://www.fs.com/de/products/48883.html Some network interface for each node with two SFP+ ports, e.g.: https://ark.intel.com/content/www/de/de/ark/products/39776/intel-ethernet-c… Connect each port with each switch and configure master/slave configuration so that the switches are redundant. What do you think of this setup - or is there any information / recommendation for an optimized setup of a 10G storage network? Best Regards, Hermann -- hermann(a)qwer.tk PGP/GPG: 299893C7 (on keyservers)

2 years, 11 months

2
1
0 0

iSCSI - failed, gateway(s) unavailable UNKNOWN

by Paul Giralt (pgiralt)

I’m new to Ceph and have just deployed my first cluster using ceph-ansible and running into some issues I was hoping someone could point me in the right direction. I have 5 servers to start with. 5 OSD, 3 Monitors, 2 Manager, 4 iSCSI gateway. My intention is to use this environment as iSCSI storage for ESXi. When I went to add an iSCSI Target, I got an error that popped up in the browser (went away to fast to read it) but after this, now if I click on the “Targets” tab in Dashboard for iSCSI, I just continuously get errors popping up: 500 - Internal Server Error The server encountered an unexpected condition which prevented it from fulfilling the request. I’m unable to add any targets now. The log files show this: 2021-05-19T12:58:05.294-0400 7f2db6115700 0 [dashboard ERROR exception] Internal Server Error Traceback (most recent call last): File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 46, in dashboard_exception_handler return handler(*args, **kwargs) File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in __call__ return self.callable(*self.args, **self.kwargs) File "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 694, in inner ret = func(*args, **kwargs) File "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 907, in wrapper return func(*vpath, **params) File "/usr/share/ceph/mgr/dashboard/controllers/iscsi.py", line 266, in list IscsiTarget._set_info(target) File "/usr/share/ceph/mgr/dashboard/controllers/iscsi.py", line 990, in _set_info raise e File "/usr/share/ceph/mgr/dashboard/controllers/iscsi.py", line 980, in _set_info target_iqn) File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 531, in func_wrapper **kwargs) File "/usr/share/ceph/mgr/dashboard/services/iscsi_client.py", line 254, in get_targetinfo return request() File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 326, in __call__ data, raw_content, headers) File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 449, in do_request resp.content) dashboard.rest_client.RequestException: iscsi REST API failed request with status code 503 (b'{\n "message": "failed, gateway(s) unavailable:cxcto-c240-j27-01(UNKNOWN' b' state)"\n}\n') 2021-05-19T12:58:05.294-0400 7f2db6115700 0 [dashboard ERROR request] [10.117.244.166:58270] [GET] [500] [0.398s] [admin] [513.0B] /api/iscsi/target 2021-05-19T12:58:05.294-0400 7f2db6115700 0 [dashboard ERROR request] [b'{"status": "500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request_id": "7bc691e8-e814-48c1-a69c-57ea1f3f3dbd"} ‘] Clearly it’s showing an error indicating that the gateway is “unavailable”. The output from gwcli shows this: [root@cxcto-c240-j27-01 ~]# ./gwcli 1 gateway is inaccessible - updates will be disabled /> ls o- / ......................................................................................................................... [...] o- cluster ......................................................................................................... [Clusters: 1] | o- ceph ............................................................................................................ [HEALTH_OK] | o- pools .......................................................................................................... [Pools: 4] | | o- device_health_metrics ................................................. [(x3), Commit: 0.00Y/15911599M (0%), Used: 0.00Y] | | o- iscsi ............................................................... [(x3), Commit: 0.00Y/15911599M (0%), Used: 262511b] | | o- rbd ................................................................... [(x3), Commit: 0.00Y/15911599M (0%), Used: 1650b] | | o- test .................................................................. [(x3), Commit: 0.00Y/15911599M (0%), Used: 0.00Y] | o- topology .............................................................................................. [OSDs: 110,MONs: 3] o- disks ....................................................................................................... [0.00Y, Disks: 0] o- iscsi-targets ............................................................................... [DiscoveryAuth: CHAP, Targets: 1] o- iqn.2001-07.com.ceph:1621437640904 ................................................................ [Auth: None, Gateways: 1] o- disks .......................................................................................................... [Disks: 0] o- gateways ............................................................................................ [Up: 0/1, Portals: 1] | o- cxcto-c240-j27-01 ....................................................................... [10.122.242.196 (UNAUTHORIZED)] o- host-groups .................................................................................................. [Groups : 0] o- hosts ....................................................................................... [Auth: ACL_ENABLED, Hosts: 0] In this case it says “UNAUTHORIZED” When I created the target, I had selected all 4 iSCSI gateways, but it looks like something happened during the addition process that has left it in a weird state. The dashboard seems to think that all the gateways are up and running : [cid:8CA421EB-DFD1-43A9-87B6-7ADEE5B2DF69] Notice that it only shows the target on the one node which it is reporting an error. I’ve tried deleting the target from gwcli but that fails and I’m not really sure where to look next. The “UNAUTHORIZED” in the gwcli output makes me wonder if there is some kind of authorization issue, but I’m not sure what that would be. This is what I see if I navigate to the Targets tab on the iSCSI page: [cid:828103A0-C321-41F6-B018-AC8A883F6C7E] Any thoughts or guidance are greatly appreciated. -Paul

2 years, 11 months

1
0
0 0

Re: Suitable 10G Switches for ceph storage - any recommendations?

by Max Vernimmen

Hermann, I think there was a discussion on recommended switches not too long ago. You should be able to find it in the mailing list archives. I think the latency of the network is usually very minor compared to ceph's dependency on cpu and disk latency, so for a simple cluster I wouldn't worry about it too much. I have found fs.com's dac cables to get stuck a lot, so I don't use them anymore. I usually buy dell or mellanox cables. Regarding network cards I've found the intel cards to be not that great due to bugs with lacp bonds, embedded lldp getting in the way and other issues. So I'm using mellanox cards instead, but broadcom should also work. hope it helps! best regards, Max On Wed, May 19, 2021 at 1:48 PM <ceph-users-request(a)ceph.io> wrote: > ---------- Forwarded message ---------- > From: Hermann Himmelbauer <hermann(a)qwer.tk> > To: ceph-users(a)ceph.com > Cc: > Bcc: > Date: Wed, 19 May 2021 11:22:26 +0200 > Subject: [ceph-users] Suitable 10G Switches for ceph storage - any > recommendations? > Dear Ceph users, > I am currently constructing a small hyperconverged Proxmox cluster with > ceph as storage. So far I always had 3 nodes, which I directly linked > together via 2 bonded 10G network interfaces for the Ceph storage, so I > never needed any switching devices. > > This new cluster has more nodes, so I am considering using a 10G switch > for the storage network. As I have no experience with such a setup, I > wonder if there are any specific issues that I should think of > (latency...)? > > As the whole cluster should be not too expensive, I am currently > thinking of the following solution: > > 2* CRS317-1G-16s+RM switches: > https://mikrotik.com/product/crs317_1g_16s_rm#fndtn-testresults > > SFP+ Cables like these: > https://www.fs.com/de/products/48883.html > > Some network interface for each node with two SFP+ ports, e.g.: > > https://ark.intel.com/content/www/de/de/ark/products/39776/intel-ethernet-c… > > Connect each port with each switch and configure master/slave > configuration so that the switches are redundant. > > What do you think of this setup - or is there any information / > recommendation for an optimized setup of a 10G storage network? > > Best Regards, > Hermann > > -- > hermann(a)qwer.tk > PGP/GPG: 299893C7 (on keyservers) > >

2 years, 11 months

2
1
0 0

rbd-nbd crashes Error: failed to read nbd request header: (33) Numerical argument out of domain

by Zhi Zhang

Hi guys, We are recently testing rbd-nbd using ceph N version. After map rbd image, mkfs and mount the nbd device, the rbd-nbd and dmesg will show following errors when doing some read/write testing. rbd-nbd log: 2021-05-18 11:35:08.034 7efdb8ff9700 20 []rbd-nbd: reader_entry: waiting for nbd request ... 2021-05-18 11:35:08.066 7efdb8ff9700 -1 []rbd-nbd: failed to read nbd request header: (33) Numerical argument out of domain 2021-05-18 11:35:08.066 7efdb3fff700 20 []rbd-nbd: writer_entry: no io requests, terminating 2021-05-18 11:35:08.066 7efdea8d1a00 20 []librbd::ImageState: 0x564a2be2b3c0 unregister_update_watcher: handle=0 2021-05-18 11:35:08.066 7efdea8d1a00 20 []librbd::ImageState: 0x564a2be2b4b0 ImageUpdateWatchers::unregister_watcher: handle=0 2021-05-18 11:35:08.066 7efdea8d1a00 20 []librbd::ImageState: 0x564a2be2b4b0 ImageUpdateWatchers::unregister_watcher: completing unregister 2021-05-18 11:35:08.066 7efdea8d1a00 10 []rbd-nbd: ~NBDServer: terminating 2021-05-18 11:35:08.066 7efdea8d1a00 20 []librbd::ImageState: 0x564a2be2b3c0 close dmesg: [Tue May 18 11:35:07 2021] EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: discard [Tue May 18 11:35:07 2021] block nbd0: shutting down sockets [Tue May 18 11:35:09 2021] blk_update_request: I/O error, dev nbd0, sector 75592 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0 client host info: centos7.x kernel 5.4.109 It looks like the kernel nbd device shutdown its socket for some reason, but we haven't figured it out. BTW, we have tried to turn on/off rbd cache, use different fs ext4/xfs, use ec pool or replicated pool, but the error remains. It is more frequent for us to reproduce when batch map, mkfs and mount rbd-nbd on different hosts simultaneously. Thanks for any suggestions. Regards, Zhi Zhang (David) Contact: zhang.david2011(a)gmail.com zhangz.david(a)outlook.com

2 years, 11 months

2
4
0 0

BlueFS spillover detected - 14.2.16

by Toby Darling

Hi In the last couple of weeks we've been getting BlueFS spillover warnings on multiple (>10) osds, eg BLUEFS_SPILLOVER BlueFS spillover detected on 1 OSD(s) osd.327 spilled over 58 MiB metadata from 'db' device (30 GiB used of 66 GiB) to slow device I know this can be corrected with a 'ceph tell osd.$osd compact' or ignored with "bluestore_warn_on_bluefs_spillover=false", but my concern is that these warnings have only recently started. Could this be a sign of something nasty heading our way that I'm not aware of? Is there a performance penalty by just ignoring, rather than compacting? Many thanks for any pointers. Cheers Toby -- Toby Darling, Scientific Computing (2N249) MRC Laboratory of Molecular Biology

2 years, 11 months

2
1
0 0

remove host from cluster for re-installing it

by mabi

Hello, On my Octopus cluster with 6 nodes (3 mon/mgr, 3 OSD), I would like to re-install the operating system of the first mon/mgr node. For that purpose I tried "ceph host rm mynode" but then I got the following two health warnings: 2 stray daemon(s) not managed by cephadm 1 stray host(s) with 2 daemon(s) not managed by cephadm So I did not proceed with the re-installation and added the node back. What would be the correct command in order to do that? I can live with mon/mgr which are do not have the quorum for the time of the re-installation. I just need my CephFS to still be available. Right now there is 2 mon daemons running and 1 active mgr and 2 standby mgrs. Thank you, Mabi

2 years, 11 months

2
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users May 2021