December 2023 - ceph-users

by Gauvain Pocentek

I'd like to say that it was something smart but it was a bit of luck. I logged in on a hypervisor (we run OSDs and OpenStack hypervisors on the same hosts) to deal with another issue, and while checking the system I noticed that one of the OSDs was using a lot more CPU than the others. It made me think that the increased IOPS could put a strain on some of the OSDs without impacting the whole cluster so I decided to increate pg_num to spread the operations to more OSDs, and it did the trick. The qlen metric went back to something similar to what we had before the problems started. We're going to look into adding CPU/RAM monitoring for all the OSDs next. Gauvain On Fri, Dec 22, 2023 at 2:58 PM Drew Weaver <drew.weaver(a)thenap.com> wrote: > Can you say how you determined that this was a problem? > > -----Original Message----- > From: Gauvain Pocentek <gauvainpocentek(a)gmail.com> > Sent: Friday, December 22, 2023 8:09 AM > To: ceph-users(a)ceph.io > Subject: [ceph-users] Re: RGW requests piling up > > Hi again, > > It turns out that our rados cluster wasn't that happy, the rgw index pool > wasn't able to handle the load. Scaling the PG number helped (256 to 512), > and the RGW is back to a normal behaviour. > > There is still a huge number of read IOPS on the index, and we'll try to > figure out what's happening there. > > Gauvain > > On Thu, Dec 21, 2023 at 1:40 PM Gauvain Pocentek < > gauvainpocentek(a)gmail.com> > wrote: > > > Hello Ceph users, > > > > We've been having an issue with RGW for a couple days and we would > > appreciate some help, ideas, or guidance to figure out the issue. > > > > We run a multi-site setup which has been working pretty fine so far. > > We don't actually have data replication enabled yet, only metadata > > replication. On the master region we've started to see requests piling > > up in the rgw process, leading to very slow operations and failures > > all other the place (clients timeout before getting responses from > > rgw). The workaround for now is to restart the rgw containers regularly. > > > > We've made a mistake and forcefully deleted a bucket on a secondary > > zone, this might be the trigger but we are not sure. > > > > Other symptoms include: > > > > * Increased memory usage of the RGW processes (we bumped the container > > limits from 4G to 48G to cater for that) > > * Lots of read IOPS on the index pool (4 or 5 times more compared to > > what we were seeing before) > > * The prometheus ceph_rgw_qlen and ceph_rgw_qactive metrics (number of > > active requests) seem to show that the number of concurrent requests > > increases with time, although we don't see more requests coming in on > > the load-balancer side. > > > > The current thought is that the RGW process doesn't close the requests > > properly, or that some requests just hang. After a restart of the > > process things look OK but the situation turns bad fairly quickly > > (after 1 hour we start to see many timeouts). > > > > The rados cluster seems completely healthy, it is also used for rbd > > volumes, and we haven't seen any degradation there. > > > > Has anyone experienced that kind of issue? Anything we should be > > looking at? > > > > Thanks for your help! > > > > Gauvain > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an > email to ceph-users-leave(a)ceph.io >

4 months, 1 week

1
1
0 0

Fwd: ceph-dashboard odd behavior when visiting through haproxy

by Demian Romeijn

Hi ceph-users, I'm not sure if this mail got send correctly, my colleague seems to not have received it. Either way. We've managed to replicate this issue from a local http_check test. The ceph-mgr seems to go down with every first visit, and works perfectly fine right after a couple re-visits. Has anyone else seen this issue before? Thanks in advance, - Demian From: Demian Romeijn <dromeijn(a)tuxis.nl> To: <ceph-users(a)ceph.io> Sent: 12/22/2023 2:14 PM Subject: ceph-dashboard odd behavior when visiting through haproxy I'm currently trying to setup a ceph-dashboard using the official documentation on how to do so. I've managed to log-in by just visiting the URL & port, and by visting it through haproxy. However using haproxy to visit the site results in odd behavior. At my first login, nothing loads on the page and eventually at ~5s it times me out, sending me back to the log-in screen. After logging back on to the dashboard, everything loads and functions as expected. I can refresh my browser as many times as I want and it still keeps on working. After some time, usually ~30 minutes or so of inactivity, the problem arises again. Haproxy tells us the server is down for about ~10 seconds, running a simple HTTP check results in the following aswell: CRITICAL - Socket timeout after 10 seconds. In the ceph-mgr logs there isn't any special error other than: [dashboard ERROR frontend.error] (https://*redacted*/#/login): Http failure response for https://*redacted*/ui-api/orchestrator/get_name: 401 OK None It seems as such the ceph dashboard is "overloaded", changing haproxy config (following the official ceph documentation on how to set it up) to do health-checks less often results in the problem happening less often. Anything I might've overlooked that could sort out the issue?

4 months, 1 week

1
0
0 0

About lost disk with erasure code

by Phong Tran Thanh

Hi community, I am running ceph with block rbd with 6 nodes, erasure code 4+2 with min_size of pool is 4. When three osd is down, and an PG is state down, some pools is can't write data, suppose three osd can't start and pg stuck in down state, how i can delete or recreate pg to replace down pg or another way to allow pool to write/read data? Thanks for the community *Tran Thanh Phong* Email: tranphong079(a)gmail.com Skype: tranphong079

4 months, 1 week

3
4
0 0

mds crashes with 18.2.1

by Andrej Filipčič

Hi, I just upgraded from 17.2.6 to 18.2.1 and have some issues with mds. mds started crashing with 2023-12-27T13:21:30.491+0100 7f717b5886c0 1 mds.f9sn015 Updating MDS map to version 2689280 from mon.5 2023-12-27T13:21:30.491+0100 7f717b5886c0 1 mds.0.2689276 handle_mds_map i am now mds.0.2689276 2023-12-27T13:21:30.491+0100 7f717b5886c0 1 mds.0.2689276 handle_mds_map state change up:clientreplay --> up:active 2023-12-27T13:21:30.491+0100 7f717b5886c0 1 mds.0.2689276 active_start 2023-12-27T13:21:30.524+0100 7f717b5886c0 1 mds.0.2689276 cluster recovered. 2023-12-27T13:21:30.551+0100 7f7176d7f6c0 -1 /var/tmp/portage/sys-cluster/ceph-18.2.1-r2/work/ceph-18.2.1/src/mds/Server.cc: In funct ion 'CInode* Server::prepare_new_inode(MDRequestRef&, CDir*, inodeno_t, unsigned int, const file_layout_t*)' thread 7f7176d7f6c0 time 2023-12-27T13:21:30.548697+0100 /var/tmp/portage/sys-cluster/ceph-18.2.1-r2/work/ceph-18.2.1/src/mds/Server.cc: 3441: FAILED ceph_assert(_inode->gid != (unsigned)-1) and I could not bring it back again. As a workaround I was able to start mds 17.2.6 and it somehow recovered. Then I started 18 mds again, which soon after startup finds this corruption: [ { "damage_type": "dentry", "id": 4247331390, "ino": 1, "frag": "*", "dname": "lost+found", "snap_id": "head", "path": "/lost+found" } ] There are few corrupted files in some other directories ( leftovers from several releases before I never managed to fix), and if I start mds scrub there, mds crashes again, maybe because of corrupted lost+found. If I try to remove lost+found, mds crashes again. Do you have any hint how to recover from this? Best regards, Andrej -- _____________________________________________________________ prof. dr. Andrej Filipcic, E-mail: Andrej.Filipcic(a)ijs.si Department of Experimental High Energy Physics - F9 Jozef Stefan Institute, Jamova 39, P.o.Box 3000 SI-1001 Ljubljana, Slovenia Tel.: +386-1-477-3674 Fax: +386-1-477-3166 -------------------------------------------------------------

4 months, 1 week

1
1
0 0

Consistent OSD crashes for ceph 17.2.5 which is causing osd up and down

by Akash Warkhade

We are running rook-ceph deployed as a operator in kubernetes with rook version 1.10.8 and ceph 17.2.5. Its working fine but we are seeing frequent OSD daemon crash in 3-4 days and restarts without any problem also we are seeing flapping osds i.e osd up down. Recently daemon crash happened for 2 OSDs at same time on different nodes with below error in crash info : -305> 2023-12-17T14:50:14.413+0000 7f53b5f91700 -1 *** Caught signal (Aborted) ** in thread 7f53b5f91700 thread_name:tp_osd_tp ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable) 1: /lib64/libpthread.so.0(+0x12cf0) [0x7f53d93ddcf0] 2: gsignal() 3: abort() 4: /lib64/libc.so.6(+0x21d79) [0x7f53d8025d79] 5: /lib64/libc.so.6(+0x47456) [0x7f53d804b456] 6: (MOSDRepOp::encode_payload(unsigned long)+0x2d0) [0x55acc0f81730] 7: (Message::encode(unsigned long, int, bool)+0x2e) [0x55acc140ec2e] 8: (ProtocolV2::send_message(Message*)+0x25e) [0x55acc16a5aae] 9: (AsyncConnection::send_message(Message*)+0x18e) [0x55acc167dc4e] 10: (OSDService::send_message_osd_cluster(int, Message*, unsigned int)+0x2bd) [0x55acc0b4b11d] 11: (ReplicatedBackend::issue_op(hobject_t const&, eversion_t const&, unsigned long, osd_reqid_t, eversion_t, eversion_t, hobject_t, hobject_t, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > const&, std::optional<pg_hit_set_history_t>&, ReplicatedBackend::InProgressOp*, ceph::os::Transaction&)+0x6c8) [0x55acc0f69368] 12: (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&&, std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0x5e7) [0x55acc0f6c907] 13: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0x50d) [0x55acc0c92ebd] 14: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0xd25) [0x55acc0cf0295] 15: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x288d) [0x55acc0cf78fd] 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1c0) [0x55acc0b56900] 17: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x6d) [0x55acc0e552ad] 18: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x115f) [0x55acc0b69dbf] 19: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x435) [0x55acc12c78c5] 20: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55acc12c9fe4] 21: /lib64/libpthread.so.0(+0x81ca) [0x7f53d93d31ca] 22: clone() It has also has below errors before the crash: scrub-queue::*remove_from_osd_queue* removing pg[2.4f0] failed. State was: unregistering Please help to troubleshoot the issue and fix it Already posted on ceph tracker but no reply there since 3-4 days

4 months, 1 week

1
0
0 0

CephFS delayed deletion

by Miroslav Svoboda

Hi, how can I increase a files deletion speed? Every files was deleted from cephfs on my pool, but ceph df still show 50% usage of pool. I know about delayed deletion (https://docs.ceph.com/en/latest), but is there some way to little speed up this? I significantly increase mds_max_purge_ops and mds_max_purge_files, but this is not helped me. Thanks for response. Svoboda Miroslav

4 months, 2 weeks

1
0
0 0

Building new cluster had a couple of questions

by Drew Weaver

Howdy, I am going to be replacing an old cluster pretty soon and I am looking for a few suggestions. #1 cephadm or ceph-ansible for management? #2 Since the whole... CentOS thing... what distro appears to be the most straightforward to use with Ceph? I was going to try and deploy it on Rocky 9. That is all I have. Thanks, -Drew

4 months, 2 weeks

11
20
0 0

Reef v18.2.1: ceph osd pool autoscale-status gives empty output

by Jayanth Reddy

Hello Users, I deployed a new cluster with v18.2.1 but noticed that pg_num and pgp_num always remained 1 for the pools with autoscale turned on. Below is the env and the relevant information ceph> version ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) ceph> status cluster: id: 273c8410-a333-11ee-b3c2-9791c3098e2b health: HEALTH_WARN clock skew detected on mon.ec-rgw-s3 services: mon: 3 daemons, quorum ec-rgw-s1,ec-rgw-s2,ec-rgw-s3 (age 48m) mgr: ec-rgw-s1.icpgxx(active, since 69m), standbys: ec-rgw-s2.quzjfv osd: 3 osds: 3 up (since 29m), 3 in (since 49m) rgw: 1 daemon active (1 hosts, 1 zones) data: pools: 8 pools, 23 pgs objects: 1.85k objects, 6.0 GiB usage: 4.8 GiB used, 595 GiB / 600 GiB avail pgs: 23 active+clean ceph> osd pool get noautoscale noautoscale is off ceph> osd pool autoscale-status ceph> osd pool autoscale-status ceph> osd pool ls detail pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 21 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 3.00 pool 2 'default.rgw.buckets.data' erasure profile ec-21 size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode off last_change 111 lfor 0/0/55 flags hashpspool stripe_width 8192 compression_algorithm lz4 compression_mode force application rgw pool 3 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 34 flags hashpspool stripe_width 0 application rgw read_balance_score 3.00 pool 4 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 37 flags hashpspool stripe_width 0 application rgw read_balance_score 3.00 pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 39 flags hashpspool stripe_width 0 application rgw read_balance_score 3.00 pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 41 flags hashpspool stripe_width 0 pg_autoscale_bias 4 application rgw read_balance_score 3.00 pool 7 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 44 flags hashpspool stripe_width 0 pg_autoscale_bias 4 application rgw read_balance_score 3.00 pool 8 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 47 flags hashpspool stripe_width 0 application rgw read_balance_score 3.00 I'd manually changed for pool ID 2. Is this by any chance due to PR [1]? [1] https://github.com/ceph/ceph/pull/53658 Thanks, Jayanth

4 months, 2 weeks

1
1
0 0

OSD is usable, but not shown in "ceph orch device ls"

by E Taka

Hello, in our cluster we have one node with SSD, which are used, but we cannot see it in "ceph orch device ls". Everything als looks OK. For better understanding, the diskname is /dev/sda, it's osd.138: ~# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 1 7T 0 disk ~# wipefs /dev/sda DEVICE OFFSET TYPE UUID LABEL sda 0x0 ceph_bluestore ~# ceph osd tree -9 15.42809 host ceph06 138 ssd 6.98630 osd.138 up 1.00000 1.00000 The file ceph-osd.138.log does not look unusal to me. ceph-volume.log show that the SSD is found by the "lsblk" command of the volume processing. It is not possible to add the SSD by "# ceph orch daemon add osd ceph06:/dev/sda Error message in this case is a question asking if it is already used, even if the SSD is fully wiped via "wipefs -a" or by overwriting the entire disk with the dd command. But It is possible to add it to the cluster by using the option "--method raw". Do you have an idea what happened here and how can I debug this behaviour?

4 months, 2 weeks

1
0
0 0

Ceph newbee questions

by Marcus

Hi all, I am all new with ceph and I come from gluster. We have had our eyes on ceph for several years and as the gluster project seems to slow down we now think it is time to start look into ceph. I have manually configured a ceph cluster with ceph fs on debian bookworm. What is the difference from installing with cephadm compared to manuall install, any benefits that you miss with manual install? There are also another couple of things that I can not figure out reading the documentation. Most of our files are small and from my understanding replication is then recomended, right? The plan is to set ceph up like this: 1 x "admin node" 2 x "storage nodes" The admin node will run mon, mgr and mds. The storage nodes will run mon, mgr, mds and 8x osd (8 disks). This works well to setup but I can not get my head around is how things are replicated over nodes and disks. In ceth.conf I set the folowing: osd pool default size = 2 osd pool default min size = 1 So the idea is that we always have 2 copies of the data. I do not seem to be able to figure out the replication when things starts to fail. If the admin node goes down, one of the data nodes will run the mon, mgr and mds. This will slow things down but will be fine until we have a new admin node in place again. (or if there is something I am missing here?) If just one data node goes down we will still not loose any data and that is fine until we have a new server. But what if one data node goes down and one disk of the other data node breaks, will I loose data then? Or how many disks can I loose before I loose data? This is what I can not get my head around, how to think when disaster strikes, how much hardware can I loose before I loose data? Or have I got it all wrong? Is it a bad idea with just 2 fileservers is more servers required? The second thing I have a problem with is snapshots. I manage to create snapshot in root with command: ceph fs subvolume snapshot create <vol_name> / <snap_name> But it fails if I try to create a shapshot in any other directory then in the root. Second of all if I try to create a snapshot from the client with: mkdir /mnt-ceph/.snap/my_snapshot I get the same error in all directories: Permission dened. I have not found any sollution to this, am I missing something here as well? Any config missing? Many thanks for your support!! Best regrads Marcus

4 months, 2 weeks

3
2
0 0

2024

2023

2022

2021

2020

2019

ceph-users December 2023