- ceph-users - lists.ceph.io

by Erich Weiler

Hello, We are tracking PR #56805: https://github.com/ceph/ceph/pull/56805 And the resolution of this item would potentially fix a pervasive and ongoing issue that needs daily attention in our cephfs cluster. I was wondering if it would be included in 18.2.3 which I *think* should be released soon? Is there any way of knowing if that is true? Thanks again, erich

1 week, 2 days

4
6
0 0

RGWs stop processing requests after upgrading to Reef

by Iain Stott

Hi, We have recently upgraded one of our clusters from Quincy 17.2.6 to Reef 18.2.1, since then we have had 3 instances of our RGWs stop processing requests. We have 3 hosts that run a single instance of RGW on each, and all 3 just seem to stop processing requests at the same time causing our storage to become unavailable. A restart or redeploy of the RGW service brings them back ok. The cluster was deployed using ceph ansible, but since we have adopted it to cephadm which is how the upgrade was performed. We have enabled debug logging as there was nothing out of the ordinary in normal logs and are currently sifting through them from the last crash. We are just wondering if it possible to run Quincy RGWs instead of Reef as we didn't have this issue prior to the upgrade? We have 3 clusters in a multisite setup, we are holding off on upgrading the other 2 clusters due to this issue. Thanks Iain Iain Stott OpenStack Engineer Iain.Stott(a)thg.com [THG Ingenuity Logo]<https://www.thg.com> www.thg.com<https://www.thg.com/> [LinkedIn]<https://www.linkedin.com/company/thgplc/?originalSubdomain=uk> [Instagram] <https://www.instagram.com/thg> [X] <https://twitter.com/thgplc?lang=en>

1 week, 3 days

2
2
0 0

Have a problem with haproxy/keepalived/ganesha/docker

by Ruslan Nurabayev

Hello! I've installed my 5-node CEPH cluster next install NFS server by command: ceph nfs cluster create nfshacluster 5 --ingress --virtual_ip 192.168.171.48/26 --ingress-mode haproxy-protocol. I don't understand fully how this must be works but when i stop NFS daemon even on one of this nodes I've see that writing on NFS shares is disappear (testing via vdbench). As i understand it is wrong and IO from stopped daemon must switching to another NFS daemon without any impact on IO. Can someone help me with troubleshoot this issue? Or explain how done full-fledged Active-Active HA NFS Cluster for production use. Thanks! Руслан Нурабаев Старший инженер Сектор ИТ платформы Отдел развития опорной сети Департамент развития сети +77012119272 Ruslan.Nurabayev(a)kcell.kz -----Original Message----- From: ceph-users-request(a)ceph.io <ceph-users-request(a)ceph.io> Sent: Thursday, April 11, 2024 15:07 To: Ruslan Nurabayev <Ruslan.Nurabayev(a)kcell.kz> Subject: Welcome to the "ceph-users" mailing list [You don't often get email from ceph-users-request(a)ceph.io. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Welcome to the "ceph-users" mailing list! To post to this list, send your email to: ceph-users(a)ceph.io You can unsubscribe or make adjustments to your options via email by sending a message to: ceph-users-request(a)ceph.io with the word 'help' in the subject or body (don't include the quotes), and you will get back a message with instructions. You will need your password to change your options, but for security purposes, this password is not included here. If you have forgotten your password you will need to reset it via the web UI. ________________________________ **************************************************************************************** Осы хабарлама және онымен берілетін кез келген файлдар құпия болып табылады және олар мекенжайда көрсетілген жеке немесе заңды тұлғалардың пайдалануына ғана арналған. Егер сіз болжамды алушы болып табылмайтын болсаңыз, осы арқылы осындай ақпаратты кез келген таратуға, жіберуге, көшіруге немесе пайдалануға қатаң тыйым салынатыны және осы электрондық хабарлама дереу жойылуға тиіс екендігін хабарлаймыз. KCELL осы хабарламадағы кез келген ақпараттың дәлдігіне немесе толықтығына қатысты ешқандай кепілдік бермейді және сол арқылы онда қамтылған ақпарат үшін немесе оны беру, қабылдау, сақтау немесе қандай да бір түрде пайдалану үшін кез келген жауапкершілікті болдырмайды. Осы хабарламада айтылған пікірлер тек жіберушіге ғана тиесілі және KCELL пікірін де білдіруі міндетті емес. Бұл электрондық хабарлама барлық танымал компьютерлік вирустарға тексерілді. **************************************************************************************** Данное сообщение и любые передаваемые с ним файлы являются конфиденциальными и предназначены исключительно для использования физическими или юридическими лицами, которым они адресованы. Если вы не являетесь предполагаемым получателем, настоящим уведомляем о том, что любое распространение, пересылка, копирование или использование такой информации строго запрещено, и данное электронное сообщение должно быть немедленно удалено. KCELL не дает никаких гарантий относительно точности или полноты любой информации, содержащейся в данном сообщении, и тем самым исключает любую ответственность за информацию, содержащуюся в нем, или за ее передачу, прием, хранение или использование каким-либо образом. Мнения, выраженные в данном сообщении, принадлежат только отправителю и не обязательно отражают мнение KCELL. Данное электронное сообщение было проверено на наличие всех известных компьютерных вирусов. **************************************************************************************** This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient you are hereby notified that any dissemination, forwarding, copying or use of any of the information is strictly prohibited, and the e-mail should immediately be deleted. KCELL makes no warranty as to the accuracy or completeness of any information contained in this message and hereby excludes any liability of any kind for the information contained therein or for the information transmission, reception, storage or use of such in any way whatsoever. The opinions expressed in this message belong to sender alone and may not necessarily reflect the opinions of KCELL. This e-mail has been scanned for all known computer viruses. ****************************************************************************************

1 week, 3 days

5
6
0 0

Multiple MDS Daemon needed?

by Erich Weiler

Hi All, We have a slurm cluster with 25 clients, each with 256 cores, each mounting a cephfs filesystem as their main storage target. The workload can be heavy at times. We have two active MDS daemons and one standby. A lot of the time everything is healthy but we sometimes get warnings about MDS daemons being slow on requests, behind on trimming, etc. I realize their may be a bug in play, but also, I was wondering if we simply didn't have enough MDS daemons to handle the load. Is there a way to know if adding a MDS daemon would help? We could add a third active MDS if needed. But I don't want to start adding a bunch of MDS's if that won't help. The OSD servers seem fine. It's mainly the MDS instances that are complaining. We are running reef 18.2.1. For reference, when things look healthy: # ceph fs status slugfs slugfs - 34 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active slugfs.pr-md-03.mclckv Reqs: 273 /s 2759k 2636k 362k 1079k 1 active slugfs.pr-md-01.xdtppo Reqs: 194 /s 868k 674k 67.3k 351k POOL TYPE USED AVAIL cephfs_metadata metadata 127G 3281G cephfs_md_and_data data 0 98.3T cephfs_data data 740T 196T STANDBY MDS slugfs.pr-md-02.sbblqq MDS version: ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) # ceph -s cluster: id: 58bde08a-d7ed-11ee-9098-506b4b4da440 health: HEALTH_OK services: mon: 5 daemons, quorum pr-md-01,pr-md-02,pr-store-01,pr-store-02,pr-md-03 (age 5d) mgr: pr-md-01.jemmdf(active, since 5w), standbys: pr-md-02.emffhz mds: 2/2 daemons up, 1 standby osd: 46 osds: 46 up (since 8d), 46 in (since 4w) data: volumes: 1/1 healthy pools: 4 pools, 1313 pgs objects: 271.17M objects, 493 TiB usage: 744 TiB used, 384 TiB / 1.1 PiB avail pgs: 1307 active+clean 4 active+clean+scrubbing 2 active+clean+scrubbing+deep io: client: 39 MiB/s rd, 108 MiB/s wr, 1.96k op/s rd, 54 op/s wr But when things are in "warning" mode, it looks like this: # ceph -s cluster: id: 58bde08a-d7ed-11ee-9098-506b4b4da440 health: HEALTH_WARN 1 filesystem is degraded 1 clients failing to advance oldest client/flush tid 1 MDSs report slow requests 1 MDSs behind on trimming services: mon: 5 daemons, quorum pr-md-01,pr-md-02,pr-store-01,pr-store-02,pr-md-03 (age 5d) mgr: pr-md-01.jemmdf(active, since 5w), standbys: pr-md-02.emffhz mds: 2/2 daemons up, 1 standby osd: 46 osds: 46 up (since 8d), 46 in (since 4w) data: volumes: 1/1 healthy pools: 4 pools, 1313 pgs objects: 271.28M objects, 494 TiB usage: 746 TiB used, 382 TiB / 1.1 PiB avail pgs: 1307 active+clean 5 active+clean+scrubbing 1 active+clean+scrubbing+deep io: client: 55 MiB/s rd, 2.6 MiB/s wr, 15 op/s rd, 46 op/s wr And this: # ceph health detail HEALTH_WARN 2 clients failing to advance oldest client/flush tid; 2 MDSs report slow requests; 1 MDSs behind on trimming [WRN] MDS_CLIENT_OLDEST_TID: 2 clients failing to advance oldest client/flush tid mds.slugfs.pr-md-01.xdtppo(mds.0): Client phoenix-06.prism failing to advance its oldest client/flush tid. client_id: 125780 mds.slugfs.pr-md-02.sbblqq(mds.1): Client phoenix-00.prism failing to advance its oldest client/flush tid. client_id: 99385 [WRN] MDS_SLOW_REQUEST: 2 MDSs report slow requests mds.slugfs.pr-md-01.xdtppo(mds.0): 4 slow requests are blocked > 30 secs mds.slugfs.pr-md-02.sbblqq(mds.1): 67 slow requests are blocked > 30 secs [WRN] MDS_TRIM: 1 MDSs behind on trimming mds.slugfs.pr-md-02.sbblqq(mds.1): Behind on trimming (109410/250) max_segments: 250, num_segments: 109410 The "cure" is the restart the active MDS daemons, one at a time. Then everything becomes healthy again, for a time. We also have the following MDS config items in play: mds_cache_memory_limit = 8589934592 mds_cache_trim_decay_rate = .6 mds_log_max_segments = 250 Thanks for any pointers! cheers, erich

1 week, 3 days

2
1
0 0

Re: Best practice and expected benefits of using separate WAL and DB devices with Bluestore

by Darren Soothill

Hi Niklaus, Lots of questions here but let me tray and get through some of them. Personally unless a cluster is for deep archive then I would never suggest configuring or deploying a cluster without Rocks DB and WAL on NVME. There are a number of benefits to this in terms of performance and recovery. Small writes go to the NVME first before being written to the HDD and it makes many recovery operations far more efficient. As to how much faster it makes things that very much depends on the type of workload you have on the system. Lots of small writes will make a significant difference. Very large writes not as much of a difference. Things like compactions of the RocksDB database are a lot faster as they are now running from NVME and not from the HDD. We normally work with a upto 1:12 ratio so 1 NVME for every 12 HDD’s. This is assuming the NVME’s being used are good mixed use enterprise NVME’s with power loss protection. As to failures yes a failure of the NVME would mean a loss of 12 OSD’s but this is no worse than a failure of an entire node. This is something Ceph is designed to handle. I certainly wouldn’t be thinking about putting the NVME’s into raid sets as that will degrade the performance of them when you are trying to get better performance. Darren Soothill Looking for help with your Ceph cluster? Contact us at https://croit.io/ croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io/ | YouTube: https://goo.gl/PGE1Bx

1 week, 4 days

3
3
0 0

RGW: Cannot write to bucket anymore

by Malte Stroem

Hello, there is one bucket for a user in our Ceph cluster who is suddenly not able to write to one of his buckets. Reading works fine. All other buckets work fine. If we copy the bucket to another bucket on the same cluster, the error stays. Writing is not possible in the new bucket, too. Interesting: If we copy the contents of the bucket to a bucket in another Ceph cluster the error is gone. So now we know how to solve this but we do not finde the root cause. I checked the policies, lifecycle and versioning. Nothing. The user has FULL_CONTROL. Same settings for the user's other buckets he can still write to. Wenn setting debugging to higher numbers all I can see is something like this while trying to write to the bucket: s3:put_obj reading permissions s3:put_obj init op s3:put_obj verifying op mask s3:put_obj verifying op permissions op->ERRORHANDLER: err_no=-13 new_err_no=-13 cache get: name=default.rgw.log++script.postrequest. : hit (negative entry) s3:put_obj op status=0 s3:put_obj http status=403 1 ====== req done req=0x7fe8bb60a710 op status=0 http_status=403 latency=0.000000000s ====== I still think there is something with a policy or so. When we copy the bucket to another bucket in the same cluster, at first, while copying you can write to the new bucket but when the copy job progresses at one point writing is not possible anymore. But what is it? Best, Malte

1 week, 4 days

2
4
0 0

Upgrading Ceph 15 to 18

by Malte Stroem

Hello, we'd like to upgrade our cluster from the latest Ceph 15 to Ceph 18. It's running with cephadm. What's the right way to do it? Latest Ceph 15 to latest 16 and then to the latest 17 and then the latest 18? Does that work? Or is it possible to jump from the latest Ceph 16 to the latest Ceph 18? Latest Ceph 15 -> latest Ceph 16 -> latest Ceph 18. Best, Malte

1 week, 4 days

3
4
0 0

MDS daemons crash

by Alexey GERASIMOV

Hello all! Hope that anybody can help us. The initial point: Ceph cluster v15.2 (installed and controlled by the Proxmox) with 3 nodes based on physical servers rented from a cloud provider. The volumes provided by Ceph using CephFS and RBD also. We run 2 MDS daemons but use max_mds=1 so one daemon was in active state, and another in standby. On Thursday some of the applications stopped working. After the investigation it was clear that we have a problem with Ceph, more precisely with СephFS - both MDS daemons suddenly crashed. We tried to restart them and found that they crashed again immediately after the start. The crash information: 2024-04-17T17:47:42.841+0000 7f959ced9700 1 mds.0.29134 recovery_done -- successful recovery! 2024-04-17T17:47:42.853+0000 7f959ced9700 1 mds.0.29134 active_start 2024-04-17T17:47:42.881+0000 7f959ced9700 1 mds.0.29134 cluster recovered. 2024-04-17T17:47:43.825+0000 7f959aed5700 -1 ./src/mds/OpenFileTable.cc: In function 'void OpenFileTable::commit(MDSContext*, uint64_t, int)' thread 7f959aed5700 time 2024-04-17T17:47:43.831243+0000 ./src/mds/OpenFileTable.cc: 549: FAILED ceph_assert(count > 0) Next hours we read tons of articles, studied the documentation, and checked the cluster status in general by the various diagnostic commands - but didn't find anything wrong. At evening we decided to upgrade our Ceph cluster; so, we upgraded it to v16, and finally to v17.2.7. Unfortunately, it didn't solve the problem, MDS continue to crash with the same error. The only difference that we found is the "1 MDSs report damaged metadata" in the output of ceph -s - see it below. I supposed that it may be the well-known bug, but couldn't find the same one on https://tracker.ceph.com - there are several bugs associated with file OpenFileTable.cc but not related to ceph_assert(count > 0) We tried to check the source code of OpenFileTable.cc also, here is a fragment of it, in function OpenFileTable::_journal_finish int omap_idx = anchor.omap_idx; unsigned& count = omap_num_items.at(omap_idx); ceph_assert(count > 0); So, we guess that the object map is empty for some object in Ceph, and it is unexpected behavior. But again, we found nothing wrong in our cluster... Next, we started with https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ article - tried to reset the journal (despite that it was Ok all the time) and wipe the sessions using cephfs-table-tool all reset session command. No result... Now I decided to continue following this article and run cephfs-data-scan scan_extents command, we started it on Friday but it is still working (2 from 3 workers finished, so I'm waiting for the last one; may be I need more workers for the next command cephfs-data-scan scan_inodes that I plan to run ). But I have a doubt that it will solve the issue because, again, we guess that we have no problem with our objects in Ceph but with metadata only... Is it the new bug? or something else? What should we try additionally to run our MDS daemon? Any idea is welcome! The important outputs: ceph -s cluster: id: 4cd1c477-c8d0-4855-a1f1-cb71d89427ed health: HEALTH_ERR 1 MDSs report damaged metadata insufficient standby MDS daemons available 83 daemons have recently crashed 3 mgr modules have recently crashed services: mon: 3 daemons, quorum asrv-dev-stor-2,asrv-dev-stor-3,asrv-dev-stor-1 (age 22h) mgr: asrv-dev-stor-2(active, since 22h), standbys: asrv-dev-stor-1 mds: 1/1 daemons up osd: 18 osds: 18 up (since 22h), 18 in (since 29h) data: volumes: 1/1 healthy pools: 5 pools, 289 pgs objects: 29.72M objects, 5.6 TiB usage: 21 TiB used, 47 TiB / 68 TiB avail pgs: 287 active+clean 2 active+clean+scrubbing+deep io: client: 2.5 KiB/s rd, 172 KiB/s wr, 261 op/s rd, 195 op/s wr ceph fs dump e29480 enable_multiple, ever_enabled_multiple: 0,1 default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 1 Filesystem 'cephfs' (1) fs_name cephfs epoch 29480 flags 12 joinable allow_snaps allow_multimds_snaps created 2022-11-25T15:56:08.507407+0000 modified 2024-04-18T16:52:29.970504+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 14728 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0 up {0=156636152} failed damaged stopped data_pools [5] metadata_pool 6 inline_data disabled balancer standby_count_wanted 1 [mds.asrv-dev-stor-1{0:156636152} state up:active seq 6 laggy since 2024-04-18T16:52:29.970479+0000 addr [v2:172.22.2.91:6800/2487054023,v1:172.22.2.91:6801/2487054023] compat {c=[1],r=[1],i=[7ff]}] cephfs-journal-tool --rank=cephfs:0 journal inspect Overall journal integrity: OK ceph pg dump summary version 41137 stamp 2024-04-18T21:17:59.133536+0000 last_osdmap_epoch 0 last_pg_scan 0 PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG sum 29717605 0 0 0 0 6112544251872 13374192956 28493480 1806575 1806575 OSD_STAT USED AVAIL USED_RAW TOTAL sum 21 TiB 47 TiB 21 TiB 68 TiB ceph pg dump pools POOLID OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG 8 31771 0 0 0 0 131337887503 2482 140 401246 401246 7 839707 0 0 0 0 3519034650971 736 61 399328 399328 6 1319576 0 0 0 0 421044421 13374189738 28493279 206749 206749 5 27526539 0 0 0 0 2461702171417 0 0 792165 792165 2 12 0 0 0 0 48497560 0 0 6991 6991 --- Best regards, Alexey Gerasimov System Manager www.opencascade.com<http://www.opencascade.com/> www.capgemini.com<http://www.capgemini.com> [cid:image001.png@01DA93D0.B001CE80]

1 week, 4 days

1
0
0 0

Working ceph cluster reports large amount of pgs in state unknown/undersized and objects degraded

by Tobias Langner

We operate a tiny ceph cluster (v16.2.7) across three machines, each running two OSDs and one of each mds, mgr, and mon. The cluster serves one main erasure-coded (2+1) storage pool and a few other management-related pools. The cluster has been running smoothly for several months. A few weeks ago we noticed a health warning reporting backfillfull/nearfull osds and pools. Here is the output of `ceph -s` at this point (extraced from logs): -------------------------------------------------------------------------------- cluster: health: HEALTH_WARN 1 backfillfull osd(s) 2 nearfull osd(s) Reduced data availability: 163 pgs inactive, 1 pg peering Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull Degraded data redundancy: 1486709/10911157 objects degraded (13.626%), 68 pgs degraded, 68 pgs undersized 162 pgs not scrubbed in time 6 pool(s) backfillfull services: mon: 3 daemons, quorum mon.101,mon.102,mon.100 (age 5m) mgr: mgr-102(active, since 54m), standbys: mgr-101, mgr-100 mds: 1/1 daemons up, 1 standby, 1 hot standby osd: 6 osds: 6 up (since 4m), 6 in (since 2w); 7 remapped pgs data: volumes: 1/1 healthy pools: 6 pools, 338 pgs objects: 3.64M objects, 14 TiB usage: 13 TiB used, 1.7 TiB / 15 TiB avail pgs: 47.929% pgs unknown 0.296% pgs not active 1486709/10911157 objects degraded (13.626%) 52771/10911157 objects misplaced (0.484%) 162 unknown 106 active+clean 67 active+undersized+degraded 1 active+undersized+degraded+remapped+backfill_toofull 1 remapped+peering 1 active+remapped+backfill_toofull -------------------------------------------------------------------------------- I now see the large amount of pgs in state unknown and the fact that a significant fraction of objects is degraded despite all osds being up, but we didn't notice this back then. Because the cluster continued to act fine from the perspective of the mounted filesystem, we didn't really notice the potential problem and did not intervene. From then one, things have mostly gone downwards. Now, `ceph -s` reports the following: -------------------------------------------------------------------------------- cluster: health: HEALTH_WARN noout flag(s) set Reduced data availability: 117 pgs inactive Degraded data redundancy: 2095625/12121767 objects degraded (17.288%), 114 pgs degraded, 114 pgs undersized 117 pgs not scrubbed in time services: mon: 3 daemons, quorum mon.101,mon.102,mon.100 (age 15h) mgr: mgr-102(active, since 7d), standbys: mgr-100, mgr-101 mds: 1/1 daemons up, 1 standby, 1 hot standby osd: 6 osds: 6 up (since 55m), 6 in (since 3w) flags noout data: volumes: 1/1 healthy pools: 6 pools, 338 pgs objects: 4.04M objects, 15 TiB usage: 12 TiB used, 2.8 TiB / 15 TiB avail pgs: 34.615% pgs unknown 2095625/12121767 objects degraded (17.288%) 117 unknown 114 active+undersized+degraded 107 active+clean -------------------------------------------------------------------------------- Note in particular the still very large number of pgs in state unknown, which hasn't changed in days. Same goes for the degraded pgs. Also, the cluster should have around 37TiB storage available but now it only reports 15 TiB. We did a bit of digging around but couldn't really get to the bottom of the unknown pgs and how we can recover from that. One other data point is that the command `ceph osd df tree` gets stuck on two of the three machines and one the one where it returns something, it looks like this: -------------------------------------------------------------------------------- ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 47.67506 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - root default -13 18.26408 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - datacenter dc.100 -5 18.26408 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - host osd-100 3 hdd 10.91409 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 91 up osd.3 5 hdd 7.34999 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 48 up osd.5 -9 14.69998 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - datacenter dc.101 -7 14.69998 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - host osd-101 0 hdd 7.34999 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 83 up osd.0 1 hdd 7.34999 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 86 up osd.1 -11 14.71100 - 15 TiB 12 TiB 12 TiB 77 MiB 21 GiB 2.6 TiB 82.00 1.00 - datacenter dc.102 -17 7.35550 - 7.4 TiB 6.3 TiB 6.2 TiB 16 MiB 11 GiB 1.1 TiB 85.16 1.04 - host osdroid-102-1 4 hdd 7.35550 1.00000 7.4 TiB 6.3 TiB 6.2 TiB 16 MiB 11 GiB 1.1 TiB 85.16 1.04 114 up osd.4 -15 7.35550 - 7.4 TiB 5.8 TiB 5.7 TiB 61 MiB 10 GiB 1.6 TiB 78.83 0.96 - host osdroid-102-2 2 hdd 7.35550 1.00000 7.4 TiB 5.8 TiB 5.7 TiB 61 MiB 10 GiB 1.6 TiB 78.83 0.96 107 up osd.2 TOTAL 15 TiB 12 TiB 12 TiB 77 MiB 21 GiB 2.6 TiB 82.00 MIN/MAX VAR: 0/1.04 STDDEV: 66.97 -------------------------------------------------------------------------------- The odd part here is that for some reason only osd.2 and osd.4 seem to contribute size to the cluster. Interestingly, accessing content from the storage pool works mostly without issues, which shouldn't work if 4 out of 6 OSDs weren't properly up. Even more odd is that while `ceph health detail` reports a lot of pgs in state unknown, undersized, and degraded, inspecting the respective pgs with `ceph pg <pdid> query` results in active+clean for *all* of them... I'm not sure which of the two pieces of information I am supposed to trust... Any ideas what we can do to get our cluster back into a sane state? I'm happy to provide more logs or command output, please let me know. Thanks!

1 week, 5 days

4
5
0 0

Mysterious Space-Eating Monster

by duluxoz

Hi All, *Something* is chewing up a lot of space on our `\var` partition to the point where we're getting warnings about the Ceph monitor running out of space (ie > 70% full). I've been looking, but I can't find anything significant (ie log files aren't too big, etc) BUT there seem to be a hell of a lot (15) of sub-directories (with GUIDs for names) under the `/var/lib/containers/storage/overlay/` folder, all ending with `merged` - ie `/var/lib/containers/storage/overlay/{{GUID}}/`merged`. Is this normal, or is something going wrong somewhere, or am I looking in the wrong place? Also, if this is the issue, can I delete these folders? Sorry for asking such a noob Q, but the Cephadm/Podman stuff is extremely new to me :-) Thanks in advance Cheers Dulux-Oz

1 week, 6 days

3
2
0 0

2024

2023

2022

2021

2020

2019

ceph-users