May 2023 - ceph-users - lists.ceph.io

cephfs-data-scan with multiple data pools

by Justin Li

Dear All, I'm trying to recover failed MDS metadata by following the link below but having troubles. Thanks in advance. Question1: how to scan 2 data pools with scan_extents (cmd 1). The cmd didn't work with two pools specified. Should I scan one then the other? Question2: As to scan_inodes (cmd 2), should I only specify the first data pool per the document. I'm concerned if the 2nd pool is not scanned then that'll cause metadata loss. *my fs name: cephfs, data pools: cephfs_hdd, cephfs_ssd* cmd 1: cephfs-data-scan scan_extents --filesystem cephfs cephfs_hdd cephfs_ssd cmd 2: cephfs-data-scan scan_inodes --filesystem cephfs cephfs_hdd cephfs-data-scan scan_extents [<data pool> [<extra data pool> ...]]cephfs-data-scan scan_inodes [<data pool>]cephfs-data-scan scan_links Note, the data pool parameters for ‘scan_extents’, ‘scan_inodes’ and ‘cleanup’ commands are optional, and usually the tool will be able to detect the pools automatically. Still you may override this. The ‘scan_extents’ command needs all data pools to be specified,* while ‘scan_inodes’ and ‘cleanup’ commands need only the main data pool.* *https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ <https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/>* -- Best Regards, *Justin Li* IT Support/Systems Administrator *Justin.Li2030(a)Gmail.com <Justin.Li2030(a)Gmail.com>* <http://www.linkedin.com/in/justinli7>

11 months, 3 weeks

1
0
0 0

Unable to online CephFS, MDS segfaults during mds log replay

by Alfred Heisner

Hello, I have a Ceph deployment using CephFS. Recently MDS failed and cannot start. Attempting to start MDS for this filesystem results in nearly immediate segfault in MDS. Logs below. cephfs-journal-tool shows Overall journal integrity state OK root@proxmox-2:/var/log/ceph# cephfs-journal-tool --rank=galaxy:all journal inspect Overall journal integrity: OK Stack dump / log from MDS: -14> 2023-05-26T15:01:09.204-0500 7f27c24b2700 1 mds.0.journaler.mdlog(ro) probing for end of the log -13> 2023-05-26T15:01:09.208-0500 7f27c34b4700 1 mds.0.journaler.pq(ro) _finish_read_head loghead(trim 4194304, expire 4194607, write 4194607, stream_format 1). probing for end of log (from 4194607)... -12> 2023-05-26T15:01:09.208-0500 7f27c34b4700 1 mds.0.journaler.pq(ro) probing for end of the log -11> 2023-05-26T15:01:09.412-0500 7f27c24b2700 1 mds.0.journaler.mdlog(ro) _finish_probe_end write_pos = 2388235687 (header had 2388213543). recovered. -10> 2023-05-26T15:01:09.412-0500 7f27c34b4700 1 mds.0.journaler.pq(ro) _finish_probe_end write_pos = 4194607 (header had 4194607). recovered. -9> 2023-05-26T15:01:09.412-0500 7f27c34b4700 4 mds.0.purge_queue operator(): open complete -8> 2023-05-26T15:01:09.412-0500 7f27c34b4700 1 mds.0.journaler.pq(ro) set_writeable -7> 2023-05-26T15:01:09.412-0500 7f27c1cb1700 4 mds.0.log Journal 0x200 recovered. -6> 2023-05-26T15:01:09.412-0500 7f27c1cb1700 4 mds.0.log Recovered journal 0x200 in format 1 -5> 2023-05-26T15:01:09.412-0500 7f27c1cb1700 2 mds.0.6403 Booting: 1: loading/discovering base inodes -4> 2023-05-26T15:01:09.412-0500 7f27c1cb1700 0 mds.0.cache creating system inode with ino:0x100 -3> 2023-05-26T15:01:09.412-0500 7f27c1cb1700 0 mds.0.cache creating system inode with ino:0x1 -2> 2023-05-26T15:01:09.416-0500 7f27c24b2700 2 mds.0.6403 Booting: 2: replaying mds log -1> 2023-05-26T15:01:09.416-0500 7f27c24b2700 2 mds.0.6403 Booting: 2: waiting for purge queue recovered 0> 2023-05-26T15:01:09.428-0500 7f27c0caf700 -1 *** Caught signal (Segmentation fault) ** in thread 7f27c0caf700 thread_name:md_log_replay ceph version 17.2.6 (995dec2cdae920da21db2d455e55efbc339bde24) quincy (stable) 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x13140) [0x7f27cd70c140] 2: (EMetaBlob::replay(MDSRank*, LogSegment*, MDPeerUpdate*)+0x66c2) [0x563540fc7372] 3: (EUpdate::replay(MDSRank*)+0x3c) [0x563540fc8abc] 4: (MDLog::_replay_thread()+0x7cb) [0x563540f4d0fb] 5: (MDLog::ReplayThread::entry()+0xd) [0x563540c1fbfd] 6: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7) [0x7f27cd700ea7] 7: clone() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. What are the safest steps to recovery at this point? Thanks, Al

11 months, 3 weeks

1
0
0 0

Pacific - MDS behind on trimming

by Emmanuel Jaep

Hi, lately, we have had some issues with our MDSs (Ceph version 16.2.10 Pacific). Part of them are related to MDS being behind on trimming. I checked the documentation and found the following information ( https://docs.ceph.com/en/pacific/cephfs/health-messages/): > CephFS maintains a metadata journal that is divided into *log segments*. The length of journal (in number of segments) is controlled by the setting mds_log_max_segments, and when the number of segments exceeds that setting the MDS starts writing back metadata so that it can remove (trim) the oldest segments. If this writeback is happening too slowly, or a software bug is preventing trimming, then this health message may appear. The threshold for this message to appear is controlled by the config option mds_log_warn_factor, the default is 2.0. Some resources on the web (https://www.suse.com/support/kb/doc/?id=000019740) indicated that a solution would be to change the `mds_log_max_segments`. Which I did: ``` ceph --cluster floki tell mds.* injectargs '--mds_log_max_segments=400000' ``` Of course, the warning disappeared, but I have a feeling that I just hid the problem. Pushing a value to 400'000 when the default value is 512 is a lot. Why is the trimming not taking place? How can I troubleshoot this further? Best, Emmanuel

11 months, 3 weeks

2
1
0 0

mds dump inode crashes file system

by Frank Schilder

Hi all, I have an annoying problem with a specific ceph fs client. We have a file server on which we re-export kernel mounts via samba (all mounts with noshare option). On one of these re-exports we have recurring problems. Today I caught it with 2023-05-10T13:39:50.963685+0200 mds.ceph-23 (mds.1) 1761 : cluster [WRN] client.205899841 isn't responding to mclientcaps(revoke), ino 0x20011d3e5cb pending pAsLsXsFscr issued pAsLsXsFscr, sent 61.705410 seconds ago and I wanted to look up what path the inode 0x20011d3e5cb points to. Unfortunately, the command ceph tell "mds.*" dump inode 0x20011d3e5cb crashes an MDS in a way that it restarts itself, but doesn't seem to come back clean (it does not fail over to a stand-by). If I repeat the command above, it crashes the MDS again. Execution on other MDS daemons succeeds, for example: # ceph tell "mds.ceph-24" dump inode 0x20011d3e5cb 2023-05-10T14:14:37.091+0200 7fa47ffff700 0 client.210149523 ms_handle_reset on v2:192.168.32.88:6800/3216233914 2023-05-10T14:14:37.124+0200 7fa4857fa700 0 client.210374440 ms_handle_reset on v2:192.168.32.88:6800/3216233914 dump inode failed, wrong inode number or the inode is not cached The caps recall gets the client evicted at some point but it doesn't manage to come back clean. On a single ceph fs mount point I see this # ls /shares/samba/rit-oil ls: cannot access '/shares/samba/rit-oil': Stale file handle All other mount points are fine, just this one acts up. A "mount -o remount /shares/samba/rit-oil" crashed the entire server and I had to do a cold reboot. On reboot I see this message: https://imgur.com/a/bOSLxBb , which only occurs on this one file server (we are running a few of those). Does this point to a more serious problem, like a file system corruption? Should I try an fs scrub on the corresponding path? Some info about the system: The file server's kernel version is quite recent, updated two weeks ago: $ uname -r 4.18.0-486.el8.x86_64 # cat /etc/redhat-release CentOS Stream release 8 Our ceph cluster is octopus latest and we use the packages from the octopus el8 repo on this server. We have several such shares and they all work fine. It is only on one share where we have persistent problems with the mount point hanging or the server freezing and crashing. After working hours I will try a proper fail of the "broken" MDS to see if I can execute the dump inode command without it crashing everything. In the mean time, any hints would be appreciated. I see that we have an exceptionally large MDS log for the problematic one. Any hint what to look for would be appreciated, it contains a lot from the recovery operations: # pdsh -w ceph-[08-17,23-24] ls -lh "/var/log/ceph/ceph-mds.ceph-??.log" ceph-23: -rw-r--r--. 1 ceph ceph 15M May 10 14:28 /var/log/ceph/ceph-mds.ceph-23.log *** huge *** ceph-24: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 /var/log/ceph/ceph-mds.ceph-24.log ceph-10: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 /var/log/ceph/ceph-mds.ceph-10.log ceph-13: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 /var/log/ceph/ceph-mds.ceph-13.log ceph-08: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 /var/log/ceph/ceph-mds.ceph-08.log ceph-15: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 /var/log/ceph/ceph-mds.ceph-15.log ceph-17: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 /var/log/ceph/ceph-mds.ceph-17.log ceph-14: -rw-r--r--. 1 ceph ceph 16K May 10 14:28 /var/log/ceph/ceph-mds.ceph-14.log ceph-09: -rw-r--r--. 1 ceph ceph 16K May 10 14:28 /var/log/ceph/ceph-mds.ceph-09.log ceph-16: -rw-r--r--. 1 ceph ceph 15K May 10 14:28 /var/log/ceph/ceph-mds.ceph-16.log ceph-11: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 /var/log/ceph/ceph-mds.ceph-11.log ceph-12: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 /var/log/ceph/ceph-mds.ceph-12.log Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

11 months, 3 weeks

3
24
0 0

Multi region RGW Config Questions - Quincy

by Deep Dish

Hello, I have a Qunicy (17.2.6) cluster, looking to create a multi-zone / multi-region RGW service and have a few questions with respect to published docs - https://docs.ceph.com/en/quincy/radosgw/multisite/. In general, I understand the process as: 1. Create a new REALM, ZONEGROUP, ZONE: radosgw-admin realm create --rgw-realm=my_new_realm [--default] radosgw-admin zonegroup create --rgw-zonegroup=my_country --endpoints=http://rgw1:80 --rgw-realm=my_new_realm --master –default radosgw-admin zone create --rgw-zonegroup=my_country --rgw-zone=my-region *\* --master --default *\* --endpoints={http://fqdn}[,{http://fqdn}] ## Question: If I have multiple RGWs deployed on my cluster, do I specify all of them as individual endpoints? OR specifying one rgw automatically propagates config throughout all? 2. Create SYSTEM user radosgw-admin user create --uid="synchronization-user" --display-name="Synchronization User" --system radosgw-admin zone modify --rgw-zone={zone-name} --access-key={access-key} --secret={secret} radosgw-admin period update --commit ## Question: The SYSTEM user is used only for replication? Will creating new REALM, ZONGROUP, ZONE reset any administrative access to management of RGWs through ceph-dashboard? 3. Remove DEFAULT REALM, ZONEGROUP, ZONE and supporting pools radosgw-admin zonegroup delete --rgw-zonegroup=default --rgw-zone=default radosgw-admin period update --commit radosgw-admin zone delete --rgw-zone=default radosgw-admin period update --commit radosgw-admin zonegroup delete --rgw-zonegroup=default radosgw-admin period update --commit ceph osd pool rm default.rgw.control default.rgw.control --yes-i-really-really-mean-it ceph osd pool rm default.rgw.data.root default.rgw.data.root --yes-i-really-really-mean-it ceph osd pool rm default.rgw.gc default.rgw.gc --yes-i-really-really-mean-it ceph osd pool rm default.rgw.log default.rgw.log --yes-i-really-really-mean-it ceph osd pool rm default.rgw.users.uid default.rgw.users.uid --yes-i-really-really-mean-it 4. UPDATING CEPH CONFIG FILE / RGW CONFIG VIA CEPH ORCH # QUESTION: Since I’m using ceph orch would I simply set rgw_zone property via CLUSTER -> CONFIGURATION on ceph-dashboard? Thank you.

11 months, 3 weeks

1
0
0 0

ln: failed to create hard link 'file name': Read-only file system

by Frank Schilder

Hi all, on an NFS re-export of a ceph-fs (kernel client) I observe a very strange error. I'm un-taring a larger package (1.2G) and after some time I get these errors: ln: failed to create hard link 'file name': Read-only file system The strange thing is that this seems only temporary. When I used "ln src dst" for manual testing, the command failed as above. However, after that I tried "ln -v src dst" and this command created the hard link with exactly the same path arguments. During the period when the error occurs, I can't see any FS in read-only mode, neither on the NFS client nor the NFS server. Funny thing is that file creation and write still works, its only the hard-link creation that fails. For details, the set-up is: file-server: mount ceph-fs at /shares/path, export /shares/path as nfs4 to other server other server: mount /shares/path as NFS More precisely, on the file-server: fstab: MON-IPs:/shares/folder /shares/nfs/folder ceph defaults,noshare,name=NAME,secretfile=sec.file,mds_namespace=FS-NAME,_netdev 0 0 exports: /shares/nfs/folder -no_root_squash,rw,async,mountpoint,no_subtree_check DEST-IP On the host at DEST-IP: fstab: FILE-SERVER-IP:/shares/nfs/folder /mnt/folder nfs defaults,_netdev 0 0 Both, the file server and the client server are virtual machines. The file server is on Centos 8 stream (4.18.0-338.el8.x86_64) and the client machine is on AlmaLinux 8 (4.18.0-425.13.1.el8_7.x86_64). When I change the NFS export from "async" to "sync" everything works. However, that's a rather bad workaround and not a solution. Although this looks like an NFS issue, I'm afraid it is a problem with hard links and ceph-fs. It looks like a race with scheduling and executing operations on the ceph-fs kernel mount. Has anyone seen something like that? Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

11 months, 3 weeks

4
13
0 0

`ceph features` on Nautilus still reports "luminous"

by Oliver Schmidt

Dear Ceph community, on our way towards getting our cluster to a current Ceph release, we updated all hosts and clients to Nautilus 14.2.22. But despite setting `ceph osd set-require-min-compat-client nautilus`, the release reported by `ceph features` is still "luminous". Is this supposed to be like this? If not, does anyone have an idea what might be missing to make the features being reported as "nautilus" as well? ``` ~ # ceph mon dump epoch 66 fsid b67bad36-3273-11e3-a2ed-0200000311bf last_changed 2022-12-12 12:20:39.244333 created 2013-10-11 14:57:32.291514 min_mon_release 14 (nautilus) 0: [v2:172.20.4.10:3300/0,v1:172.20.4.10:6789/0] mon.host1 1: [v2:172.20.4.100:3300/0,v1:172.20.4.100:6789/0] mon.host2 2: [v2:172.20.4.101:3300/0,v1:172.20.4.101:6789/0] mon.host3 dumped monmap epoch 66 ~ # ceph features { "mon": [ { "features": "0x3ffddff8ffecffff", "release": "luminous", "num": 3 } ], "osd": [ { "features": "0x3ffddff8ffecffff", "release": "luminous", "num": 14 } ], "client": [ { "features": "0x3ffddff8ffecffff", "release": "luminous", "num": 137 } ], "mgr": [ { "features": "0x3ffddff8ffecffff", "release": "luminous", "num": 3 } ] } ``` All the best -- Oliver Schmidt · os(a)flyingcircus.io · Systems Engineer Flying Circus Internet Operations GmbH · http://flyingcircus.io Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick

11 months, 3 weeks

5
6
0 0

Re: [Help appreciated] ceph mds damaged

by Justin Li

Sorry Patrick, last email was restricted as attachment size. I attached a link for you to download the log. Thanks. https://drive.google.com/drive/folders/1bV_X7vyma_-gTfLrPnEV27QzsdmgyK4g?us… Justin Li Senior Technical Officer School of Information Technology Faculty of Science, Engineering and Built Environment For ICT Support please see https://www.deakin.edu.au/sebeicthelp Deakin University Melbourne Burwood Campus, 221 Burwood Highway, Burwood, VIC 3125 +61 3 9246 8932 Justin.li(a)deakin.edu.au http://www.deakin.edu.au/ Deakin University CRICOS Provider Code 00113B Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone. Deakin University does not warrant that this email and any attachments are error or virus free. -----Original Message----- From: Justin Li Sent: Wednesday, May 24, 2023 8:21 AM To: Patrick Donnelly <pdonnell(a)redhat.com> Cc: ceph-users(a)ceph.io Subject: RE: [ceph-users] [Help appreciated] ceph mds damaged Hi Patrick, I attached two logs here. Those two servers are one of the monitors and MDSs. Let me know if you need more logs. Thanks. Justin Li Senior Technical Officer School of Information Technology Faculty of Science, Engineering and Built Environment For ICT Support please see https://www.deakin.edu.au/sebeicthelp Deakin University Melbourne Burwood Campus, 221 Burwood Highway, Burwood, VIC 3125 +61 3 9246 8932 Justin.li(a)deakin.edu.au http://www.deakin.edu.au/ Deakin University CRICOS Provider Code 00113B Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone. Deakin University does not warrant that this email and any attachments are error or virus free. -----Original Message----- From: Patrick Donnelly <pdonnell(a)redhat.com> Sent: Wednesday, May 24, 2023 7:35 AM To: Justin Li <justin.li(a)deakin.edu.au> Cc: ceph-users(a)ceph.io Subject: Re: [ceph-users] [Help appreciated] ceph mds damaged Hello Justin, On Tue, May 23, 2023 at 4:55 PM Justin Li <justin.li(a)deakin.edu.au> wrote: > > Dear All, > > After a unsuccessful upgrade to pacific, MDS were offline and could not get back on. Checked the MDS log and found below. See cluster info from below as well. Appreciate it if anyone can point me to the right direction. Thanks. > > > MDS log: > > 2023-05-24T06:21:36.831+1000 7efe56e7d700 1 mds.0.cache.den(0x600 > 1005480d3b2) loaded already corrupt dentry: [dentry > #0x100/stray0/1005480d3b2 [19ce,head] rep(a)0,-2.0<mailto:rep@0,-2.0> > NULL (dversion lock) pv=0 v=2154265030 ino=(nil) state=0 > 0x556433addb80] > > -5> 2023-05-24T06:21:36.831+1000 7efe56e7d700 -1 mds.0.damage > notify_dentry Damage to dentries in fragment * of ino 0x600is fatal > because it is a system directory for this rank > > -4> 2023-05-24T06:21:36.831+1000 7efe56e7d700 5 mds.beacon.posco > set_want_state: up:active -> down:damaged > > -3> 2023-05-24T06:21:36.831+1000 7efe56e7d700 5 mds.beacon.posco > Sending beacon down:damaged seq 5339 > > -2> 2023-05-24T06:21:36.831+1000 7efe56e7d700 10 monclient: > _send_mon_message to mon.ceph-3 at v2:10.120.0.146:3300/0 > > -1> 2023-05-24T06:21:37.659+1000 7efe60690700 5 mds.beacon.posco > received beacon reply down:damaged seq 5339 rtt 0.827966 > > 0> 2023-05-24T06:21:37.659+1000 7efe56e7d700 1 mds.posco respawn! > > > Cluster info: > root@ceph-1:~# ceph -s > cluster: > id: e2b93a76-2f97-4b34-8670-727d6ac72a64 > health: HEALTH_ERR > 1 filesystem is degraded > 1 filesystem is offline > 1 mds daemon damaged > > services: > mon: 3 daemons, quorum ceph-1,ceph-2,ceph-3 (age 26h) > mgr: ceph-3(active, since 15h), standbys: ceph-1, ceph-2 > mds: 0/1 daemons up, 3 standby > osd: 135 osds: 133 up (since 10h), 133 in (since 2w) > > data: > volumes: 0/1 healthy, 1 recovering; 1 damaged > pools: 4 pools, 4161 pgs > objects: 230.30M objects, 276 TiB > usage: 836 TiB used, 460 TiB / 1.3 PiB avail > pgs: 4138 active+clean > 13 active+clean+scrubbing > 10 active+clean+scrubbing+deep > > > > root@ceph-1:~# ceph health detail > HEALTH_ERR 1 filesystem is degraded; 1 filesystem is offline; 1 mds > daemon damaged [WRN] FS_DEGRADED: 1 filesystem is degraded > fs cephfs is degraded > [ERR] MDS_ALL_DOWN: 1 filesystem is offline > fs cephfs is offline because no MDS is active for it. > [ERR] MDS_DAMAGE: 1 mds daemon damaged > fs cephfs mds.0 is damaged Do you have a complete log you can share? Try: https://docs.ceph.com/en/quincy/man/8/ceph-post-file/ To get your upgrade to complete, you may set: ceph config set mds mds_go_bad_corrupt_dentry false -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone. Deakin University does not warrant that this email and any attachments are error or virus free.

11 months, 4 weeks

2
4
0 0

Feature/Change Request: Don't send alert emails for --sticky muted WARN conditions

by Edward R Huyer

I recently upgraded to Quincy, and toggled on the BULK flag of a few pools. As a result, my cluster has been spending the last several days shuffling data while growing the pool pg counts. That in turn has resulted in a steadily increasing number of pgs being flagged PG_NOT_DEEP_SCRUBBED. And that has resulted in my getting hundreds of alert emails about pgs not being deep scrubbed, because I get a new email whenever the count changes. I tried using "ceph health mute PG_NOT_DEEP_SCRUBBED --sticky", but all that did (in terms of the email alerts) was make the emails say "HEALTH_OK" instead of "HEALTH WARN", which is less than helpful. I haven't found a way to stop the cluster from sending me these alert emails other than turning off email notifications entirely. If there is one, I'd love to know what it is. If not, I feel like there ought to be one, either as part of muting the health warning, or as a separate toggle. Hundreds of emails over what is expected behavior is rather silly. ----- Edward Huyer Golisano College of Computing and Information Sciences Rochester Institute of Technology Golisano 70-2373 152 Lomb Memorial Drive Rochester, NY 14623 585-475-6651 erhvks(a)rit.edu<mailto:erhvks@rit.edu> Obligatory Legalese: The information transmitted, including attachments, is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and destroy any copies of this information.

11 months, 4 weeks

1
0
0 0

MDS Upgrade from 17.2.5 to 17.2.6 not possible

by Henning Achterrath

Hi all, we did a major update from Pacific to Quincy (17.2.5) a month ago without any problems. Now we have tried a minor update from 17.2.5 to 17.2.6 (ceph orch upgrade). It stucks at mds upgrade phase. At this point the cluster tries to scale down mds (ceph fs set max_mds 1). We waited a few hours. We are running two active mds with 1 standby. No subdir pinning configured. CephFS data pool: 575 TB While Upgrading, Rank 1 MDS remains in state stopping. During this state clients are not able to reconnect. So we paused this upgrade and set max_mds to 2 back again and fail rank 1. After that, standby becomes active. In the mds (rank 1 in stopping state) logs we can see: waiting for strays to migrate In our second try, we have evicted all clients first without success. We make daily snapshots on / and rotate them via snapshot scheduler after one week. Is there a way to get rid of stray entries without scale down mds or do we have to wait longer? We had about the same amount of strays before we did the major upgrade. So, it is a bit curious. Current output from ceph perf dump Rank0: "num_strays": 417304, "num_strays_delayed": 3, "num_strays_enqueuing": 0, "strays_created": 567879, "strays_enqueued": 561803, "strays_reintegrated": 13751, "strays_migrated": 4, Rank1: ceph daemon mds.fdi-cephfs.ceph-service-13.rwdkqs perf dump | grep stray "num_strays": 172528, "num_strays_delayed": 0, "num_strays_enqueuing": 0, "strays_created": 418365, "strays_enqueued": 396142, "strays_reintegrated": 67406, "strays_migrated": 4, Any help would be appreciated. best regards Henning

11 months, 4 weeks

6
7
0 0

2024

2023

2022

2021

2020

2019

ceph-users May 2023