May 2023 - ceph-users - lists.ceph.io

Unable to restart mds - mds crashes almost immediately after finishing recovery

by Emmanuel Jaep

Hi, I just inherited a ceph storage. Therefore, my level of confidence with the tool is certainly less than ideal. We currently have an mds server that refuses to come back online. While reviewing the logs, I can see that, upon mds start, the recovery goes well: ``` -10> 2023-05-03T08:12:43.632+0200 7f345d00b700 1 mds.4.2638711 cluster recovered. 12: (MDCache::_open_ino_traverse_dir(inodeno_t, MDCache::open_ino_info_t&, int)+0xbf) [0x558323d602df] ``` However, rights after this message, ceph handles a couple of clients request: ``` -9> 2023-05-03T08:12:43.632+0200 7f345d00b700 4 mds.4.2638711 set_osd_epoch_barrier: epoch=249241 -8> 2023-05-03T08:12:43.632+0200 7f3459003700 2 mds.4.cache Memory usage: total 2739784, rss 2321188, heap 348412, baseline 315644, 0 / 765023 inodes have caps, 0 caps, 0 caps per inode -7> 2023-05-03T08:12:43.688+0200 7f3458802700 4 mds.4.server handle_client_request client_request(client.108396030:57271 lookup #0x70001516236/012385530.npy 2023-05-02T20:37:19.675666+0200 RETRY=6 caller_uid=135551, caller_gid=11157{0,4,27,11157,}) v5 -6> 2023-05-03T08:12:43.688+0200 7f3458802700 4 mds.4.server handle_client_request client_request(client.104073212:5109945 readdir #0x70001516236 2023-05-02T20:36:29.517066+0200 RETRY=6 caller_uid=180090, caller_gid=11157{0,4,27,11157,}) v5 -5> 2023-05-03T08:12:43.688+0200 7f3458802700 4 mds.4.server handle_client_request client_request(client.104288735:3008344 readdir #0x70001516236 2023-05-02T20:36:29.520801+0200 RETRY=6 caller_uid=135551, caller_gid=11157{0,4,27,11157,}) v5 -4> 2023-05-03T08:12:43.688+0200 7f3458802700 4 mds.4.server handle_client_request client_request(client.8558540:46306346 readdir #0x700019ba15e 2023-05-01T21:26:34.303697+0200 RETRY=49 caller_uid=0, caller_gid=0{}) v2 -3> 2023-05-03T08:12:43.688+0200 7f3458802700 4 mds.4.server handle_client_request client_request(client.96913903:2156912 create #0x1000b37db9a/street-photo-3.png 2023-05-01T17:27:37.454042+0200 RETRY=59 caller_uid=271932, caller_gid=30034{}) v2 -2> 2023-05-03T08:12:43.688+0200 7f345d00b700 5 mds.icadmin006 handle_mds_map old map epoch 2638715 <= 2638715, discarding ``` and crashes: ``` -1> 2023-05-03T08:12:43.692+0200 7f345d00b700 -1 /build/ceph-16.2.10/src/mds/Server.cc: In function 'void Server::handle_client_open(MDRequestRef&)' thread 7f345d00b700 time 2023-05-03T08:12:43.694660+0200 /build/ceph-16.2.10/src/mds/Server.cc: 4240: FAILED ceph_assert(cur->is_auth()) ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f3462533d65] 2: /usr/lib/ceph/libceph-common.so.2(+0x265f6d) [0x7f3462533f6d] 3: (Server::handle_client_open(boost::intrusive_ptr<MDRequestImpl>&)+0x1834) [0x558323c89f04] 4: (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0x28f) [0x558323c925ef] 5: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xa45) [0x558323cc3575] 6: (MDCache::dispatch_request(boost::intrusive_ptr<MDRequestImpl>&)+0x3d) [0x558323d7460d] 7: (MDSContext::complete(int)+0x61) [0x558323f68681] 8: (MDCache::_open_remote_dentry_finish(CDentry*, inodeno_t, MDSContext*, bool, int)+0x3e) [0x558323d3edce] 9: (C_MDC_OpenRemoteDentry::finish(int)+0x3e) [0x558323de6cce] 10: (MDSContext::complete(int)+0x61) [0x558323f68681] 11: (MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&, int)+0xcf) [0x558323d5ff2f] 12: (MDCache::_open_ino_traverse_dir(inodeno_t, MDCache::open_ino_info_t&, int)+0xbf) [0x558323d602df] 13: (MDSContext::complete(int)+0x61) [0x558323f68681] 14: (MDSRank::_advance_queues()+0x88) [0x558323c23c38] 15: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x1fa) [0x558323c24a1a] 16: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x5e) [0x558323c254fe] 17: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x1d6) [0x558323bfd906] 18: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x460) [0x7f34627854e0] 19: (DispatchQueue::entry()+0x58f) [0x7f3462782d7f] 20: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f346284eee1] 21: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f3462278609] 22: clone() 0> 2023-05-03T08:12:43.700+0200 7f345d00b700 -1 *** Caught signal (Aborted) ** in thread 7f345d00b700 thread_name:ms_dispatch ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable) 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x143c0) [0x7f34622843c0] 2: gsignal() 3: abort() 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1ad) [0x7f3462533dc0] 5: /usr/lib/ceph/libceph-common.so.2(+0x265f6d) [0x7f3462533f6d] 6: (Server::handle_client_open(boost::intrusive_ptr<MDRequestImpl>&)+0x1834) [0x558323c89f04] 7: (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0x28f) [0x558323c925ef] 8: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xa45) [0x558323cc3575] 9: (MDCache::dispatch_request(boost::intrusive_ptr<MDRequestImpl>&)+0x3d) [0x558323d7460d] 10: (MDSContext::complete(int)+0x61) [0x558323f68681] 11: (MDCache::_open_remote_dentry_finish(CDentry*, inodeno_t, MDSContext*, bool, int)+0x3e) [0x558323d3edce] 12: (C_MDC_OpenRemoteDentry::finish(int)+0x3e) [0x558323de6cce] 13: (MDSContext::complete(int)+0x61) [0x558323f68681] 14: (MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&, int)+0xcf) [0x558323d5ff2f] 15: (MDCache::_open_ino_traverse_dir(inodeno_t, MDCache::open_ino_info_t&, int)+0xbf) [0x558323d602df] 16: (MDSContext::complete(int)+0x61) [0x558323f68681] 17: (MDSRank::_advance_queues()+0x88) [0x558323c23c38] 18: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x1fa) [0x558323c24a1a] 19: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x5e) [0x558323c254fe] 20: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x1d6) [0x558323bfd906] 21: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x460) [0x7f34627854e0] 22: (DispatchQueue::entry()+0x58f) [0x7f3462782d7f] 23: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f346284eee1] 24: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f3462278609] 25: clone() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_mirror 0/ 5 rbd_replay 0/ 5 rbd_pwl 0/ 5 journaler 0/ 5 objectcacher 0/ 5 immutable_obj_cache 0/ 5 client 1/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 0/ 0 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 1 reserver 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 rgw_sync 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 compressor 1/ 5 bluestore 1/ 5 bluefs 1/ 3 bdev 1/ 5 kstore 4/ 5 rocksdb 4/ 5 leveldb 4/ 5 memdb 1/ 5 fuse 2/ 5 mgr 1/ 5 mgrc 1/ 5 dpdk 1/ 5 eventtrace 1/ 5 prioritycache 0/ 5 test 0/ 5 cephfs_mirror 0/ 5 cephsqlite -2/-2 (syslog threshold) -1/-1 (stderr threshold) --- pthread ID / name mapping for recent threads --- 139862749464320 / 139862757857024 / md_submit 139862766249728 / 139862774642432 / MR_Finisher 139862791427840 / PQ_Finisher 139862799820544 / mds_rank_progr 139862808213248 / ms_dispatch 139862841784064 / ceph-mds 139862858569472 / safe_timer 139862875354880 / ms_dispatch 139862892140288 / io_context_pool 139862908925696 / admin_socket 139862917318400 / msgr-worker-2 139862925711104 / msgr-worker-1 139862934103808 / msgr-worker-0 139862951257984 / ceph-mds max_recent 10000 max_new 10000 log_file /var/log/ceph/floki-mds.icadmin006.log --- end dump of recent events --- ``` How could I troubleshoot that further? Thanks in advance for your help, Emmanuel

1 year

3
3
0 0

osd pause

by Thomas Bennett

Hi, FYI - This might be pedantic, but there does not seem to be any difference between using these two sets of commands: - ceph osd pause / ceph osd unpause - ceph osd set pause / ceph osd unset pause I can see that they both set/unset the pauserd,pausewr flags, but since they don't report anything else, I assume they do exactly the same thing. I also assumed it only stopped reads/writes to the OSDs, but I found this openattic post <https://openattic.org/posts/how-to-do-a-ceph-cluster-maintenanceshutdown/> which had this comment: Pausing the cluster means that you can't see when OSDs come back up again and no map update will happen. I didn't know that but it seems pretty useful knowing. Pausing is mentioned when posts talking about shutting down Ceph cluster for maintenance but often it's added as optional. Does anyone know what is the original intended purpose of pausing and when/why would you use it? Also - can I assume that pause will complete any current write/read ops on OSDs before pausing? Thanks, Tom

1 year

1
0
0 0

rbd map: corrupt full osdmap (-22) when

by Kamil Madac

Hi, We deployed pacific cluster 16.2.12 with cephadm. We experience following error during rbd map: [Wed May 3 08:59:11 2023] libceph: mon2 (1)[2a00:da8:ffef:1433::]:6789 session established [Wed May 3 08:59:11 2023] libceph: another match of type 1 in addrvec [Wed May 3 08:59:11 2023] libceph: corrupt full osdmap (-22) epoch 200 off 1042 (000000009876284d of 000000000cb24b58-0000000080b70596) [Wed May 3 08:59:11 2023] osdmap: 00000000: 08 07 7d 10 00 00 09 01 5d 09 00 00 a2 22 3b 86 ..}.....]....";. [Wed May 3 08:59:11 2023] osdmap: 00000010: e4 f5 11 ed 99 ee 47 75 ca 3c ad 23 c8 00 00 00 ......Gu.<.#.... [Wed May 3 08:59:11 2023] osdmap: 00000020: 21 68 4a 64 98 d2 5d 2e 84 fd 50 64 d9 3a 48 26 !hJd..]...Pd.:H& [Wed May 3 08:59:11 2023] osdmap: 00000030: 02 00 00 00 01 00 00 00 00 00 00 00 1d 05 71 01 ..............q. .... Linux Kernel is 6.1.13 and the important thing is that we are using ipv6 addresses for connection to ceph nodes. We were able to map rbd from client with kernel 5.10, but in prod environment we are not allowed to use that kernel. What could be the reason for such behavior on newer kernels and how to troubleshoot it? Here is output of ceph osd dump: # ceph osd dump epoch 200 fsid a2223b86-e4f5-11ed-99ee-4775ca3cad23 created 2023-04-27T12:18:41.777900+0000 modified 2023-05-02T12:09:40.642267+0000 flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit crush_version 34 full_ratio 0.95 backfillfull_ratio 0.9 nearfull_ratio 0.85 require_min_compat_client luminous min_compat_client jewel require_osd_release pacific stretch_mode_enabled false pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 183 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr_devicehealth pool 2 'idp' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 48 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd max_osd 3 osd.0 up in weight 1 up_from 176 up_thru 182 down_at 172 last_clean_interval [170,171) [v2:[2a00:da8:ffef:1431::]:6800/805023868,v1:[2a00:da8:ffef:1431::]:6801/805023868,v2: 0.0.0.0:6802/805023868,v1:0.0.0.0:6803/805023868] [v2:[2a00:da8:ffef:1431::]:6804/805023868,v1:[2a00:da8:ffef:1431::]:6805/805023868,v2: 0.0.0.0:6806/805023868,v1:0.0.0.0:6807/805023868] exists,up e8fd0ee2-ea63-4d02-8f36-219d36869078 osd.1 up in weight 1 up_from 136 up_thru 182 down_at 0 last_clean_interval [0,0) [v2:[2a00:da8:ffef:1432::]:6800/2172723816,v1:[2a00:da8:ffef:1432::]:6801/2172723816,v2: 0.0.0.0:6802/2172723816,v1:0.0.0.0:6803/2172723816] [v2:[2a00:da8:ffef:1432::]:6804/2172723816,v1:[2a00:da8:ffef:1432::]:6805/2172723816,v2: 0.0.0.0:6806/2172723816,v1:0.0.0.0:6807/2172723816] exists,up 0b7b5628-9273-4757-85fb-9c16e8441895 osd.2 up in weight 1 up_from 182 up_thru 182 down_at 178 last_clean_interval [123,177) [v2:[2a00:da8:ffef:1433::]:6800/887631330,v1:[2a00:da8:ffef:1433::]:6801/887631330,v2: 0.0.0.0:6802/887631330,v1:0.0.0.0:6803/887631330] [v2:[2a00:da8:ffef:1433::]:6804/887631330,v1:[2a00:da8:ffef:1433::]:6805/887631330,v2: 0.0.0.0:6806/887631330,v1:0.0.0.0:6807/887631330] exists,up 21f8d0d5-6a3f-4f78-96c8-8ec4e4f78a01 Thank you. -- Kamil Madac

1 year

3
5
0 0

Re: 17.2.6 fs 'ls' ok, but 'cat' 'operation not permitted' puzzle

by Harry G Coin

This problem of inaccessible file systems post upgrade by other than client.admin date back from v14 carries on through v17. It also applies to any case of specifying other than the default pool names for new file systems. Solved because Curt remembered link on this list. (Thanks Curt!) Here's what the official ceph docs ought have provided, for others who hit this. YMMV: IF you have ceph file systems which have data and meta data pools that were specified in the 'ceph fs new' command (meaning not left to the defaults which create them for you), OR you have an existing ceph file system and are upgrading to a new major version of ceph THEN for the documented 'ceph fs authorize...' commands to do as documented (and avoid strange 'operation not permitted' errors when doing file I/O or similar security related problems for all but such as the client.admin user), you must first run: ceph osd pool application set <your metadata pool name> cephfs metadata <your ceph fs filesystem name> and ceph osd pool application set <your data pool name> cephfs data <your ceph fs filesystem name> Otherwise when the OSD's get a request to read or write data (not the directory info, but file data) they won't know which ceph file system name to look up, nevermind the names you may have chosen for the pools, as the 'defaults' themselves changed in the major releases, from data pool=fsname metadata pool=fsname_metadata to data pool=fsname.data and metadata pool=fsname.meta as the ceph revisions came and went. Any setup that just used 'client.admin' for all mounts didn't see the problem as the admin key gave blanket permission. A temporary 'fix' is to change mount requests to the 'client.admin' and associated key. A less drastic but still half-fix is to change the osd cap for your user to just 'caps osd = "allow rw" and delete "tag cephfs data=...." The only documentation I could find for this upgrade security-related ceph-ending catastrophe was in the NFS, not cephfs docs: https://docs.ceph.com/en/latest/cephfs/nfs/ and the Genius level much appreciated pointer from Curt here: On 5/2/23 14:21, Curt wrote: > This thread might be of use, it's an older version of ceph 14, but > might still apply, > https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/23FDDSYBCDV… > ? > > On Tue, May 2, 2023 at 11:06 PM Harry G Coin <hgcoin(a)gmail.com> wrote: > > In 17.2.6 is there a security requirement that pool names > supporting a > ceph fs filesystem match the filesystem name.data for the data and > name.meta for the associated metadata pool? (multiple file systems > are > enabled) > > I have filesystems from older versions with the data pool name > matching > the filesystem and appending _metadata for that, > > and even older filesystems with the pool name as in 'library' and > 'library_metadata' supporting a filesystem called 'libraryfs' > > The pools all have the cephfs tag. > > But using the documented: > > ceph fs authorize libraryfs client.basicuser / rw > > command allows the root user to mount and browse the library > directory > tree, but fails with 'operation not permitted' when even reading > any file. > > However, changing the client.basicuser osd auth to 'allow rw' > instead of > 'allow rw tag...' allows normal operations. > > So: > > [client.basicuser] > key = <key stuff>== > caps mds = "allow rw fsname=libraryfs" > caps mon = "allow r fsname=libraryfs" > caps osd = "allow rw" > > works, but the same with > > caps osd = "allow rw tag cephfs data=libraryfs" > > leads to the 'operation not permitted' on read, or write or any > actual > access. > > It remains a puzzle. Help appreciated! > > Were there upgrade instructions about that, any help pointing me > to them? > > Thanks > > Harry Coin > Rock Stable Systems > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io >

1 year

3
2
0 0

CephFS Scrub Questions

by Chris Palmer

Hi Grateful if someone could clarify some things about CephFS Scrubs: 1) Am I right that a command such as "ceph tell mds.cephfs:0 scrub start / recursive" only triggers a forward scrub (not a backward scrub)? 2) I couldn't find any reference to forward scrubs being done automatically and was wondering whether I should do them using cron? But then I saw an undated (but I think a little elderly) presentation by Greg Farnum that states that "forward scrub...runs continuously in the background". Is that still correct (for Quincy), and if so what controls the frequency? 3) Are backward scrubs always manual, using the 3 cephfs-data-scan phases? 4) Are regular backward scrubs recommended, or only if there is indication of a problem? (With due regard to the amount of time they may take...) Thanks for any advice. Regards, Chris

1 year

2
1
0 0

Radosgw: ssl_private_key could not find the file even if it existed

by viplanghe6＠gmail.com

The radosgw has been configured like this: [client.rgw.ceph1] host = ceph1 rgw_frontends = beast port=8080 ssl_port=443 ssl_certificate=/root/ssl/ca.crt ssl_private_key=/root/ssl/ca.key #rgw_frontends = beast port=8080 ssl_port=443 ssl_certificate=/root/ssl/ca.crt ssl_private_key=config://rgw/cert/default/ca.key admin_socket = /var/run/ceph/ceph-client.rgw.ceph1 but I'm getting this error: failed to add ssl_private_key=/root/ssl/ca.key: No such file or directory I also tried to import the key into ceph db and provided the path with config://, but it doesn't work too. Anyone have any ideal? Thanks

1 year

2
1
0 0

Frequent calling monitor election

by Frank Schilder

Hi all, our monitors have enjoyed democracy since the beginning. However, I don't share a sudden excitement about voting: 2/9/23 4:42:30 PM[INF]overall HEALTH_OK 2/9/23 4:42:30 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4) 2/9/23 4:42:26 PM[INF]mon.ceph-01 calling monitor election 2/9/23 4:42:26 PM[INF]mon.ceph-26 calling monitor election 2/9/23 4:42:26 PM[INF]mon.ceph-25 calling monitor election 2/9/23 4:42:26 PM[INF]mon.ceph-02 calling monitor election 2/9/23 4:40:00 PM[INF]overall HEALTH_OK 2/9/23 4:30:00 PM[INF]overall HEALTH_OK 2/9/23 4:24:34 PM[INF]overall HEALTH_OK 2/9/23 4:24:34 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4) 2/9/23 4:24:29 PM[INF]mon.ceph-01 calling monitor election 2/9/23 4:24:29 PM[INF]mon.ceph-02 calling monitor election 2/9/23 4:24:29 PM[INF]mon.ceph-03 calling monitor election 2/9/23 4:24:29 PM[INF]mon.ceph-01 calling monitor election 2/9/23 4:24:29 PM[INF]mon.ceph-26 calling monitor election 2/9/23 4:24:29 PM[INF]mon.ceph-25 calling monitor election 2/9/23 4:24:29 PM[INF]mon.ceph-02 calling monitor election 2/9/23 4:24:04 PM[INF]overall HEALTH_OK 2/9/23 4:24:03 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4) 2/9/23 4:23:59 PM[INF]mon.ceph-01 calling monitor election 2/9/23 4:23:59 PM[INF]mon.ceph-02 calling monitor election 2/9/23 4:20:00 PM[INF]overall HEALTH_OK 2/9/23 4:10:00 PM[INF]overall HEALTH_OK 2/9/23 4:00:00 PM[INF]overall HEALTH_OK 2/9/23 3:50:00 PM[INF]overall HEALTH_OK 2/9/23 3:43:13 PM[INF]overall HEALTH_OK 2/9/23 3:43:13 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4) 2/9/23 3:43:08 PM[INF]mon.ceph-01 calling monitor election 2/9/23 3:43:08 PM[INF]mon.ceph-26 calling monitor election 2/9/23 3:43:08 PM[INF]mon.ceph-25 calling monitor election We moved a switch from one rack to another and after the switch came beck up, the monitors frequently bitch about who is the alpha. How do I get them to focus more on their daily duties again? Thanks for any help! ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

1 year

4
9
0 0

pg upmap primary

by Nguetchouang Ngongang Kevin

Hello, I have a question, when happened when i delete a pg on which i set a particular osd as primary using the pg-upmap-primary command ? -- Nguetchouang Ngongang Kevin ENS de Lyon https://perso.ens-lyon.fr/kevin.nguetchouang/

1 year

2
1
0 0

回复：ceph-users Digest, Vol 107, Issue 20

by 余晟佐

help ------------------ 原始邮件 ------------------ 发件人: "ceph-users" <ceph-users-request(a)ceph.io>gt;; 发送时间: 2023年5月4日(星期四) 下午4:40 收件人: "ceph-users"<ceph-users(a)ceph.io>gt;; 主题: ceph-users Digest, Vol 107, Issue 20 Send ceph-users mailing list submissions to ceph-users(a)ceph.io To subscribe or unsubscribe via email, send a message with subject or body 'help' to ceph-users-request(a)ceph.io You can reach the person managing the list at ceph-users-owner(a)ceph.io When replying, please edit your Subject line so it is more specific than "Re: Contents of ceph-users digest..." Today's Topics:    1. Initialization timeout, failed to initialize (Vitaly Goot)    2. Re: MDS crash on FAILED ceph_assert(cur->is_auth())       (Peter van Heusden)    3. Re: MDS "newly corrupt dentry" after patch version upgrade       (Janek Bevendorff)    4. Best practice for expanding Ceph cluster (huxiaoyu(a)horebdata.cn)    5. Re: 16.2.13 pacific QE validation status (Guillaume Abrioux) ---------------------------------------------------------------------- Date: Thu, 04 May 2023 01:50:12 -0000 From: "Vitaly Goot" <vitaly.goot(a)gmail.com> Subject: [ceph-users] Initialization timeout, failed to initialize To: ceph-users(a)ceph.io Message-ID: <168316501216.1713.9594013921879975501@mailman-web> Content-Type: text/plain; charset="utf-8" playing with MULTI-SITE zones for CEPH Object Gateway ceph version: 17.2.5 my setup: 3 zone multi-site; 3-way full sync mode; each zone has 3 machines -> RGW+MON+OSD running load test:  3000 concurrent uploads of 1M object after about 3-4 minutes of load RGW machine get stuck, on 2 zone out of 3 RGW is not responding (e.g. curl $RGW:80) attempt to restart RGW ends up with `Initialization timeout, failed to initialize` here is a backtrace from gdb with a backtrace where it hangs after restart: (gdb) inf thr   Id   Target Id                                           Frame * 1    Thread 0x7fa7d3abbcc0 (LWP 30791) "radosgw"         futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7ffc7f7a2438) at ../sysdeps/nptl/futex-internal.h:183 ... (gdb) bt #0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7ffc7f7a2438) at ../sysdeps/nptl/futex-internal.h:183 #1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7ffc7f7a2488, cond=0x7ffc7f7a2410) at pthread_cond_wait.c:508 #2  __pthread_cond_wait (cond=cond@entry=0x7ffc7f7a2410, mutex=0x7ffc7f7a2488) at pthread_cond_wait.c:647 #3  0x00007fa7d7097e42 in ceph::condition_variable_debug::wait (this=this@entry=0x7ffc7f7a2410, lock=...) at ../src/common/mutex_debug.h:148 #4  0x00007fa7d7953cba in ceph::condition_variable_debug::wait<librados::IoCtxImpl::operate(const object_t&, ObjectOperation*, ceph::real_time*, int)::<lambda()> > (pred=..., lock=..., this=0x7ffc7f7a2410) at ../src/librados/IoCtxImpl.cc:672 #5  librados::IoCtxImpl::operate (this=this@entry=0x558347c21010, oid=..., o=0x558347e12310, pmtime=<optimized out>, flags=<optimized out>) at ../src/librados/IoCtxImpl.cc:672 #6  0x00007fa7d792bd55 in librados::v14_2_0::IoCtx::operate (this=this@entry=0x558347e44760, oid="notify.0", o=o@entry=0x7ffc7f7a2690, flags=flags@entry=0) at ../src/librados/librados_cxx.cc:1536 #7  0x00007fa7d9490ad1 in rgw_rados_operate (dpp=<optimized out>, ioctx=..., oid="notify.0", op=op@entry=0x7ffc7f7a2690, y=..., flags=0) at ../src/rgw/rgw_tools.cc:277 #8  0x00007fa7d9627e0f in RGWSI_RADOS::Obj::operate (this=this@entry=0x558347e44710, dpp=<optimized out>, op=op@entry=0x7ffc7f7a2690, y=..., flags=flags@entry=0) at ../src/rgw/services/svc_rados.h:112 #9  0x00007fa7d96209a5 in RGWSI_Notify::init_watch (this=this@entry=0x558347c49530, dpp=<optimized out>, y=...) at ../src/rgw/services/svc_notify.cc:214 #10 0x00007fa7d962161b in RGWSI_Notify::do_start (this=0x558347c49530, y=..., dpp=<optimized out>) at ../src/rgw/services/svc_notify.cc:277 #11 0x00007fa7d8f17bcf in RGWServiceInstance::start (this=0x558347c49530, y=..., dpp=<optimized out>) at ../src/rgw/rgw_service.cc:331 #12 0x00007fa7d8f1a260 in RGWServices_Def::init (this=this@entry=0x558347de90a0, cct=<optimized out>, have_cache=<optimized out>, raw=raw@entry=false, run_sync=<optimized out>, y=..., dpp=<optimized out>) at /usr/include/c++/9/bits/unique_ptr.h:360 #13 0x00007fa7d8f1cc40 in RGWServices::do_init (this=this@entry=0x558347de90a0, _cct=<optimized out>, have_cache=<optimized out>, raw=raw@entry=false, run_sync=<optimized out>, y=..., dpp=<optimized out>) at ../src/rgw/rgw_service.cc:284 #14 0x00007fa7d92a7b1f in RGWServices::init (dpp=<optimized out>, y=..., run_sync=<optimized out>, have_cache=<optimized out>, cct=<optimized out>, this=0x558347de90a0) at ../src/rgw/rgw_service.h:153 #15 RGWRados::init_svc (this=this@entry=0x558347de8dc0, raw=raw@entry=false, dpp=<optimized out>) at ../src/rgw/rgw_rados.cc:1380 #16 0x00007fa7d930f241 in RGWRados::initialize (this=0x558347de8dc0, dpp=<optimized out>) at ../src/rgw/rgw_rados.cc:1400 #17 0x00007fa7d944f85f in RGWRados::initialize (dpp=<optimized out>, _cct=0x558347c6a320, this=<optimized out>) at ../src/rgw/rgw_rados.h:586 #18 StoreManager::init_storage_provider (dpp=<optimized out>, dpp@entry=0x7ffc7f7a2e90, cct=cct@entry=0x558347c6a320, svc="rados", use_gc_thread=use_gc_thread@entry=true, use_lc_thread=use_lc_thread@entry=true, quota_threads=quota_threads@entry=true, run_sync_thread=true, run_reshard_thread=true, use_cache=true,     use_gc=true) at ../src/rgw/rgw_sal.cc:55 #19 0x00007fa7d8e7367a in StoreManager::get_storage (use_gc=true, use_cache=true, run_reshard_thread=true, run_sync_thread=true, quota_threads=true, use_lc_thread=true, use_gc_thread=true, svc="rados", cct=0x558347c6a320, dpp=0x7ffc7f7a2e90) at /usr/include/c++/9/bits/basic_string.h:267 #20 radosgw_Main (argc=<optimized out>, argv=<optimized out>) at ../src/rgw/rgw_main.cc:372 #21 0x0000558347883f56 in main (argc=<optimized out>, argv=<optimized out>) at ../src/rgw/radosgw.cc:12 (gdb) #0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7ffc7f7a2438) at ../sysdeps/nptl/futex-internal.h:183 #1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7ffc7f7a2488, cond=0x7ffc7f7a2410) at pthread_cond_wait.c:508 #2  __pthread_cond_wait (cond=cond@entry=0x7ffc7f7a2410, mutex=0x7ffc7f7a2488) at pthread_cond_wait.c:647 #3  0x00007fa7d7097e42 in ceph::condition_variable_debug::wait (this=this@entry=0x7ffc7f7a2410, lock=...) at ../src/common/mutex_debug.h:148 #4  0x00007fa7d7953cba in ceph::condition_variable_debug::wait<librados::IoCtxImpl::operate(const object_t&, ObjectOperation*, ceph::real_time*, int)::<lambda()> > (pred=..., lock=..., this=0x7ffc7f7a2410) at ../src/librados/IoCtxImpl.cc:672 #5  librados::IoCtxImpl::operate (this=this@entry=0x558347c21010, oid=..., o=0x558347e12310, pmtime=<optimized out>, flags=<optimized out>) at ../src/librados/IoCtxImpl.cc:672 #6  0x00007fa7d792bd55 in librados::v14_2_0::IoCtx::operate (this=this@entry=0x558347e44760, oid="notify.0", o=o@entry=0x7ffc7f7a2690, flags=flags@entry=0) at ../src/librados/librados_cxx.cc:1536 #7  0x00007fa7d9490ad1 in rgw_rados_operate (dpp=<optimized out>, ioctx=..., oid="notify.0", op=op@entry=0x7ffc7f7a2690, y=..., flags=0) at ../src/rgw/rgw_tools.cc:277 #8  0x00007fa7d9627e0f in RGWSI_RADOS::Obj::operate (this=this@entry=0x558347e44710, dpp=<optimized out>, op=op@entry=0x7ffc7f7a2690, y=..., flags=flags@entry=0) at ../src/rgw/services/svc_rados.h:112 #9  0x00007fa7d96209a5 in RGWSI_Notify::init_watch (this=this@entry=0x558347c49530, dpp=<optimized out>, y=...) at ../src/rgw/services/svc_notify.cc:214 #10 0x00007fa7d962161b in RGWSI_Notify::do_start (this=0x558347c49530, y=..., dpp=<optimized out>) at ../src/rgw/services/svc_notify.cc:277 #11 0x00007fa7d8f17bcf in RGWServiceInstance::start (this=0x558347c49530, y=..., dpp=<optimized out>) at ../src/rgw/rgw_service.cc:331 #12 0x00007fa7d8f1a260 in RGWServices_Def::init (this=this@entry=0x558347de90a0, cct=<optimized out>, have_cache=<optimized out>, raw=raw@entry=false, run_sync=<optimized out>, y=..., dpp=<optimized out>) at /usr/include/c++/9/bits/unique_ptr.h:360 #13 0x00007fa7d8f1cc40 in RGWServices::do_init (this=this@entry=0x558347de90a0, _cct=<optimized out>, have_cache=<optimized out>, raw=raw@entry=false, run_sync=<optimized out>, y=..., dpp=<optimized out>) at ../src/rgw/rgw_service.cc:284 #14 0x00007fa7d92a7b1f in RGWServices::init (dpp=<optimized out>, y=..., run_sync=<optimized out>, have_cache=<optimized out>, cct=<optimized out>, this=0x558347de90a0) at ../src/rgw/rgw_service.h:153 #15 RGWRados::init_svc (this=this@entry=0x558347de8dc0, raw=raw@entry=false, dpp=<optimized out>) at ../src/rgw/rgw_rados.cc:1380 #16 0x00007fa7d930f241 in RGWRados::initialize (this=0x558347de8dc0, dpp=<optimized out>) at ../src/rgw/rgw_rados.cc:1400 #17 0x00007fa7d944f85f in RGWRados::initialize (dpp=<optimized out>, _cct=0x558347c6a320, this=<optimized out>) at ../src/rgw/rgw_rados.h:586 #18 StoreManager::init_storage_provider (dpp=<optimized out>, dpp@entry=0x7ffc7f7a2e90, cct=cct@entry=0x558347c6a320, svc="rados", use_gc_thread=use_gc_thread@entry=true, use_lc_thread=use_lc_thread@entry=true, quota_threads=quota_threads@entry=true, run_sync_thread=true, run_reshard_thread=true, use_cache=true,     use_gc=true) at ../src/rgw/rgw_sal.cc:55 #19 0x00007fa7d8e7367a in StoreManager::get_storage (use_gc=true, use_cache=true, run_reshard_thread=true, run_sync_thread=true, quota_threads=true, use_lc_thread=true, use_gc_thread=true, svc="rados", cct=0x558347c6a320, dpp=0x7ffc7f7a2e90) at /usr/include/c++/9/bits/basic_string.h:267 #20 radosgw_Main (argc=<optimized out>, argv=<optimized out>) at ../src/rgw/rgw_main.cc:372 #21 0x0000558347883f56 in main (argc=<optimized out>, argv=<optimized out>) at ../src/rgw/radosgw.cc:12 Any suggestion on what can be a problem and how to reset RGW so it will be able to start normally? ------------------------------ Date: Thu, 4 May 2023 09:13:56 +0200 From: Peter van Heusden <pvh(a)sanbi.ac.za> Subject: [ceph-users] Re: MDS crash on FAILED ceph_assert(cur->is_auth()) Cc: ceph-users(a)ceph.io Message-ID: <CAK1reXhEDjfKuLmuyus0RT09mwecRmP=LGLcoSKWeZ+pu+YXJQ(a)mail.gmail.com> Content-Type: text/plain; charset="UTF-8" Hi Emmaneul It was a while ago, but as I recall I evicted all clients and that allowed me to restart the MDS servers. There was something clearly "broken" in how at least one of the clients was interacting with the system. Peter On Thu, 4 May 2023 at 07:18, Emmanuel Jaep <emmanuel.jaep(a)gmail.com> wrote: > Hi, > > did you finally figure out what happened? > I do have the same behavior and we can't get the mds to start again... > > Thanks, > > Emmanuel > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io > ------------------------------ Date: Thu, 4 May 2023 09:15:38 +0200 From: Janek Bevendorff <janek.bevendorff(a)uni-weimar.de> Subject: [ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade To: Patrick Donnelly <pdonnell(a)redhat.com> Cc: ceph-users <ceph-users(a)ceph.io> Message-ID: <591f410c-aabc-72af-36d0-478ce8d09028(a)uni-weimar.de> Content-Type: text/plain; charset=UTF-8; format=flowed After running the tool for 11 hours straight, it exited with the following exception: Traceback (most recent call last):    File "/home/webis/first-damage.py", line 156, in <module>      traverse(f, ioctx)    File "/home/webis/first-damage.py", line 84, in traverse      for (dnk, val) in it:    File "rados.pyx", line 1389, in rados.OmapIterator.__next__    File "rados.pyx", line 318, in rados.decode_cstr UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 8: invalid start byte Does that mean that the last inode listed in the output file is corrupt? Any way I can fix it? The output file has 14 million lines. We have about 24.5 million objects in the metadata pool. Janek On 03/05/2023 14:20, Patrick Donnelly wrote: > On Wed, May 3, 2023 at 4:33 AM Janek Bevendorff > <janek.bevendorff(a)uni-weimar.de> wrote: >> Hi Patrick, >> >>> I'll try that tomorrow and let you know, thanks! >> I was unable to reproduce the crash today. Even with >> mds_abort_on_newly_corrupt_dentry set to true, all MDS booted up >> correctly (though they took forever to rejoin with logs set to 20). >> >> To me it looks like the issue has resolved itself overnight. I had run a >> recursive scrub on the file system and another snapshot was taken, in >> case any of those might have had an effect on this. It could also be the >> case that the (supposedly) corrupt journal entry has simply been >> committed now and hence doesn't trigger the assertion any more. Is there >> any way I can verify this? > You can run: > > https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py > > Just do: > > python3 first-damage.py --memo run.1 <meta pool> > > No need to do any of the other steps if you just want a read-only check. > -- Bauhaus-Universität Weimar Bauhausstr. 9a, R308 99423 Weimar, Germany Phone: +49 3643 58 3577 www.webis.de ------------------------------ Date: Thu, 4 May 2023 10:38:30 +0200 From: "huxiaoyu(a)horebdata.cn" <huxiaoyu(a)horebdata.cn> Subject: [ceph-users] Best practice for expanding Ceph cluster To: ceph-users <ceph-users(a)ceph.io> Message-ID: <985B22F000D9DE88+2023050410382589829410(a)horebdata.cn> Content-Type: text/plain;       charset="us-ascii" Dear Ceph folks, I am writing to ask for advice on best practice of expanding ceph cluster. We are running an 8-node Ceph cluster and RGW, and would like to add another 10 node, each of which have 10x 12TB HDD. The current 8-node has ca. 400TB user data. I am wondering whether to add 10 nodes at one shot and let the cluster to rebalance, or divide into 5 steps, each of which add 2 nodes and rebalance step by step?  I do not know what would be the advantages or disadvantages with the one shot scheme vs 5 bataches of adding 2 nodes step-by-step. Any suggestions, experience sharing or advice are highly appreciated. thanks a lot in advance, Samuel huxiaoyu(a)horebdata.cn ------------------------------ Date: Thu, 4 May 2023 10:40:02 +0200 From: Guillaume Abrioux <gabrioux(a)redhat.com> Subject: [ceph-users] Re: 16.2.13 pacific QE validation status To: Laura Flores <lflores(a)redhat.com> Cc: Yuri Weinstein <yweinste(a)redhat.com>gt;, Radoslaw Zarzynski <rzarzyns(a)redhat.com>gt;, dev <dev(a)ceph.io>gt;, ceph-users <ceph-users(a)ceph.io> Message-ID: <CANqTTH5ba9qf3xStCcCZr24n5GPyq0Eeimw3Seha1MZ6wna5nA(a)mail.gmail.com> Content-Type: text/plain; charset="UTF-8" ceph-volume approved https://jenkins.ceph.com/job/ceph-volume-test/553/ On Wed, 3 May 2023 at 22:43, Guillaume Abrioux <gabrioux(a)redhat.com> wrote: > The failure seen in ceph-volume tests isn't related. > That being said, it needs to be fixed to have a better view of the current > status. > > On Wed, 3 May 2023 at 21:00, Laura Flores <lflores(a)redhat.com> wrote: > >> upgrade/octopus-x (pacific) is approved. Went over failures with Adam >> King and it was decided they are not release blockers. >> >> On Wed, May 3, 2023 at 1:53 PM Yuri Weinstein <yweinste(a)redhat.com> >> wrote: >> >>> upgrade/octopus-x (pacific) - Laura >>> ceph-volume - Guillaume >>> >>> + 2 PRs are the remaining issues >>> >>> Josh FYI >>> >>> On Wed, May 3, 2023 at 11:50 AM Radoslaw Zarzynski <rzarzyns(a)redhat.com> >>> wrote: >>> > >>> > rados approved. >>> > >>> > Big thanks to Laura for helping with this! >>> > >>> > On Thu, Apr 27, 2023 at 11:21 PM Yuri Weinstein <yweinste(a)redhat.com> >>> wrote: >>> > > >>> > > Details of this release are summarized here: >>> > > >>> > > https://tracker.ceph.com/issues/59542#note-1 >>> > > Release Notes - TBD >>> > > >>> > > Seeking approvals for: >>> > > >>> > > smoke - Radek, Laura >>> > > rados - Radek, Laura >>> > >   rook - Sébastien Han >>> > >   cephadm - Adam K >>> > >   dashboard - Ernesto >>> > > >>> > > rgw - Casey >>> > > rbd - Ilya >>> > > krbd - Ilya >>> > > fs - Venky, Patrick >>> > > upgrade/octopus-x (pacific) - Laura (look the same as in 16.2.8) >>> > > upgrade/pacific-p2p - Laura >>> > > powercycle - Brad (SELinux denials) >>> > > ceph-volume - Guillaume, Adam K >>> > > >>> > > Thx >>> > > YuriW >>> > > _______________________________________________ >>> > > Dev mailing list -- dev(a)ceph.io >>> > > To unsubscribe send an email to dev-leave(a)ceph.io >>> > >>> _______________________________________________ >>> Dev mailing list -- dev(a)ceph.io >>> To unsubscribe send an email to dev-leave(a)ceph.io >>> >> >> >> -- >> >> Laura Flores >> >> She/Her/Hers >> >> Software Engineer, Ceph Storage <https://ceph.io> >> >> Chicago, IL >> >> lflores(a)ibm.com | lflores(a)redhat.com <lflores(a)redhat.com> >> M: +17087388804 >> >> >> _______________________________________________ >> Dev mailing list -- dev(a)ceph.io >> To unsubscribe send an email to dev-leave(a)ceph.io >> > > > -- > > *Guillaume Abrioux*Senior Software Engineer > -- *Guillaume Abrioux*Senior Software Engineer ------------------------------ Subject: Digest Footer _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ------------------------------ End of ceph-users Digest, Vol 107, Issue 20 *******************************************

1 year

1
0
0 0

MDS crash on FAILED ceph_assert(cur->is_auth())

by Peter van Heusden

I am running Ceph 15.2.13 on CentOS 7.9.2009 and recently my MDS servers have started failing with the error message In function 'void Server::handle_client_open(MDRequestRef&)' thread 7f0ca9908700 time 2021-06-28T09:21:11.484768+0200 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/15.2.13/rpm/el7/BUILD/ceph-15.2.13/src/mds/Server.cc: 4149: FAILED ceph_assert(cur->is_auth()) Complete log is: https://gist.github.com/pvanheus/4da555a6de6b5fa5e46cbf74f5500fbd ceph status output is: # ceph status cluster: id: ed7b2c16-b053-45e2-a1fe-bf3474f90508 health: HEALTH_WARN 30 OSD(s) experiencing BlueFS spillover insufficient standby MDS daemons available 1 MDSs report slow requests 2 mgr modules have failed dependencies 4347046/326505282 objects misplaced (1.331%) 6 nearfull osd(s) 23 pgs not deep-scrubbed in time 23 pgs not scrubbed in time 8 pool(s) nearfull services: mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 22m) mgr: ceph-mon1(active, since 11w), standbys: ceph-mon2, ceph-mon3 mds: SANBI_FS:2 {0=ceph-mon1=up:active(laggy or crashed),1=ceph-mon2=up:stopping} osd: 54 osds: 54 up (since 2w), 54 in (since 11w); 50 remapped pgs data: pools: 8 pools, 833 pgs objects: 42.37M objects, 89 TiB usage: 159 TiB used, 105 TiB / 264 TiB avail pgs: 4347046/326505282 objects misplaced (1.331%) 782 active+clean 49 active+clean+remapped 1 active+clean+scrubbing+deep 1 active+clean+remapped+scrubbing io: client: 29 KiB/s rd, 427 KiB/s wr, 37 op/s rd, 48 op/s wr When restarting a MDS it goes through states replace, reconnect, resolve and finally sets itself to active before this crash happens. Any advice on what to do? Thanks, Peter P.S. apologies if you received this email more than once - I have had some trouble figuring out the correct mailing list to use.

1 year

3
5
0 0

2024

2023

2022

2021

2020

2019

ceph-users May 2023