September 2019 - ceph-users

by Robert LeBlanc

The question was posed, "What if we want to backup our RGW data to tape?" Anyone doing this? Any suggestions? We could probably just catch any PUT requests and queue them to be written to tape. Our dataset is so large, that traditional backup solutions don't seem feasible (GFS), so probably a single copy (or two copies on different tapes at the same time) when the object is created. Bonus points for being near-line. Thanks, Robert LeBlanc ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

4 years, 7 months

3
3
0 0

HEALTH_WARN due to large omap object wont clear even after trim

by shubjero

Hey all, Yesterday our cluster went in to HEALTH_WARN due to 1 large omap object in the .usage pool (I've posted about this in the past). Last time we resolved the issue by trimming the usage log below the alert threshold but this time it seems like the alert wont clear even after trimming and (this time) disabling the usage log entirely. ceph health detail HEALTH_WARN 1 large omap objects LARGE_OMAP_OBJECTS 1 large omap objects 1 large objects found in pool '.usage' Search the cluster log for 'Large omap object found' for more details. I've bounced ceph-mon, ceph-mgr, radosgw and even issued osd scrub on the two osd's that hold pg's for the .usage pool but the alert wont clear. It's been over 24 hours since I trimmed the usage log. Any suggestions? Jared Baker Cloud Architect, OICR

4 years, 7 months

3
5
0 0

How to set timeout on Rados gateway request

by Hanyu Liu

Hi, We are looking for a way to set timeout on requests to rados gateway. If a request takes too long time, just kill it. 1. Is there a command that can set the timeout? 2. This parameter looks interesting. Can I know what the "open threads" means? rgw op thread timeout Description: The timeout in seconds for open threads. Type: Integer Default: 600 *(from https://docs.ceph.com/docs/nautilus/radosgw/config-ref/ <https://docs.ceph.com/docs/nautilus/radosgw/config-ref/>)* Thanks, Hanyu

4 years, 7 months

2
1
0 0

handle_connect_reply_2 connect got BADAUTHORIZER when running ceph pg <id> query

by Thomas

Hi, ceph health status reports unknown objects. All objects reside on same osd.9 When I execute ceph pg <id> query I get this (endless) output: 2019-09-20 14:47:35.922 7f937144f700 0 --1- 10.97.206.91:0/2060489821 >> v1:10.97.206.93:7054/15812 conn(0x7f935407c120 0x7f935407b120 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER ^CTraceback (most recent call last): File "/usr/bin/ceph", line 1263, in <module> retval = main() File "/usr/bin/ceph", line 1179, in main prefix='get_command_descriptions') File "/usr/lib/python2.7/dist-packages/ceph_argparse.py", line 1459, in json_command inbuf, timeout, verbose) File "/usr/lib/python2.7/dist-packages/ceph_argparse.py", line 1329, in send_command_retry return send_command(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/ceph_argparse.py", line 1381, in send_command cluster.pg_command, pgid, cmd, inbuf, timeout=timeout) File "/usr/lib/python2.7/dist-packages/ceph_argparse.py", line 1311, in run_in_thread t.join(timeout=timeout) File "/usr/lib/python2.7/threading.py", line 951, in join self.__block.wait(delay) File "/usr/lib/python2.7/threading.py", line 359, in wait _sleep(delay) KeyboardInterrupt 2019-09-20 14:47:35.950 7f937144f700 0 --1- 10.97.206.91:0/2060489821 >> v1:10.97.206.93:7054/15812 conn(0x7f935407f4b0 0x7f935407b920 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER 2019-09-20 14:47:35.950 7f937144f700 0 --1- 10.97.206.91:0/2060489821 >> v1:10.97.206.93:7054/15812 conn(0x7f935407c120 0x7f935407b120 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER 2019-09-20 14:47:35.950 7f937144f700 0 --1- 10.97.206.91:0/2060489821 >> v1:10.97.206.93:7054/15812 conn(0x7f935407f4b0 0x7f935407b920 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER How can I fix this issue with pg / osd.9? THX

4 years, 7 months

1
0
0 0

Re: ceph fs crashes on simple fio test

by Robert LeBlanc

On Tue, Sep 10, 2019 at 1:11 PM Frank Schilder <frans(a)dtu.dk> wrote: > Hi Robert, > > I have meta data on SSD (3xrep) and data on 8+2 EC on spinning disks, so > the speed difference is orders of magnitudes. Our usage is quite meta data > heavy, so this suits us well. In particular since EC pools are high > throughput with large IO sizes. > > As long as one uses fio with direct=1 (probably also if using sync=1 > and/or fsync=1), everything is fine and behaves as you describe. IOPs > fluctuate but adjust to media speed. No problems at all. > > As mentioned in my last update (I cut it out below), the destructive fio > command runs with direct=0 and neither sync=1 nor fsync=1. This test just > writes as fast as it can (to buffers) without waiting for acks. I would > expect that a ceph client would translate that to synced or direct IO, > which would be fine. > > But it doesn't. Instead, it pushes the IO also as fast as possible to the > cluster. I have seen 40kops write on the EC pool (on 100+ HDDs) that can > handle maybe 1kops write in total. The queues were constantly increasing at > an incredible rate (several hundred ops per second). I hope with the change > of cut_off=high that heartbeats will not get lost any more, but this will > still destabilize our ceph cluster quite dramatically. > Changing the cut_off to high will not allow heartbeats to not get lost (heartbeats have a priority far above the high mark). What cut_off = high does is put replication ops into the main queue instead of the strict priority queue. That way an OSD doesn't get DDOSed from it's peers and is never able to service it's own clients. When I did my fio testing, was on FireFly/Hammer and on RBD, so I can't talk specifically to newer versions and CephFS. We haven't had time to set up our test cluster, so I can't run benches at the moment. > My problem is not so much that such an IO pattern could occur in > reasonable software, but > - that someone might try just for fun, and that > - the number of 500+ clients might occasionally produce such a workload by > aggregation. > > I find it somewhat alarming that a storage system that promises data > integrity and reliability can be taken down with a publicly available > benchmark tool in a matter of a few dozen seconds by ordinary users. > Potentially with damaging effects. I guess something similar could be > achieved with a modified rogue client. > > I would expect that a storage cluster should have basic self-defence > mechanisms that prevent this kind of overload or DOS attack by throttling > clients with crazy IO requests. Are there any settings that can be enabled > to prevent this from happening? > ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

4 years, 7 months

2
1
0 0

Re: ceph mdss keep on crashing after update to 14.2.3

by Sasha Litvak

Any chance for a fix soon? In 14.2.5 ? On Thu, Sep 19, 2019 at 8:44 PM Yan, Zheng <ukernel(a)gmail.com> wrote: > On Thu, Sep 19, 2019 at 11:37 PM Dan van der Ster <dan(a)vanderster.com> > wrote: > > > > You were running v14.2.2 before? > > > > It seems that that ceph_assert you're hitting was indeed added > > between v14.2.2. and v14.2.3 in this commit > > > https://github.com/ceph/ceph/commit/12f8b813b0118b13e0cdac15b19ba8a7e127730b > > > > There's a comment in the tracker for that commit which says the > > original fix was incomplete > > (https://tracker.ceph.com/issues/39987#note-5) > > > > So perhaps nautilus needs > > > https://github.com/ceph/ceph/pull/28459/commits/0a1e92abf1cfc8bddf526cbf5bc… > > ?? > > > > You are right. Sorry for the bug. For now, please got back to 14.2.2 > (just mds) or complie ceph-mds from source > > Yan, Zheng > > > Did you already try going back to v14.2.2 (on the MDS's only) ?? > > > > -- dan > > > > On Thu, Sep 19, 2019 at 4:59 PM Kenneth Waegeman > > <kenneth.waegeman(a)ugent.be> wrote: > > > > > > Hi all, > > > > > > I updated our ceph cluster to 14.2.3 yesterday, and today the mds are > crashing one after another. I'm using two active mds. > > > > > > I've made a tracker ticket, but I was wondering if someone else also > has seen this issue yet? > > > > > > -27> 2019-09-19 15:42:00.196 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8887 lookup > #0x100166004d4/WindowsPhone-MSVC-CXX.cmake 2019-09-19 15:42:00.203132 > caller_uid=0, caller_gid=0{0,}) v4 > > > -26> 2019-09-19 15:42:00.196 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865372:5815 lookup > #0x20005a6eb3a/selectable.cpython-37.pyc 2019-09-19 15:42:00.204970 > caller_uid=0, caller_gid=0{0,}) v4 > > > -25> 2019-09-19 15:42:00.196 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8888 lookup > #0x100166004d4/WindowsPhone.cmake 2019-09-19 15:42:00.206381 caller_uid=0, > caller_gid=0{0,}) v4 > > > -24> 2019-09-19 15:42:00.206 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8889 lookup > #0x100166004d4/WindowsStore-MSVC-C.cmake 2019-09-19 15:42:00.209703 > caller_uid=0, caller_gid=0{0,}) v4 > > > -23> 2019-09-19 15:42:00.206 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8890 lookup > #0x100166004d4/WindowsStore-MSVC-CXX.cmake 2019-09-19 15:42:00.213200 > caller_uid=0, caller_gid=0{0,}) v4 > > > -22> 2019-09-19 15:42:00.216 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8891 lookup > #0x100166004d4/WindowsStore.cmake 2019-09-19 15:42:00.216577 caller_uid=0, > caller_gid=0{0,}) v4 > > > -21> 2019-09-19 15:42:00.216 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8892 lookup > #0x100166004d4/Xenix.cmake 2019-09-19 15:42:00.220230 caller_uid=0, > caller_gid=0{0,}) v4 > > > -20> 2019-09-19 15:42:00.216 7f0369aeb700 2 mds.1.cache Memory > usage: total 4603496, rss 4167920, heap 323836, baseline 323836, 501 / > 1162471 inodes have caps, 506 caps, 0.00043528 caps per inode > > > -19> 2019-09-19 15:42:00.216 7f03652e2700 5 mds.1.log > _submit_thread 30520209420029~9062 : EUpdate scatter_writebehind [metablob > 0x1000bd8ac7b, 2 dirs] > > > -18> 2019-09-19 15:42:00.216 7f03652e2700 5 mds.1.log > _submit_thread 30520209429111~10579 : EUpdate scatter_writebehind [metablob > 0x1000bf26309, 9 dirs] > > > -17> 2019-09-19 15:42:00.216 7f03652e2700 5 mds.1.log > _submit_thread 30520209439710~2305 : EUpdate scatter_writebehind [metablob > 0x1000bf2745b.001*, 2 dirs] > > > -16> 2019-09-19 15:42:00.216 7f03652e2700 5 mds.1.log > _submit_thread 30520209442035~1845 : EUpdate scatter_writebehind [metablob > 0x1000c233753, 2 dirs] > > > -15> 2019-09-19 15:42:00.216 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8893 lookup > #0x100166004d4/eCos.cmake 2019-09-19 15:42:00.223360 caller_uid=0, > caller_gid=0{0,}) v4 > > > -14> 2019-09-19 15:42:00.216 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865319:2381 lookup > #0x1001172f39d/microsoft-cp1251 2019-09-19 15:42:00.224940 caller_uid=0, > caller_gid=0{0,}) v4 > > > -13> 2019-09-19 15:42:00.226 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8894 lookup > #0x100166004d4/gas.cmake 2019-09-19 15:42:00.226624 caller_uid=0, > caller_gid=0{0,}) v4 > > > -12> 2019-09-19 15:42:00.226 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865319:2382 readdir > #0x1001172f3d7 2019-09-19 15:42:00.228673 caller_uid=0, caller_gid=0{0,}) v4 > > > -11> 2019-09-19 15:42:00.226 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8895 lookup > #0x100166004d4/kFreeBSD.cmake 2019-09-19 15:42:00.229668 caller_uid=0, > caller_gid=0{0,}) v4 > > > -10> 2019-09-19 15:42:00.226 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8896 lookup > #0x100166004d4/syllable.cmake 2019-09-19 15:42:00.232746 caller_uid=0, > caller_gid=0{0,}) v4 > > > -9> 2019-09-19 15:42:00.236 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865333:8897 readdir > #0x10016601379 2019-09-19 15:42:00.240672 caller_uid=0, caller_gid=0{0,}) v4 > > > -8> 2019-09-19 15:42:00.236 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865356:3604574 readdir > #0x2000090d630 2019-09-19 15:42:00.241832 caller_uid=0, caller_gid=0{0,}) v4 > > > -7> 2019-09-19 15:42:00.266 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865356:3604575 readdir > #0x2000090d631 2019-09-19 15:42:00.272158 caller_uid=0, caller_gid=0{0,}) v4 > > > -6> 2019-09-19 15:42:00.326 7f03652e2700 5 mds.1.log > _submit_thread 30520209443900~3089 : EUpdate scatter_writebehind [metablob > 0x20005af5c63, 3 dirs] > > > -5> 2019-09-19 15:42:00.326 7f03652e2700 5 mds.1.log > _submit_thread 30520209447009~10579 : EUpdate scatter_writebehind [metablob > 0x1000bf26309, 9 dirs] > > > -4> 2019-09-19 15:42:00.326 7f03652e2700 5 mds.1.log > _submit_thread 30520209457608~2305 : EUpdate scatter_writebehind [metablob > 0x1000bf2745b.001*, 2 dirs] > > > -3> 2019-09-19 15:42:00.326 7f03652e2700 5 mds.1.log > _submit_thread 30520209459933~1030 : EUpdate check_inode_max_size [metablob > 0x20005af5c74, 1 dirs] > > > -2> 2019-09-19 15:42:00.326 7f036c2f0700 4 mds.1.server > handle_client_request client_request(client.37865372:5816 setattr size=1138 > #0x20005a6eb67 2019-09-19 15:42:00.333015 caller_uid=0, caller_gid=0{0,}) v4 > > > -1> 2019-09-19 15:42:00.336 7f036c2f0700 -1 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.3/rpm/el7/BUILD/ceph-14.2.3/src/mds/MDCache.cc: > In function 'CInode* MDCache::cow_inode(CInode*, snapid_t)' thread > 7f036c2f0700 time 2019-09-19 15:42:00.333567 > > > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.3/rpm/el7/BUILD/ceph-14.2.3/src/mds/MDCache.cc: > 1498: FAILED ceph_assert(!lock->get_num_wrlocks()) > > > > > > ceph version 14.2.3 (0f776cf838a1ae3130b2b73dc26be9c95c6ccc39) > nautilus (stable) > > > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x14a) [0x7f0375773ac2] > > > 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char > const*, char const*, ...)+0) [0x7f0375773c90] > > > 3: (()+0x1ee48d) [0x55a7e4ccf48d] > > > 4: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, > snapid_t, CInode**, CDentry::linkage_t*)+0x823) [0x55a7e4ccfcb3] > > > 5: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*, > snapid_t)+0xbc) [0x55a7e4cd042c] > > > 6: (Locker::_do_cap_update(CInode*, Capability*, int, snapid_t, > boost::intrusive_ptr<MClientCaps const> const&, > boost::intrusive_ptr<MClientCaps> const&, bool*)+0xfb6) [0x55a7e4d957e6] > > > 7: (Locker::handle_client_caps(boost::intrusive_ptr<MClientCaps > const> const&)+0x2059) [0x55a7e4d9c8e9] > > > 8: (Locker::dispatch(boost::intrusive_ptr<Message const> > const&)+0xe7) [0x55a7e4daaf97] > > > 9: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message > const> const&)+0x304) [0x55a7e4c089e4] > > > 10: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, > bool)+0x6eb) [0x55a7e4c0b1bb] > > > 11: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message > const> const&)+0x40) [0x55a7e4c0b8d0] > > > 12: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> > const&)+0xfc) [0x55a7e4bf82ac] > > > 13: (DispatchQueue::entry()+0x12a9) [0x7f0375968dd9] > > > 14: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f0375a183dd] > > > 15: (()+0x7dd5) [0x7f0373642dd5] > > > 16: (clone()+0x6d) [0x7f03722f302d] > > > > > > 0> 2019-09-19 15:42:00.336 7f036c2f0700 -1 *** Caught signal > (Aborted) ** > > > in thread 7f036c2f0700 thread_name:ms_dispatch > > > > > > ceph version 14.2.3 (0f776cf838a1ae3130b2b73dc26be9c95c6ccc39) > nautilus (stable) > > > 1: (()+0xf5d0) [0x7f037364a5d0] > > > 2: (gsignal()+0x37) [0x7f037222b2c7] > > > 3: (abort()+0x148) [0x7f037222c9b8] > > > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x199) [0x7f0375773b11] > > > 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char > const*, char const*, ...)+0) [0x7f0375773c90] > > > 6: (()+0x1ee48d) [0x55a7e4ccf48d] > > > 7: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, > snapid_t, CInode**, CDentry::linkage_t*)+0x823) [0x55a7e4ccfcb3] > > > 8: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*, > snapid_t)+0xbc) [0x55a7e4cd042c] > > > 9: (Locker::_do_cap_update(CInode*, Capability*, int, snapid_t, > boost::intrusive_ptr<MClientCaps const> const&, > boost::intrusive_ptr<MClientCaps> const&, bool*)+0xfb6) [0x55a7e4d957e6] > > > 10: (Locker::handle_client_caps(boost::intrusive_ptr<MClientCaps > const> const&)+0x2059) [0x55a7e4d9c8e9] > > > 11: (Locker::dispatch(boost::intrusive_ptr<Message const> > const&)+0xe7) [0x55a7e4daaf97] > > > 12: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message > const> const&)+0x304) [0x55a7e4c089e4] > > > 13: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, > bool)+0x6eb) [0x55a7e4c0b1bb] > > > 14: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message > const> const&)+0x40) [0x55a7e4c0b8d0] > > > 15: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> > const&)+0xfc) [0x55a7e4bf82ac] > > > 16: (DispatchQueue::entry()+0x12a9) [0x7f0375968dd9] > > > 17: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f0375a183dd] > > > 18: (()+0x7dd5) [0x7f0373642dd5] > > > 19: (clone()+0x6d) [0x7f03722f302d] > > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > > > > > > > > > Thanks! > > > > > > K > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users(a)lists.ceph.com > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users(a)lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >

4 years, 7 months

1
0
0 0

RGW: realm reloader and slow requests

by Eric Choi

Hello, We recently upgraded from Luminous to Nautilus, after the upgrade, we are seeing this sporadic "lock-up" behavior on the RGW side. What I noticed from the log is that it seems to coincide with rgw realm reloader. What we are seeing is that realm reloader tries to pause frontends, and the last request will take 1 minute or 2 at worst cases, and for that time period RGW is completely locked up, unable to take new requests. 1. Is this an expected behavior? 2. is there a way to disable "rgw realm reloader" or reduce frequency? We are not using multi-site feature and we don't change our realm at all. Thanks!

4 years, 7 months

1
0
1 0

unsubscribe

by Oliver Liebel

4 years, 7 months

1
0
0 0

Stability of cephfs snapshot in nautilus

by pinepinebrook

Is snapshot of cephfs in version nautilus is production ready? When I take a snapshot, what will happen if somebody change the content in that directory before the snapshot finish. Will there be any conflict or something to notice about when I take a snapshot on a busy directory (probably 20 clients r/w in in that directory)

4 years, 7 months

1
0
0 0

Ceph deployment tool suggestions

by Shain Miley

Hello, We currently have an existing 20 node Ceph cluster (17 osd nodes with 3 mon nodes). When this was originally configured, much of the OS install was done manually and the cluster was mainly deployed using ceph-deploy. We are going to be replacing 9 of the 1st gen nodes with 6 newer more dense nodes in the near future. This time around I would like to see about automating the process as much as possible (both OS and Ceph installs). I was wondering if anyone had any suggestions about the best tool to use for this as we are not setting up a cluster from scratch and whatever tool we decide to use, would have to work within the context of an already in production cluster. Some of these tools seem best suited to work when you are setting up a cluster for the very first time, however we are not in that position at this point and I want to make sure I am using something that is flexible enough for our environment going forward. Thanks in advance, Shain -- NPR | Shain Miley | Manager of Infrastructure, Digital Media | smiley(a)npr.org | 202.513.3649

4 years, 7 months

4
3
0 0

2024

2023

2022

2021

2020

2019

ceph-users September 2019