August 2019 - ceph-users

How to tune the ceph balancer in nautilus

by Manuel Lausch

Hi, I playing around with the ceph balancer in luminous and nautilus. While tuning some balancer settings I experienced some problems with nautilus. In Luminous I cold configure the max_misplaced value like this: ceph config-key set mgr/balancer/max_misplaced 0.002 With the same command in nautilus I get this Warning: WARNING: it looks like you might be trying to set a ceph-mgr module configuration key. Since Ceph 13.0.0 (Mimic), mgr module configuration is done with `config set`, and new values set using `config-key set` will be ignored. After investigating the nautilus documentation (https://docs.ceph.com/docs/nautilus/rados/operations/balancer/#throttling) I tried this: ceph config set mgr mgr/balancer/max_misplaced 0.002 and get this error: Error EINVAL: unrecognized config option 'mgr/balancer/max_misplaced' My question is, how can I configure this parameters? In general the whole "ceph config" foo confuses me a bit. ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable) Regards Manuel

4 years, 8 months

1
0
0 0

rgw luminous 12.2.12

by Marc Roos

FYI I just had an issue with radosgw / civetweb. Wanted to upload 40 MB file, it started with poor transfer speed, which was decreasing over time to 20KB/s when I stopped the transfer. I had to kill radosgw and start it to get 'normal' operation back.

4 years, 8 months

1
0
0 0

"Signature check failed" from certain clients

by Peter Sarossy

hey folks, I spent the past 2 hours digging through the forums and similar sources with no luck.. I use ceph storage for docker stacks, and this issue has taken the whole thing down as I cannot mount their volumes back... Starting yesterday, some of my nodes cannot mount the filesystem and it just hangs on the mount command, while the logs are full of the messages below. It doesn't matter which MDS node is active for some clients it just works, and for others it doesn't. Any hints? 2019-08-15 02:20:55.230 7f6e80e47700 0 --1- [v2:10.0.0.8:6800/859279545,v1:10.0.0.8:6801/859279545] >> v1:10.0.0.1:0/844039356 conn(0x564928535000 0x56492851b800 :6801 s=READ_FOOTER_AND_DISPATCH pgs=13406 cs=3203 l=0).handle_message_footer Signature check failed 2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1 Message signature does not match contents. 2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1Signature on message: 2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1 sig: 1045888080092928376 2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1Locally calculated signature: 2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1 sig_check:13982737427498638198 2019-08-15 02:20:56.254 7f6e7fe45700 0 Signature failed. 2019-08-15 02:20:56.254 7f6e7fe45700 0 --1- [v2:10.0.0.8:6800/859279545,v1:10.0.0.8:6801/859279545] >> v1:10.0.0.1:0/844039356 conn(0x564928535000 0x56492851b800 :6801 s=READ_FOOTER_AND_DISPATCH pgs=13410 cs=3205 l=0).handle_message_footer Signature check failed 2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1 Message signature does not match contents. 2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1Signature on message: 2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1 sig: 1045888080092928376 2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1Locally calculated signature: 2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1 sig_check:13982737427498638198 2019-08-15 02:20:57.246 7f6e80646700 0 Signature failed. 2019-08-15 02:20:57.246 7f6e80646700 0 --1- [v2:10.0.0.8:6800/859279545,v1:10.0.0.8:6801/859279545] >> v1:10.0.0.1:0/844039356 conn(0x564928535000 0x56492851b800 :6801 s=READ_FOOTER_AND_DISPATCH pgs=13414 cs=3207 l=0).handle_message_footer Signature check failed Cheers, Peter

4 years, 8 months

2
1
0 0

Failing heartbeats when no backfill is running

by Lorenz Kiefner

Dear ceph-users, I'm having trouble with heartbeats, there are a lot of "heartbeat_check: no reply from..."-messages in my logs when there is no backfilling or repairing running (yes, it's failing when all PGs are active+clean). Only a few OSDs are failing, even when there are several OSDs on the same host. Doesn't look like a network issue to me. When I set the flags "nobackfill" and "norecover" there are no heartbeat issues. My cluster is kind of heterogenous: it's ARMv7 and x86_64, connected mostly via VPN. Some hosts are Debian Stretch, so I'm still using Ceph Luminous (12.2.12). Is there someone having the same issue? What could be the next steps to debug? Any ideas? Thanks for any help! Lorenz

4 years, 8 months

3
4
0 0

Re: Fw: Ceph-Deploy

by Paul Emmerich

You cannot add OSDs to a cluster that is offline. -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Wed, Aug 14, 2019 at 5:43 PM Cory Mueller <corymueller(a)hotmail.com> wrote: > > Good Morning, > > > I was wondering if you were aware of a way to add more OSD nodes to an existing OFFLINE cluster. > > Ive tried to use ceph-deploy new > ceph-deploy admin > ceph-deploy install > > and none of them allow me to join the cluster. I tried install the ceph packages beforehand and that still didnt work. The one part that im noticing causing the most issues is the ASC key. > I installed the asc key the same way i did on the previous 5 nodes before, yet when I take it offline to connect it to cluster, it still tries to install one from online which wont work in a offline environment. > > I tried copying the asc key from the other nodes in the cluster that are working, and that didnt change anything either. > > > I was hoping you had a few steps that I should be following to bypass this. > > Thank you in advance > Cory Mueller > > > ________________________________________ > From: Sage Weil <sweil(a)redhat.com> > Sent: August 14, 2019 2:20 PM > To: Cory Mueller > Subject: Re: Ceph-Deploy > > Try emailing ceph-users(a)ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

4 years, 8 months

1
0
0 0

rbd image usage per osd

by Frank R

I have an all RBD pool/cluster. I am interested in tracking how much disk space is being used by each RBD image on every OSD drive. The OSDs are Filestore. Does anyone know of any existing scripts that accomplish this task? If not, what commands can be used to generate this info?

4 years, 8 months

2
1
0 0

Fw: Ceph-Deploy

by Cory Mueller

Good Morning, I was wondering if you were aware of a way to add more OSD nodes to an existing OFFLINE cluster. Ive tried to use ceph-deploy new ceph-deploy admin ceph-deploy install and none of them allow me to join the cluster. I tried install the ceph packages beforehand and that still didnt work. The one part that im noticing causing the most issues is the ASC key. I installed the asc key the same way i did on the previous 5 nodes before, yet when I take it offline to connect it to cluster, it still tries to install one from online which wont work in a offline environment. I tried copying the asc key from the other nodes in the cluster that are working, and that didnt change anything either. I was hoping you had a few steps that I should be following to bypass this. Thank you in advance Cory Mueller ________________________________________ From: Sage Weil <sweil(a)redhat.com> Sent: August 14, 2019 2:20 PM To: Cory Mueller Subject: Re: Ceph-Deploy Try emailing ceph-users(a)ceph.io

4 years, 8 months

1
0
0 0

BlueStore _txc_add_transaction errors (possibly related to bug #38724)

by Florian Haas

Hi everyone, it seems there have been several reports in the past related to BlueStore OSDs crashing from unhandled errors in _txc_add_transaction: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-April/034444.html http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032172.html http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-December/031960.ht… http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-December/031964.ht… Bug #38724 tracks this, has been fixed in master with https://github.com/ceph/ceph/pull/27929, and is pending backports (and, I dare say, is *probably* misclassified as being only minor, as this does cause potential data loss as soon as it affects enough OSDs simultaneously): https://tracker.ceph.com/issues/38724 We just ran into a similar issue with a couple of BlueStore OSDs that we recently added to a Luminous (12.2.12) cluster that was upgraded from Jewel, and hence, still largely runs on FileStore. I say similar because evidently other people reporting this problem have been running into ENOENT (No such file or directory) or ENOTEMPTY (Directory not empty); for us it's interestingly E2BIG (Argument list too long): https://tracker.ceph.com/issues/38724#note-26 So I'm wondering if someone could shed light on these questions: * Is this the same issue as that which https://github.com/ceph/ceph/pull/27929 fixes? * Thus, since https://github.com/ceph/ceph/pull/29115 (the Nautilus backport for that fix) has been merged, but is not yet included in a release, do *Nautilus* users get a fix in the upcoming 14.2.3 release, and once they update, would this bug go away with no further intervention required? * For users on *Luminous*, since https://tracker.ceph.com/issues/39694 (the Luminous version of 38724) says "non-trivial backport", is it fair to say that a fix might still take a while for that release? * Finally, are Luminous users safe from this bug if they keep using, or revert to, FileStore? Thanks in advance for your thoughts! Please keep Erik CC'd on your reply. Cheers, Florian

4 years, 8 months

4
6
0 0

Ceph Tech Talk for August 22nd

by Mike Perez

Hi everyone, We had a change with our original planned tech talk for August 22nd. This is short notice, but if you have a presentation topic you want to give and discuss with the community, now is a good opportunity. https://ceph.com/ceph-tech-talks/ -- Mike Perez (thingee)

4 years, 8 months

1
0
0 0

Planning Ceph User Survey for 2019

by Mike Perez

Hi everyone, It's that time of the year again for us to form this year's user survey. The user survey gives the Ceph community insight with how people are using Ceph and where we should be spending our efforts. You can see last year's survey in this blog post: https://ceph.com/ceph-blog/ceph-user-survey-2018-results To start, let's use the Ceph pad to look at the previous set of questions and the answer options that will be available. https://pad.ceph.com/p/user-survey-2019 Let the discussing begin! -- Mike Perez (thingee)

4 years, 8 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users August 2019