Hi,
I playing around with the ceph balancer in luminous and nautilus. While
tuning some balancer settings I experienced some problems with nautilus.
In Luminous I cold configure the max_misplaced value like this:
ceph config-key set mgr/balancer/max_misplaced 0.002
With the same command in nautilus I get this Warning:
WARNING: it looks like you might be trying to set a ceph-mgr module
configuration key. Since Ceph 13.0.0 (Mimic), mgr module configuration
is done with `config set`, and new values set using `config-key set`
will be ignored.
After investigating the nautilus documentation
(https://docs.ceph.com/docs/nautilus/rados/operations/balancer/#throttling)
I tried this:
ceph config set mgr mgr/balancer/max_misplaced 0.002
and get this error:
Error EINVAL: unrecognized config option 'mgr/balancer/max_misplaced'
My question is, how can I configure this parameters? In general the
whole "ceph config" foo confuses me a bit.
ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus
(stable)
Regards
Manuel
FYI I just had an issue with radosgw / civetweb. Wanted to upload 40 MB
file, it started with poor transfer speed, which was decreasing over
time to 20KB/s when I stopped the transfer. I had to kill radosgw and
start it to get 'normal' operation back.
hey folks,
I spent the past 2 hours digging through the forums and similar sources with no luck..
I use ceph storage for docker stacks, and this issue has taken the whole thing down as I cannot mount their volumes back...
Starting yesterday, some of my nodes cannot mount the filesystem and it just hangs on the mount command, while the logs are full of the messages below. It doesn't matter which MDS node is active for some clients it just works, and for others it doesn't.
Any hints?
2019-08-15 02:20:55.230 7f6e80e47700 0 --1- [v2:10.0.0.8:6800/859279545,v1:10.0.0.8:6801/859279545] >> v1:10.0.0.1:0/844039356 conn(0x564928535000 0x56492851b800 :6801 s=READ_FOOTER_AND_DISPATCH pgs=13406 cs=3203 l=0).handle_message_footer Signature check failed
2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1 Message signature does not match contents.
2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1Signature on message:
2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1 sig: 1045888080092928376
2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1Locally calculated signature:
2019-08-15 02:20:56.254 7f6e7fe45700 0 SIGN: MSG 1 sig_check:13982737427498638198
2019-08-15 02:20:56.254 7f6e7fe45700 0 Signature failed.
2019-08-15 02:20:56.254 7f6e7fe45700 0 --1- [v2:10.0.0.8:6800/859279545,v1:10.0.0.8:6801/859279545] >> v1:10.0.0.1:0/844039356 conn(0x564928535000 0x56492851b800 :6801 s=READ_FOOTER_AND_DISPATCH pgs=13410 cs=3205 l=0).handle_message_footer Signature check failed
2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1 Message signature does not match contents.
2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1Signature on message:
2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1 sig: 1045888080092928376
2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1Locally calculated signature:
2019-08-15 02:20:57.246 7f6e80646700 0 SIGN: MSG 1 sig_check:13982737427498638198
2019-08-15 02:20:57.246 7f6e80646700 0 Signature failed.
2019-08-15 02:20:57.246 7f6e80646700 0 --1- [v2:10.0.0.8:6800/859279545,v1:10.0.0.8:6801/859279545] >> v1:10.0.0.1:0/844039356 conn(0x564928535000 0x56492851b800 :6801 s=READ_FOOTER_AND_DISPATCH pgs=13414 cs=3207 l=0).handle_message_footer Signature check failed
Cheers,
Peter
Dear ceph-users,
I'm having trouble with heartbeats, there are a lot of "heartbeat_check:
no reply from..."-messages in my logs when there is no backfilling or
repairing running (yes, it's failing when all PGs are active+clean).
Only a few OSDs are failing, even when there are several OSDs on the
same host. Doesn't look like a network issue to me.
When I set the flags "nobackfill" and "norecover" there are no heartbeat
issues.
My cluster is kind of heterogenous: it's ARMv7 and x86_64, connected
mostly via VPN. Some hosts are Debian Stretch, so I'm still using Ceph
Luminous (12.2.12).
Is there someone having the same issue? What could be the next steps to
debug? Any ideas?
Thanks for any help!
Lorenz
You cannot add OSDs to a cluster that is offline.
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
On Wed, Aug 14, 2019 at 5:43 PM Cory Mueller <corymueller(a)hotmail.com> wrote:
>
> Good Morning,
>
>
> I was wondering if you were aware of a way to add more OSD nodes to an existing OFFLINE cluster.
>
> Ive tried to use ceph-deploy new
> ceph-deploy admin
> ceph-deploy install
>
> and none of them allow me to join the cluster. I tried install the ceph packages beforehand and that still didnt work. The one part that im noticing causing the most issues is the ASC key.
> I installed the asc key the same way i did on the previous 5 nodes before, yet when I take it offline to connect it to cluster, it still tries to install one from online which wont work in a offline environment.
>
> I tried copying the asc key from the other nodes in the cluster that are working, and that didnt change anything either.
>
>
> I was hoping you had a few steps that I should be following to bypass this.
>
> Thank you in advance
> Cory Mueller
>
>
> ________________________________________
> From: Sage Weil <sweil(a)redhat.com>
> Sent: August 14, 2019 2:20 PM
> To: Cory Mueller
> Subject: Re: Ceph-Deploy
>
> Try emailing ceph-users(a)ceph.io
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
I have an all RBD pool/cluster. I am interested in tracking how much disk
space is being used by each RBD image on every OSD drive.
The OSDs are Filestore.
Does anyone know of any existing scripts that accomplish this task?
If not, what commands can be used to generate this info?
Good Morning,
I was wondering if you were aware of a way to add more OSD nodes to an existing OFFLINE cluster.
Ive tried to use ceph-deploy new
ceph-deploy admin
ceph-deploy install
and none of them allow me to join the cluster. I tried install the ceph packages beforehand and that still didnt work. The one part that im noticing causing the most issues is the ASC key.
I installed the asc key the same way i did on the previous 5 nodes before, yet when I take it offline to connect it to cluster, it still tries to install one from online which wont work in a offline environment.
I tried copying the asc key from the other nodes in the cluster that are working, and that didnt change anything either.
I was hoping you had a few steps that I should be following to bypass this.
Thank you in advance
Cory Mueller
________________________________________
From: Sage Weil <sweil(a)redhat.com>
Sent: August 14, 2019 2:20 PM
To: Cory Mueller
Subject: Re: Ceph-Deploy
Try emailing ceph-users(a)ceph.io
Hi everyone,
We had a change with our original planned tech talk for August 22nd.
This is short notice, but if you have a presentation topic you want to
give and discuss with the community, now is a good opportunity.
https://ceph.com/ceph-tech-talks/
--
Mike Perez (thingee)
Hi everyone,
It's that time of the year again for us to form this year's user survey.
The user survey gives the Ceph community insight with how people are using
Ceph and where we should be spending our efforts. You can see last year's
survey in this blog post:
https://ceph.com/ceph-blog/ceph-user-survey-2018-results
To start, let's use the Ceph pad to look at the previous set of questions
and the answer options that will be available.
https://pad.ceph.com/p/user-survey-2019
Let the discussing begin!
--
Mike Perez (thingee)