Hi All,
So, I am trying to create a site-specifc zonegroup at my 2nd site's Ceph
cluster. Upon creating the zonegroup and a placeholder master zone at my
master site, I go to do a period update and commit, and this is what it
returns to me:
(hostname) ~ $ radosgw-admin period commit
2019-11-14 22:27:41.023 7f87e87359c0 1 Cannot find zone
id=c7403ff7-d141-46b1-a185-67edb10c23e1 (name=hou-placeholder-zone),
switching to local zonegroup configuration
2019-11-14 22:27:41.023 7f87e87359c0 -1 Cannot find zone
id=c7403ff7-d141-46b1-a185-67edb10c23e1 (name=hou-placeholder-zone)
2019-11-14 22:27:41.023 7f87e87359c0 0 ERROR: failed to start notify
service ((22) Invalid argument
2019-11-14 22:27:41.023 7f87e87359c0 0 ERROR: failed to init services
(ret=(22) Invalid argument)
couldn't init storage provider
(hostname) ~ $ radosgw-admin zone list
{
"default_info": "c7403ff7-d141-46b1-a185-67edb10c23e1",
"zones": [
"hou-placeholder-zone",
"hou",
"hou-ec-1"
]
}
(hostname)~ $
It's like it doesn't see that the zone actually exists, and won't let me
commit the period. Any ideas as to what's going on here?
Thanks,
Mac
There are some bad links to the mailing list subscribe/unsubscribe/archives on this page that should get updated:
https://ceph.io/resources/
The subscribe/unsubscribe/archives links point to the old lists vger and lists.ceph.com, and not the new lists on lists.ceph.io:
ceph-devel
subscribe => mailto:majordomo@vger.kernel.org?body=subscribe ceph-devel
unsubscribe => mailto:majordomo@vger.kernel.org?body=unsubscribe ceph-devel
archives => http://marc.info/?l=ceph-devel
ceph-users
subscribe => mailto:ceph-users-join@lists.ceph.com
unsubscribe => mailto:ceph-users-leave@lists.ceph.com
archives => http://lists.ceph.com/pipermail/ceph-users-ceph.com/
Bryan
On Thursday, November 14, 2019, Nathan Cutler <ncutler(a)suse.com> wrote:
>
> Hi Dan:
>
> > I might have found the reason why several of our clusters (and maybe
> > Bryan's too) are getting stuck not trimming osdmaps.
> > It seems that when an osd fails, the min_last_epoch_clean gets stuck
> > forever (even long after HEALTH_OK), until the ceph-mons are
> > restarted.
> >
> > I've updated the ticket: https://tracker.ceph.com/issues/41154
>
> Did you mean to write https://tracker.ceph.com/issues/37875 here?
Oops yes that's it. Not sure where that other link came from. Thanks!
Dan
> Nathan
Hi all,
When increasing the number of placement groups for a pool by a large amount
(say 2048 to 4096) is it better to go in small steps or all at once?
This is a filestore cluster.
Thanks,
Frank
I am a problem with my mds that is in a crash loop, with the help of Yan, Zheng I have run a few attempts to save it but it seems that it is not going the way it should.
I am reading through this documentation.
https://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
If I use the last step to get it back by creating an alternative metadata pool for recovery.
If that works how do I remove the broken metadata pool and rename the new one to match the broken one ?
Thanks,
- Karsten
Good day
We currently have 12 nodes in 4 Racks (3x4) and getting another 3 nodes to
complete the 5th rack on Version 12.2.12, using ceph-ansible & docker
containers.
With the 3 new nodes (1 rack bucket) we would like to make use of a
non-containerised setup since our long term plan is to completely move away
from OSD containers.
How would one go forward running a hybrid environment in the interim until
we've rebuilt all our existing osd nodes?
I can assume that i would require a separate /ceph-ansible directory with
site.yml to my existing site-docker.yml
Would I require separate inventory files, one for the existing 12 nodes &
one for the 3 new nodes? How would I connect the new nodes to the current
controller (mds,mgr, mon) container nodes?
Regards
--
*Jeremi-Ernst Avenant, Mr.*Cloud Infrastructure Specialist
Inter-University Institute for Data Intensive Astronomy
5th Floor, Department of Physics and Astronomy,
University of Cape Town
Tel: 021 959 4137 <0219592327>
Web: www.idia.ac.za <http://www.uwc.ac.za/>
E-mail (IDIA): jeremi(a)idia.ac.za <mfundo(a)idia.ac.za>
Rondebosch, Cape Town, 7600, South Africa
Hi, I'm trying to make our system a bit more fault tolerant, and I struggle a bit with letting clients reconnect if they have lost contact for a while.
When there is a temporary network problem, I would like clients to block I/O, wait for a connection, and resume.
Do I have any options other than just increasing mds_session_autoclose ?
Is there a downside for using very large value here (like, a full day?)? I expect all clients to be connected at all times anyway when things are running normally.
What I see right now (if the disconnect is sufficiently long) is that the ceph client releases the I/O block, and you get permission denied on all I/O operations on the existing mount point.
Re-mounting it works, but, this also requires killing off all active session blocking unmounting. Basically, just overall bad is this happens, and I would prefer almost any other option.
I can see that the client tries a reconnect when this happens:
Nov 12 11:53:24 hebbe01-3 kernel: libceph: mds0 10.43.20.3:6800 connection reset
Nov 12 11:53:24 hebbe01-3 kernel: libceph: reset on mds0
Nov 12 11:53:24 hebbe01-3 kernel: ceph: mds0 closed our session
Nov 12 11:53:24 hebbe01-3 kernel: ceph: mds0 reconnect start
Nov 12 11:53:24 hebbe01-3 kernel: ceph: mds0 reconnect denied
Nov 12 11:56:55 hebbe01-3 kernel: libceph: mds0 10.43.20.3:6800 socket closed (con state NEGOTIATING)
Nov 12 11:56:55 hebbe01-3 kernel: ceph: mds0 rejected session
but the logs on the MDS server disallows it as it's not in a "reconnect state"-
So, if I understand this correctly, reconnecting is just available in the case that the MDS server was rebooted?
Best regards, Mikael
On Wed, Nov 13, 2019 at 04:32:45PM +0000, Arash Shams wrote:
> Hi everybody
> Im using Nginx in front of radosgw and I generate the request id header on nginx can I pass the same value to radosgw and tell it use this header instead of generating a new one ?
>
> nginx sample :
>
> more_set_input_headers "x-amz-request-id: $txid"
You'll have to patch RGW's source to do this:
1. Copy the header from request to response.
2. Do NOT store the header as metadata in PutObject.
--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail : robbat2(a)gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
With FileStore you can get the number of OSD maps for an OSD by using a simple find command:
# rpm -q ceph
ceph-12.2.12-0.el7.x86_64
# find /var/lib/ceph/osd/ceph-420/current/meta/ -name 'osdmap*' | wc -l
42486
Does anyone know of an equivalent command that can be used with BlueStore?
Thanks,
Bryan
Hi everybody
Im using Nginx in front of radosgw and I generate the request id header on nginx can I pass the same value to radosgw and tell it use this header instead of generating a new one ?
nginx sample :
more_set_input_headers "x-amz-request-id: $txid"
Thanks