November 2019 - ceph-users

Can't Add Zone at Remote Multisite Cluster

by Mac Wynkoop

Hi All, So, I am trying to create a site-specifc zonegroup at my 2nd site's Ceph cluster. Upon creating the zonegroup and a placeholder master zone at my master site, I go to do a period update and commit, and this is what it returns to me: (hostname) ~ $ radosgw-admin period commit 2019-11-14 22:27:41.023 7f87e87359c0 1 Cannot find zone id=c7403ff7-d141-46b1-a185-67edb10c23e1 (name=hou-placeholder-zone), switching to local zonegroup configuration 2019-11-14 22:27:41.023 7f87e87359c0 -1 Cannot find zone id=c7403ff7-d141-46b1-a185-67edb10c23e1 (name=hou-placeholder-zone) 2019-11-14 22:27:41.023 7f87e87359c0 0 ERROR: failed to start notify service ((22) Invalid argument 2019-11-14 22:27:41.023 7f87e87359c0 0 ERROR: failed to init services (ret=(22) Invalid argument) couldn't init storage provider (hostname) ~ $ radosgw-admin zone list { "default_info": "c7403ff7-d141-46b1-a185-67edb10c23e1", "zones": [ "hou-placeholder-zone", "hou", "hou-ec-1" ] } (hostname)~ $ It's like it doesn't see that the zone actually exists, and won't let me commit the period. Any ideas as to what's going on here? Thanks, Mac

4 years, 5 months

1
0
0 0

Bad links on ceph.io for mailing lists

by Bryan Stillwell

There are some bad links to the mailing list subscribe/unsubscribe/archives on this page that should get updated: https://ceph.io/resources/ The subscribe/unsubscribe/archives links point to the old lists vger and lists.ceph.com, and not the new lists on lists.ceph.io: ceph-devel subscribe => mailto:majordomo@vger.kernel.org?body=subscribe ceph-devel unsubscribe => mailto:majordomo@vger.kernel.org?body=unsubscribe ceph-devel archives => http://marc.info/?l=ceph-devel ceph-users subscribe => mailto:ceph-users-join@lists.ceph.com unsubscribe => mailto:ceph-users-leave@lists.ceph.com archives => http://lists.ceph.com/pipermail/ceph-users-ceph.com/ Bryan

4 years, 5 months

3
2
0 0

Re: osdmaps not trimmed until ceph-mon's restarted (if cluster has a down osd)

by Dan van der Ster

On Thursday, November 14, 2019, Nathan Cutler <ncutler(a)suse.com> wrote: > > Hi Dan: > > > I might have found the reason why several of our clusters (and maybe > > Bryan's too) are getting stuck not trimming osdmaps. > > It seems that when an osd fails, the min_last_epoch_clean gets stuck > > forever (even long after HEALTH_OK), until the ceph-mons are > > restarted. > > > > I've updated the ticket: https://tracker.ceph.com/issues/41154 > > Did you mean to write https://tracker.ceph.com/issues/37875 here? Oops yes that's it. Not sure where that other link came from. Thanks! Dan > Nathan

4 years, 5 months

1
0
0 0

increasing PG count - limiting disruption

by Frank R

Hi all, When increasing the number of placement groups for a pool by a large amount (say 2048 to 4096) is it better to go in small steps or all at once? This is a filestore cluster. Thanks, Frank

4 years, 5 months

2
1
0 0

mds crash loop - cephfs disaster recovery

by Karsten Nielsen

I am a problem with my mds that is in a crash loop, with the help of Yan, Zheng I have run a few attempts to save it but it seems that it is not going the way it should. I am reading through this documentation. https://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/ If I use the last step to get it back by creating an alternative metadata pool for recovery. If that works how do I remove the broken metadata pool and rename the new one to match the broken one ? Thanks, - Karsten

4 years, 5 months

1
0
0 0

Adding new non-containerised hosts to current contanerised environment and moving away from containers forward

by Jeremi Avenant

Good day We currently have 12 nodes in 4 Racks (3x4) and getting another 3 nodes to complete the 5th rack on Version 12.2.12, using ceph-ansible & docker containers. With the 3 new nodes (1 rack bucket) we would like to make use of a non-containerised setup since our long term plan is to completely move away from OSD containers. How would one go forward running a hybrid environment in the interim until we've rebuilt all our existing osd nodes? I can assume that i would require a separate /ceph-ansible directory with site.yml to my existing site-docker.yml Would I require separate inventory files, one for the existing 12 nodes & one for the 3 new nodes? How would I connect the new nodes to the current controller (mds,mgr, mon) container nodes? Regards -- *Jeremi-Ernst Avenant, Mr.*Cloud Infrastructure Specialist Inter-University Institute for Data Intensive Astronomy 5th Floor, Department of Physics and Astronomy, University of Cape Town Tel: 021 959 4137 <0219592327> Web: www.idia.ac.za <http://www.uwc.ac.za/> E-mail (IDIA): jeremi(a)idia.ac.za <mfundo(a)idia.ac.za> Rondebosch, Cape Town, 7600, South Africa

4 years, 5 months

2
1
0 0

Allowing cephfs clients to reconnect

by Mikael Öhman

Hi, I'm trying to make our system a bit more fault tolerant, and I struggle a bit with letting clients reconnect if they have lost contact for a while. When there is a temporary network problem, I would like clients to block I/O, wait for a connection, and resume. Do I have any options other than just increasing mds_session_autoclose ? Is there a downside for using very large value here (like, a full day?)? I expect all clients to be connected at all times anyway when things are running normally. What I see right now (if the disconnect is sufficiently long) is that the ceph client releases the I/O block, and you get permission denied on all I/O operations on the existing mount point. Re-mounting it works, but, this also requires killing off all active session blocking unmounting. Basically, just overall bad is this happens, and I would prefer almost any other option. I can see that the client tries a reconnect when this happens: Nov 12 11:53:24 hebbe01-3 kernel: libceph: mds0 10.43.20.3:6800 connection reset Nov 12 11:53:24 hebbe01-3 kernel: libceph: reset on mds0 Nov 12 11:53:24 hebbe01-3 kernel: ceph: mds0 closed our session Nov 12 11:53:24 hebbe01-3 kernel: ceph: mds0 reconnect start Nov 12 11:53:24 hebbe01-3 kernel: ceph: mds0 reconnect denied Nov 12 11:56:55 hebbe01-3 kernel: libceph: mds0 10.43.20.3:6800 socket closed (con state NEGOTIATING) Nov 12 11:56:55 hebbe01-3 kernel: ceph: mds0 rejected session but the logs on the MDS server disallows it as it's not in a "reconnect state"- So, if I understand this correctly, reconnecting is just available in the case that the MDS server was rebooted? Best regards, Mikael

4 years, 5 months

2
1
0 0

Re: custom x-amz-request-id

by Robin H. Johnson

On Wed, Nov 13, 2019 at 04:32:45PM +0000, Arash Shams wrote: > Hi everybody > Im using Nginx in front of radosgw and I generate the request id header on nginx can I pass the same value to radosgw and tell it use this header instead of generating a new one ? > > nginx sample : > > more_set_input_headers "x-amz-request-id: $txid" You'll have to patch RGW's source to do this: 1. Copy the header from request to response. 2. Do NOT store the header as metadata in PutObject. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robbat2(a)gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

4 years, 5 months

1
0
0 0

Counting OSD maps

by Bryan Stillwell

With FileStore you can get the number of OSD maps for an OSD by using a simple find command: # rpm -q ceph ceph-12.2.12-0.el7.x86_64 # find /var/lib/ceph/osd/ceph-420/current/meta/ -name 'osdmap*' | wc -l 42486 Does anyone know of an equivalent command that can be used with BlueStore? Thanks, Bryan

4 years, 5 months

2
1
0 0

custom x-amz-request-id

by Arash Shams

Hi everybody Im using Nginx in front of radosgw and I generate the request id header on nginx can I pass the same value to radosgw and tell it use this header instead of generating a new one ? nginx sample : more_set_input_headers "x-amz-request-id: $txid" Thanks

4 years, 5 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users November 2019