October 2020 - ceph-users

Re: MONs are down, the quorum is unable to resolve.

by Brian Topping

I see, maybe you want to look at these instructions. I don’t know if you are running Rook, but the point about getting the container alive by using `sleep` is important. Then you can get into the container with `exec` and do what you need to. https://rook.io/docs/rook/v1.4/ceph-disaster-recovery.html#restoring-mon-qu… > On Oct 12, 2020, at 4:16 PM, Gaël THEROND <gael.therond(a)bitswalk.com> wrote: > > Hi Brian! > > Thanks a lot for your quick answer, it was fast ! > > Yes, I’ve read this doc, yet I can’t perform appropriate commands as my OSDs are up and running. > > As my mon is a container if I try to use ceph-mon —extract it won’t work as the mon process is running and if I stop it the container will be restarted and I’ll be ousted off it. > > I can’t retrieve anything from ceph mon getmap as the quorum isn’t forming. > > Yep, I know that I would need three nodes and I have a third node available since recently for this lab. > > unfortunately it’s a lab cluster and so one of my colleagues just took the third node for testing purpose... I told you, a series of unfortunate events :-) > > I can’t get rid of the cluster as I can’t lost OSDs data. > > G. > > Le mar. 13 oct. 2020 à 00:01, Brian Topping <brian.topping(a)gmail.com <mailto:brian.topping@gmail.com>> a écrit : > Hi there! > > This isn’t a difficult problem to fix. For purposes of clarity, the monmap is just a part of the monitor database. You generally have all the details correct though. > > Have you looked at the process in https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#… <https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#…> > > Please do make sure you are working on the copy of the monitor database with the newest epoch. After removing the other monitors and getting your cluster back online, you can re-add monitors at will. > > Also note that a quorum is defined as "one-half the total number of nodes plus one”. In your case, quorum is defined by both nodes! Taking either down would cause this problem. So you need to have an odd number of nodes to provide the ability to take a node down, for instance in a rolling upgrade. > > Hope that helps! > > Brian > > >> On Oct 12, 2020, at 3:54 PM, Gaël THEROND <gael.therond(a)bitswalk.com <mailto:gael.therond@bitswalk.com>> wrote: >> > > >> Hi everyone, >> >> Because of unfortunate events, I’ve a containers based ceph cluster >> (nautilus) in a bad shape. >> >> One of the lab cluster which is only made of 2 nodes as control plane (I >> know it’s bad :-)) each of these nodes run a mon, a mgr and a rados-gw >> containerized ceph_daemon. >> >> They were installed using ceph-ansible if relevant for anyone. >> >> However, when I was performing an upgrade on one of the first nodes, the >> second went down too (electrical power outage). >> >> As soon as I saw that I stopped all current process within the upgrading >> node. >> >> For now, if I try to restart my second node, as the quorum is looking for >> two node the cluster isn’t available. >> >> The container start, the node elect itself as the master but all ceph >> commands are stuck forever, which is perfectly normal as the quorum still >> wait for one member to achieve the election process etc. >> >> So, my question is, as I can’t (to my knowledge) extract the monmap with >> this intermediary state, and as my first node will still be considered as a >> known mon and try to join back if started properly, can I just copy the >> /etc/ceph.conf and /var/lib/mon/<host>/keyring from the last living node >> (the second one) and copy everything at its own place within the first >> node? My mon keys were the same for both mon initially and if I’m not >> making any mistakes my first node being blank will try to create a default >> store, join the existing cluster and try to retrieve the appropriate monmap >> from the remaining node right? >> >> If not, is there a process to be able to save/extract the monmap when using >> a container based ceph ? I can perfectly exec on the remaining node if it >> make any difference. >> >> Thanks a lot! > >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io <mailto:ceph-users@ceph.io> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io <mailto:ceph-users-leave@ceph.io> >

3 years, 6 months

2
1
0 0

MONs are down, the quorum is unable to resolve.

by Gaël THEROND

Hi everyone, Because of unfortunate events, I’ve a containers based ceph cluster (nautilus) in a bad shape. One of the lab cluster which is only made of 2 nodes as control plane (I know it’s bad :-)) each of these nodes run a mon, a mgr and a rados-gw containerized ceph_daemon. They were installed using ceph-ansible if relevant for anyone. However, when I was performing an upgrade on one of the first nodes, the second went down too (electrical power outage). As soon as I saw that I stopped all current process within the upgrading node. For now, if I try to restart my second node, as the quorum is looking for two node the cluster isn’t available. The container start, the node elect itself as the master but all ceph commands are stuck forever, which is perfectly normal as the quorum still wait for one member to achieve the election process etc. So, my question is, as I can’t (to my knowledge) extract the monmap with this intermediary state, and as my first node will still be considered as a known mon and try to join back if started properly, can I just copy the /etc/ceph.conf and /var/lib/mon/<host>/keyring from the last living node (the second one) and copy everything at its own place within the first node? My mon keys were the same for both mon initially and if I’m not making any mistakes my first node being blank will try to create a default store, join the existing cluster and try to retrieve the appropriate monmap from the remaining node right? If not, is there a process to be able to save/extract the monmap when using a container based ceph ? I can perfectly exec on the remaining node if it make any difference. Thanks a lot!

3 years, 6 months

2
1
0 0

Long heartbeat ping times

by Frank Schilder

Dear all, occasionally, I find messages like Health check update: Long heartbeat ping times on front interface seen, longest is 1043.153 msec (OSD_SLOW_PING_TIME_FRONT) in the cluster log. Unfortunately, I seem to be unable to find out which OSDs were affected (a-posteriori). I cannot find related messages in any OSD log and the messages I find in /var/log/messages do not contain IP addresses or OSD IDs. Is there a way to find out which OSDs/hosts were the problem after health status is back to healthy? Thanks! ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

3 years, 6 months

1
0
0 0

Re: Cluster under stress - flapping OSDs?

by Kristof Coucke

Diving into the different logging and searching for answers, I came across the following: PG_DEGRADED Degraded data redundancy: 2101057/10339536570 objects degraded (0.020%), 3 pgs degraded, 3 pgs undersized pg 1.4b is stuck undersized for 63114.227655, current state active+undersized+degraded+remapped+backfilling, last acting [62,20,33,25,97,2,159,2147483647,88] pg 1.115 is stuck undersized for 67017.759147, current state active+undersized+degraded+remapped+backfilling, last acting [2147483647,6,28,48,171,160,51,7,84] pg 1.1ec is stuck undersized for 67017.772311, current state active+undersized+degraded+remapped+backfilling, last acting [65,82,2147483647,161,6,36,105,106,48] Note the PG# 2147483647... That doesn't seem correct. Any ideas?

3 years, 6 months

2
1
0 0

Re: Cluster under stress - flapping OSDs?

by Kristof Coucke

I'll answer it myself: When CRUSH fails to find enough OSDs to map to a PG, it will show as a 2147483647 which is ITEM_NONE or no OSD found.

3 years, 6 months

1
0
0 0

Re: Ubuntu 20 with octopus

by Seena Fallah

I've seen this PR that reverts the latest ubuntu version from 20.04 to 18.04 because of some failures! Are there any updates on this? https://github.com/ceph/ceph/pull/35110 On Mon, Oct 12, 2020 at 4:11 AM Robert Ruge <robert.ruge(a)deakin.edu.au> wrote: > I am using Ubuntu 20.04 LTS for a five node 1PB cephfs setup with no > problems that I can attribute to using Ubuntu 20. > > Regards > Robert Ruge > > > > -----Original Message----- > From: Seena Fallah <seenafallah(a)gmail.com> > Sent: Monday, 12 October 2020 11:35 AM > To: ceph-users <ceph-users(a)ceph.io> > Subject: [ceph-users] Re: Ubuntu 20 with octopus > > The main reason I asked is because I don’t see any Ubuntu 20 in this doc > https://docs.ceph.com/en/latest/start/os-recommendations/ > > On Mon, Oct 12, 2020 at 4:01 AM Seena Fallah <seenafallah(a)gmail.com> > wrote: > > > Hi all, > > > > Does anyone has any production cluster with ubuntu 20 (focal) or any > > suggestion or any bugs that prevents to deploy Ceph octopus on Ubuntu 20? > > > > Thanks. > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an > email to ceph-users-leave(a)ceph.io > > Important Notice: The contents of this email are intended solely for the > named addressee and are confidential; any unauthorised use, reproduction or > storage of the contents is expressly prohibited. If you have received this > email in error, please delete it and any attachments immediately and advise > the sender by return email or telephone. > > Deakin University does not warrant that this email and any attachments are > error or virus free. >

3 years, 6 months

2
1
0 0

Cluster under stress - flapping OSDs?

by Kristof Coucke

Hi all, We're now having trouble over a week with our Ceph cluster. Short info regarding our situation: - Original cluster had 10 OSD nodes, each having 16 OSDs - Expansion was necessary, so another 6 nodes have been added - Version: 14.2.11 Last week we saw heavily loaded OSD servers, after help here we identified the disk load being too high due to compaction of the Rocks.db. Taking the disk offline and running ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/xxxx compact does take the load away temporarily as mentioned. Most of the new disks still have a weight of 0 as we want to get the system stable first, but there is something I simply don't understand. When setting the following flags: noout, norecover, nobackfill and norebalance prior taking the disk offline for compaction, we still get a raise in PGs degraded after the OSD is back marked as "up". This night, a flapping OSD was also temporarily marked as "offline", I assume because it was heavily loaded, causing again a rise in degraded PGs. I know that there is a flag "nodown", but I've never used it. Reading the docs, it states these flags are "temporary" and the blocking action will be performed anyhow afterwards... So I have a few questions: 1. Why is the cluster marking PGs as "degraded" and indicating degraded data redundancy, while this was not the case before? It rises (2196398/10339524249 objects degraded (0.021%)) and I simply cannot understand why it keeps going up... 2. The flag "nodown". Can I use this to prevent the flapping... I don't want to get a deeper mess. As far as I understood, it would help in our case as the OSDs are heavily used. 3. Is it a good idea to start adding the other disks as well (slowly increasing their weight)? Thansk, Kristof

3 years, 6 months

1
0
0 0

Cephdeploy support

by Amudhan P

Hi, Future releases of Ceph support cephdeploy or only Cephadm will be the choice. Thanks, Amudhan

3 years, 6 months

1
0
0 0

Is cephfs multi-volume support stable?

by Alexander E. Patrakov

Hello, I found that documentation on the Internet on the question whether I can safely have two instances of cephfs in my cluster is inconsistent. For the record, I don't use snapshots. FOSDEM 19 presentation by Sage Weil: https://archive.fosdem.org/2019/schedule/event/ceph_project_status_update/a… Slide 25 is specifically devoted to this topic, and declares multi-volume support as stable. But, https://docs.ceph.com/en/nautilus/cephfs/experimental-features/ declares that multiple filesystems in the same cluster are an experimental feature, and the "latest" version of the same doc makes the same claim. What should I believe - the presentation or the official docs? -- Alexander E. Patrakov CV: http://pc.cd/PLz7

3 years, 6 months

2
1
0 0

Q on enabling application on the pool

by Void Star Nill

Hello, What is the necessity for enabling the application on the pool? As per the documentation, we need to enable application before using the pool. However, in my case, I have a single pool running on the cluster used for RBD. I am able to run all RBD operations on the pool even if I dont enable the application. So what is the downside of not enabling the application on a pool? Regards, Shridhar

3 years, 6 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users October 2020