Hello,
Thank You for help. This is done and everything is working now.
Best Regards
Mateusz Skała
Wiadomość napisana przez Gaël THEROND
<gael.therond(a)bitswalk.com> w dniu 13.10.2020, o godz. 14:59:
EXTERNAL EMAIL - Do not click any links or open any attachments unless you trust the
sender and know the content is safe.
If you’ve got all “Nodes” up and running fine now, here what I’ve done on
my own just this morning.
1°/- Ensure all MONs get the same /etc/ceph/ceph.conf file.
2°/- Many times you MONs share the same keyring, if so, ensure you’ve got
the right keyring in both places /etc/ceph/ceph.mon.keyring and
/var/lib/ceph/mon/<clustername>-<hostname>/keyring
3°/- Delete your NOT HEALTHY mon store and kv that you can found out on
/var/lib/ceph/mon/<clustername>-<hostname>/ it will be rebuild during the
restart of the mon process.
4°/- Start the latest healthy monitor and wait for him to complain about no
way to acquire global_id.
5°/- Start the remaining MONs.
You should see the quorum trigger a new election as soon as each mons will
have detected it is part of an already existing cluster and so retrieve the
appropriate data (store/kv/etc) from the remaining healthy MON.
This procedure can fail if your not healthy MONs don’t get the appropriate
keyring.
Le mar. 13 oct. 2020 à 12:56, Mateusz Skała <mateusz.skala(a)gmail.com
<mailto:mateusz.skala@gmail.com>> a
écrit :
Hi,
Thanks for responding, all monitors goes down, 2/3 is actually up, but
probably not in the quorum. Quick look for before tasks:
1. few pgs without scrub and deep-scrub, 2 mons in cluster
2. added one monitor (via ansible), ansible restarted osd
3. all system os filesystem goes full (because of multiple sst files)
4. all pods with monitors goes down
5. added new fs for monitors, and move data from system os to this fs
6. 2 monitors started (last with failure), but not responding for any
commands
Regards
Mateusz Skała
On Tue, 13 Oct 2020 at 11:25, Gaël THEROND <gael.therond(a)bitswalk.com>
wrote:
> This error means your quorum didn’t formed.
>
> How much mon nodes do you have usually and how much went down?
>
> Le mar. 13 oct. 2020 à 10:56, Mateusz Skała <mateusz.skala(a)gmail.com> a
> écrit :
>
>> Hello Community,
>> I have problems with ceph-mons in docker. Docker pods are starting but I
>> got a lot of messages "e6 handle_auth_request failed to assign global_id”
>> in log. 2 mons are up but I can’t send any ceph commands.
>> Regards
>> Mateusz
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io <mailto:ceph-users@ceph.io>
To unsubscribe send an email to ceph-users-leave(a)ceph.io
<mailto:ceph-users-leave@ceph.io>