This sounds a lot like an old thread of mine:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/M5ZKF7PTEO2…
See the discussion about mon_sync_max_payload_size, and the PR that
fixed this at some point in nautilus.
Our workaround was:
ceph config set mon mon_sync_max_payload_size 4096
Hope that helps,
Dan
On Wed, Jan 6, 2021 at 8:18 PM Frank Schilder <frans(a)dtu.dk> wrote:
>
> Dear Dan,
>
> thanks for your fast response.
>
> Version: mimic 13.2.10.
>
> Here is the mon_status of the "new" MON during syncing:
>
> [root@ceph-01 ~]# ceph daemon mon.ceph-01 mon_status
> {
> "name": "ceph-01",
> "rank": 0,
> "state": "synchronizing",
> "election_epoch": 0,
> "quorum": [],
> "features": {
> "required_con": "144115188346404864",
> "required_mon": [
> "kraken",
> "luminous",
> "mimic",
> "osdmap-prune"
> ],
> "quorum_con": "0",
> "quorum_mon": []
> },
> "outside_quorum": [
> "ceph-01"
> ],
> "extra_probe_peers": [],
> "sync_provider": [],
> "sync": {
> "sync_provider": "mon.2 192.168.32.67:6789/0",
> "sync_cookie": 33302773774,
> "sync_start_version": 38355711
> },
> "monmap": {
> "epoch": 3,
> "fsid": "e4ece518-f2cb-4708-b00f-b6bf511e91d9",
> "modified": "2019-03-14 23:08:34.717223",
> "created": "2019-03-14 22:18:15.088212",
> "features": {
> "persistent": [
> "kraken",
> "luminous",
> "mimic",
> "osdmap-prune"
> ],
> "optional": []
> },
> "mons": [
> {
> "rank": 0,
> "name": "ceph-01",
> "addr": "192.168.32.65:6789/0",
> "public_addr": "192.168.32.65:6789/0"
> },
> {
> "rank": 1,
> "name": "ceph-02",
> "addr": "192.168.32.66:6789/0",
> "public_addr": "192.168.32.66:6789/0"
> },
> {
> "rank": 2,
> "name": "ceph-03",
> "addr": "192.168.32.67:6789/0",
> "public_addr": "192.168.32.67:6789/0"
> }
> ]
> },
> "feature_map": {
> "mon": [
> {
> "features": "0x3ffddff8ffacfffb",
> "release": "luminous",
> "num": 1
> }
> ],
> "mds": [
> {
> "features": "0x3ffddff8ffacfffb",
> "release": "luminous",
> "num": 2
> }
> ],
> "client": [
> {
> "features": "0x2f018fb86aa42ada",
> "release": "luminous",
> "num": 1
> },
> {
> "features": "0x3ffddff8eeacfffb",
> "release": "luminous",
> "num": 1
> },
> {
> "features": "0x3ffddff8ffacfffb",
> "release": "luminous",
> "num": 17
> }
> ]
> }
> }
>
> I'm a bit surprised that the other 2 MONs don't remain in quorum until this
MON has caught up. Is there any way to monitor the syncing progress? Right now I need to
interrupt regularly to allow some I/O, but I have no clue how long I need to wait.
>
> Thanks for your help!
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Dan van der Ster <dan(a)vanderster.com>
> Sent: 06 January 2021 20:16:44
> To: Frank Schilder
> Cc: Ceph Users
> Subject: Re: [ceph-users] Re: Storage down due to MON sync very slow
>
> Which version of Ceph are you running?
>
> .. dan
>
>
> On Wed, Jan 6, 2021, 8:14 PM Frank Schilder
<frans@dtu.dk<mailto:frans@dtu.dk>> wrote:
> In the output of the MON I see slow ops warnings:
>
> debug 2021-01-06 20:12:48.854 7f1a3d29f700 -1 mon.ceph-01@0(synchronizing) e3
get_health_metrics reporting 20 slow ops, oldest is log(1 entries from seq 1 at 2021-01-06
20:00:12.014861)
>
> There appears to be no progress on this operation, it is stuck.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Frank Schilder <frans@dtu.dk<mailto:frans@dtu.dk>>
> Sent: 06 January 2021 20:11:25
> To: ceph-users@ceph.io<mailto:ceph-users@ceph.io>
> Subject: [ceph-users] Storage down due to MON sync very slow
>
> Dear all,
>
> I had to restart one out of 3 MONs on an empty MON DB dir. It is in state syncing
right now, but I'm not sure if there is any progress. The cluster is completely
unresponsive even though I have 2 healthy MONs. Is there any way to sync the DB directory
faster and/or without downtime?
>
> Thanks a lot!
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
> To unsubscribe send an email to
ceph-users-leave@ceph.io<mailto:ceph-users-leave@ceph.io>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
> To unsubscribe send an email to
ceph-users-leave@ceph.io<mailto:ceph-users-leave@ceph.io>