Replying to own mail...
On Tue, Jun 15, 2021 at 7:54 PM Dan van der Ster <dan(a)vanderster.com> wrote:
Hi Ilya,
We're now hitting this on CentOS 8.4.
The "setmaxosd" workaround fixed access to one of our clusters, but
isn't working for another, where we have gaps in the osd ids, e.g.
# ceph osd getmaxosd
max_osd = 553 in epoch 691642
# ceph osd tree | sort -n -k1 | tail
541 ssd 0.87299 osd.541 up 1.00000 1.00000
543 ssd 0.87299 osd.543 up 1.00000 1.00000
548 ssd 0.87299 osd.548 up 1.00000 1.00000
552 ssd 0.87299 osd.552 up 1.00000 1.00000
Is there another workaround for this?
The following seems to have fixed this cluster:
1. Fill all gaps with: ceph osd new `uuid`
^^ after this, the cluster is still not mountable.
2. Purge all the gap osds: ceph osd purge <id>
I filled/purged a couple hundred gap osds, and now the cluster can be mounted.
Cheers!
Dan
P.S. The bugzilla is not public:
https://bugzilla.redhat.com/show_bug.cgi?id=1972278
>
> Cheers, dan
>
>
> On Mon, May 3, 2021 at 12:32 PM Ilya Dryomov <idryomov(a)gmail.com> wrote:
> >
> > On Mon, May 3, 2021 at 12:27 PM Magnus Harlander <magnus(a)harlan.de>
wrote:
> > >
> > > Am 03.05.21 um 12:25 schrieb Ilya Dryomov:
> > >
> > > ceph osd setmaxosd 10
> > >
> > > Bingo! Mount works again.
> > >
> > > Veeeery strange things are going on here (-:
> > >
> > > Thanx a lot for now!! If I can help to track it down, please let me know.
> >
> > Good to know it helped! I'll think about this some more and probably
> > plan to patch the kernel client to be less stringent and not choke on
> > this sort of misconfiguration.
> >
> > Thanks,
> >
> > Ilya
> > _______________________________________________
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io