netstat -anp | grep LISTEN | grep mgr
has it bound to 127.0.0.1 ?
(also check the other daemons).
If so this is another case of
https://tracker.ceph.com/issues/49938
-- dan
On Thu, Mar 25, 2021 at 8:34 PM Simon Oosthoek <s.oosthoek(a)science.ru.nl> wrote:
>
> Hi
>
> I'm in a bit of a panic :-(
>
> Recently we started attempting to configure a radosgw to our ceph
> cluster, which was until now only doing cephfs (and rbd wss working as
> well). We were messing about with ceph-ansible, as this was how we
> originally installed the cluster. Anyway, it installed nautilus 14.2.18
> on the radosgw and I though it would be good to pull up the rest of the
> cluster to that level as well using our tried and tested ceph upgrade
> script (it basically does an update of all ceph nodes one by one and
> checks whether ceph is ok again before doing the next)
>
> After the 3rd mon/mgr was done, all pg's were unavailable :-(
> obviously, the script is not continuing, but ceph is also broken now...
>
> The message deceptively is: HEALTH_WARN Reduced data availability: 5568
> pgs inactive
>
> That's all PGs!
>
> I tried as a desperate measure to upgrade one ceph OSD node, but that
> broke as well, the osd service on that node gets an interrupt from the
> kernel....
>
> the versions are now like:
> 20:29 [root@cephmon1 ~]# ceph versions
> {
> "mon": {
> "ceph version 14.2.18
> (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
> },
> "mgr": {
> "ceph version 14.2.18
> (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
> },
> "osd": {
> "ceph version 14.2.15
> (afdd217ae5fb1ed3f60e16bd62357ca58cc650e5) nautilus (stable)": 156
> },
> "mds": {
> "ceph version 14.2.15
> (afdd217ae5fb1ed3f60e16bd62357ca58cc650e5) nautilus (stable)": 2
> },
> "overall": {
> "ceph version 14.2.15
> (afdd217ae5fb1ed3f60e16bd62357ca58cc650e5) nautilus (stable)": 158,
> "ceph version 14.2.18
> (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 6
> }
> }
>
>
> 12 OSDs are down
>
> # ceph -s
> cluster:
> id: b489547c-ba50-4745-a914-23eb78e0e5dc
> health: HEALTH_WARN
> Reduced data availability: 5568 pgs inactive
>
> services:
> mon: 3 daemons, quorum cephmon3,cephmon1,cephmon2 (age 50m)
> mgr: cephmon1(active, since 53m), standbys: cephmon3, cephmon2
> mds: cephfs:1 {0=cephmds2=up:active} 1 up:standby
> osd: 168 osds: 156 up (since 28m), 156 in (since 18m); 1722
> remapped pgs
>
> data:
> pools: 12 pools, 5568 pgs
> objects: 0 objects, 0 B
> usage: 0 B used, 0 B / 0 B avail
> pgs: 100.000% pgs unknown
> 5568 unknown
>
> progress:
> Rebalancing after osd.103 marked in
> [..............................]
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io