[ceph-users] Re: Advice needed: stuck cluster halfway upgraded, comms issues and MON space usage

22 Mar 2021

So, we started the mons and mgr up again, and here's the relevant logs,
including also ceph versions. We've also turned off all of the firewalls on
all of the nodes so we know that there can't be network issues [and,
indeed, all of our management of the OSDs happens via logins from the
service nodes or to each other]

...
  ceph status 

  cluster:
    id:     a1148af2-6eaf-4486-a27e-a05a78c2b378
    health: HEALTH_WARN
            pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover
flag(s) set
            1 nearfull osd(s)
            3 pool(s) nearfull
            Reduced data availability: 2048 pgs inactive
            mons cephs01,cephs02,cephs03 are using a lot of disk space

  services:
    mon: 3 daemons, quorum cephs01,cephs02,cephs03 (age 61s)
    mgr: cephs01(active, since 76s)
    osd: 329 osds: 329 up (since 63s), 328 in (since 4d); 466 remapped pgs
         flags pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover

  data:
    pools:   3 pools, 2048 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     100.000% pgs unknown
             2048 unknown

...
  ceph health detail 
HEALTH_WARN pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover
flag(s) set; 1 nearfull osd(s); 3 pool(s) nearfull; Reduced data
availability: 2048 pgs inactive; mons cephs01,cephs02,cephs03 are using a
lot of disk space
OSDMAP_FLAGS pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover
flag(s) set
OSD_NEARFULL 1 nearfull osd(s)
    osd.63 is near full
POOL_NEARFULL 3 pool(s) nearfull
    pool 'dteam' is nearfull
    pool 'atlas' is nearfull
    pool 'atlas-localgroup' is nearfull
PG_AVAILABILITY Reduced data availability: 2048 pgs inactive
    pg 13.1ef is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f0 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f1 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f2 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f3 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f4 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f5 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f6 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f7 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f8 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1f9 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1fa is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1fb is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1fc is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1fd is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1fe is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 13.1ff is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1ec is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f0 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f1 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f2 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f3 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f4 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f5 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f6 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f7 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f8 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1f9 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1fa is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1fb is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1fc is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1fd is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1fe is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 14.1ff is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1ed is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f0 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f1 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f2 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f3 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f4 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f5 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f6 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f7 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f8 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1f9 is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1fa is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1fb is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1fc is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1fd is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1fe is stuck inactive for 89.322981, current state unknown, last
acting []
    pg 15.1ff is stuck inactive for 89.322981, current state unknown, last
acting []
MON_DISK_BIG mons cephs01,cephs02,cephs03 are using a lot of disk space
    mon.cephs01 is 96 GiB >= mon_data_size_warn (15 GiB)
    mon.cephs02 is 96 GiB >= mon_data_size_warn (15 GiB)
    mon.cephs03 is 96 GiB >= mon_data_size_warn (15 GiB)

...
  ceph versions 
{
    "mon": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9)
nautilus (stable)": 3
    },
    "mgr": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9)
nautilus (stable)": 1
    },
    "osd": {
        "ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20)
nautilus (stable)": 1,
        "ceph version 14.2.15 (afdd217ae5fb1ed3f60e16bd62357ca58cc650e5)
nautilus (stable)": 188,
        "ceph version 14.2.16 (762032d6f509d5e7ee7dc008d80fe9c87086603c)
nautilus (stable)": 18,
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9)
nautilus (stable)": 122
    },

...
 >>>>> 
As a note, the log where the mgr explodes (which precipitated all of this)
definitely shows the problem occurring on the 12th [when 14.2.17 dropped],
but things didn't "break" until we tried upgrading OSDs to 14.2.18...

Sam

On Mon, 22 Mar 2021 at 12:20, Sam Skipsey &lt;aoanla(a)gmail.com&gt; wrote:

...
  Hi Dan:

 Thanks for the reply - at present, our mons and mgrs are off [because of
 the unsustainable nature of the filesystem usage]. We'll try putting them
 on again for long enough to get "ceph status" out of them, but because the
 mgr was unable to actually talk to anything, and reply at that point.

 (And thanks for the link to the bug tracker - I guess this mismatch of
 expectations is why the devs are so keen to move to containerised
 deployments where there is no co-location of different types of server, as
 it means they don't need to worry as much about the assumptions about when
 it's okay to restart a service on package update. Disappointing that it
 seems stale after 2 years...)

 Sam

 On Mon, 22 Mar 2021 at 12:11, Dan van der Ster &lt;dan(a)vanderster.com&gt; wrote:

  Hi Sam,

 The daemons restart (for *some* releases) because of this:
 https://tracker.ceph.com/issues/21672
 In short, if the selinux module changes, and if you have selinux
 enabled, then midway through yum update, there will be a systemctl
 restart ceph.target issued.

 For the rest -- I think you should focus on getting the PGs all
 active+clean as soon as possible, because the degraded and remapped
 states are what leads to mon / osdmap growth.
 This kind of scenario is why we wrote this tool:

 https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-rema…
 It will use pg-upmap-items to force the PGs to the OSDs where they are
 currently residing.

 But there is some clarification needed before you go ahead with that.
 Could you share `ceph status`, `ceph health detail`?

 Cheers, Dan

 On Mon, Mar 22, 2021 at 12:05 PM Sam Skipsey &lt;aoanla(a)gmail.com&gt; wrote:

 Hi everyone:

 I posted to the list on Friday morning (UK time), but apparently my  email
  is still in moderation (I have an email from the
list bot telling me  that
  it's held for moderation but no updates).

 Since this is a bit urgent - we have ~3PB of storage offline - I'm  posting
  again.

 To save retyping the whole thing, I will direct you to a copy of the  email
  I wrote on Friday:

 http://aoanla.pythonanywhere.com/Logs/EmailToCephUsers.txt

 (Since that was sent, we did successfully add big SSDs to the MON hosts  so
  they don't fill up their disks with store.db
s).

 I would appreciate any advice - assuming this also doesn't get stuck in
 moderation queues.

 --
 Sam Skipsey (he/him, they/them)
 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io 

 --
 Sam Skipsey (he/him, they/them)

-- 
Sam Skipsey (he/him, they/them)

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Advice needed: stuck cluster halfway upgraded, comms issues and MON space usage