How to fix 1 pg stale+active+clean

List overview All Threads
Download

newer

older

Recommendation for decent write...

Re: Understanding monitor...

Marc Roos

9 Apr 2020 9 Apr '20

10:26 p.m.

How to fix 1 pg marked as stale+active+clean pg 30.4 is stuck stale for 175342.419261, current state stale+active+clean, last acting [31]

Show replies by date

Marc Roos

11 Apr 11 Apr

4:05 p.m.

I had just one osd go down (31), why is ceph not auto-healing in this 'simple' case? -----Original Message----- To: ceph-users Subject: [ceph-users] How to fix 1 pg stale+active+clean How to fix 1 pg marked as stale+active+clean pg 30.4 is stuck stale for 175342.419261, current state stale+active+clean, last acting [31]

Jarett DeAngelis

6:43 p.m.

New subject: Possible to "move" an OSD?

This is an edge case and probably not something that would be done in production, so I suspect the answer is “lol, no,” but here goes: I have three nodes running Nautilus courtesy of Proxmox. One of them is a self-built Ryzen 5 3600 system, and the other two are salvaged i5 Skylake desktops that I have pressed into service as virtualization and storage nodes. I want to replace the i5 systems with machines that are identical to the Ryzen 5 system. What I want to know is whether it’s possible to just take the devices that are currently hosting the OSDs, together with the hard drive that is hosting Proxmox, move them into the new machine, power up and have everything working. I don’t *think* the device names should change. What does everything think about this possibly insane plan? (Yes, I will back up all my important data before trying this.) Thanks, J

Adam Tygart

7 p.m.

New subject: Possible to "move" an OSD?

As far as Ceph is concerned, as long as there are no separate journal/blockdb/wal devices, you absolutely can transfer osds between hosts. If there are separate journal/blockdb/wal devices, you can do it still, provided they move with the OSDs. With Nautilus and up, make sure the osd bootstrap key is on the new host, and run 'ceph-volume lvm scan --all'. It will scan through the devices, identify the ceph osds et al and start them on the new host. There are no other "gotchas" that I remember. I cannot speak to Proxmox, however. -- Adam On Sat, Apr 11, 2020 at 12:45 PM Jarett DeAngelis <jarett(a)reticulum.us> wrote: > > This is an edge case and probably not something that would be done in production, so I suspect the answer is “lol, no,” but here goes: > > I have three nodes running Nautilus courtesy of Proxmox. One of them is a self-built Ryzen 5 3600 system, and the other two are salvaged i5 Skylake desktops that I have pressed into service as virtualization and storage nodes. I want to replace the i5 systems with machines that are identical to the Ryzen 5 system. What I want to know is whether it’s possible to just take the devices that are currently hosting the OSDs, together with the hard drive that is hosting Proxmox, move them into the new machine, power up and have everything working. I don’t *think* the device names should change. What does everything think about this possibly insane plan? (Yes, I will back up all my important data before trying this.) > > Thanks, > J > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

Marc Roos

12 Apr 12 Apr

3:05 p.m.

New subject: How to fix 1 pg stale+active+clean of cephfs pool

The cause of the stale pg, is a fs_data.r1 1 replica pool. This should be empty but ceph df shows 128 KiB used. I have already marked the osd as lost and removed the osd from the crush map. PG_AVAILABILITY Reduced data availability: 1 pg stale pg 30.4 is stuck stale for 407878.113092, current state stale+active+clean, last acting [31] [@c01 ~]# ceph pg map 30.4 osdmap e72814 pg 30.4 (30.4) -> up [29] acting [29] [@c01 ~]# ceph pg 30.4 query Error ENOENT: i don't have pgid 30.4 -----Original Message----- To: ceph-users Subject: [ceph-users] Re: How to fix 1 pg stale+active+clean I had just one osd go down (31), why is ceph not auto-healing in this 'simple' case? -----Original Message----- To: ceph-users Subject: [ceph-users] How to fix 1 pg stale+active+clean How to fix 1 pg marked as stale+active+clean pg 30.4 is stuck stale for 175342.419261, current state stale+active+clean, last acting [31] _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

1497

days inactive

1500

days old

ceph-users@ceph.io

Manage subscription

4 comments

3 participants

tags (0)

participants (3)

Adam Tygart
Jarett DeAngelis
Marc Roos