[ceph-users] Re: Ceph 14.2 - some PGs stuck peering.

4 Nov 2020

W dniu 2020-11-04 01:18, m.sliwinski(a)lh.pl napisał(a):

Just in case - result of ceph report is here: 
http://paste.ubuntu.com/p/D7yfr3pzr4/

> Hi
> 
> We have a weird issue iwth our ceph cluster - almost all PGs assigned
> to one specific pool became stuck, locking out all operations without
> reporting any errors.
> Story:
> We have 3 different pools, hdd-backed, ssd-backed and nvme-backed.
> Pool ssh worked fine for few months.
> Today one of the hosts assigned to nvme pool restarted triggering
> recovery in that pool. It wnet fast and cluster went to OK state.
> During these events or shortly after them ssd pool became
> unresponsive. It was impossible to either read or write from/to it.
> We decided to slowly restart fist OSDs assigned to it, thenm as it
> didn't help - all the mons, wihout breaking quorum of course.
> At this moment both nvme and hdd polls are working fine, ssd one is
> stuck in recovery.
> All OSDs in that ssd pool use large amount of CPU and are exchanging
> approx 1Mpps per OSD server between each other.

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Ceph 14.2 - some PGs stuck peering.