I have 104 pg stays in unknown states for a long time
[root@node-1 /]# ceph -s
cluster:
id: 653c6c1a-607e-4a62-bb92-dfe2f0d7afb6
health: HEALTH_ERR
1 osds down
Reduced data availability: 104 pgs inactive
24 slow requests are blocked > 32 sec. Implicated osds 0,1,2,8,9,10
14 stuck requests are blocked > 4096 sec. Implicated osds 5,6
services:
mon: 3 daemons, quorum node-1,node-2,node-3
mgr: node-1(active), standbys: node-2, node-3
osd: 12 osds: 11 up, 12 in
flags nodeep-scrub
rbd-mirror: 1 daemon active
data:
pools: 7 pools, 360 pgs
objects: 1.80k objects, 3.91GiB
usage: 17.6GiB used, 7.96TiB / 7.98TiB avail
pgs: 28.889% pgs unknown
256 active+clean
104 unknown
io:
client: 1.56MiB/s wr, 0op/s rd, 83op/s wr
[root@node-1 /]# ceph health detail
HEALTH_ERR Reduced data availability: 104 pgs inactive; 30 slow
requests are blocked > 32 sec. Implicated osds 0,1,2,4,8,9,10; 14
stuck requests are blocked > 4096 sec. Implicated osds 5,6
PG_AVAILABILITY Reduced data availability: 104 pgs inactive
pg 1.0 is stuck inactive for 2857.069686, current state unknown,
last acting []
pg 1.1 is stuck inactive for 2857.069686, current state unknown,
last acting []
pg 1.2 is stuck inactive for 2857.069686, current state unknown,
last acting []
pg 1.3 is stuck inactive for 2857.069686, current state unknown,
last acting []
pg 1.4 is stuck inactive for 2857.069686, current state unknown,
last acting []
pg 1.5 is stuck inactive for 2857.069686, current state unknown,
last acting []
pg 1.6 is stuck inactive for 2857.069686, current state unknown,
last acting []
pg 1.7 is stuck inactive for 2857.069686, current state unknown,
last acting []
pg 2.0 is stuck inactive for 2857.069686, current state unknown,
last acting []
......
[root@node-1 /]# ceph pg dump_stuck inactive
ok
PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY
3.1d unknown [] -1 [] -1
3.1c unknown [] -1 [] -1
3.1b unknown [] -1 [] -1
3.1a unknown [] -1 [] -1
3.19 unknown [] -1 [] -1
......
my pool size = 3
[root@node-1 /]# ceph pg 3.1d query
Error ENOENT: i don't have pgid 3.1d
Show replies by date