Hi cephists,
We have a 10 node cluster running Nautilus 14.2.9
All objects are on EC pool. We have mgr balancer plugin in upmap mode
doing it's rebalancing:
health: HEALTH_OK
pgs:
1985 active+clean
190 active+remapped+backfilling
65 active+remapped+backfill_wait
io:
client: 0 B/s wr, 0 op/s rd, 0 op/s wr
recovery: 770 MiB/s, 463 objects/s
We have restarted osd.0 on one of our OSD nodes, and this was the
status immediately after:
```
health: HEALTH_WARN
1 osds down
Degraded data redundancy: 4531479/531067647 objects
degraded (0.853%), 109 pgs degraded
```
Then OSD became UP again:
```
health: HEALTH_WARN
Degraded data redundancy: 4963207/531067545 objects
degraded (0.935%), 120 pgs degraded
```
And after a minute or so has passed it settled on:
```
health: HEALTH_WARN
Degraded data redundancy: 295515/531067347 objects
degraded (0.056%), 10 pgs degraded, 10 pgs undersized
```
upmap balancer was running during osd.0 restart, the restart was
successfull, without any issues.
This left us wondering - how could a simple osd restart cause
degraded PGs? Could this be related to the upmap balancer running?
Thanks!
--
Vyteni
Show replies by date