Exactly my thought too: the node in question has a most-full outlier, so when that OSD is
out, the most-full is … less full.
ceph osd df | sort -nk8 | tail
Or using
ceph osd df | egrep -v WEIGHT\|TOTAL\|MIN\|ID\|nan | sed -e 's/ssd//' -e
's/hdd//' | awk '{print 1, $7}' | histogram.py -a -b 200 -m 0 -x 100 -p
--no-mvsd
(that commandline could be done more elegantly)
— Anthony
I have seen this when there is one OSD on the node being rebooted that
is using more space than the others. Max avail for the pool is based
on the fullest OSD as far as I know.
On Sun, Jun 14, 2020 at 4:29 PM KervyN <bb(a)kervyn.de> wrote:
Does someone got any ideas on this?
The mgr nodes are separate, pg_autoscaler is also not active (I don‘t know how the impact
will be on a 1pb storage), and it also happens when I turn of an osd service on any node.
it’s the latest ceph nautilus.
Cheers
- Boris
Am 28.05.2020 um 23:42 schrieb Boris Behrens
<bb(a)kervyn.de>de>:
Dear people on this mailing list,
I've got the "problem" that our MAX AVAIL value increases by about
5-10 TB when I reboot a whole OSD node. After the reboot the value
goes back to normal.
I would love to know WHY.
Under normal circumstances I would ignore this behavior, but because I
am very new to the whole ceph software I would like to know why stuff
like this happens.
What I read is, that this value will be calculated by the most filled OSD.
I've set noout and norebalance while the node is offline and I unset
both values after the reboot.
We are currently on nautilus.
Cheers and thanks in advance
Boris
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io