On Wed, Sep 2, 2020 at 11:38 PM Dan van der Ster <dan(a)vanderster.com> wrote:
Hi Joao & Kefu,
I have a question about
https://github.com/ceph/ceph/commit/e62269c8929e414284ad0773c4a3c82e43735e4e
which was backported and released into v14.2.10.
My understanding is that the intention was to ignore the osd_epoch of
down osds, so that we can trim osdmaps up to the min of (a) the lowest
per-pool clean epoch and (b) the lowest clean epoch of all up osds.
(See [1] and [2] for motivation).
Before this commit, get_min_last_epoch_clean would loop over *all* osd
epochs and lower the floor if needed.
Now after the commit we only check the epochs of the *out* osds.
Isn't that logic inverted? Shouldn't we be looping over all the *in* osds? [3]
This commit has passed by many eyes already so I must be confused...
Please help :-/
hi Dan, thanks for pointing this out! indeed! i created
https://tracker.ceph.com/issues/47290 to track this issue. and i will
create a fix based on your one-liner change.
(I ask because we already have evidence running 14.2.11 that maps are
still not trimmed when we mark out a broken osd -- we had to restart
the mon leader to provoke the trimming).
Thanks,
Dan
[1]
https://tracker.ceph.com/issues/37875#note-6
[2]
https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/6KSOLVLWR6HZOVUY7U…
[3]
@@ -2251,7 +2251,7 @@ epoch_t OSDMonitor::get_min_last_epoch_clean() const
// don't trim past the oldest reported osd epoch
for (auto [osd, epoch] : osd_epochs) {
if (epoch < floor &&
- osdmap.is_out(osd)) {
+ osdmap.is_in(osd)) {
floor = epoch;
}
}
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io
--
Regards
Kefu Chai