Hi,
This problem also happened in my customer's environment, so I want to solve this
problem.
To facilitate the discussion, I restate the problem and the current solution.
(Mykola has already written the solution idea. I am sorry if there is anything different
from Mykola's idea.)
In master:
Problem: A primary OSD crashes in an unnecessary situation. (I think this is a bug.)
Solution: Remove the ceph_assert from the code below.
---------------------------------------------------------------
diff --git a/src/osd/PrimaryLogPG.cc b/src/osd/PrimaryLogPG.cc
index 626e8ccefb..12956424bd 100644
--- a/src/osd/PrimaryLogPG.cc
+++ b/src/osd/PrimaryLogPG.cc
@@ -13079,7 +13079,6 @@ void PrimaryLogPG::_clear_recovery_state()
last_backfill_started = hobject_t();
set<hobject_t>::iterator i = backfills_in_flight.begin();
while (i != backfills_in_flight.end()) {
- ceph_assert(recovering.count(*i));
backfills_in_flight.erase(i++);
}
---------------------------------------------------------------
The reason is as follows.
- The above code assumes that all of the objects contained in backfills_in_flight are
contained in recovering.
- However, the current implementation of on_failed_pull[1], if it is non-primary OSD,
unfound objects will remain only in backfills_in_flight. (but unconditionally removed from
recovering[2])
Therefore, the above ceph_assert does not match the current implementation of
on_failed_pull.
I thinks this ceph_assert should be removed, but I would like to hear opinion from the
community.
[1]:
https://github.com/ceph/ceph/blob/813933f81e3d682a0b1ae6dd906e38e78c4859a4/…
[2]:
https://github.com/ceph/ceph/blob/813933f81e3d682a0b1ae6dd906e38e78c4859a4/…
In nautilus:
Problem: backfill_unfound state becomes clear when the OSD is restarted. (This is also a
bug.)
This causes a user to mistakenly think the problem has been solved and cause unexpected
trouble.
Solution: Remain unfound objects in backfills_in_flight such as on_failed_pull, if it is
non-primary OSD.
There is the following commit[3], but as the range of correction of this commit is wide,
so I think only the minimum correction necessary for problem solving should be directly
committed to nautilus.
[3]:
https://github.com/ceph/ceph/commit/8a8947d2a32d6390cb17099398e7f2212660c9a1
In addition, if this problem is solved, the problem that primary OSD crashes occurs, so
the commit of the master described above needs to be backported.
I am considering sending PRs next week, so please let me know if you have any opinions
from the community before that.
--
Jin