On 10 Apr 2024, at 01:00, Eugen Block
<eblock(a)nde.ag> wrote:
I appreciate your message, it really sounds tough (9 months, really?!). But thanks for
the reassurance :-)
Yes, the total "make this project great again" tooks 16 month, I think. This my
work
First problem after 1M objects in PG was a deletion [1]. It's just impossible to
delete objects for the 'stray' PG
The second was - the code, that cares about nearfull & backfillfull just don't
work for this OSD [2], because code use DATA field (the objects), instead RAW field (the
DATA + RocksDB database) for computations
The third was minor, but WTF statistics metric issue [3]
And the last but not least (and still present in master) - when lock object acquired, this
crashes replica OSD's in acting set, when object is absent on primary OSD [4]. This
may ruin client IO until OSD's restart & recovery
For current time, not all collection_list fixes was merged [5], but since 14.2.22 much
better than before...
They don’t have any other options so we’ll have to
start that process anyway, probably tomorrow. We’ll see how it goes…
Yes, you just have to start, and then we’ll see
Thanks,
k
[1]
https://tracker.ceph.com/issues/47044 +
https://tracker.ceph.com/issues/45765 ->
https://tracker.ceph.com/issues/50466
[2]
https://tracker.ceph.com/issues/50533
[3]
https://tracker.ceph.com/issues/52512
[4]
https://tracker.ceph.com/issues/52513
[5]
https://tracker.ceph.com/issues/58274