FYI: I have radosgw-admin gc list --include-all running every three
minutes for a day, but the list has stayed empty. Though I haven't seen
any further data loss, either. I will keep it running until the next
time I seen an object vanish.
On 17/11/2020 09:22, Janek Bevendorff wrote:
I have run radosgw-admin gc list (without --include-all) a few times
already, but the list was always empty. I will create a cron job
running it every few minutes and writing out the results.
On 17/11/2020 02:22, Eric Ivancich wrote:
> I’m wondering if anyone experiencing this bug would mind running
> `radosgw-admin gc list --include-all` on a schedule and saving the
> results. I’d like to know whether these tail objects are getting
> removed by the gc process. If we find that that’s the case then
> there’s the issue of how they got on the gc list.
>
> Eric
>
>
>> On Nov 16, 2020, at 3:48 AM, Janek Bevendorff
>> <janek.bevendorff(a)uni-weimar.de
>> <mailto:janek.bevendorff@uni-weimar.de>> wrote:
>>
>> As noted in the bug report, the issue has affected only multipart
>> objects at this time. I have added some more remarks there.
>>
>> And yes, multipart objects tend to have 0 byte head objects in
>> general. The affected objects are simply missing all shadow objects,
>> leaving us with nothing but the empty head object and a few metadata.
>>
>>
>> On 13/11/2020 20:14, Eric Ivancich wrote:
>>> Thank you for the answers to those questions, Janek.
>>>
>>> And in case anyone hasn’t seen it, we do have a tracker for this issue:
>>>
>>>
https://tracker.ceph.com/issues/47866
>>>
>>> We may want to move most of the conversation to the comments there,
>>> so everything’s together.
>>>
>>> I do want to follow up on your answer to Question 4, Janek:
>>>
>>>> On Nov 13, 2020, at 12:22 PM, Janek Bevendorff
>>>> <janek.bevendorff(a)uni-weimar.de
>>>> <mailto:janek.bevendorff@uni-weimar.de>> wrote:
>>>>>
>>>>> 4. Is anyone experiencing this issue willing to run their RGWs
>>>>> with 'debug_ms=1'? That would allow us to see a request from
an
>>>>> RGW to either remove a tail object or decrement its reference
>>>>> counter (and when its counter reaches 0 it will be deleted).
>>>>
>>>> I haven't had any new data loss in the last few days (at least I
>>>> think so, I read 1byte from all objects, but didn't compare
>>>> checksums, so I cannot say if all objects are complete, but at
>>>> least all are there).
>>>>
>>> With multipart uploads I believe this is a sufficient test, as the
>>> first bit of data is in the first tail object, and it’s tail
>>> objects that seem to be disappearing.
>>>
>>> However if the object is not uploaded via multipart and if it does
>>> have tail (_shadow_) objects, then the initial data is stored in
>>> the head object. So this test would not be truly diagnostic. This
>>> could be done with a large object, for example, with `s3cmd put
>>> --disable-multipart …`.
>>>
>>> Eric
>>>
>>> --
>>> J. Eric Ivancich
>>> he / him / his
>>> Red Hat Storage
>>> Ann Arbor, Michigan, USA
>