Ah ok, glad to see it’s on the radar! Just to add onto this with our own findings:

1. Rados GW reports all of the pieced objects, however they are visible with s3cmd or any other client side application.
2. Somewhere along the line it seems like the pieces are being lost in the registry and not marked for automatic removal by ceph, however setting the bucket shards to 0 and running a bucket check command fixes the issues. 

# radosgw-admin bucket check --check-objects --fix --bucket sgbackup1

But this only works when bucket shards are set to 0. If the bucket check command is run on a bucket with > 0 shards, it fails to remove the data.


The ticket points out that remaining orphans have a different id than the pieces which are recombined. The fact that these pieces aren’t visible from a client service accessing ceph metadata sounds like its an issue with the way ceph is tracking the pieces internally. The method for cataloging multipart objects seems to not have taken into account shards or forgotten the system “shadow tags” added onto the upload ID to differentiate them.

To recap, we are able to successfully remove the orphaned objects with a bucket check command only on buckets with 0 shards: 

# radosgw-admin bucket check --check-objects --fix --bucket sgbackup1

Setting bucket shards to a lower amount doesn’t change anything. After running this command than a bucket check, the orphaned data remains.

[root@os1-sin1 ~]# radosgw-admin reshard add --bucket vgood-test --num-shards 7 --yes-i-really-mean-it
[root@os1-sin1 ~]# radosgw-admin reshard list
[
    {
        "time": "2020-09-24T17:14:42.189517Z",
        "tenant": "",
        "bucket_name": "vgood-test",
        "bucket_id": "d8c6ebd1-2bab-414d-9d6b-73bf9bc8fc5a.12045805.1",
        "new_instance_id": "",
        "old_num_shards": 11,
        "new_num_shards": 7
    }
]

However setting bucket shards to 0 then running the bucket check command removed the orphan data.

[root@os1-sin1 ~]# radosgw-admin reshard add --bucket vgood-test --num-shards 0 --yes-i-really-mean-it
[root@os1-sin1 ~]# radosgw-admin reshard list
[
    {
        "time": "2020-09-24T17:23:34.843021Z",
        "tenant": "",
        "bucket_name": "vgood-test",
        "bucket_id": "d8c6ebd1-2bab-414d-9d6b-73bf9bc8fc5a.14335315.1",
        "new_instance_id": "",
        "old_num_shards": 7,
        "new_num_shards": 0
    }
]
[root@os1-sin1 ~]# radosgw-admin reshard process
2020-09-24T13:23:50.895-0400 7f24a0e47200  1 execute INFO: reshard of bucket "vgood-test" from "vgood-test:d8c6ebd1-2bab-414d-9d6b-73bf9bc8fc5a.14335315.1" to "vgood-test:d8c6ebd1-2bab-414d-9d6b-73bf9bc8fc5a.14335720.1" completed successfully

Is there any word on where this behavior might be originating from? 

I updated the ticket with this additional info, and would be glad to contribute any resources we can offer to help introduce a patch. We’re facing the same issues as in the ticket, running these clean up commands might be feasible for smaller buckets but is very unwieldy for the size clusters we’re running and are also losing a few terabytes of capacity.


- Gavin 


On Mar 26, 2021, at 10:26 AM, Casey Bodley <cbodley@redhat.com> wrote:

On Thu, Mar 25, 2021 at 10:21 AM Gavin Chen <gchen@linode.com> wrote:

Hi all,

We’re running into what seems to be a reoccurring bug on RGW’s when handling multipart uploads. The RGW’s seem to orphaning upload parts which then take up space and can be difficult to find as they do not show up using client side s3 tools.  RGW shows the pieces from the multipart upload are essentially orphaned and still stored in the cluster even after the upload has finished and the piecemeal object recombined. We’re currently running Octopus version 2.8 and are able to reliably reproduce the bug.

Interestingly, when looking at the cluster through s3cmd or boto it’s showing the correct bucket usage with just the successful multipart object in the bucket, and the smaller shards from the upload not appearing. The bug seems like it’s related to bucket index sharding, as the bug is fixable by setting the shards to 0 and running a bucket check command. But running the command for buckets with sharding enabled doesn’t do anything and the orphans remain in the cluster.

Looking at the issues backlog it seems like this was a problem even in much earlier releases dating all the way back to Hammer. https://tracker.ceph.com/issues/16767

We can confirm that this bug is still persisting even in Octopus and Nautilus. A current manual workaround is to reset bucket sharding to 0 and run a bucket check command. However this is ineffective since one would need to know which bucket is affected (which can only be done through RGW since the s3 tools don’t show the orphaned pieces), and bucket sharding would need to set to 0 for the fix to happen.

Has anyone else come across this bug? The comments from the issue ticket show it’s been a consistent problem through the years but with unfortunately no movement. The bug was assigned 3 years ago but looks like a fix was unable to be implemented.


- Gavin
_______________________________________________
Dev mailing list -- dev@ceph.io
To unsubscribe send an email to dev-leave@ceph.io

thanks for the link. we've been tracking this one in
https://tracker.ceph.com/issues/44660 and are still working on it