RE: [ceph-users] Adventures with large RGW buckets

2 Aug 2019

HI Greg / Eric,

What about allow delete bucket object with a lifecycle policy?

You can actually put 1 day of object life, that task is done at cluster level. And them
delete objects young than 1 day, and remove bucket.

That sometimes speed deletes as task is done by rgw's.

It should be like a background delete option, due deleting bucket of millions of objects
take weeks.

Regards

-----Mensaje original-----
De: ceph-users &lt;ceph-users-bounces(a)lists.ceph.com&gt; En nombre de Gregory Farnum
Enviado el: jueves, 1 de agosto de 2019 22:48
Para: Eric Ivancich &lt;ivancich(a)redhat.com&gt;
CC: Ceph Users &lt;ceph-users(a)lists.ceph.com&gt;om>; dev(a)ceph.io
Asunto: Re: [ceph-users] Adventures with large RGW buckets

On Thu, Aug 1, 2019 at 12:06 PM Eric Ivancich &lt;ivancich(a)redhat.com&gt; wrote:
...

 Hi Paul,

 I’ll interleave responses below.

 On Jul 31, 2019, at 2:02 PM, Paul Emmerich &lt;paul.emmerich(a)croit.io&gt; wrote:

 How could the bucket deletion of the future look like? Would it be 
 possible to put all objects in buckets into RADOS namespaces and 
 implement some kind of efficient namespace deletion on the OSD level 
 similar to how pool deletions are handled at a lower level?

 I’ll raise that with other RGW developers. I’m unfamiliar with how RADOS namespaces are
handled. 
I expect RGW could do this, but unfortunately deleting namespaces at the RADOS level is
not practical. People keep asking and maybe in some future world it will be cheaper, but a
namespace is effectively just part of the object name (and I don't think it's even
the first thing they sort by for the key entries in metadata tracking!), so deleting a
namespace would be equivalent to deleting a snapshot[1] but with the extra cost that
namespaces can be created arbitrarily on every write operation (so our solutions for
handling snapshots without it being ludicrously expensive wouldn't apply). Deleting a
namespace from the OSD-side using map updates would require the OSD to iterate through
just about all the objects they have and examine them for deletion.

Is it cheaper than doing over the network? Sure. Is it cheap enough we're willing to
let a single user request generate that kind of cluster IO on an unconstrained interface?
Absolutely not.
-Greg
[1]: Deleting snapshots is only feasible because every OSD maintains a sorted secondary
index from snapid->set<objects>. This is only possible because snapids are issued
by the monitors and clients cooperate in making sure they can't get reused after being
deleted.
Namespaces are generated by clients and there are no constraints on their use, reuse, or
relationship to each other. We could maybe work around these problems, but it'd be
building a fundamentally different interface than what namespaces currently are.
_______________________________________________
ceph-users mailing list
ceph-users(a)lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

2024

2023

2022

2021

2020

2019

RE: [ceph-users] Adventures with large RGW buckets