Packing's obviously a good idea for storing these
kinds of artifacts
in Ceph, and hacking through the existing librbd might indeed be
easier than building something up from raw RADOS, especially if you
want to use stuff like rbd-mirror.
My main concern would just be as Dan points out, that we don't test
rbd with extremely large images and we know deleting that image will
take a looooong time — I don't know of other issues off the top of my
head, and in the worst case you could always fall back to manipulating
it with raw librados if there is an issue.
But you might also check in on the status of Danny Al-Gaaf's rados
email project. Email and these artifacts seemingly have a lot in
common.
-Greg
On Mon, Feb 1, 2021 at 12:52 PM Loïc Dachary <loic(a)dachary.org> wrote:
Hi Dan,
On 01/02/2021 21:13, Dan van der Ster wrote:
Hi Loïc,
We've never managed 100TB+ in a single RBD volume. I can't think of
anything, but perhaps there are some unknown limitations when they get so
big.
It should be easy enough to use rbd bench to create and fill a massive test
image to validate everything works well at that size.
Good idea! I'll look for
a cluster with 100TB of free space and post my findings.
Also, I assume you'll be doing the IO from just one client? Multiple
readers/writers to a single volume could get complicated.
Yes.
Otherwise, yes RBD sounds very convenient for what you need.
It is inspired by
https://static.usenix.org/event/osdi10/tech/full_papers/Beaver.pdf which suggests an
ad-hoc implementation to pack immutable objects together. But I think RBD already provides
the underlying logic, even though it is not specialized for this use case. RGW also packs
small objects together and would be a good candidate. But it provides more flexibility to
modify/delete objects and I assume it will be slower to write N objects with RGW than to
write them sequentially on an RBD image. But I did not try and maybe I should.
To be continued.
Cheers, Dan
On Sat, Jan 30, 2021, 4:01 PM Loïc Dachary <loic(a)dachary.org> wrote:
Bonjour,
In the context Software Heritage (a noble mission to preserve all source
code)[0], artifacts have an average size of ~3KB and there are billions of
them. They never change and are never deleted. To save space it would make
sense to write them, one after the other, in an every growing RBD volume
(more than 100TB). An index, located somewhere else, would record the
offset and size of the artifacts in the volume.
I wonder if someone already implemented this idea with success? And if
not... does anyone see a reason why it would be a bad idea?
Cheers
[0]
https://docs.softwareheritage.org/
--
Loïc Dachary, Artisan Logiciel Libre
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
--
Loïc Dachary, Artisan Logiciel Libre
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io