Hi,
I guess what you are suggesting is something like k+m
with m>=k+2,
for example k=4, m=6. Then, one can distribute 5 shards per DC and
sustain the loss of an entire DC while still having full access to
redundant storage.
that's exactly what I mean, yes.
Now, a long time ago I was in a lecture about
error-correcting codes
(Reed-Solomon codes). From what I remember, the computational
complexity of these codes explodes at least exponentially with m.
Out of curiosity, how does m>3 perform in practice? What's the CPU
requirement per OSD?
Such a setup usually would be considered for archiving purposes so the
performance requirements aren't very high, but so far we haven't heard
any complaints performance-wise.
I don't have details on CPU requirements at hand right now.
Regards,
Eugen
Zitat von Frank Schilder <frans(a)dtu.dk>dk>:
> Dear Eugen,
>
I guess what you are suggesting is something like k+m
with m>=k+2,
for example k=4, m=6. Then, one can distribute 5 shards per DC and
sustain the loss of an entire DC while still having full access to
redundant storage.
>
Now, a long time ago I was in a lecture about
error-correcting codes
(Reed-Solomon codes). From what I remember, the computational
complexity of these codes explodes at least exponentially with m.
Out of curiosity, how does m>3 perform in practice? What's the CPU
requirement per OSD?
>
> Best regards,
>
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Eugen Block <eblock(a)nde.ag>
> Sent: 27 March 2020 08:33:45
> To: ceph-users(a)ceph.io
> Subject: [ceph-users] Re: Combining erasure coding and replication?
>
> Hi Brett,
>
>> Our concern with Ceph is the cost of having three replicas. Storage
>> may be cheap but I’d rather not buy ANOTHER 5pb for a third replica
>> if there are ways to do this more efficiently. Site-level redundancy
>> is important to us so we can’t simply create an erasure-coded volume
>> across two buildings – if we lose power to a building, the entire
>> array would become unavailable.
>
> can you elaborate on that? Why is EC not an option? We have installed
> several clusters with two datacenters resilient to losing a whole dc
> (and additional disks if required). So it's basically the choice of
> the right EC profile. Or did I misunderstand something?
>
>
> Zitat von Brett Randall <brett.randall(a)gmail.com>om>:
>
>> Hi all
>>
>> Had a fun time trying to join this list, hopefully you don’t get
>> this message 3 times!
>>
>> On to Ceph… We are looking at setting up our first ever Ceph cluster
>> to replace Gluster as our media asset storage and production system.
>> The Ceph cluster will have 5pb of usable storage. Whether we use it
>> as object-storage, or put CephFS in front of it, is still TBD.
>>
>> Obviously we’re keen to protect this data well. Our current Gluster
>> setup utilises RAID-6 on each of the nodes and then we have a single
>> replica of each brick. The Gluster bricks are split between
>> buildings so that the replica is guaranteed to be in another
>> premises. By doing it this way, we guarantee that we can have a
>> decent number of disk or node failures (even an entire building)
>> before we lose both connectivity and data.
>>
>> Our concern with Ceph is the cost of having three replicas. Storage
>> may be cheap but I’d rather not buy ANOTHER 5pb for a third replica
>> if there are ways to do this more efficiently. Site-level redundancy
>> is important to us so we can’t simply create an erasure-coded volume
>> across two buildings – if we lose power to a building, the entire
>> array would become unavailable. Likewise, we can’t simply have a
>> single replica – our fault tolerance would drop way down on what it
>> is right now.
>>
>> Is there a way to use both erasure coding AND replication at the
>> same time in Ceph to mimic the architecture we currently have in
>> Gluster? I know we COULD just create RAID6 volumes on each node and
>> use the entire volume as a single OSD, but that this is not the
>> recommended way to use Ceph. So is there some other way?
>>
>> Apologies if this is a nonsensical question, I’m still trying to
>> wrap my head around Ceph, CRUSH maps, placement rules, volume types,
>> etc etc!
>>
>> TIA
>>
>> Brett
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io