Hello,
* 2 x Xeon Silver 4212 (12C/24T)
I would choose single cpu AMD EPYC systems for lower price with better
performance. Supermicro does have some good systems for AMD as well.
* 16 x 10 TB nearline SAS HDD (8 bays for future needs)
Don't waste money here as well. No real gain. Invest it better in more or
faster (ssd) disks.
* 4 x 40G QSFP+
With 24x spinning media, even a single 40G link will be enough. No gain for
a lot of money again.
* 2 x 40G per server for ceph network (LACP/VPC for HA)
> * 2 x 40G per server for public network (LACP/VPC for HA)
Use vlans if you really want to separate the networks. Most of the time we
see new customers coming in with problems on such configurations and we
don't suggest tu configure Ceph that way from our experience.
* ZFS on RBD, exposed via samba shares (cluster with failover)
Maybe, just maybe think about just using samba on top of cephfs to export
the data. No need for all the overhead and possible bugs you would
encounter.
* We're used to run mons and mgrs daemons on a few of our OSD nodes,
> without any issue so far : is this a bad idea for a big cluster ?
We always do so and never had a problem with it. Just make sure the MON has
enough resources for your workload.
* We thought using cache tiering on an SSD pool, but a large part of the PB
> is used on a daily basis, so we expect the cache to be not so effective
> and really expensive ?
Tend to be error prone and we saw a lot of cluster meltdowns in the last 7
years due to cache tiering. Just go for an all flash cluster use db/wal
devices to improve performance.
* Could a 2x10G network be enough ?
Yes ;), but maybe on recovery workloads it will slow down the recovery a
bit. However I don't believe that it will be a problem in your mentioned
szenario.
* ZFS on Ceph ? Any thoughts ?
just don't ;)
* What about CephFS ? We'd like to use RBD diff for backups but it looks
impossible to use snapshot diff with Cephfs ?
Please see
https://docs.ceph.com/docs/master/dev/cephfs-snapshots/
If you do have questions or want some consulting to get the best Ceph
cluster for the job. Please feel free to contact us.
--
Martin Verges
Managing director
Mobile: +49 174 9335695
E-Mail: martin.verges(a)croit.io
Chat:
https://t.me/MartinVerges
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:
https://croit.io
YouTube:
https://goo.gl/PGE1Bx
Am Di., 3. Dez. 2019 um 21:07 Uhr schrieb Fabien Sirjean <
fsirjean(a)eddie.fdn.fr>gt;:
> Hi Ceph users !
> After years of using Ceph, we plan to build
soon a new cluster bigger than
> what
> we've done in the past. As the project is still in reflection, I'd like to
> have your thoughts on our planned design : any feedback is welcome :)
> ## Requirements
> * ~1 PB usable space for file storage,
extensible in the future
> * The files are mostly "hot" data, no cold storage
> * Purpose : storage for big files being essentially used on windows
> workstations (10G access)
> * Performance is better :)
> ## Global design
> * 8+3 Erasure Coded pool
> * ZFS on RBD, exposed via samba shares (cluster with failover)
> ## Hardware
> * 1 rack (multi-site would be better, of
course...)
> * OSD nodes : 14 x supermicro servers
> * 24 usable bays in 2U rackspace
> * 16 x 10 TB nearline SAS HDD (8 bays for future needs)
> * 2 x Xeon Silver 4212 (12C/24T)
> * 128 GB RAM
> * 4 x 40G QSFP+
> * Networking : 2 x Cisco N3K 3132Q or
3164Q
> * 2 x 40G per server for ceph network (LACP/VPC for HA)
> * 2 x 40G per server for public network (LACP/VPC for HA)
> * QSFP+ DAC cables
> ## Sizing
> If we've done the maths well, we expect
to have :
> * 2.24 PB of raw storage, extensible to
3.36 PB by adding HDD
> * 1.63 PB expected usable space with 8+3 EC, extensible to 2.44 PB
> * ~1 PB of usable space if we want to keep the OSD use under 66% to allow
> loosing nodes without problem, extensible to 1.6 PB (same condition)
> ## Reflections
> * We're used to run mons and mgrs
daemons on a few of our OSD nodes,
> without
> any issue so far : is this a bad idea for a big cluster ?
> * We thought using cache tiering on an SSD
pool, but a large part of the
> PB is
> used on a daily basis, so we expect the cache to be not so effective and
> really expensive ?
> * Could a 2x10G network be enough ?
> * ZFS on Ceph ? Any thoughts ?
> * What about CephFS ? We'd like to use
RBD diff for backups but it looks
> impossible to use snapshot diff with Cephfs ?
> Thanks for reading, and sharing your experiences !
> F.
>
_______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io