Added a fifth OSD node. Cluster now looks something like:
3x mons (2x 10G, 2x E5-2690 V2, 256GB RAM)
5x OSD (2x 10G, 2x e5-2690 V2, 256GB-385GB RAM, 12x Samsung SM1625 SSDs)
Random write latency went up to 16ms average with the addition of the fifth node, and
k=3,m=2.
What kind of latencies are people seeing in their EC clusters?
From: "Anthony Brandelli (abrandel)" <abrandel(a)cisco.com>
Date: Thursday, February 13, 2020 at 10:17 AM
To: Martin Verges <martin.verges(a)croit.io>
Cc: "ceph-users(a)ceph.io" <ceph-users(a)ceph.io>
Subject: Re: [ceph-users] EC Pools w/ RBD - IOPs
I should mention this is solely meant as a test cluster, and unfortunately I only have
four OSD nodes in it. I guess I’ll go see if I can dig up another node so I can better
mirror what might eventually go to production.
I would imagine that latency is only going to increase as we increase k though, no?
From: Martin Verges <martin.verges(a)croit.io>
Date: Thursday, February 13, 2020 at 10:10 AM
To: "Anthony Brandelli (abrandel)" <abrandel(a)cisco.com>
Cc: "ceph-users(a)ceph.io" <ceph-users(a)ceph.io>
Subject: Re: [ceph-users] EC Pools w/ RBD - IOPs
Hello,
please do not even think about using an EC pool (k=2, m=1). See other posts here, just
don't.
EC works quite well and we have a lot of users with EC based VMs often with proxmox (rbd)
oder vmware (iscsi) hypervisors.
Performance depends on the hardware and is definitely slower than replica, but cost
efficient and more then ok on most workloads. If you split generic VMs and Databases (or
similar workloads), you can save a lot of money with EC.
--
Martin Verges
Managing director
Hint: Secure one of the last slots in the upcoming 4-day Ceph Intensive Training at
https://croit.io/training/4-days-ceph-in-depth-training.
Mobile: +49 174 9335695
E-Mail: martin.verges@croit.io<mailto:martin.verges@croit.io>
Chat:
https://t.me/MartinVerges
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:
https://croit.io
YouTube:
https://goo.gl/PGE1Bx
Am Do., 13. Feb. 2020 um 17:52 Uhr schrieb Anthony Brandelli (abrandel)
<abrandel@cisco.com<mailto:abrandel@cisco.com>>:
Hi Ceph Community,
Wondering what experiences good/bad you have with EC pools for iops intensive workloads
(IE: 4Kish random IO from things like VMWare ESXi). I realize that EC pools are a tradeoff
between more usable capacity, and having larger latency/lower iops, but in my testing the
tradeoff for small IO seems to be much worse than I had anticipated.
On an all flash 3x replicated pool we’re seeing 45k random read, and 35k random write iops
testing with fio on a client living on an iSCSI LUN presented to an ESXi host. Average
latencies for these ops are 4.2ms, and 5.5ms, which is respectable at an io depth of 32.
Take this same setup with an EC pool (k=2, m=1, tested with both ISA and jerasure, ISA
does give better performance for our use case) and we see 30k random read, and 16k random
write iops. Random reads see 6.5ms average, while random writes suffer with 12ms
average.
Are others using EC pools seeing similar hits to random writes with small IOs? Any way to
improve this?
Thanks,
Anthony
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to
ceph-users-leave@ceph.io<mailto:ceph-users-leave@ceph.io>