Just to clarify, it is better to separate the different performance cases:
1- regular io performance ( iops / throughput ), this should be good.
2- vmotion within datastores managed by Ceph: this will be good, as
xcopy will be used.
3. vmotion between Ceph datastore and an external datastore..this will
be bad. This seems the case you are testing. It is bad because between 2
different storage systems (iqns are served on different targets), vaai
xcopy cannot be used and vmware does its own stuff. It moves data using
64k block size, which gives low performance...to add some flavor, it
does indeed use 32 threads, but unfortunately they use co-located
addresses which does not work well in Ceph as they are hitting the same
rbd object, which gets serialized due to pg locks, so you will not get
any palatalization. Your speed will mostly be determined by a serial
64k, so with 1 ms write latency for ssd cluster, you will get around 64
MB/s..it will be slightly higher as the extra threads have some low effect.
Note your esxtop does show 32 active ios under ACTV, the QUED of zero
does is not the queue depth, but rather the "queued" io the ESX would
suspend in case your active reaches the maximum by adapater ( 128 ).
This is just to clarify, if case 3 is not your primary concern than i
would forget about it and benchmark 1 and 2 if they are relevant. Else,
if 3 is important, i am not sure you can do much as it is happening
within vmware..maybe there could be a way to map the external iqn to be
served by the same target serving the Ceph iqn then there could be a
chance the xcopy could be activated..Mike would probably know if this
has any chance of working :)
/Maged
On 25/10/2019 22:01, Ryan wrote:
esxtop is showing a queue length of 0
Storage motion to ceph
DEVICE PATH/WORLD/PARTITION DQLEN WQLEN
ACTV QUED %USD LOAD CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s
DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
naa.6001405ec60d8b82342404d929fbbd03 - 128 - 32 0
25 0.25 1442.32 0.18 1440.50 0.00 89.78 21.32 0.01
21.34 0.01
Storage motion from ceph
DEVICE PATH/WORLD/PARTITION DQLEN WQLEN
ACTV QUED %USD LOAD CMDS/s READS/s WRITES/s MBREAD/s MBWRTN/s
DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
naa.6001405ec60d8b82342404d929fbbd03 - 128 - 32 0
25 0.25 4065.38 4064.83 0.36 253.52 0.00 7.57 0.01
7.58 0.00
I tried using fio like you mentioned but it was hanging with
[r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS] and the ETA kept climbing. I ended
up using rbd bench on the ceph iscsi gateway. With a 64K write
workload I'm seeing 400MB/s transfers.
rbd create test --size 100G --image-feature layering
rbd map test
mkfs.ext4 /dev/rbd/rbd/test
mount /dev/rbd/rbd/test test
rbd create testec --size 100G --image-feature layering --data-pool rbd_ec
rbd map testec
mkfs.ext4 /dev/rbd/rbd/testec
mount /dev/rbd/rbd/testec testec
[root@ceph-iscsi1 mnt]# rbd bench --image test --io-size 64K --io-type
write --io-total 10G
bench type write io_size 65536 io_threads 16 bytes 10737418240
pattern sequential
SEC OPS OPS/SEC BYTES/SEC
1 6368 6377.59 417961796.64
2 12928 6462.27 423511630.71
3 19296 6420.18 420752986.78
4 26320 6585.61 431594792.67
5 33296 6662.37 436624891.04
6 40128 6754.67 442673957.25
7 46784 6765.75 443400452.26
8 53280 6809.02 446236110.93
9 60032 6739.67 441691068.73
10 66784 6698.91 439019550.77
11 73616 6690.88 438493253.66
12 80016 6654.35 436099640.00
13 85712 6485.07 425005611.11
14 91088 6202.49 406486113.46
15 96896 6021.17 394603137.62
16 102368 5741.19 376254347.24
17 107568 5501.57 360550910.38
18 113728 5603.17 367209502.58
19 120144 5820.48 381451245.32
20 126496 5917.60 387816078.53
21 132768 6089.71 399095466.00
22 139040 6306.98 413334431.09
23 145104 6276.42 411331743.63
24 151440 6256.67 410036891.68
25 157808 6261.12 410328554.98
26 163456 6140.03 402392725.36
elapsed: 26 ops: 163840 ops/sec: 6271.36 bytes/sec: 410999626.38
[root@ceph-iscsi1 mnt]# rbd bench --image testec --io-size 64K
--io-type write --io-total 10G
bench type write io_size 65536 io_threads 16 bytes 10737418240
pattern sequential
SEC OPS OPS/SEC BYTES/SEC
1 7392 7415.38 485974266.41
2 14464 7243.59 474715656.29
3 22000 7341.08 481104853.50
4 29408 7352.29 481839517.16
5 37296 7459.38 488857889.75
6 44864 7494.36 491150574.57
7 52848 7676.76 503104281.98
8 60784 7756.76 508347136.11
9 68608 7835.26 513491609.52
10 76784 7902.30 517885290.67
11 84544 7935.96 520091129.45
12 92432 7916.76 518832844.57
13 100064 7855.96 514848275.43
14 107040 7692.52 504136734.09
15 114320 7499.66 491497933.56
16 121744 7436.99 487390477.85
17 129664 7438.92 487517345.01
18 136704 7326.50 480149408.39
19 144960 7587.00 497221460.09
20 153264 7796.56 510955233.33
21 160832 7814.44 512126854.90
elapsed: 21 ops: 163840 ops/sec: 7659.97 bytes/sec: 502004079.43
On Fri, Oct 25, 2019 at 11:54 AM Mike Christie <mchristi(a)redhat.com
<mailto:mchristi@redhat.com>> wrote:
On 10/24/2019 11:47 PM, Ryan wrote:
I'm using CentOS 7.7.1908 with kernel
3.10.0-1062.1.2.el7.x86_64. The
workload was a VMware Storage Motion from a local
SSD backed
datastore
Ignore my comments. I thought you were just doing fio like tests
in the vm.
to the ceph backed datastore. Performance was
measured using
dstat on
the iscsi gateway for network traffic and ceph
status as this
cluster is
basically idle. I changed max_data_area_mb to
256 and
cmdsn_depth to
128. This appears to have given a slight
improvement of maybe
10MB/s.
Moving VM to the ceph backed datastore
io:
client: 124 KiB/s rd, 76 MiB/s wr, 95 op/s rd, 1.26k op/s wr
Moving VM off the ceph backed datastore
io:
client: 344 MiB/s rd, 625 KiB/s wr, 5.54k op/s rd, 62 op/s wr
If you run esxtop while running your test what do you see for the
number
of commands in the iscsi LUN's queue?
I'm going to test bonnie++ with an rbd volume
mounted directly
on the
To try and isolate if its the iscsi or rbd, you need to run fio
with the
librbd io engine. We know krbd is going to be the fastest. ceph-iscsi
uses librbd so it is a better baseline. If you are not familiar
with fio
you can just do something like:
fio --group_reporting --ioengine=rbd --direct=1 --name=librbdtest
--numjobs=32 --bs=512k --iodepth=128 --size=10G --rw=write
--rbd=name_of_your_image -pool=name_of_pool
iscsi gateway. Also will test bonnie++ inside a
VM on a ceph backed
datastore.
On Thu, Oct 24, 2019 at 7:15 PM Mike Christie
<mchristi(a)redhat.com
<mailto:mchristi@redhat.com>
<mailto:mchristi@redhat.com
<mailto:mchristi@redhat.com>>> wrote:
On 10/24/2019 12:22 PM, Ryan wrote:
> I'm in the process of testing the iscsi target feature of
ceph.
The
> cluster is running ceph 14.2.4 and
ceph-iscsi 3.3. It
consists of 5
What kernel are you using?
> hosts with 12 SSD OSDs per host. Some basic testing moving
VMs to
a ceph
> backed datastore is only showing 60MB/s transfers. However
moving
these
back off the datastore is fast at 200-300MB/s.
What is the workload and what are you using to measure the
throughput?
If you are using fio, what arguments are you using? And,
could you
change the ioengine to rbd and re-run the
test from the
target system so
we can check if rbd is slow or iscsi?
For small IOs, 60 is about right.
For 128-512K IOs you should be able to get around 300 MB/s
for writes
and 600 for reads.
1. Increase max_data_area_mb. This is a kernel buffer
lio/tcmu uses to
pass data between the kernel and tcmu-runner.
The default is
only 8MB.
In gwcli cd to your disk and do:
# reconfigure max_data_area_mb %N
where N is between 8 and 2048 MBs.
2. The Linux kernel target only allows 64 commands per iscsi
session by
default. We increase that to 128, but you can
increase this
to 512.
In gwcli cd to the target dir and do
reconfigure cmdsn_depth 512
3. I think ceph-iscsi and lio work better with higher queue
depths so if
you are using fio you want higher numjobs
and/or iodepths.
>
> What should I be looking at to track down the write
performance
issue?
> In comparison with the Nimble Storage
arrays I can see
200-300MB/s in
> both directions.
>
> Thanks,
> Ryan
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
<mailto:ceph-users@ceph.io>
<mailto:ceph-users@ceph.io
<mailto:ceph-users@ceph.io>>
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
<mailto:ceph-users-leave@ceph.io>
<mailto:ceph-users-leave@ceph.io
<mailto:ceph-users-leave@ceph.io>>
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io