Hi Reed,
Have you tried just start multiple rsync process simultaneously to transfer different
directories? Distributed system like ceph often benefits from more parallelism.
Weiwen Hu
在 2021年5月28日,03:54,Reed Dier
<reed.dier(a)focusvq.com> 写道:
Hoping someone may be able to help point out where my bottleneck(s) may be.
I have an 80TB kRBD image on an EC8:2 pool, with an XFS filesystem on top of that.
This was not an ideal scenario, rather it was a rescue mission to dump a large, aging
raid array before it was too late, so I'm working with the hand I was dealt.
To further conflate the issues, the main directory structure consists of lots and lots of
small file sizes, and deep directories.
My goal is to try and rsync (or otherwise) data from the RBD to cephfs, but its just
unbearably slow and will take ~150 days to transfer ~35TB, which is far from ideal.
15.41G 79% 4.36MB/s 0:56:09
(xfr#23165, ir-chk=4061/27259)
avg-cpu: %user %nice %system %iowait %steal
%idle
0.17 0.00 1.34 13.23 0.00 85.26
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s
wrqm/s %wrqm w_await wareq-sz d/s dMB/s drqm/s %drqm d_await dareq-sz aqu-sz
%util
rbd0 124.00 0.66 0.00 0.00 17.30 5.48 50.00 0.17
0.00 0.00 31.70 3.49 0.00 0.00 0.00 0.00 0.00 0.00 3.39
96.40
Rsync progress and iostat (during the rsync) from the rbd to a local ssd, to remove any
bottlenecks doubling back to cephfs.
About 16G in 1h, not exactly blazing, this being 5 of the 7000 directories I'm
looking to offload to cephfs.
Currently running 15.2.11, and the host is Ubuntu 20.04 (5.4.0-72-generic) with a single
E5-2620, 64GB of memory, and 4x10GbT bond talking to ceph, iperf proves it out.
EC8:2, across about 16 hosts, 240 OSDs, with 24 of those being 8TB 7.2k SAS, and the
other 216 being 2TB 7.2K SATA. So there are quite a few spindles in play here.
Only 128 PGs, in this pool, but its the only RBD image in this pool. Autoscaler
recommends going to 512, but was hoping to avoid the performance overhead of the PG splits
if possible, given perf is bad enough as is.
Examining the main directory structure it looks like there are 7000 files per directory,
about 60% of which are <1MiB, and in all totaling nearly 5GiB per directory.
My fstab for this is:
xfs _netdev,noatime 0 0
I tried to increase the read_ahead_kb to 4M from 128K at
/sys/block/rbd0/queue/read_ahead_kb to match the object/stripe size of the EC pool, but
that doesn't appear to have had much of an impact.
The only thing I can think of that I could possibly try as a change would be to increase
the queue depth in the rbdmap up from 128, so thats my next bullet to fire.
Attaching xfs_info in case there are any useful nuggets:
meta-data=/dev/rbd0 isize=256
agcount=81, agsize=268435455 blks
= sectsz=512 attr=2, projid32bit=0
= crc=0 finobt=0, sparse=0, rmapbt=0
= reflink=0
data = bsize=4096 blocks=21483470848, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=0
log =internal log bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
And rbd-info:
rbd image 'rbd-image-name:
size 85 TiB in 22282240 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: a09cac2b772af5
data_pool: rbd-ec82-pool
block_name_prefix: rbd_data.29.a09cac2b772af5
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten,
data-pool
op_features:
flags:
create_timestamp: Mon Apr 12 18:44:38 2021
access_timestamp: Mon Apr 12 18:44:38 2021
modify_timestamp: Mon Apr 12 18:44:38 2021
Any other ideas or hints are greatly appreciated.
Thanks,
Reed
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io