Hi Frank,
out of curiosity, can you share the recovery rates you are seeing?
I would appreciate it, thanks!
On 12/03 09:44, Frank Schilder wrote:
Hi Janne,
looked at it already. The recovery rate is unbearably slow and I would like to increase
it. The % misplaced objects is decreasing unnecessarily slow.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Janne Johansson <icepic.dz(a)gmail.com>
Sent: 03 December 2020 10:41:29
To: Frank Schilder
Cc: ceph-users(a)ceph.io
Subject: Re: [ceph-users] Increase number of objects in flight during recovery
Den tors 3 dec. 2020 kl 10:11 skrev Frank Schilder
<frans@dtu.dk<mailto:frans@dtu.dk>>:
I have the opposite problem as discussed in "slow down keys/s in recovery". I
need to increase the number of objects in flight during rebalance. It is already all
remapped PGs in state backfilling, but it looks like no more than 8 objects/sec are
transferred per PG at a time. The pools sits on high-performance SSDs and could easily
handle a transfer of 100 or more objects/sec simultaneously. Is there any way to increase
the number of transfers/sec or simultaneous transfers? Increasing the options
osd_max_backfills and osd_recovery_max_active has no effect.
Background: The pool in question (con-fs2-meta2) is the default data pool of a ceph fs,
which stores exclusively the kind of meta data that goes into this pool. Storage
consumption is reported as 0, but the number of objects is huge:
I don't run cephfs so it might not map 100%, but I think that pools for which ceph
stores file/object metadata (radosgw pools in my case) will show a completely
"false" numbers while recovering, which I think is because there are tons of
object metadata applied as metadata on 0-sized objects. This means recovery will look like
it does one object per second or something, while in fact it does 100s of metadatas on
that one object but the recovery doesn't list this. Also, it made old ceph df and
rados df say "this pool is almost empty" but when you try to dump or move the
pool it takes far longer than it should take to move an almost-empty pool. And the pool
dump gets huge.
I would take a look at iostat output for those OSD drives and see if there are 8 iops or
lots more actually.
--
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
--
David Caro
SRE - Cloud Services
Wikimedia Foundation <https://wikimediafoundation.org/>
PGP Signature: 7180 83A2 AC8B 314F B4CE 1171 4071 C7E1 D262 69C3
"Imagine a world in which every single human being can freely share in the
sum of all knowledge. That's our commitment."