Yes, octopus. -- Frank
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Szabo, Istvan (Agoda) <Istvan.Szabo(a)agoda.com>
Sent: Wednesday, December 13, 2023 6:13 AM
To: Frank Schilder; ceph-users(a)ceph.io
Subject: Re: [ceph-users] Re: increasing number of (deep) scrubs
Hi,
You are on octopus right?
Istvan Szabo
Staff Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@agoda.com<mailto:istvan.szabo@agoda.com>
---------------------------------------------------
________________________________
From: Frank Schilder <frans(a)dtu.dk>
Sent: Tuesday, December 12, 2023 7:33 PM
To: ceph-users(a)ceph.io <ceph-users(a)ceph.io>
Subject: [ceph-users] Re: increasing number of (deep) scrubs
Email received from the internet. If in doubt, don't click any link nor open any
attachment !
________________________________
Hi all,
if you follow this thread, please see the update in "How to configure something like
osd_deep_scrub_min_interval?"
(
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/YUHWQCDAKP5…).
I found out how to tune the scrub machine and I posted a quick update in the other thread,
because the solution was not to increase the number of scrubs, but to tune parameters.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Frank Schilder
Sent: Monday, January 9, 2023 9:14 AM
To: Dan van der Ster
Cc: ceph-users(a)ceph.io
Subject: Re: [ceph-users] increasing number of (deep) scrubs
Hi Dan,
thanks for your answer. I don't have a problem with increasing osd_max_scrubs (=1 at
the moment) as such. I would simply prefer a somewhat finer grained way of controlling
scrubbing than just doubling or tripling it right away.
Some more info. These 2 pools are data pools for a large FS. Unfortunately, we have a
large percentage of small files, which is a pain for recovery and seemingly also for deep
scrubbing. Our OSDs are about 25% used and I had to increase the warning interval already
to 2 weeks. With all the warning grace parameters this means that we manage to deep scrub
everything about every month. I need to plan for 75% utilisation and a 3 months period is
a bit far on the risky side.
Our data is to a large percentage cold data. Client reads will not do the check for us, we
need to combat bit-rot pro-actively.
The reasons I'm interested in parameters initiating more scrubs while also converting
more scrubs into deep scrubs are, that
1) scrubs seem to complete very fast. I almost never catch a PG in state
"scrubbing", I usually only see "deep scrubbing".
2) I suspect the low deep-scrub count is due to a low number of deep-scrubs scheduled and
not due to conflicting per-OSD deep scrub reservations. With the OSD count we have and the
distribution over 12 servers I would expect at least a peak of 50% OSDs being active in
scrubbing instead of the 25% peak I'm seeing now. It ought to be possible to schedule
more PGs for deep scrub than actually are.
3) Every OSD having only 1 deep scrub active seems to have no measurable impact on user
IO. If I could just get more PGs scheduled with 1 deep scrub per OSD it would already help
a lot. Once this is working, I can eventually increase osd_max_scrubs when the OSDs fill
up. For now I would just like that (deep) scrub scheduling looks a bit harder and
schedules more eligible PGs per time unit.
If we can get deep scrubbing up to an average of 42PGs completing per hour with keeping
osd_max_scrubs=1 to maintain current IO impact, we should be able to complete a deep scrub
with 75% full OSDs in about 30 days. This is the current tail-time with 25% utilisation. I
believe currently a deep scrub of a PG in these pools takes 2-3 hours. Its just a gut
feeling from some repair and deep-scrub commands, I would need to check logs for more
precise info.
Increasing osd_max_scrubs would then be a further and not the only option to push for more
deep scrubbing. My expectation would be that values of 2-3 are fine due to the
increasingly higher percentage of cold data for which no interference with client IO will
happen.
Hope that makes sense and there is a way beyond bumping osd_max_scrubs to increase the
number of scheduled and executed deep scrubs.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Dan van der Ster <dvanders(a)gmail.com>
Sent: 05 January 2023 15:36
To: Frank Schilder
Cc: ceph-users(a)ceph.io
Subject: Re: [ceph-users] increasing number of (deep) scrubs
Hi Frank,
What is your current osd_max_scrubs, and why don't you want to increase it?
With 8+2, 8+3 pools each scrub is occupying the scrub slot on 10 or 11
OSDs, so at a minimum it could take 3-4x the amount of time to scrub
the data than if those were replicated pools.
If you want the scrub to complete in time, you need to increase the
amount of scrub slots accordingly.
On the other hand, IMHO the 1-week deadline for deep scrubs is often
much too ambitious for large clusters -- increasing the scrub
intervals is one solution, or I find it simpler to increase
mon_warn_pg_not_scrubbed_ratio and mon_warn_pg_not_deep_scrubbed_ratio
until you find a ratio that works for your cluster.
Of course, all of this can impact detection of bit-rot, which anyway
can be covered by client reads if most data is accessed periodically.
But if the cluster is mostly idle or objects are generally not read,
then it would be preferable to increase slots osd_max_scrubs.
Cheers, Dan
On Tue, Jan 3, 2023 at 2:30 AM Frank Schilder <frans(a)dtu.dk> wrote:
Hi all,
we are using 16T and 18T spinning drives as OSDs and I'm observing that they are not
scrubbed as often as I would like. It looks like too few scrubs are scheduled for these
large OSDs. My estimate is as follows: we have 852 spinning OSDs backing a 8+2 pool with
2024 and an 8+3 pool with 8192 PGs. On average I see something like 10PGs of pool 1 and 12
PGs of pool 2 (deep) scrubbing. This amounts to only 232 out of 852 OSDs scrubbing and
seems to be due to a conservative rate of (deep) scrubs being scheduled. The PGs (dep)
scrub fairly quickly.
I would like to increase gently the number of scrubs scheduled for these drives and *not*
the number of scrubs per OSD. I'm looking at parameters like:
osd_scrub_backoff_ratio
osd_deep_scrub_randomize_ratio
I'm wondering if lowering osd_scrub_backoff_ratio to 0.5 and, maybe, increasing
osd_deep_scrub_randomize_ratio to 0.2 would have the desired effect? Are there other
parameters to look at that allow gradual changes in the number of scrubs going on?
Thanks a lot for your help!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may
also be privileged or otherwise protected by copyright or other legal rules. If you have
received it by mistake please let us know by reply email and delete it from your system.
It is prohibited to copy this message or disclose its content to anyone. Any
confidentiality or privilege is not waived or lost by any mistaken delivery or
unauthorized disclosure of the message. All messages sent to and from Agoda may be
monitored to ensure compliance with company policies, to protect the company's
interests and to remove potential malware. Electronic messages may be intercepted,
amended, lost or deleted, or contain viruses.