Hi All
>On 09/10/2019 09:07, Florian Haas wrote:
>[...]
>the question with about resharding the usage log still stands. (The untrimmed usage log, in my case, would have blasted the old 2M keys threshold, too.)
>
>Cheers, Florian
Is there any new wisdom about resharding the usage log for one user? Since Nautilus we get a HEALTH_WARN after 3 weeks of the month because the usage data of one single user reaches the threshold for large omap warnings - which I already increased to 1M. At start of month, we truncate the usage data so we are save again for a while.
Cheers,
ingo
--
ngo Reimann
Teamleiter Technik
[ https://www.dunkel.de/ ]
Dunkel GmbH
Philipp-Reis-Straße 2
65795 Hattersheim
Fon: +49 6190 889-100
Fax: +49 6190 889-399
eMail: support(a)dunkel.de
http://www.Dunkel.de/ Amtsgericht Frankfurt/Main
HRB: 37971
Geschäftsführer: Axel Dunkel
Ust-ID: DE 811622001
Hi everyone,
Quick reminder that the early-bird registration for Cephalocon Seoul (Mar
3-5) ends tonight! We also have the hotel booking link and code up on the
site (finally--sorry for the delay).
https://ceph.io/cephalocon/seoul-2020/
Hope to see you there!
sage
Hi,
Like I said in an earlier mail to this list, we re-balanced ~ 60% of the
CephFS metadata pool to NVMe backed devices. Roughly 422 M objects (1.2
Billion replicated). We have 512 PGs allocated to them. While
rebalancing we suffered from quite a few SLOW_OPS. Memory, CPU and
device IOPS capacity were not a limiting factor as far as we can see (plenty of
them available ... nowhere near max capacity). We saw quite a few
slow ops with the following events:
"time": "2019-12-19 09:41:02.712010",
"event": "reached_pg"
},
{
"time": "2019-12-19 09:41:02.712014",
"event": "waiting for rw locks"
},
{
"time": "2019-12-19 09:41:02.881939",
"event": "reached_pg"
... and this repeated 100's of times taking ~ 30 seconds to complete
Does this indicate PG lock contention?
If so ... would we need to provide more PGs to the metadata pool to avoid this?
The metadata pool is only ~ 166 MiB big ... but with loads of OMAPs ...
Most advice on PG planning is concerned with the _amount_ of data ... but the
metadata pool (and this might also be true for RGW index pools) seem to be a
special case.
Thanks for your insights,
Gr. Stefan
--
| BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info(a)bit.nl
Hi,
After the upgrade to 13.2.8 deep-scrub has a big impact on client IO:
loads of SLOW_OPS and high latency. We hardly ever had SLOW_OPS, but
since the upgrade the impact is so big that we even have OSDs marking
each other out (OSD op thread timeout) multiple times during the scrub
window. Plenty of CPU / RAM / IOPS left, hardly any load on these OSD
servers. Has there anything changed in this release that can explain
this behaviour?
Besides this the impact of rebalance is very severe as well. With only
the balancer remapping a couple of PGs at a time there are loads of
(MDS_)SLOW_OPS. This morning the cephfs metadata pool got rebalanced ...
and that triggered a lot of SLOW_OPS. One particular OSD was pegged at
1000% CPU for more than half an hour (not doing that much IO): that's 10
cores going full throttle! After a restart this issue was gone.
Thanks,
Stefan
--
| BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info(a)bit.nl
Hi,
We upgraded our cluster from Jewel to Luminous, and it turned out that more than 80% object misplaced. Since our cluster has 130T data, backfilling seems take forever. We didn’t modify any crushmap. Any thoughts about this issue?
br,
Xu Yun
It seems that people now split between new and old list servers.
Regardless of either one of them, I am missing a number of messages that
appear on archive pages but never seem to make to my inbox. And no they
are not in my junk folder. I wonder if some of my questions are not
getting a response because people don't receive them. Any other reason,
like people choosing not to answer, is pretty acceptable. Does anyone else
have difficulty communicating with user list using gmail account?
Hi,
Pretty certain not. I hit that exact issue. The workaround I was
suggested to use was an init container running as root to change the
ownership. That works ok but is very hacky.
--
*Kevin Thorpe*
VP of Enterprise Platform
*W* *|* www.predictx.com
*P * *|* +44 (0)20 3005 6750 <+44%2020%203005%206750> | +44 (0)808 204 0344
<+44%20808%20204%200344>
*A* *|* 7th Floor, 52 Grosvenor Gardens, London SW1W 0AU
<https://maps.google.com/?q=7th+Floor,+52+Grosvenor+Gardens,+London+SW1W+0AU…>
<http://twitter.com/predictxai>
<https://www.linkedin.com/company/1339603/>
_________________________________________________________
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system manager.
This message contains confidential information and is intended only for the
individual named. If you are not the named addressee you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately by e-mail if you have received this e-mail by mistake and
delete this e-mail from your system. If you are not the intended recipient
you are notified that disclosing, copying, distributing or taking any
action in reliance on the contents of this information is strictly
prohibited.
On Mon, 20 Jan 2020 at 16:47, Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote:
>
> Is it possible to mount a cephfs with a specific uid or gid? To make it
> available to a 'non-root' user?
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
Hi,
Is there any logic / filtering which PGs to backfill at any given time
that takes into account the OSD the PG is living on?
Our cluster is backfilling a complete pool now (512 PGs) and (currently)
of the 7 active+remapped+backfilling there are 4 of them on the same
OSD. Which stresses this OSD way more than needed. It would be nice if
the selection criteria for which PG to backfill (and / or recover) from
would include the OSD as selection criterion in order to spread the load
across different OSDs.
Gr. Stefan
--
| BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info(a)bit.nl