January 2020 - ceph-users

Large omap objects in radosgw .usage pool: is there a way to reshard the rgw usage log?

by Ingo Reimann

Hi All >On 09/10/2019 09:07, Florian Haas wrote: >[...] >the question with about resharding the usage log still stands. (The untrimmed usage log, in my case, would have blasted the old 2M keys threshold, too.) > >Cheers, Florian Is there any new wisdom about resharding the usage log for one user? Since Nautilus we get a HEALTH_WARN after 3 weeks of the month because the usage data of one single user reaches the threshold for large omap warnings - which I already increased to 1M. At start of month, we truncate the usage data so we are save again for a while. Cheers, ingo -- ngo Reimann Teamleiter Technik [ https://www.dunkel.de/ ] Dunkel GmbH Philipp-Reis-Straße 2 65795 Hattersheim Fon: +49 6190 889-100 Fax: +49 6190 889-399 eMail: support(a)dunkel.de http://www.Dunkel.de/ Amtsgericht Frankfurt/Main HRB: 37971 Geschäftsführer: Axel Dunkel Ust-ID: DE 811622001

4 years, 3 months

1
0
0 0

Cephalocon early-bird registration ends today

by Sage Weil

Hi everyone, Quick reminder that the early-bird registration for Cephalocon Seoul (Mar 3-5) ends tonight! We also have the hotel booking link and code up on the site (finally--sorry for the delay). https://ceph.io/cephalocon/seoul-2020/ Hope to see you there! sage

4 years, 3 months

1
0
0 0

PG lock contention? CephFS metadata pool rebalance

by Stefan Kooman

Hi, Like I said in an earlier mail to this list, we re-balanced ~ 60% of the CephFS metadata pool to NVMe backed devices. Roughly 422 M objects (1.2 Billion replicated). We have 512 PGs allocated to them. While rebalancing we suffered from quite a few SLOW_OPS. Memory, CPU and device IOPS capacity were not a limiting factor as far as we can see (plenty of them available ... nowhere near max capacity). We saw quite a few slow ops with the following events: "time": "2019-12-19 09:41:02.712010", "event": "reached_pg" }, { "time": "2019-12-19 09:41:02.712014", "event": "waiting for rw locks" }, { "time": "2019-12-19 09:41:02.881939", "event": "reached_pg" ... and this repeated 100's of times taking ~ 30 seconds to complete Does this indicate PG lock contention? If so ... would we need to provide more PGs to the metadata pool to avoid this? The metadata pool is only ~ 166 MiB big ... but with loads of OMAPs ... Most advice on PG planning is concerned with the _amount_ of data ... but the metadata pool (and this might also be true for RGW index pools) seem to be a special case. Thanks for your insights, Gr. Stefan -- | BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info(a)bit.nl

4 years, 3 months

1
1
0 0

deep-scrub / backfilling: large amount of SLOW_OPS after upgrade to 13.2.8

by Stefan Kooman

Hi, After the upgrade to 13.2.8 deep-scrub has a big impact on client IO: loads of SLOW_OPS and high latency. We hardly ever had SLOW_OPS, but since the upgrade the impact is so big that we even have OSDs marking each other out (OSD op thread timeout) multiple times during the scrub window. Plenty of CPU / RAM / IOPS left, hardly any load on these OSD servers. Has there anything changed in this release that can explain this behaviour? Besides this the impact of rebalance is very severe as well. With only the balancer remapping a couple of PGs at a time there are loads of (MDS_)SLOW_OPS. This morning the cephfs metadata pool got rebalanced ... and that triggered a lot of SLOW_OPS. One particular OSD was pegged at 1000% CPU for more than half an hour (not doing that much IO): that's 10 cores going full throttle! After a restart this issue was gone. Thanks, Stefan -- | BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info(a)bit.nl

4 years, 3 months

2
3
0 0

subscribe

by lists

4 years, 3 months

1
0
0 0

Upgrade from Jewel to Luminous resulted 82% misplacement

by 徐蕴

Hi, We upgraded our cluster from Jewel to Luminous, and it turned out that more than 80% object misplaced. Since our cluster has 130T data, backfilling seems take forever. We didn’t modify any crushmap. Any thoughts about this issue? br, Xu Yun

4 years, 3 months

2
2
0 0

lists and gmail

by Sasha Litvak

It seems that people now split between new and old list servers. Regardless of either one of them, I am missing a number of messages that appear on archive pages but never seem to make to my inbox. And no they are not in my junk folder. I wonder if some of my questions are not getting a response because people don't receive them. Any other reason, like people choosing not to answer, is pretty acceptable. Does anyone else have difficulty communicating with user list using gmail account?

4 years, 3 months

1
0
0 0

Re: cephfs kernel mount option uid?

by Kevin Thorpe

Hi, Pretty certain not. I hit that exact issue. The workaround I was suggested to use was an init container running as root to change the ownership. That works ok but is very hacky. -- *Kevin Thorpe* VP of Enterprise Platform *W* *|* www.predictx.com *P * *|* +44 (0)20 3005 6750 <+44%2020%203005%206750> | +44 (0)808 204 0344 <+44%20808%20204%200344> *A* *|* 7th Floor, 52 Grosvenor Gardens, London SW1W 0AU <https://maps.google.com/?q=7th+Floor,+52+Grosvenor+Gardens,+London+SW1W+0AU…> <http://twitter.com/predictxai> <https://www.linkedin.com/company/1339603/> _________________________________________________________ This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. On Mon, 20 Jan 2020 at 16:47, Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote: > > Is it possible to mount a cephfs with a specific uid or gid? To make it > available to a 'non-root' user? > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io >

4 years, 3 months

1
0
0 0

backfill / recover logic (OSD included as selection criterion)

by Stefan Kooman

Hi, Is there any logic / filtering which PGs to backfill at any given time that takes into account the OSD the PG is living on? Our cluster is backfilling a complete pool now (512 PGs) and (currently) of the 7 active+remapped+backfilling there are 4 of them on the same OSD. Which stresses this OSD way more than needed. It would be nice if the selection criteria for which PG to backfill (and / or recover) from would include the OSD as selection criterion in order to spread the load across different OSDs. Gr. Stefan -- | BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info(a)bit.nl

4 years, 3 months

2
2
0 0

Benchmark results for Seagate Exos2X14 Dual Actuator HDDs

by Paul Emmerich

Hi, we ran some benchmarks with a few samples of Seagate's new HDDs that some of you might find interesting: Blog post: https://croit.io/2020/01/06/2020-01-06-benchmark-mach2 GitHub repo with scripts and raw data: https://github.com/croit/benchmarks/tree/master/mach2-disks Tl;dr: way faster for writes, somewhat faster for reads in some scenarios Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io Looking for Ceph training? We have some free spots available https://croit.io/training/4-days-ceph-in-depth-training croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90

4 years, 3 months

5
6
0 0

2024

2023

2022

2021

2020

2019

ceph-users January 2020