February 2021 - ceph-users

Re: XFS block size on RBD / EC vs space amplification

by Gilles Mocellin

Hello, thank you for your response. Erasure Coding gets better and we really cannot afford the storage overhead of x3 replication. Anyway, as I understand the problem, it is also present with replication, just less amplified (blocks are not divided between OSDs, just replicated fully). Le 2021-02-02 16:50, Steven Pine a écrit : > You are unlikely to avoid the space amplification bug by using larger > block sizes. I honestly do not recommend using an EC pool, it is > generally less performant and EC pools are not as well supported by > the ceph development community. > > On Tue, Feb 2, 2021 at 5:11 AM Gilles Mocellin > <gilles.mocellin(a)nuagelibre.org> wrote: > >> Hello, >> >> As we know, with 64k for bluestore_min_alloc_size_hdd (I'm only >> using >> HDDs), >> in certain conditions, especially with erasure coding, >> there's a leak of space while writing objects smaller than 64k x k >> (EC:k+m). >> >> Every object is divided in k elements, written on different OSD. >> >> My main use case is big (40TB) RBD images mounted as XFS filesystems >> on >> Linux servers, >> exposed to our backup software. >> So, it's mainly big files. >> >> My though, but I'd like some other point of view, is that I could >> deal >> with the amplification by using bigger block sizes on my XFS >> filesystems. >> Instead of reducing bluestore_min_alloc_size_hdd on all OSDs. >> >> What do you think ? >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io > > -- > > Steven Pine > > E steven.pine(a)webair.com | P 516.938.4100 x > > Webair | 501 Franklin Avenue Suite 200, Garden City NY, 11530 > > webair.com [1] > > [2] [3] [4] > > NOTICE: This electronic mail message and all attachments transmitted > with it are intended solely for the use of the addressee and may > contain legally privileged proprietary and confidential information. > If the reader of this message is not the intended recipient, or if you > are an employee or agent responsible for delivering this message to > the intended recipient, you are hereby notified that any > dissemination, distribution, copying, or other use of this message or > its attachments is strictly prohibited. If you have received this > message in error, please notify the sender immediately by replying to > this message and delete it from your computer. > > > > Links: > ------ > [1] http://webair.com > [2] https://www.facebook.com/WebairInc/ > [3] https://twitter.com/WebairInc > [4] https://www.linkedin.com/company/webair

3 years, 2 months

1
0
0 0

Module 'cephadm' has failed: 'NoneType' object has no attribute 'split'

by Tony Liu

Hi, After upgrading from 15.2.5 to 15.2.8, I see this health error. Has anyone seen this? "ceph log last cephadm" doesn't show anything about it. How can I trace it? Thanks! Tony

3 years, 2 months

1
1
0 0

no device listed after adding host

by Tony Liu

Hi, I added a host by "ceph orch host add ceph-osd-5 10.6.10.84 ceph-osd". I can see the host by "ceph orch host ls", but no devices listed by "ceph orch device ls ceph-osd-5". I tried "ceph orch device zap ceph-osd-5 /dev/sdc --force", which works fine. Wondering why no devices listed? What I am missing here? Thanks! Tony

3 years, 2 months

2
5
0 0

is unknown pg going to be active after osds are fixed?

by Tony Liu

Hi, With 3 replicas, a pg hs 3 osds. If all those 3 osds are down, the pg becomes unknow. Is that right? If those 3 osds are replaced and in and on, is that pg going to be eventually back to active? Or anything else has to be done to fix it? Thanks! Tony

3 years, 2 months

3
3
0 0

radosgw bucket index issue

by Fox, Kevin M

We have a fairly old cluster that has over time been upgraded to nautilus. We were digging through some things and found 3 bucket indexes without a corresponding bucket. They should have been deleted but somehow were left behind. When we try and delete the bucket index, it will not allow it as the bucket is not found. The bucket index list command works fine though without the bucket. Is there a way to delete the indexes? Maybe somehow relink the bucket so it can be deleted again? Thanks, Kevin

3 years, 2 months

1
1
0 0

Re: `cephadm` not deploying OSDs from a storage spec

by Juan Miguel Olmo Martinez

Hi Davor, Use "ceph orch ls osd --format yaml" to have more info about the problems deploying the osd service, probably that will give you clues about what is happening. Share the input if you cannot solve the problem:-) The same command can be used for other services like the node-exporter, although in that case I think that the problem was a bug fixed a few days ago. https://github.com/ceph/ceph/pull/38946 The fix was backported to pacific last week. BR -- Juan Miguel Olmo Martínez Senior Software Engineer Red Hat <https://www.redhat.com/> jolmomar(a)redhat.com <https://www.redhat.com/>

3 years, 2 months

1
0
0 0

Re: no device listed after adding host

by Juan Miguel Olmo Martinez

Hi Eugen Block <https://lists.ceph.io/hyperkitty/users/d8d92a6469954bcd82ced59fcbf701d7/> useful tips to create OSDs: 1. Check devices availability in your cluster hosts: # ceph orch device ls 2. Devices not available: This usually means that you have created lvs in these devices, (I mean the devices are not cleaned.) A ""cepr orch zap <device>" will fix that. 3. The OSD does not start. Check what is the status with: ceph orch osd ls --format yaml -- Juan Miguel Olmo Martínez Senior Software Engineer Red Hat <https://www.redhat.com/> jolmomar(a)redhat.com <https://www.redhat.com/>

3 years, 2 months

1
0
0 0

pg repair or pg deep-scrub does not start

by Marcel Kuiper

Hi I've got an old cluster running ceph 10.2.11 with filestore backend. Last week a PG was reported inconsistent with a scrub error # ceph health detail HEALTH_ERR 1 pgs inconsistent; 1 scrub errors pg 38.20 is active+clean+inconsistent, acting [1778,1640,1379] 1 scrub errors I first tried 'ceph pg repair' but nothing seemed to happen, then # rados list-inconsistent-obj 38.20 --format=json-pretty showed that the problem was on osd 1379. The logs showed that that osd had read errors so I decided to mark that osd out for replacement. Later on removed it from the crush map en deleted the osd. My thoughts were that the missing replica gets backfilled on another osd and everything would be ok again. It got another osd assigned but the health error stayed # ceph health detail HEALTH_ERR 1 pgs inconsistent; 1 scrub errors pg 38.20 is active+clean+inconsistent, acting [1778,1640,1384] 1 scrub errors Now I get an error on: # rados list-inconsistent-obj 38.20 --format=json-pretty No scrub information available for pg 38.20 error 2: (2) No such file or directory And if I try # ceph pg deep-scrub 38.20 instructing pg 38.20 on osd.1778 to deep-scrub The deepscrub does not get scheduled. Same goes for # ceph daemon osd.1778 trigger_scrub 38.20 on the storage node Nothing appears in the logs concerning the scrubbing of PG 38.20. I see in the log that other PG's get (deep) scrubbed according to the automatic scheduling There is no recovery going on but just to be sure I'd set ceph daemon osd.1778 config set osd_scrub_during_recovery true Also the load limit is set way higher then the actual system load I checked the other osds en there are no scrubs going on on these when I schedule the deep-scrub I found some report of people that had the same problem. However no solution was found (for example https://tracker.ceph.com/issues/15781). Even in mimic and luminous there were sort of the same cases - Does anyone know what logging I should incraese in order to get more information as to why my deep-scrub does not get scheduled - Is there a way in jewel to see the list of scheduled scrubs and their dates for an osd - Does someone have advice on how to proceed in clearing this PG error Thanks for any help Marcel

3 years, 2 months

1
0
0 0

osd recommended scheduler

by Andrei Mikhailovsky

Hello everyone, Could some one please let me know what is the recommended modern kernel disk scheduler that should be used for SSD and HDD osds? The information in the manuals is pretty dated and refer to the schedulers which have been deprecated from the recent kernels. Thanks Andrei

3 years, 2 months

4
5
0 0

CephFS per client monitoring

by Erwin Bogaard

Hi, we're using mainly CephFS to give access to storage. At all times we can see that all clients combines use "X MiB/s" and "y op/s" for read and write by using the cli or ceph dashboard. With a tool like iftop, I can get a bit of insight to which clients most data 'flows', but it isn't really precise. is there any way to get a MiB/s and op/s number per CephFS client? Thanks, Erwin

3 years, 2 months

2
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users February 2021