- ceph-users - lists.ceph.io

Re: backfill_toofull after adding new OSDs

by Frank Schilder

I observe the same issue after adding two new OSD hosts to an almost empty mimic cluster. > Let's try to restrict discussion to the original thread > "backfill_toofull while OSDs are not full" and get a tracker opened up > for this issue. Is this the issue you are referring to: https://tracker.ceph.com/issues/41255 ? I have a number of larger rebalance operations ahead and will probably see this for a couple of days. If there is any information (logs etc.) I can provide, please let me know. Status right now is: [root@ceph-01 ~]# ceph status cluster: id: e4ece518-f2cb-4708-b00f-b6bf511e91d9 health: HEALTH_ERR 15227159/90990337 objects misplaced (16.735%) Degraded data redundancy (low space): 64 pgs backfill_toofull too few PGs per OSD (29 < min 30) services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs-1/1/1 up {0=ceph-12=up:active}, 1 up:standby-replay osd: 208 osds: 208 up, 208 in; 273 remapped pgs data: pools: 7 pools, 790 pgs objects: 9.45 M objects, 17 TiB usage: 21 TiB used, 1.4 PiB / 1.4 PiB avail pgs: 15227159/90990337 objects misplaced (16.735%) 517 active+clean 190 active+remapped+backfill_wait 64 active+remapped+backfill_wait+backfill_toofull 19 active+remapped+backfilling io: client: 893 KiB/s rd, 6.3 MiB/s wr, 208 op/s rd, 306 op/s wr recovery: 298 MiB/s, 156 objects/s Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

4 years, 8 months

1
0
0 0

Howto add DB (aka RockDB) device to existing OSD on HDD

by 74cmonty＠gmail.com

Hi, I have created OSD on HDD w/o putting DB on faster drive. In order to improve performance I have now a single SSD drive with 3.8TB. I modified /etc/ceph/ceph.conf by adding this in [global]: bluestore_block_db_size = 53687091200 This should create RockDB with size 50GB. Then I tried to move DB to a new device (SSD) that is not formatted: root@ld5505:~# ceph-bluestore-tool bluefs-bdev-new-db –-path /var/lib/ceph/osd/ceph-76 --dev-target /dev/sdbk too many positional options have been specified on the command line Checking the content of /var/lib/ceph/osd/ceph-76 it appears that there's no link to block.db: root@ld5505:~# ls -l /var/lib/ceph/osd/ceph-76/ insgesamt 52 -rw-r--r-- 1 ceph ceph 418 Aug 27 11:08 activate.monmap lrwxrwxrwx 1 ceph ceph 93 Aug 27 11:08 block -> /dev/ceph-8cd045dc-9eb2-47ad-9668-116cf425a66a/osd-block-9c51bde1-3c75-4767-8808-f7e7b58b8f97 -rw-r--r-- 1 ceph ceph 2 Aug 27 11:08 bluefs -rw-r--r-- 1 ceph ceph 37 Aug 27 11:08 ceph_fsid -rw-r--r-- 1 ceph ceph 37 Aug 27 11:08 fsid -rw------- 1 ceph ceph 56 Aug 27 11:08 keyring -rw-r--r-- 1 ceph ceph 8 Aug 27 11:08 kv_backend -rw-r--r-- 1 ceph ceph 21 Aug 27 11:08 magic -rw-r--r-- 1 ceph ceph 4 Aug 27 11:08 mkfs_done -rw-r--r-- 1 ceph ceph 41 Aug 27 11:08 osd_key -rw-r--r-- 1 ceph ceph 6 Aug 27 11:08 ready -rw-r--r-- 1 ceph ceph 3 Aug 27 11:08 require_osd_release -rw-r--r-- 1 ceph ceph 10 Aug 27 11:08 type -rw-r--r-- 1 ceph ceph 3 Aug 27 11:08 whoami root@ld5505:~# more /var/lib/ceph/osd/ceph-76/bluefs 1 Questions: How can I add DB device for every single existing OSD to this new SSD drive? How can I increase the DB size later in case it's insufficient? THX

4 years, 8 months

2
1
1 0

help

by Amudhan P

Hi, I am using ceph version 13.2.6 (mimic) on test setup trying with cephfs. my ceph health status showing warning . "ceph health" HEALTH_WARN Degraded data redundancy: 1197023/7723191 objects degraded (15.499%) "ceph health detail" HEALTH_WARN Degraded data redundancy: 1197128/7723191 objects degraded (15.500%) PG_DEGRADED Degraded data redundancy: 1197128/7723191 objects degraded (15.500%) pg 2.0 is stuck undersized for 1076.454929, current state active+undersized+ pg 2.2 is stuck undersized for 1076.456639, current state active+undersized+ pg 2.3 is stuck undersized for 1076.456113, current state active+undersized+ pg 2.7 is stuck undersized for 1076.456342, current state active+undersized+ pg 2.8 is stuck undersized for 1076.455920, current state active+undersized+ pg 2.a is stuck undersized for 1076.486412, current state active+undersized+ pg 2.b is stuck undersized for 1076.485975, current state active+undersized+ pg 2.f is stuck undersized for 1076.486953, current state active+undersized+ pg 2.10 is stuck undersized for 1076.486763, current state active+undersized pg 2.12 is stuck undersized for 1076.486539, current state active+undersized pg 2.13 is stuck undersized for 1075.419199, current state active+undersized pg 2.17 is stuck undersized for 1076.455424, current state active+undersized pg 2.18 is stuck undersized for 1075.419639, current state active+undersized pg 2.1a is stuck undersized for 1076.455966, current state active+undersized pg 2.1b is stuck undersized for 1076.486677, current state active+undersized pg 2.1f is stuck undersized for 1076.455572, current state active+undersized how to bring it health status OK regards Amudhan

4 years, 8 months

5
10
0 0

Which CephFS clients send a compressible hint?

by Erwin Bogaard

We're mainly using CephFS using the Centos/Rhel 7 kernel client and I'm pondering if I should go for bluestore compression mode" = passive or aggressive with this client to get compression on (preferably) only compressible objects. Is there any list of CephFS clients that send compressible hints? If not, is there any other way to detect this, other than checking a client by just writing some compressible and some uncompressible data? The documentation is rather sparse on this matter, as far as I can see.

4 years, 8 months

2
1
0 0

Deleted snapshot still having error, how to fix (pg repair is not working)

by Marc Roos

I was a little bit afraid I would be deleting this snapshot without result. How do I fix this error (pg repair is not working) pg 17.36 is active+clean+inconsistent, acting [7,29,12] 2019-08-30 10:40:04.580470 7f9b3f061700 -1 log_channel(cluster) log [ERR] : repair 17.36 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:head : expected clone 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4 1 missing

4 years, 8 months

1
0
0 0

645% Clean PG's in Dashboard

by c.lilja＠falseprivacy.org

Hi, I've upgraded to Nautilus from Mimic a while ago and enabled the pg_autoscaler. When pg_autoscaler was activated I got a HEALTH_WARN regarding: POOL_TARGET_SIZE_BYTES_OVERCOMMITTED 1 subtrees have overcommitted pool target_size_bytes Pools ['cephfs_data_reduced', 'cephfs_data', 'cephfs_metadata'] overcommit available storage by 1.460x due to target_size_bytes 0 on pools [] POOL_TARGET_SIZE_RATIO_OVERCOMMITTED 1 subtrees have overcommitted pool target_size_ratio Pools ['cephfs_data_reduced', 'cephfs_data', 'cephfs_metadata'] overcommit available storage by 1.460x due to target_size_ratio 0.000 on pools [] Both target_size_bytes and target_size_ratio on all the pools are set to 0, so I started to wonder why this error message appear. My autoscale-status looks like this: POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE cephfs_metadata 16708M 4.0 34465G 0.0019 1.0 8 warn cephfs_data_reduced 15506G 2.0 34465G 0.8998 1.0 375 warn cephfs_data 6451G 3.0 34465G 0.5616 1.0 250 warn So the ratio in total is 1.4633.. Isn't 1.0 of the combined ratio of all pools equal of full? I also enabled the Dashboard and saw that the PG Status showed "645% clean" PG's. This cluster was originally installed with version Jewel, so may it be any legacy setting or such that causing this?

4 years, 8 months

1
0
0 0

Re: active+remapped+backfilling with objects misplaced

by David Casier

Hi, First, do not panic :) Secondly, verify that the number of pg pools is adapted to the need and the cluster. Third, if I understood correctly, there was a too small number of pg in the pool and the data is essentially on a PG => wait for the data to be distributed correctly. If possible, there is pg-upmap for a finer distribution of PGs.

4 years, 8 months

2
1
0 0

Re: Identify rbd snapshot

by Ilya Dryomov

On Thu, Aug 29, 2019 at 11:20 PM Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote: > > > I have this error. I have found the rbd image with the > block_name_prefix:1f114174b0dc51, how can identify what snapshot this > is? (Is it a snapshot?) > > 2019-08-29 16:16:49.255183 7f9b3f061700 -1 log_channel(cluster) log > [ERR] : deep-scrub 17.36 > 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:head : expected > clone 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4 1 missing > 2019-08-29 16:24:54.806912 7f9b3f061700 -1 log_channel(cluster) log > [ERR] : 17.36 deep-scrub 1 errors If you already found the image, just "rbd snap ls <image name>". Is there a snapshot with SNAPID 4? Thanks, Ilya

4 years, 8 months

2
1
0 0

admin socket for OpenStack client vanishes

by Georg Fleig

Hi, I am trying to get detailed information about the RBD images used by OpenStack (r/w operations, throughput, ..). On the mailing list I found instructions that this is possible using an admin socket of the client [1]. So I enabled the socket on one of my hosts according to [2]. The manual states that the socket should be there once I restart the VM. At some point it actually does appear, but it vanishes within a second or two. If I keep monitoring the directory I see it appearing for roughly 1-2 seconds per minute. The socket looks like this: root@compute01:/var/run/ceph/guests# ls -l srwxr-xr-x 1 cinder cinder 0 Aug 29 17:54 ceph-client.cinder.2772108.94507439454256.asok Does anyone know what I am doing wrong? Or is there another way to get information about which RBD image is causing the most load on a cluster? Regards, Georg [1] http://webcache.googleusercontent.com/search?q=cache%3Ahttp%3A%2F%2Flists.c… By the way, the mail archive from before 2019 seems to be inaccessible. Using google cache as a fallback. [2] https://docs.ceph.com/docs/mimic/rbd/rbd-openstack/#configuring-nova

4 years, 8 months

1
0
0 0

Danish ceph users

by Torben Hørup

Hi A colleague and I are talking about making an event in Denmark for the danish ceph community, and we would like to get a feeling of how many ceph users are there in Denmark and hereof who would be interested in a Danish ceph event ? Regards, Torben

4 years, 8 months

4
3
0 0

2024

2023

2022

2021

2020

2019

ceph-users