May 2021 - ceph-users - lists.ceph.io

by Reed Dier

I guess I should probably have been more clear, this is one pool of many, so the other OSDs aren't idle. So I don't necessarily think that the PG bump would be the worst thing to try, but its definitely not as bad as I may have made it sound. Thanks, Reed > On May 27, 2021, at 11:37 PM, Anthony D'Atri <anthony.datri(a)gmail.com> wrote: > > That gives you a PG ratio of …. 5.3 ??? > > Run `ceph osd df` ; I wouldn’t be surprised if some of your drives have 0 PGs on them, for sure I would suspect that they aren’t even at all. > > There are bottlenecks in the PG code, and in the OSD code — one reason why with NVMe clusters it’s common to split each drive into at least 2 OSDs. With spinners you don’t want to do that, but you get the idea. > > The pg autoscaler is usually out of its Vulcan mind. 512 would give you a ratio of just 21. > > Prior to 12.2.1 conventional wisdom was a PG ratio of 100-200 on spinners. > > 2048 PGs would give you a ratio of 85, which current (retconned) guidance would call good. I’d probably go to 4096 but 2048 would be way better than 128. > > I strongly suspect that PG splitting would still get you done faster than the way it is, esp. if you’re running BlueStore OSDs. > > Try bumping pg_num up to say 262 and see how bad it is, and if when pgp_num catches up if your ingest rate isn’t a bit higher than it was before. > >> EC8:2, across about 16 hosts, 240 OSDs, with 24 of those being 8TB 7.2k SAS, and the other 216 being 2TB 7.2K SATA. So there are quite a few spindles in play here. >> Only 128 PGs, in this pool, but its the only RBD image in this pool. Autoscaler recommends going to 512, but was hoping to avoid the performance overhead of the PG splits if possible, given perf is bad enough as is. > >

2 years, 11 months

1
0
0 0

Messed up placement of MDS

by mabi

Hello, I am trying to place the two MDS daemons for CephFS on dedicated nodes. For that purpose I tried out a few different "cephadm orch apply ..." commands with a label but at the end it looks like I messed up with the placement as I now have two mds service_types as you can see below: # ceph orch ls --service-type mds --export service_type: mds service_id: ceph1fs service_name: mds.ceph1fs placement: count: 2 hosts: - ceph1g - ceph1a --- service_type: mds service_id: label:mds service_name: mds.label:mds placement: count: 2 This second entry at the bottom seems totally wrong and I would like to remove it but I haven't found how to remove it totally. Any ideas? Ideally I just want to place two MDS daemons on node ceph1a and ceph1g. Regards, Mabi

2 years, 11 months

1
1
0 0

Ceph osd will not start.

by Peter Childs

I'm attempting to get get ceph up and running, and currently feel like I'm going around in circles. I'm attempting to use cephadm and Pacific, currently on debian buster, mostly because centos7 ain't supported any more and cenotos8 ain't support by some of my hardware. Anyway I have a few nodes with 59x 7.2TB disks but for some reason the osd daemons don't start, the disks get formatted and the osd are created but the daemons never come up. They are probably the wrong spec for ceph (48gb of memory and only 4 cores) but I was expecting them to start and be either dirt slow or crash later, anyway I've got upto 30 of them, so I was hoping on getting at least get 6PB of raw storage out of them. As yet I've not spotted any helpful error messages. This is for a archive / slow ceph cluster so I'm not expecting speed. Thanks in advance. Peter.

2 years, 11 months

4
8
0 0

rebalancing after node more

by Rok Jaklič

Hi, I have removed one node, but now ceph seems to stuck in: Degraded data redundancy: 67/2393 objects degraded (2.800%), 12 pgs degraded, 12 pgs undersized How to "force" rebalancing? Or should I just wait a little bit more? Kind regards, rok

2 years, 11 months

2
5
0 0

cephadm: How to replace failed HDD where DB is on SSD

by Kai Stian Olstad

Hi The server run 15.2.9 and has 15 HDD and 3 SSD. The OSDs was created with this YAML file hdd.yml -------- service_type: osd service_id: hdd placement: host_pattern: 'pech-hd-*' data_devices: rotational: 1 db_devices: rotational: 0 The result was that the 3 SSD is added to 1 VG with 15 LV on it. # vgs | egrep "VG|dbs" VG #PV #LV #SN Attr VSize VFree ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b 3 15 0 wz--n- <5.24t 48.00m One of the osd failed and I run rm with replace # ceph orch osd rm 178 --replace and the result is # ceph osd tree | grep "ID|destroyed" ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF 178 hdd 12.82390 osd.178 destroyed 0 1.00000 But I'm not able to replace the disk with the same YAML file as shown above. # ceph orch apply osd -i hdd.yml --dry-run ################ OSDSPEC PREVIEWS ################ +---------+------+------+------+----+-----+ |SERVICE |NAME |HOST |DATA |DB |WAL | +---------+------+------+------+----+-----+ +---------+------+------+------+----+-----+ I guess this is the wrong way to do it, but I can't find the answer in the documentation. So how can I replace this failed disk in Cephadm? -- Kai Stian Olstad

2 years, 11 months

3
10
0 0

How to add back stray OSD daemon after node re-installation

by mabi

Hello, I have by mistake re-installed the OS of an OSD node of my Octopus cluster (managed by cephadm). Luckily the OSD data is on a separate disk and did not get affected by the re-install. Now I have the following state: health: HEALTH_WARN 1 stray daemon(s) not managed by cephadm 1 osds down 1 host (1 osds) down To fix that I tried to run: # ceph orch daemon add osd ceph1f:/dev/sda Created no osd(s) on host ceph1f; already created? That did not work, so I tried: # ceph cephadm osd activate ceph1f no valid command found; 10 closest matches: ... Error EINVAL: invalid command Did not work either. So I wanted to ask how can I "adopt" back an OSD disk to my cluster? Thanks for your help. Regards, Mabi

2 years, 11 months

2
10
0 0

MDS stuck in up:stopping state

by Martin Rasmus Lundquist Hansen

After scaling the number of MDS daemons down, we now have a daemon stuck in the "up:stopping" state. The documentation says it can take several minutes to stop the daemon, but it has been stuck in this state for almost a full day. According to the "ceph fs status" output attached below, it still holds information about 2 inodes, which we assume is the reason why it cannot stop completely. Does anyone know what we can do to finally stop it? cephfs - 71 clients ====== RANK STATE MDS ACTIVITY DNS INOS 0 active ceph-mon-01 Reqs: 0 /s 15.7M 15.4M 1 active ceph-mon-02 Reqs: 48 /s 19.7M 17.1M 2 stopping ceph-mon-03 0 2 POOL TYPE USED AVAIL cephfs_metadata metadata 652G 185T cephfs_data data 1637T 539T STANDBY MDS ceph-mon-03-mds-2 MDS version: ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable)

2 years, 11 months

3
7
0 0

cephfs:: store files on different pools?

by Adrian Sevcenco

Hi! is is (technically) possible to instruct cephfs to store files < 1Mib on a (replicate) pool and the others files on another (ec) pool? And even more, is it possible to take the same kind of decision on the path of the file? (let's say that critical files with names like r"/critical_path/critical_.*" i want them in a 6x replication ssd pool) Thank you! Adrian

2 years, 11 months

2
1
0 0

Re: cephadm: How to replace failed HDD where DB is on SSD

by Eugen Block

Could you share the output of lsblk -o name,rota,size,type from the affected osd node? My spec file is for a tiny lab cluster, in your case the db drive size should be something like '5T:6T' to specify a range. How large are the HDDs? Also maybe you should use the option 'filter_logic: AND', but I'm not sure if that's already the default, I remember that there were issues in Nautilus because the default was OR. I tried this just recently with a version similar to this, I believe it was 15.2.8 and it worked for me, but again, it's just a tiny virtual lab cluster. Zitat von Kai Stian Olstad <ceph+list(a)olstad.com>: > On 26.05.2021 11:16, Eugen Block wrote: >> Yes, the LVs are not removed automatically, you need to free up the >> VG, there are a couple of ways to do so, for example remotely: >> >> pacific1:~ # ceph orch device zap pacific4 /dev/vdb --force >> >> or directly on the host with: >> >> pacific1:~ # cephadm ceph-volume lvm zap --destroy /dev/<CEPH_VG>/<DB_LV> > > Thanks, > > I used the cephadm command and deleted the LV and the VG now has free space > > # vgs | egrep "VG|dbs" > VG #PV #LV #SN > Attr VSize VFree > ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b 3 14 0 > wz--n- <5.24t 357.74g > > But it doesn't seams to be able to use it, because it can find anyting > > # ceph orch apply osd -i hdd.yml --dry-run > ################ > OSDSPEC PREVIEWS > ################ > +---------+------+-------------+----------+----+-----+ > |SERVICE |NAME |HOST |DATA |DB |WAL | > +---------+------+-------------+----------+----+-----+ > +---------+------+-------------+----------+----+-----+ > > I tried adding size as you have in your configuration > db_devices: > rotational: 0 > size: '30G:' > > Still it was unable to create the OSD. > > If I removed the : so it is 30GB exact size, it did find the disk, > but DB is not placed on a SSD since I do not have one with 30 GB > exact size > ################ > OSDSPEC PREVIEWS > ################ > +---------+------+-------------+----------+----+-----+ > |SERVICE |NAME |HOST |DATA |DB |WAL | > +---------+------+-------------+----------+----+-----+ > |osd |hdd |pech-hd-7 |/dev/sdt |- |- | > +---------+------+-------------+----------+----+-----+ > > > To me I looks like Cephadm can't use/find the free space on the VG > and use that as a new LV for the OSD. > > > -- > Kai Stian Olstad > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

2 years, 11 months

2
3
0 0

best practice balance mode in HAproxy in front of RGW?

by Boris Behrens

Hello togehter, is there any best practive on the balance mode when I have a HAproxy in front of my rgw_frontend? currently we use "balance leastconn". Cheers Boris

2 years, 11 months

2
4
0 0

2024

2023

2022

2021

2020

2019

ceph-users May 2021