I'm attempting to get get ceph up and running, and currently feel like I'm
going around in circles.
I'm attempting to use cephadm and Pacific, currently on debian buster,
mostly because centos7 ain't supported any more and cenotos8 ain't support
by some of my hardware.
Anyway I have a few nodes with 59x 7.2TB disks but for some reason the osd
daemons don't start, the disks get formatted and the osd are created but
the daemons never come up.
They are probably the wrong spec for ceph (48gb of memory and only 4 cores)
but I was expecting them to start and be either dirt slow or crash later,
anyway I've got upto 30 of them, so I was hoping on getting at least get
6PB of raw storage out of them.
As yet I've not spotted any helpful error messages.
This is for a archive / slow ceph cluster so I'm not expecting speed.
Thanks in advance.
Peter.
Hi,
I have removed one node, but now ceph seems to stuck in:
Degraded data redundancy: 67/2393 objects degraded (2.800%), 12 pgs
degraded, 12 pgs undersized
How to "force" rebalancing? Or should I just wait a little bit more?
Kind regards,
rok
Hi
The server run 15.2.9 and has 15 HDD and 3 SSD.
The OSDs was created with this YAML file
hdd.yml
--------
service_type: osd
service_id: hdd
placement:
host_pattern: 'pech-hd-*'
data_devices:
rotational: 1
db_devices:
rotational: 0
The result was that the 3 SSD is added to 1 VG with 15 LV on it.
# vgs | egrep "VG|dbs"
VG #PV #LV #SN Attr
VSize VFree
ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b 3 15 0 wz--n-
<5.24t 48.00m
One of the osd failed and I run rm with replace
# ceph orch osd rm 178 --replace
and the result is
# ceph osd tree | grep "ID|destroyed"
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT
PRI-AFF
178 hdd 12.82390 osd.178 destroyed 0
1.00000
But I'm not able to replace the disk with the same YAML file as shown
above.
# ceph orch apply osd -i hdd.yml --dry-run
################
OSDSPEC PREVIEWS
################
+---------+------+------+------+----+-----+
|SERVICE |NAME |HOST |DATA |DB |WAL |
+---------+------+------+------+----+-----+
+---------+------+------+------+----+-----+
I guess this is the wrong way to do it, but I can't find the answer in
the documentation.
So how can I replace this failed disk in Cephadm?
--
Kai Stian Olstad
Hello,
I have by mistake re-installed the OS of an OSD node of my Octopus cluster (managed by cephadm). Luckily the OSD data is on a separate disk and did not get affected by the re-install.
Now I have the following state:
health: HEALTH_WARN
1 stray daemon(s) not managed by cephadm
1 osds down
1 host (1 osds) down
To fix that I tried to run:
# ceph orch daemon add osd ceph1f:/dev/sda
Created no osd(s) on host ceph1f; already created?
That did not work, so I tried:
# ceph cephadm osd activate ceph1f
no valid command found; 10 closest matches:
...
Error EINVAL: invalid command
Did not work either. So I wanted to ask how can I "adopt" back an OSD disk to my cluster?
Thanks for your help.
Regards,
Mabi
After scaling the number of MDS daemons down, we now have a daemon stuck in the
"up:stopping" state. The documentation says it can take several minutes to stop the
daemon, but it has been stuck in this state for almost a full day. According to
the "ceph fs status" output attached below, it still holds information about 2
inodes, which we assume is the reason why it cannot stop completely.
Does anyone know what we can do to finally stop it?
cephfs - 71 clients
======
RANK STATE MDS ACTIVITY DNS INOS
0 active ceph-mon-01 Reqs: 0 /s 15.7M 15.4M
1 active ceph-mon-02 Reqs: 48 /s 19.7M 17.1M
2 stopping ceph-mon-03 0 2
POOL TYPE USED AVAIL
cephfs_metadata metadata 652G 185T
cephfs_data data 1637T 539T
STANDBY MDS
ceph-mon-03-mds-2
MDS version: ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable)
Hi! is is (technically) possible to instruct cephfs to store files < 1Mib on a (replicate) pool
and the others files on another (ec) pool?
And even more, is it possible to take the same kind of decision on the path of the file?
(let's say that critical files with names like r"/critical_path/critical_.*" i want them in a 6x replication ssd pool)
Thank you!
Adrian
Could you share the output of
lsblk -o name,rota,size,type
from the affected osd node?
My spec file is for a tiny lab cluster, in your case the db drive size
should be something like '5T:6T' to specify a range.
How large are the HDDs? Also maybe you should use the option
'filter_logic: AND', but I'm not sure if that's already the default, I
remember that there were issues in Nautilus because the default was
OR. I tried this just recently with a version similar to this, I
believe it was 15.2.8 and it worked for me, but again, it's just a
tiny virtual lab cluster.
Zitat von Kai Stian Olstad <ceph+list(a)olstad.com>:
> On 26.05.2021 11:16, Eugen Block wrote:
>> Yes, the LVs are not removed automatically, you need to free up the
>> VG, there are a couple of ways to do so, for example remotely:
>>
>> pacific1:~ # ceph orch device zap pacific4 /dev/vdb --force
>>
>> or directly on the host with:
>>
>> pacific1:~ # cephadm ceph-volume lvm zap --destroy /dev/<CEPH_VG>/<DB_LV>
>
> Thanks,
>
> I used the cephadm command and deleted the LV and the VG now has free space
>
> # vgs | egrep "VG|dbs"
> VG #PV #LV #SN
> Attr VSize VFree
> ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b 3 14 0
> wz--n- <5.24t 357.74g
>
> But it doesn't seams to be able to use it, because it can find anyting
>
> # ceph orch apply osd -i hdd.yml --dry-run
> ################
> OSDSPEC PREVIEWS
> ################
> +---------+------+-------------+----------+----+-----+
> |SERVICE |NAME |HOST |DATA |DB |WAL |
> +---------+------+-------------+----------+----+-----+
> +---------+------+-------------+----------+----+-----+
>
> I tried adding size as you have in your configuration
> db_devices:
> rotational: 0
> size: '30G:'
>
> Still it was unable to create the OSD.
>
> If I removed the : so it is 30GB exact size, it did find the disk,
> but DB is not placed on a SSD since I do not have one with 30 GB
> exact size
> ################
> OSDSPEC PREVIEWS
> ################
> +---------+------+-------------+----------+----+-----+
> |SERVICE |NAME |HOST |DATA |DB |WAL |
> +---------+------+-------------+----------+----+-----+
> |osd |hdd |pech-hd-7 |/dev/sdt |- |- |
> +---------+------+-------------+----------+----+-----+
>
>
> To me I looks like Cephadm can't use/find the free space on the VG
> and use that as a new LV for the OSD.
>
>
> --
> Kai Stian Olstad
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
Hello togehter,
is there any best practive on the balance mode when I have a HAproxy
in front of my rgw_frontend?
currently we use "balance leastconn".
Cheers
Boris
Hi,
Is there a way to be able to manage specific pools with the python lib without admin keyring?
Not sure why it only works with admin keyring but not with the client keyring :/
Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo(a)agoda.com<mailto:istvan.szabo@agoda.com>
---------------------------------------------------
________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.