Hi everyone
I'm new to CEPH, just a french 4 days training session with Octopus on
VMs that convince me to build my first cluster.
At this time I have 4 old identical nodes for testing with 3 HDDs each,
2 network interfaces and running Alma Linux8 (el8). I try to replay the
training session but it fails, breaking the web interface because of
some problems with podman 4.2 not compatible with Octopus.
So I try to deploy Pacific with cephadm tool on my first node (mostha1)
(to enable testing also an upgrade later).
dnf -y install
https://download.ceph.com/rpm-16.2.13/el8/noarch/cephadm-16.2.13-0.el8.noar…
monip=$(getent ahostsv4 mostha1 |head -n 1| awk '{ print $1 }')
cephadm bootstrap --mon-ip $monip --initial-dashboard-password xxxxx \
--initial-dashboard-user admceph \
--allow-fqdn-hostname --cluster-network 10.1.0.0/16
This was sucessfull.
But running "*c**eph orch device ls*" do not show any HDD even if I have
/dev/sda (used by the OS), /dev/sdb and /dev/sdc
The web interface shows a row capacity which is an aggregate of the
sizes of the 3 HDDs for the node.
I've also tried to reset /dev/sdb but cephadm do not see it:
[ceph: root@mostha1 /]# ceph orch device zap
mostha1.legi.grenoble-inp.fr /dev/sdb --force
Error EINVAL: Device path '/dev/sdb' not found on host
'mostha1.legi.grenoble-inp.fr'
On my first attempt with octopus, I was able to list the available HDD
with this command line. Before moving to Pacific, the OS on this node
has been reinstalled from scratch.
Any advices for a CEPH beginner ?
Thanks
Patrick
Hi,
Trying to move a node/host under a new SSD root and getting below error.
Has anyone seen it and know the fix? the pg_num and pgp_num are same for
all pools so that is not the issue.
[root@hbmon1 ~]# ceph osd crush move hbssdhost1 root=ssd
Error ERANGE: (34) Numerical result out of range
[root@hbmon1 ~]#
Thanks,
Pardhiv
Hi all,
I am trying to trigger deep scrubbing in Ceph reef (18.2.0) on demand on a
set of files that I randomly write to CephFS. I have tried both invoking
deep-scrub on CephFS using ceph tell and just deep scrubbing a
particular PG. Unfortunately, none of that seems to be working for me. I am
monitoring the ceph status output, it never shows any scrubbing
information. Can anyone please help me out on this ? In a nutshell, I need
Ceph to scrub for me anytime I want. I am using Ceph with default configs
for scrubbing. Thanks all.
Best Regards,
*Jayjeet Chakraborty*
Ph.D. Student
Department of Computer Science and Engineering
University of California, Santa Cruz
*Email: jayjeetc(a)ucsc.edu <jayjeetc(a)ucsc.edu>*
Details of this release are summarized here:
https://tracker.ceph.com/issues/63219#note-2
Release Notes - TBD
Issue https://tracker.ceph.com/issues/63192 appears to be failing several runs.
Should it be fixed for this release?
Seeking approvals/reviews for:
smoke - Laura
rados - Laura, Radek, Travis, Ernesto, Adam King
rgw - Casey
fs - Venky
orch - Adam King
rbd - Ilya
krbd - Ilya
upgrade/quincy-p2p - Known issue IIRC, Casey pls confirm/approve
client-upgrade-quincy-reef - Laura
powercycle - Brad pls confirm
ceph-volume - Guillaume pls take a look
Please reply to this email with approval and/or trackers of known
issues/PRs to address them.
Josh, Neha - gibba and LRC upgrades -- N/A for quincy now after reef release.
Thx
YuriW
Hi Ceph users,
currently I'm using the lua script feature in radosgw to send "put_obj" and "get_obj" requests stats to a mongo db.
So far it's working quite well but I miss a field which is very important for us for traffic stats.
Im looking for the HTTP_REMOTE-ADDR field which is available in the ops_log but couldn't find it in here https://docs.ceph.com/en/quincy/radosgw/lua-scripting/#request-fields
Does someone know how to get this field via lua script?
Cheers
Stephan
Hi,
We have a ceph cluster running reef version. We want to buy some enterprise
ssd for our ceph cluster, and our prepared storage size is 1.92TB.
For that, we have selected the Intel model. Please give a review about this
model, and if you have any other model preference, please share with us.
Thank you
Brand: Intel
SSD: 1.92TB 2.5'' Enterprise SATA, 6Gb/s
Model: D3-S4510
Regards,
Nafiz Imtiaz
Assistant Manager, Product Development
IT Division
Bangladesh Export Import Company Ltd.
Hi,
Getting an error while adding a new node/OSD with bluestore OSDs to the
cluster. The OSD is added without any host and is down, tried to bring it
up didn't work. The same method to add in other clusters doesn't have any
issue. Any idea what the problem is?
Ceph Version: ceph version 12.2.11
(26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)
Ceph Health: OK
2023-10-25 20:40:40.867878 7f1f478cde40 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1698266440867866, "job": 1, "event": "recovery_started",
"log_files": [270]}
2023-10-25 20:40:40.867883 7f1f478cde40 4 rocksdb:
[/build/ceph-U0cfoi/ceph-12.2.11/src/rocksdb/db/db_impl_open.cc:482]
Recovering log #270 mode 0
2023-10-25 20:40:40.867904 7f1f478cde40 4 rocksdb:
[/build/ceph-U0cfoi/ceph-12.2.11/src/rocksdb/db/version_set.cc:2395]
Creating manifest 272
2023-10-25 20:40:40.869553 7f1f478cde40 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1698266440869548, "job": 1, "event": "recovery_finished"}
2023-10-25 20:40:40.870924 7f1f478cde40 4 rocksdb:
[/build/ceph-U0cfoi/ceph-12.2.11/src/rocksdb/db/db_impl_open.cc:1063] DB
pointer 0x55c9061ba000
2023-10-25 20:40:40.870964 7f1f478cde40 1
bluestore(/var/lib/ceph/osd/ceph-721) _open_db opened rocksdb path db
options
compression=kNoCompression,max_write_buffer_number=4,min_write_buffer_number_to_merge=1,recycle_log_file_num=4,write_buffer_size=268435456,writable_file_max_buffer_size=0,compaction_readahead_size=2097152
2023-10-25 20:40:40.871234 7f1f478cde40 1 freelist init
2023-10-25 20:40:40.871293 7f1f478cde40 1
bluestore(/var/lib/ceph/osd/ceph-721) _open_alloc opening allocation
metadata
2023-10-25 20:40:40.871314 7f1f478cde40 1
bluestore(/var/lib/ceph/osd/ceph-721) _open_alloc loaded 3.49TiB in 1
extents
2023-10-25 20:40:40.874700 7f1f478cde40 0 <cls>
/build/ceph-U0cfoi/ceph-12.2.11/src/cls/cephfs/cls_cephfs.cc:197: loading
cephfs
2023-10-25 20:40:40.874721 7f1f478cde40 0 _get_class not permitted to load
sdk
2023-10-25 20:40:40.874955 7f1f478cde40 0 _get_class not permitted to load
kvs
2023-10-25 20:40:40.875638 7f1f478cde40 0 _get_class not permitted to load
lua
2023-10-25 20:40:40.875724 7f1f478cde40 0 <cls>
/build/ceph-U0cfoi/ceph-12.2.11/src/cls/hello/cls_hello.cc:296: loading
cls_hello
2023-10-25 20:40:40.875776 7f1f478cde40 0 osd.721 0 crush map has features
288232575208783872, adjusting msgr requires for clients
2023-10-25 20:40:40.875780 7f1f478cde40 0 osd.721 0 crush map has features
288232575208783872 was 8705, adjusting msgr requires for mons
2023-10-25 20:40:40.875784 7f1f478cde40 0 osd.721 0 crush map has features
288232575208783872, adjusting msgr requires for osds
2023-10-25 20:40:40.875837 7f1f478cde40 0 osd.721 0 load_pgs
2023-10-25 20:40:40.875840 7f1f478cde40 0 osd.721 0 load_pgs opened 0 pgs
2023-10-25 20:40:40.875844 7f1f478cde40 0 osd.721 0 using weightedpriority
op queue with priority op cut off at 64.
2023-10-25 20:40:40.877401 7f1f478cde40 -1 osd.721 0 log_to_monitors
{default=true}
2023-10-25 20:40:40.888408 7f1f478cde40 -1 osd.721 0
mon_cmd_maybe_osd_create fail: '(34) Numerical result out of range': (34)
Numerical result out of range
2023-10-25 20:40:40.891367 7f1f478cde40 -1 osd.721 0
mon_cmd_maybe_osd_create fail: '(34) Numerical result out of range': (34)
Numerical result out of range
2023-10-25 20:40:40.891409 7f1f478cde40 -1 osd.721 0 init unable to
update_crush_location: (34) Numerical result out of range
Thanks,
Pardhiv
Hi,
I'm struggling with a problem to add cephadm some hosts in our Quincy
cluster. "ceph orch host add host addr" fails with the famous "missing 2
required positional arguments: 'hostname' and 'addr'" because of bug
https://tracker.ceph.com/issues/59081 but looking at cephadm messages
with "ceph -W cephadm", I can see:
--------
Log: Opening SSH connection to 10.81.22.183, port 22
[conn=736] Connected to SSH server at 10.81.22.183, port 22
[conn=736] Local address: 10.81.22.151, port 53640
[conn=736] Peer address: 10.81.22.183, port 22
[conn=736] Login timeout expired
[conn=736] Aborting connection
Traceback (most recent call last): (removed)
cephadm.ssh.HostConnectionError: Failed to connect to jc-rgw3
(10.81.22.183). Login timeout expired
Log: Opening SSH connection to 10.81.22.183, port 22
[conn=736] Connected to SSH server at 10.81.22.183, port 22
[conn=736] Local address: 10.81.22.151, port 53640
[conn=736] Peer address: 10.81.22.183, port 22
[conn=736] Login timeout expired
[conn=736] Aborting connection
--------
It is very strange for me because " ssh -i /tmp/cephadm_identity_xxx
10.81.22.183" is working fine |when executed in the active mgr container.
|
|The host I'm trying to add is a RGW that has 3 active network
connections: Ceph public network, our intranet network (used for
managing the server) and the network of the application that will use
the RGW. It seems to be somewhat related to this network configuration
as main cluster servers (MONs, OSDs) which have only the the 2 Ceph
networks and the intranet one don't suffer the same problem. In
particular, what is strange is that I can successfully add the host if I
use its intranet adress rather than the Ceph public network one
(|||10.81.22.183) in the cephadm command.
I have 3 hosts sharing the same network configuration and having the
same problem.
Any hint or suggestion to troubleshoot further this problem would be
highly appreciated!
Best regards,
Michel