October 2023 - ceph-users

Stickyness of writing vs full network storage writing

by Hans Kaiser

6 months, 1 week

1
0
0 0

[Pacific] ceph orch device ls do not returns any HDD

by Patrick Begou

Hi everyone I'm new to CEPH, just a french 4 days training session with Octopus on VMs that convince me to build my first cluster. At this time I have 4 old identical nodes for testing with 3 HDDs each, 2 network interfaces and running Alma Linux8 (el8). I try to replay the training session but it fails, breaking the web interface because of some problems with podman 4.2 not compatible with Octopus. So I try to deploy Pacific with cephadm tool on my first node (mostha1) (to enable testing also an upgrade later). dnf -y install https://download.ceph.com/rpm-16.2.13/el8/noarch/cephadm-16.2.13-0.el8.noar… monip=$(getent ahostsv4 mostha1 |head -n 1| awk '{ print $1 }') cephadm bootstrap --mon-ip $monip --initial-dashboard-password xxxxx \ --initial-dashboard-user admceph \ --allow-fqdn-hostname --cluster-network 10.1.0.0/16 This was sucessfull. But running "*c**eph orch device ls*" do not show any HDD even if I have /dev/sda (used by the OS), /dev/sdb and /dev/sdc The web interface shows a row capacity which is an aggregate of the sizes of the 3 HDDs for the node. I've also tried to reset /dev/sdb but cephadm do not see it: [ceph: root@mostha1 /]# ceph orch device zap mostha1.legi.grenoble-inp.fr /dev/sdb --force Error EINVAL: Device path '/dev/sdb' not found on host 'mostha1.legi.grenoble-inp.fr' On my first attempt with octopus, I was able to list the available HDD with this command line. Before moving to Pacific, the OS on this node has been reinstalled from scratch. Any advices for a CEPH beginner ? Thanks Patrick

6 months, 1 week

9
51
3 0

Ceph - Error ERANGE: (34) Numerical result out of range

by Pardhiv Karri

Hi, Trying to move a node/host under a new SSD root and getting below error. Has anyone seen it and know the fix? the pg_num and pgp_num are same for all pools so that is not the issue. [root@hbmon1 ~]# ceph osd crush move hbssdhost1 root=ssd Error ERANGE: (34) Numerical result out of range [root@hbmon1 ~]# Thanks, Pardhiv

6 months, 1 week

2
1
0 0

How to trigger scrubbing in Ceph on-demand ?

by Jayjeet Chakraborty

Hi all, I am trying to trigger deep scrubbing in Ceph reef (18.2.0) on demand on a set of files that I randomly write to CephFS. I have tried both invoking deep-scrub on CephFS using ceph tell and just deep scrubbing a particular PG. Unfortunately, none of that seems to be working for me. I am monitoring the ceph status output, it never shows any scrubbing information. Can anyone please help me out on this ? In a nutshell, I need Ceph to scrub for me anytime I want. I am using Ceph with default configs for scrubbing. Thanks all. Best Regards, *Jayjeet Chakraborty* Ph.D. Student Department of Computer Science and Engineering University of California, Santa Cruz *Email: jayjeetc(a)ucsc.edu <jayjeetc(a)ucsc.edu>*

6 months, 2 weeks

3
4
0 0

quincy v17.2.7 QE Validation status

by Yuri Weinstein

Details of this release are summarized here: https://tracker.ceph.com/issues/63219#note-2 Release Notes - TBD Issue https://tracker.ceph.com/issues/63192 appears to be failing several runs. Should it be fixed for this release? Seeking approvals/reviews for: smoke - Laura rados - Laura, Radek, Travis, Ernesto, Adam King rgw - Casey fs - Venky orch - Adam King rbd - Ilya krbd - Ilya upgrade/quincy-p2p - Known issue IIRC, Casey pls confirm/approve client-upgrade-quincy-reef - Laura powercycle - Brad pls confirm ceph-volume - Guillaume pls take a look Please reply to this email with approval and/or trackers of known issues/PRs to address them. Josh, Neha - gibba and LRC upgrades -- N/A for quincy now after reef release. Thx YuriW

6 months, 2 weeks

10
25
0 0

[quincy - 17.2.6] Lua scripting in the rados gateway - HTTP_REMOTE-ADDR missing

by stephan＠gridscale.io

Hi Ceph users, currently I'm using the lua script feature in radosgw to send "put_obj" and "get_obj" requests stats to a mongo db. So far it's working quite well but I miss a field which is very important for us for traffic stats. Im looking for the HTTP_REMOTE-ADDR field which is available in the ops_log but couldn't find it in here https://docs.ceph.com/en/quincy/radosgw/lua-scripting/#request-fields Does someone know how to get this field via lua script? Cheers Stephan

6 months, 2 weeks

2
2
0 0

Enterprise SSD require for Ceph Reef Cluster

by Nafiz Imtiaz

Hi, We have a ceph cluster running reef version. We want to buy some enterprise ssd for our ceph cluster, and our prepared storage size is 1.92TB. For that, we have selected the Intel model. Please give a review about this model, and if you have any other model preference, please share with us. Thank you Brand: Intel SSD: 1.92TB 2.5'' Enterprise SATA, 6Gb/s Model: D3-S4510 Regards, Nafiz Imtiaz Assistant Manager, Product Development IT Division Bangladesh Export Import Company Ltd.

6 months, 2 weeks

1
0
0 0

init unable to update_crush_location: (34) Numerical result out of range

by Pardhiv Karri

Hi, Getting an error while adding a new node/OSD with bluestore OSDs to the cluster. The OSD is added without any host and is down, tried to bring it up didn't work. The same method to add in other clusters doesn't have any issue. Any idea what the problem is? Ceph Version: ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable) Ceph Health: OK 2023-10-25 20:40:40.867878 7f1f478cde40 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1698266440867866, "job": 1, "event": "recovery_started", "log_files": [270]} 2023-10-25 20:40:40.867883 7f1f478cde40 4 rocksdb: [/build/ceph-U0cfoi/ceph-12.2.11/src/rocksdb/db/db_impl_open.cc:482] Recovering log #270 mode 0 2023-10-25 20:40:40.867904 7f1f478cde40 4 rocksdb: [/build/ceph-U0cfoi/ceph-12.2.11/src/rocksdb/db/version_set.cc:2395] Creating manifest 272 2023-10-25 20:40:40.869553 7f1f478cde40 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1698266440869548, "job": 1, "event": "recovery_finished"} 2023-10-25 20:40:40.870924 7f1f478cde40 4 rocksdb: [/build/ceph-U0cfoi/ceph-12.2.11/src/rocksdb/db/db_impl_open.cc:1063] DB pointer 0x55c9061ba000 2023-10-25 20:40:40.870964 7f1f478cde40 1 bluestore(/var/lib/ceph/osd/ceph-721) _open_db opened rocksdb path db options compression=kNoCompression,max_write_buffer_number=4,min_write_buffer_number_to_merge=1,recycle_log_file_num=4,write_buffer_size=268435456,writable_file_max_buffer_size=0,compaction_readahead_size=2097152 2023-10-25 20:40:40.871234 7f1f478cde40 1 freelist init 2023-10-25 20:40:40.871293 7f1f478cde40 1 bluestore(/var/lib/ceph/osd/ceph-721) _open_alloc opening allocation metadata 2023-10-25 20:40:40.871314 7f1f478cde40 1 bluestore(/var/lib/ceph/osd/ceph-721) _open_alloc loaded 3.49TiB in 1 extents 2023-10-25 20:40:40.874700 7f1f478cde40 0 <cls> /build/ceph-U0cfoi/ceph-12.2.11/src/cls/cephfs/cls_cephfs.cc:197: loading cephfs 2023-10-25 20:40:40.874721 7f1f478cde40 0 _get_class not permitted to load sdk 2023-10-25 20:40:40.874955 7f1f478cde40 0 _get_class not permitted to load kvs 2023-10-25 20:40:40.875638 7f1f478cde40 0 _get_class not permitted to load lua 2023-10-25 20:40:40.875724 7f1f478cde40 0 <cls> /build/ceph-U0cfoi/ceph-12.2.11/src/cls/hello/cls_hello.cc:296: loading cls_hello 2023-10-25 20:40:40.875776 7f1f478cde40 0 osd.721 0 crush map has features 288232575208783872, adjusting msgr requires for clients 2023-10-25 20:40:40.875780 7f1f478cde40 0 osd.721 0 crush map has features 288232575208783872 was 8705, adjusting msgr requires for mons 2023-10-25 20:40:40.875784 7f1f478cde40 0 osd.721 0 crush map has features 288232575208783872, adjusting msgr requires for osds 2023-10-25 20:40:40.875837 7f1f478cde40 0 osd.721 0 load_pgs 2023-10-25 20:40:40.875840 7f1f478cde40 0 osd.721 0 load_pgs opened 0 pgs 2023-10-25 20:40:40.875844 7f1f478cde40 0 osd.721 0 using weightedpriority op queue with priority op cut off at 64. 2023-10-25 20:40:40.877401 7f1f478cde40 -1 osd.721 0 log_to_monitors {default=true} 2023-10-25 20:40:40.888408 7f1f478cde40 -1 osd.721 0 mon_cmd_maybe_osd_create fail: '(34) Numerical result out of range': (34) Numerical result out of range 2023-10-25 20:40:40.891367 7f1f478cde40 -1 osd.721 0 mon_cmd_maybe_osd_create fail: '(34) Numerical result out of range': (34) Numerical result out of range 2023-10-25 20:40:40.891409 7f1f478cde40 -1 osd.721 0 init unable to update_crush_location: (34) Numerical result out of range Thanks, Pardhiv

6 months, 2 weeks

1
0
0 0

cephadm failing to add hosts despite a working SSH connection

by Michel Jouvin

Hi, I'm struggling with a problem to add cephadm some hosts in our Quincy cluster. "ceph orch host add host addr" fails with the famous "missing 2 required positional arguments: 'hostname' and 'addr'" because of bug https://tracker.ceph.com/issues/59081 but looking at cephadm messages with "ceph -W cephadm", I can see: -------- Log: Opening SSH connection to 10.81.22.183, port 22 [conn=736] Connected to SSH server at 10.81.22.183, port 22 [conn=736] Local address: 10.81.22.151, port 53640 [conn=736] Peer address: 10.81.22.183, port 22 [conn=736] Login timeout expired [conn=736] Aborting connection Traceback (most recent call last): (removed) cephadm.ssh.HostConnectionError: Failed to connect to jc-rgw3 (10.81.22.183). Login timeout expired Log: Opening SSH connection to 10.81.22.183, port 22 [conn=736] Connected to SSH server at 10.81.22.183, port 22 [conn=736] Local address: 10.81.22.151, port 53640 [conn=736] Peer address: 10.81.22.183, port 22 [conn=736] Login timeout expired [conn=736] Aborting connection -------- It is very strange for me because " ssh -i /tmp/cephadm_identity_xxx 10.81.22.183" is working fine |when executed in the active mgr container. | |The host I'm trying to add is a RGW that has 3 active network connections: Ceph public network, our intranet network (used for managing the server) and the network of the application that will use the RGW. It seems to be somewhat related to this network configuration as main cluster servers (MONs, OSDs) which have only the the 2 Ceph networks and the intranet one don't suffer the same problem. In particular, what is strange is that I can successfully add the host if I use its intranet adress rather than the Ceph public network one (|||10.81.22.183) in the cephadm command. I have 3 hosts sharing the same network configuration and having the same problem. Any hint or suggestion to troubleshoot further this problem would be highly appreciated! Best regards, Michel

6 months, 2 weeks

1
1
0 0

Re: radosgw - octopus - 500 Bad file descriptor on upload

by BEAUDICHON Hubert (Acoss)

Hi, We encountered the same kind of error for one of our users. CEPH Version : 16.2.10 2023-10-24T17:57:22.438+0200 7fc27ab44700 0 WARNING: set_req_state_err err_no=125 resorting to 500 2023-10-24T17:57:22.439+0200 7fc584957700 0 req 12200560481916573577 143.735748291s ERROR: RESTFUL_IO(s)->complete_header() returned err=Bad file descriptor 2023-10-24T17:57:22.439+0200 7fbecfaed700 1 ====== req done req=0x7fbdb86ab600 op status=-125 http_status=500 latency=143.735748291s ====== 2023-10-24T17:57:22.439+0200 7fbecfaed700 1 beast: 0x7fbdb86ab600: 10.227.131.117 - dev-centralog-save [24/Oct/2023:17:54:58.703 +0200] "PUT <obfuscate>&partNumber=1 HTTP/1.1" 500 58720313 - "<agent>" - latency=143.735748291s I haven't got any clue on the cause...

6 months, 2 weeks

2
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users October 2023