April 2021 - ceph-users - lists.ceph.io

by Fabrice Bacchella

I'm trying to understand what and where radosgw listen ? There is a lot of contradictory or redundant informations about that. First about the contradictory informations for the socket. At https://docs.ceph.com/en/pacific/radosgw/config-ref/ <https://docs.ceph.com/en/pacific/radosgw/config-ref/>, it says rgw_socket_path, but at https://docs.ceph.com/en/pacific/man/8/radosgw/ <https://docs.ceph.com/en/pacific/man/8/radosgw/> is says 'rgw socket path' That problem is quite common in the ceph documentation. Are both value accepted ? Next about some naming, or binding IP. Where it's defined, and how ? You have: rgw_frontends = "beast ssl_endpoint=0.0.0.0:443 port=443 ..." rgw_host = rgw_port = rgw_dns_name = That's a lot of redundancy, or contradictory informations. What is the purpose of each one ? What is the difference between rgw_frontends = ".. port = ..." and rgw_port = ? Or rgw_host and rgw_dns_name. What is the difference ? The documentation provides no help at all: rgw_dns_name Description: The DNS name of the served domain. See also the hostnames setting within regions. The description says nothing new, it just repeat the field name. Is one of them used by the manager for communication ? I already had the problem for the entry in the certificate used by the frontend, it used an IP coming from nowhere. If a fcgi is used, how the manager find the endpoint ?

2 years, 11 months

2
2
0 0

Best distro to run ceph.

by Peter Childs

I'm trying to set up a new ceph cluster, and I've hit a bit of a blank. I started off with centos7 and cephadm. Worked fine to a point, except I had to upgrade podman but it mostly worked with octopus. Since this is a fresh cluster and hence no data at risk, I decided to jump straight into Pacific when it came out and upgrade. Which is where my trouble began. Mostly because Pacific needs a version on lvm later than what's in centos7. I can't upgrade to centos8 as my boot drives are not supported by centos8 due to the way redhst disabled lots of disk drivers. I think I'm looking at Ubuntu or debian. Given cephadm has a very limited set of depends it would be good to have a supported matrix, it would also be good to have a check in cephadm on upgrade, that says no I won't upgrade if the version of lvm2 is too low on any host and let's the admin fix the issue and try again. I was thinking to upgrade to centos8 for this project anyway until I relised that centos8 can't support my hardware I've inherited. But currently I've got a broken cluster unless I can workout some way to upgrade lvm in centos7. Peter.

2 years, 11 months

3
3
0 0

Large OSD Performance: osd_op_num_shards, osd_op_num_threads_per_shard

by Dave Hall

Hello, I noticed a couple unanswered questions on this topic from a while back. It seems, however, worth asking whether adjusting either or both of the subject attributes could improve performance with large HDD OSDs (mine are 12TB SAS). In the previous posts on this topic the writers indicated that they had experimented with increasing either or both of osd_op_num_shards and osd_op_num_threads_per_shard and had seen performance improvement. Like myself, the writers wondering about any limitations or pitfalls relating to such adjustments. Since I would rather not take chances with a 500TB production cluster I am asking for guidance from this list. BTW, my cluster is currently running Nautilus 14.2.6 (stock Debian packages). Thank you. -Dave -- Dave Hall Binghamton University kdhall(a)binghamton.edu

2 years, 11 months

1
0
0 0

Re: one of 3 monitors keeps going down

by Eugen Block

Have you checked for disk failure? dmesg, smartctl etc. ? Zitat von "Robert W. Eckert" <rob(a)rob.eckert.name>: > I worked through that workflow- but it seems like the one monitor > will run for a while - anywhere from an hour to a day, then just stop. > > This machine is running on AMD hardware (3600X CPU on X570 chipset) > while my other two are running on old intel. > > I did find this in the service logs > > 2021-04-30T16:02:40.135+0000 7f5d0a94f700 -1 rocksdb: submit_common > error: Corruption: block checksum mismatch: expected 395334538, got > 4289108204 in /var/lib/ceph/mon/ceph-cube/store.db/073501.sst > offset 36769734 size 84730 code = 2 Rocksdb transaction: > > I am attaching the output of > journalctl -u ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08867(a)mon.cube.service > > The error appears to be here: > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -61> > 2021-04-30T16:02:38.700+0000 7f5d21332700 4 mon.cube(a)-1(???).mgr > e702 active server: > [v2:192.168.2.199:6834/1641928541,v1:192.168.2.199:6835/1641928541](2184157) > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -60> > 2021-04-30T16:02:38.700+0000 7f5d21332700 4 mon.cube(a)-1(???).mgr > e702 mkfs or daemon transitioned to available, loading commands > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -59> > 2021-04-30T16:02:38.701+0000 7f5d21332700 4 set_mon_vals no > callback set > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -58> > 2021-04-30T16:02:38.701+0000 7f5d21332700 10 set_mon_vals > client_cache_size = 32768 > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -57> > 2021-04-30T16:02:38.701+0000 7f5d21332700 10 set_mon_vals > container_image = > docker.io/ceph/ceph@sha256:15b15fb7a708970f1b734285ac08aef45dcd76e86866af37412d041e00853743 > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -56> > 2021-04-30T16:02:38.701+0000 7f5d21332700 10 set_mon_vals > log_to_syslog = true > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -55> > 2021-04-30T16:02:38.701+0000 7f5d21332700 10 set_mon_vals > mon_data_avail_warn = 10 > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -54> > 2021-04-30T16:02:38.701+0000 7f5d21332700 10 set_mon_vals > mon_warn_on_insecure_global_id_reclaim_allowed = true > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -53> > 2021-04-30T16:02:38.701+0000 7f5d21332700 4 set_mon_vals no > callback set > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -52> > 2021-04-30T16:02:38.702+0000 7f5d21332700 2 auth: KeyRing::load: > loaded key file /var/lib/ceph/mon/ceph-cube/keyring > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -51> > 2021-04-30T16:02:38.702+0000 7f5d1095b700 3 rocksdb: > [db_impl/db_impl_compaction_flush.cc:2808] Compaction error: > Corruption: block checksum mismatch: expected 395334538, got > 4289108204 in /var/lib/ceph/mon/ceph- cube/store.db/073501.sst > offset 36769734 size 84730 > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -50> > 2021-04-30T16:02:38.702+0000 7f5d21332700 5 asok(0x56327d226000) > register_command compact hook 0x56327e028700 > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -49> > 2021-04-30T16:02:38.702+0000 7f5d1095b700 4 rocksdb: (Original Log > Time 2021/04/30-16:02:38.703267) [compaction/compaction_job.cc:760] > [default] compacted to: base level 6 level multiplier 10.00 max > bytes base 268435456 files[5 0 0 0 0 0 2] max score 0.00, MB/sec: > 11035.6 rd, 0.0 wr, level 6, files in(5, 2) out(1) MB in(32.1, > 126.7) out(0.0), read-write-amplify(5.0) write-amplify(0.0) > Corruption: block checksum mismatch: expected 395334538, got > 4289108204 in /var/lib/ceph/mon/ceph-cube/store.db/073501.sst > offset 36769734 size 84730, records in: 7670, records dropped: 6759 > output_compres > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -48> > 2021-04-30T16:02:38.702+0000 7f5d1095b700 4 rocksdb: (Original Log > Time 2021/04/30-16:02:38.703283) EVENT_LOG_v1 {"time_micros": > 1619798558703277, "job": 3, "event": "compaction_finished", > "compaction_time_micros": 15085, "compaction_time_cpu_micros": > 11937, "output_level": 6, "num_output_files": 1, > "total_output_size": 12627499, "num_input_records": 7670, > "num_output_records": 911, "num_subcompactions": 1, > "output_compression": "NoCompression", > "num_single_delete_mismatches": 0, "num_single_delete_fallthrough": > 0, "lsm_state": [5, 0, 0, 0, 0, 0, 2]} > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -47> > 2021-04-30T16:02:38.702+0000 7f5d1095b700 2 rocksdb: > [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background > compaction error: Corruption: block checksum mismatch: expected > 395334538, got 4289108204 in > /var/lib/ceph/mon/ceph-cube/store.db/073501.sst offset 36769734 > size 84730, Accumulated background error counts: 1 > Apr 30 12:02:40 cube.robeckert.us conmon[41474]: debug -46> > 2021-04-30T16:02:38.702+0000 7f5d21332700 5 asok(0x56327d226000) > register_command smart hook 0x56327e028700 > > > This is running the latest pacific container, but I was seeing the > same issue in octopus. > > The container runs under podman on rhel 8, and the > /var/lib/ceph/mon/ceph-cube is mapped to > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.cube.service > on the nvme boot drive, which has plenty of space. > > To recover I run a script that will stop the monitor on another > host, copy the store.db directory then start up, and it syncs right > up. > > > > Thanks, > Rob > > > > > > -----Original Message----- > From: Sebastian Wagner <sewagner(a)redhat.com> > Sent: Thursday, April 29, 2021 7:44 AM > To: Eugen Block <eblock(a)nde.ag>; ceph-users(a)ceph.io > Subject: [ceph-users] Re: one of 3 monitors keeps going down > > Right, here are the docs for that workflow: > > https://docs.ceph.com/en/latest/cephadm/mon/#mon-service > > Am 29.04.21 um 13:13 schrieb Eugen Block: >> Hi, >> >> instead of copying MON data to this one did you also try to redeploy >> the MON container entirely so it gets a fresh start? >> >> >> Zitat von "Robert W. Eckert" <rob(a)rob.eckert.name>: >> >>> Hi, >>> On a daily basis, one of my monitors goes down >>> >>> [root@cube ~]# ceph health detail >>> HEALTH_WARN 1 failed cephadm daemon(s); 1/3 mons down, quorum >>> rhel1.robeckert.us,story [WRN] CEPHADM_FAILED_DAEMON: 1 failed >>> cephadm daemon(s) >>> daemon mon.cube on cube.robeckert.us is in error state [WRN] >>> MON_DOWN: 1/3 mons down, quorum rhel1.robeckert.us,story >>> mon.cube (rank 2) addr >>> [v2:192.168.2.142:3300/0,v1:192.168.2.142:6789/0] is down (out of >>> quorum) [root@cube ~]# ceph --version ceph version 15.2.11 >>> (e3523634d9c2227df9af89a4eac33d16738c49cb) >>> octopus (stable) >>> >>> I have a script that will copy the mon data from another server and >>> it restarts and runs well for a while. >>> >>> It is always the same monitor, and when I look at the logs the only >>> thing I really see is the cephadm log showing it down >>> >>> 2021-04-28 10:07:26,173 DEBUG Running command: /usr/bin/podman >>> --version >>> 2021-04-28 10:07:26,217 DEBUG /usr/bin/podman: stdout podman version >>> 2.2.1 >>> 2021-04-28 10:07:26,222 DEBUG Running command: /usr/bin/podman >>> inspect --format >>> {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index >>> .Config.Labels "io.ceph.version"}} >>> ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08867-osd.2 >>> 2021-04-28 10:07:26,326 DEBUG /usr/bin/podman: stdout >>> fab17e5242eb4875e266df19ca89b596a2f2b1d470273a99ff71da2ae81eeb3c,dock >>> er.io/ceph/ceph:v15,5b724076c58f97872fc2f7701e8405ec809047d71528f79da >>> 452188daf2af72e,2021-04-26 >>> 17:13:15.54183375 -0400 EDT, >>> 2021-04-28 10:07:26,328 DEBUG Running command: systemctl is-enabled >>> ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08867(a)mon.cube<mailto:ceph-fe3a7c >>> b0-69ca-11eb-8d45-c86000d08867(a)mon.cube> >>> >>> 2021-04-28 10:07:26,334 DEBUG systemctl: stdout enabled >>> 2021-04-28 10:07:26,335 DEBUG Running command: systemctl is-active >>> ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08867(a)mon.cube<mailto:ceph-fe3a7c >>> b0-69ca-11eb-8d45-c86000d08867(a)mon.cube> >>> >>> 2021-04-28 10:07:26,340 DEBUG systemctl: stdout failed >>> 2021-04-28 10:07:26,340 DEBUG Running command: /usr/bin/podman >>> --version >>> 2021-04-28 10:07:26,395 DEBUG /usr/bin/podman: stdout podman version >>> 2.2.1 >>> 2021-04-28 10:07:26,402 DEBUG Running command: /usr/bin/podman >>> inspect --format >>> {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index >>> .Config.Labels "io.ceph.version"}} >>> ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08867-mon.cube >>> 2021-04-28 10:07:26,526 DEBUG /usr/bin/podman: stdout >>> 04e7c673cbacf5160427b0c3eb2f0948b2f15d02c58bd1d9dd14f975a84cfc6f,dock >>> er.io/ceph/ceph:v15,5b724076c58f97872fc2f7701e8405ec809047d71528f79da >>> 452188daf2af72e,2021-04-28 >>> 08:54:57.614847512 -0400 EDT, >>> >>> I don't know if it matters, but this server is an AMD 3600XT while >>> my other two servers which have had no issues are intel based. >>> >>> The root file system was originally on a SSD, and I switched to NVME, >>> so I eliminated controller or drive issues. (I didn't see anything >>> in dmesg anyway) >>> >>> If someone could point me in the right direction on where to >>> troubleshoot next, I would appreciate it. >>> >>> Thanks, >>> Rob Eckert >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an >>> email to ceph-users-leave(a)ceph.io >> >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an >> email to ceph-users-leave(a)ceph.io >> > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an > email to ceph-users-leave(a)ceph.io

2 years, 11 months

2
1
0 0

Performance questions - 4 node (commodity) cluster - what to expect (and what not ;-)

by Schmid, Michael

Hello folks, I am new to ceph and at the moment I am doing some performance tests with a 4 node ceph-cluster (pacific, 16.2.1). Node hardware (4 identical nodes): * DELL 3620 workstation * Intel Quad-Core i7-6700(a)3.4 GHz * 8 GB RAM * Debian Buster (base system, installed a dedicated on Patriot Burst 120 GB SATA-SSD) * HP 530SPF+ 10 GBit dual-port NIC (tested with iperf to 9.4 GBit/s from node to node) * 1 x Kingston KC2500 M2 NVMe PCIe SSD (500 GB, NO power loss protection !) * 3 x Seagate Barracuda SATA disk drives (7200 rpm, 500 GB) After bootstrapping a containerized (docker) ceph-cluster, I did some performance tests on the NVMe storage by creating a storage pool called „ssdpool“, consisting of 4 OSDs per (one) NVMe device (per node). A first write-performance test yields ============= root@ceph1:~# rados bench -p ssdpool 10 write -b 4M -t 16 --no-cleanup hints = 1 Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects Object prefix: benchmark_data_ceph1_78 sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 0 0 0 0 0 0 - 0 1 16 30 14 55.997 56 0.0209977 0.493427 2 16 53 37 73.9903 92 0.0264305 0.692179 3 16 76 60 79.9871 92 0.559505 0.664204 4 16 99 83 82.9879 92 0.609332 0.721016 5 16 116 100 79.9889 68 0.686093 0.698084 6 16 132 116 77.3224 64 1.19715 0.731808 7 16 153 137 78.2741 84 0.622646 0.755812 8 16 171 155 77.486 72 0.25409 0.764022 9 16 192 176 78.2076 84 0.968321 0.775292 10 16 214 198 79.1856 88 0.401339 0.766764 11 1 214 213 77.4408 60 0.969693 0.784002 Total time run: 11.0698 Total writes made: 214 Write size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 77.3272 Stddev Bandwidth: 13.7722 Max bandwidth (MB/sec): 92 Min bandwidth (MB/sec): 56 Average IOPS: 19 Stddev IOPS: 3.44304 Max IOPS: 23 Min IOPS: 14 Average Latency(s): 0.785372 Stddev Latency(s): 0.49011 Max latency(s): 2.16532 Min latency(s): 0.0144995 ============= ... and I think that 80 MB/s throughput is a very poor result in conjunction with NVMe devices and 10 GBit nics. A bare write-test (with fsync=0 option) of the NVMe drives yields a write throughput of round about 800 MB/s per device ... the second test (with fsync=1) drops performance to 200 MB/s. ============= root@ceph1:/home/mschmid# fio --rw=randwrite --name=IOPS-write --bs=1024k --direct=1 --filename=/dev/nvme0n1 --numjobs=4 --ioengine=libaio --iodepth=32 --refill_buffers --group_reporting --runtime=30 --time_based --fsync=0 IOPS-write: (g=0): rw=randwrite, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32... fio-3.12 Starting 4 processes Jobs: 4 (f=4): [w(4)][100.0%][w=723MiB/s][w=722 IOPS][eta 00m:00s] IOPS-write: (groupid=0, jobs=4): err= 0: pid=31585: Thu Apr 29 15:15:03 2021 write: IOPS=740, BW=740MiB/s (776MB/s)(21.8GiB/30206msec); 0 zone resets slat (usec): min=16, max=810, avg=106.48, stdev=30.48 clat (msec): min=7, max=1110, avg=172.09, stdev=120.18 lat (msec): min=7, max=1110, avg=172.19, stdev=120.18 clat percentiles (msec): | 1.00th=[ 32], 5.00th=[ 48], 10.00th=[ 53], 20.00th=[ 63], | 30.00th=[ 115], 40.00th=[ 161], 50.00th=[ 169], 60.00th=[ 178], | 70.00th=[ 190], 80.00th=[ 220], 90.00th=[ 264], 95.00th=[ 368], | 99.00th=[ 667], 99.50th=[ 751], 99.90th=[ 894], 99.95th=[ 986], | 99.99th=[ 1036] bw ( KiB/s): min=22528, max=639744, per=25.02%, avg=189649.94, stdev=113845.69, samples=240 iops : min= 22, max= 624, avg=185.11, stdev=111.18, samples=240 lat (msec) : 10=0.01%, 20=0.19%, 50=6.43%, 100=20.29%, 250=61.52% lat (msec) : 500=8.21%, 750=2.85%, 1000=0.47% cpu : usr=11.87%, sys=2.05%, ctx=13141, majf=0, minf=45 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.3%, 32=99.4%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued rwts: total=0,22359,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): WRITE: bw=740MiB/s (776MB/s), 740MiB/s-740MiB/s (776MB/s-776MB/s), io=21.8GiB (23.4GB), run=30206-30206msec Disk stats (read/write): nvme0n1: ios=0/89150, merge=0/0, ticks=0/15065724, in_queue=15118720, util=99.75% ============= Furthermore an IOPS-test on the NVMe device with block-size 4k shows round about 1000 IOPS with fsnyc=1 and 35000 IOPS with fsync=0. To my question: As CPU- and network-load seem to be low during my tests, I would like to know, which bottleneck can cause such a huge performance drop between the bare hardware-performance of the nvme-drives and the write-speeds in the rados benchmark. Could the missing power loss protection (fsync=1) be the problem, or what throughput should one expect to be normal in such a setup? Thanks for every advice! Best regards, Michael

2 years, 11 months

4
4
0 0

Re: Specify monitor IP when CIDR detection fails

by Stephen Smith6

2 years, 11 months

1
0
0 0

Specify monitor IP when CIDR detection fails

by Stephen Smith6

2 years, 11 months

1
0
0 0

cephadm upgrade from v15.11 to pacific fails all the times

by Ackermann, Christoph

Dear gents, to get handy with cephadm upgrade path and in general (we heavily use old style "ceph-deploy" Octopus based production clusters), we decided to do some tests with a vanilla cluster running 15.2.11 based on Centos8 on top of vSphere. Deployment of Octopus cluster runs very well and we are excited about this new technique and all the possibilities. No errors no clues... :-) Unfortunately upgrade fails to Pacific (16.2.0 or 16.2.1) either original docker or quay.ceph.io/ceph-ci/ceph:pacific images all the time. We use a small setup (3 mons, 2 mgrs, some osds) This is the upgrade behaviour: Upgrade of both MGR's seems to be ok but we get this: 2021-04-29T15:35:19.903111+0200 mgr.c0n00.vnxaqu [DBG] daemon mgr.c0n00.vnxaqu container digest correct 2021-04-29T15:35:19.903206+0200 mgr.c0n00.vnxaqu [DBG] daemon mgr.c0n00.vnxaqu deployed by correct version 2021-04-29T15:35:19.903298+0200 mgr.c0n00.vnxaqu [DBG] daemon mgr.c0n01.gstlmw container digest correct 2021-04-29T15:35:19.903378+0200 mgr.c0n00.vnxaqu [DBG] daemon mgr.c0n01.gstlmw *not deployed by correct version* After this the upgrade process stucks completely. Although you have a running cluster (minus one monitor daemon): [root@c0n00 ~]# ceph -s cluster: id: 5541c866-a8fe-11eb-b604-005056b8f1bf health: HEALTH_WARN * 3 hosts fail cephadm check* services: mon: 2 daemons, quorum c0n00,c0n02 (age 68m) mgr: c0n00.bmtvpr(active, since 68m), standbys: c0n01.jwfuca osd: 4 osds: 4 up (since 63m), 4 in (since 62m) [..] progress: Upgrade to 16.2.1-257-g717ce59b (0s) [=...........................] { "target_image": " quay.ceph.io/ceph-ci/ceph@sha256:d0f624287378fe63fc4c30bccc9f82bfe0e42e62381c0a3d0d3d86d985f5d788", "in_progress": true, "services_complete": [ "mgr" ], "progress": "2/19 ceph daemons upgraded", "message": "Error: UPGRADE_EXCEPTION: Upgrade: failed due to an unexpected exception" [root@c0n00 ~]# ceph orch ps NAME HOST PORTS STATUS REFRESHED AGE VERSION IMAGE ID CONTAINER ID alertmanager.c0n00 c0n00 running (56m) 4m ago 16h 0.20.0 0881eb8f169f 30d9eff06ce2 crash.c0n00 c0n00 running (56m) 4m ago 16h 15.2.11 9d01da634b8f 91d3e4d0e14d crash.c0n01 c0n01 host is offline 16h ago 16h 15.2.11 9d01da634b8f 0ff4a20021df crash.c0n02 c0n02 host is offline 16h ago 16h 15.2.11 9d01da634b8f 0253e6bb29a0 crash.c0n03 c0n03 host is offline 16h ago 16h 15.2.11 9d01da634b8f 291ce4f8b854 grafana.c0n00 c0n00 running (56m) 4m ago 16h 6.7.4 80728b29ad3f 46d77b695da5 mgr.c0n00.bmtvpr c0n00 *:8443,9283 running (56m) 4m ago 16h 16.2.1-257-g717ce59b 3be927f015dd 94a7008ccb4f mgr.c0n01.jwfuca c0n01 host is offline 16h ago 16h 16.2.1-257-g717ce59b 3be927f015dd 766ada65efa9 mon.c0n00 c0n00 running (56m) 4m ago 16h 15.2.11 9d01da634b8f b9f270cd99e2 mon.c0n02 c0n02 host is offline 16h ago 16h 15.2.11 9d01da634b8f a90c21bfd49e node-exporter.c0n00 c0n00 running (56m) 4m ago 16h 0.18.1 e5a616e4b9cf eb1306811c6c node-exporter.c0n01 c0n01 host is offline 16h ago 16h 0.18.1 e5a616e4b9cf 093a72542d3e node-exporter.c0n02 c0n02 host is offline 16h ago 16h 0.18.1 e5a616e4b9cf 785531f5d6cf node-exporter.c0n03 c0n03 host is offline 16h ago 16h 0.18.1 e5a616e4b9cf 074fac77e17c osd.0 c0n02 host is offline 16h ago 16h 15.2.11 9d01da634b8f c075bd047c0a osd.1 c0n01 host is offline 16h ago 16h 15.2.11 9d01da634b8f 616aeda28504 osd.2 c0n03 host is offline 16h ago 16h 15.2.11 9d01da634b8f b36453730c83 osd.3 c0n00 running (56m) 4m ago 16h 15.2.11 9d01da634b8f e043abf53206 prometheus.c0n00 c0n00 running (56m) 4m ago 16h 2.18.1 de242295e225 7cb50c04e26a After some digging into daemon logs we found Tracebacks (please see below). We also noticed that we successfully reach each host per ssh -F .... !!! We've done tcpdumps while upgrading and every SYN gets its SYNACK... ;-) Because we get no errors while deploying fresh Octopus cluster by cephadm (from https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm and cephadm prepare host is always OK) it might be a missing Python Lib or something that's not checked cephadm itself? Thank you for any hint. Christoph Ackermann Traceback: Traceback (most recent call last): File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec s = io.read(1) File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf))) EOFError: expected 1 bytes, got 0 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 1166, in _remote_connection conn, connr = self.mgr._get_connection(addr) File "/usr/share/ceph/mgr/cephadm/module.py", line 1202, in _get_connection sudo=True if self.ssh_user != 'root' else False) File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__ self.gateway = self._make_gateway(hostname) File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway self._make_connection_string(hostname) File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway gw = gateway_bootstrap.bootstrap(io, spec) File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap bootstrap_exec(io, spec) File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec raise HostNotFound(io.remoteaddress) execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-61otabz_ -i /tmp/cephadm-identity-rt2nm0t4 root@c0n02 The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/utils.py", line 73, in do_work return f(*arg) File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 60, in create_from_spec_one replace_osd_ids=osd_id_claims.get(host, []), env_vars=env_vars File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 75, in create_single_host out, err, code = self._run_ceph_volume_command(host, cmd, env_vars=env_vars) File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 295, in _run_ceph_volume_command error_ok=True) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1003, in _run_cephadm with self._remote_connection(host, addr) as tpl: File "/lib64/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1197, in _remote_connection raise OrchestratorError(msg) from e orchestrator._interface.OrchestratorError: Failed to connect to c0n02 (c0n02). Please make sure that the host is reachable and accepts connections using the cephadm SSH key To add the cephadm SSH key to the host: > ceph cephadm get-pub-key > ~/ceph.pub > ssh-copy-id -f -i ~/ceph.pub root@c0n02 To check that the host is reachable: > ceph cephadm get-ssh-config > ssh_config > ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key > chmod 0600 ~/cephadm_private_key > ssh -F ssh_config -i ~/cephadm_private_key root@c0n02

2 years, 11 months

1
0
0 0

Host ceph version in dashboard incorrect after upgrade

by mabi

Hello, I upgraded my Octopus test cluster which has 5 hosts because one of the node (a mon/mgr node) was still on version 15.2.10 but all the others on 15.2.11. For the upgrade I used the following command: ceph orch upgrade start --ceph-version 15.2.11 The upgrade worked correctly and I did not see any errors in the logs but the host version in the ceph dashboard (under the navigation Cluster -> Hosts) still snows 15.2.10 for that specific node. The output of "ceph versions", shows that every component is on 15.2.11 as you can see below: { "mon": { "ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable)": 3 }, "mgr": { "ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable)": 2 }, "osd": { "ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable)": 2 }, "mds": {}, "overall": { "ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable)": 7 } } So why is it still stuck on 15.2.10 in the dashboard? Best regards, Mabi

2 years, 11 months

2
4
0 0

Re: Nautilus 14.2.19 mon 100% CPU

by Robert LeBlanc

Good thought. The storage for the monitor data is a RAID-0 over three NVMe devices. Watching iostat, they are completely idle, maybe 0.8% to 1.4% for a second every minute or so. ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Apr 8, 2021 at 7:48 PM Zizon Qiu <zzdtsv(a)gmail.com> wrote: > > Will it be related to some kind of disk issue of that mon located in,which may casually > slow down IO and further the rocksdb? > > > On Fri, Apr 9, 2021 at 4:29 AM Robert LeBlanc <robert(a)leblancnet.us> wrote: >> >> I found this thread that matches a lot of what I'm seeing. I see the >> ms_dispatch thread going to 100%, but I'm at a single MON, the >> recovery is done and the rocksdb MON database is ~300MB. I've tried >> all the settings mentioned in that thread with no noticeable >> improvement. I was hoping that once the recovery was done (backfills >> to reformatted OSDs) that it would clear up, but not yet. So any other >> ideas would be really helpful. Our MDS is functioning, but stalls a >> lot because the mons miss heartbeats. >> >> mon_compact_on_start = true >> rocksdb_cache_size = 1342177280 >> mon_lease = 30 >> mon_osd_cache_size = 200000 >> mon_sync_max_payload_size = 4096 >> >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> On Thu, Apr 8, 2021 at 1:11 PM Stefan Kooman <stefan(a)bit.nl> wrote: >> > >> > On 4/8/21 6:22 PM, Robert LeBlanc wrote: >> > > I upgraded our Luminous cluster to Nautilus a couple of weeks ago and >> > > converted the last batch of FileStore OSDs to BlueStore about 36 hours >> > > ago. Yesterday our monitor cluster went nuts and started constantly >> > > calling elections because monitor nodes were at 100% and wouldn't >> > > respond to heartbeats. I reduced the monitor cluster to one to prevent >> > > the constant elections and that let the system limp along until the >> > > backfills finished. There are large amounts of time where ceph commands >> > > hang with the CPU is at 100%, when the CPU drops I see a lot of work >> > > getting done in the monitor logs which stops as soon as the CPU is at >> > > 100% again. >> > >> > >> > Try reducing mon_sync_max_payload_size=4096. I have seen Frank Schilder >> > advise this several times because of monitor issues. Also recently for a >> > cluster that got upgraded from Luminous -> Mimic -> Nautilus. >> > >> > Worth a shot. >> > >> > Otherwise I'll try to look in depth and see if I can come up with >> > something smart (for now I need to go catch some sleep). >> > >> > Gr. Stefan

2 years, 12 months

4
19
0 0

2024

2023

2022

2021

2020

2019

ceph-users April 2021