May 2021 - ceph-users - lists.ceph.io

cephadm stalled after adjusting placement

by Bryan Stillwell

I'm looking for help in figuring out why cephadm isn't making any progress after I told it to redeploy an mds daemon with: ceph orch daemon redeploy mds.cephfs.aladdin.kgokhr ceph/ceph:v15.2.12 The output from 'ceph -W cephadm' just says: 2021-05-14T16:24:46.628084+0000 mgr.paris.glbvov [INF] Schedule redeploy daemon mds.cephfs.aladdin.kgokhr However, the mds never gets redeployed. I do see this warning in 'ceph health detail' which might have something to do with it: Module 'cephadm' has failed: 'NoneType' object has no attribute 'target_id' What steps can I do to figure out why cephadm is hung? Thanks, Bryan

3 years

1
0
0 0

Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

by David Orman

Hi, We are seeing the mgr attempt to apply our OSD spec on the various hosts, then block. When we investigate, we see the mgr has executed cephadm calls like so, which are blocking: root 1522444 0.0 0.0 102740 23216 ? S 17:32 0:00 \_ /usr/bin/python3 /var/lib/ceph/XXXXX/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90 --image docker.io/ceph/ceph@sha256:694ba9cdcbe6cb7d25ab14b34113c42c2d1af18d4c79c7ba4d1f62cf43d145fe ceph-volume --fsid XXXXX -- lvm list --format json This occurs on all hosts in the cluster, following starting/restarting/failing over a manager. It's blocking an in-progress upgrade post-manager updates on one cluster, currently. Looking at the cephadm logs on the host(s) in question, we see the last entry appears to be truncated, like: 2021-05-10 17:32:06,471 INFO /usr/bin/podman: "ceph.db_uuid": "1n2f5v-EEgO-1Kn6-hQd2-v5QF-AN9o-XPkL6b", 2021-05-10 17:32:06,471 INFO /usr/bin/podman: "ceph.encrypted": "0", 2021-05-10 17:32:06,471 INFO /usr/bin/podman: "ceph.osd_fsid": "XXXX", 2021-05-10 17:32:06,471 INFO /usr/bin/podman: "ceph.osd_id": "205", 2021-05-10 17:32:06,471 INFO /usr/bin/podman: "ceph.osdspec_affinity": "osd_spec", 2021-05-10 17:32:06,471 INFO /usr/bin/podman: "ceph.type": "block", The previous entry looks like this: 2021-05-10 17:32:06,469 INFO /usr/bin/podman: "ceph.db_uuid": "TMTPD5-MLqp-06O2-raqp-S8o5-TfRG-hbFmpu", 2021-05-10 17:32:06,469 INFO /usr/bin/podman: "ceph.encrypted": "0", 2021-05-10 17:32:06,469 INFO /usr/bin/podman: "ceph.osd_fsid": "XXXX", 2021-05-10 17:32:06,469 INFO /usr/bin/podman: "ceph.osd_id": "195", 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "ceph.osdspec_affinity": "osd_spec", 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "ceph.type": "block", 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "ceph.vdo": "0" 2021-05-10 17:32:06,470 INFO /usr/bin/podman: }, 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "type": "block", 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "vg_name": "ceph-ffd1a4a7-316c-4c85-acde-06459e26f2c4" 2021-05-10 17:32:06,470 INFO /usr/bin/podman: } 2021-05-10 17:32:06,470 INFO /usr/bin/podman: ], We'd like to get to the bottom of this, please let us know what other information we can provide. Thank you, David

3 years

2
4
0 0

mon vanished after cephadm upgrade

by Ashley Merrick

I had a 3 mon CEPH cluster, after updating from 15.2.x to 16.2.x one of my mon's is showing as a stopped state in the Ceph Dashboard.And checking the cephadm logs on the server in question I can see "/usr/bin/docker: Error: No such object: ceph-30449cba-44e4-11eb-ba64-dda10beff041-mon.sn-m01"There is a few OSD services running on the same physical server and they all are starting/running fine via docker.I tried to do a cephadm apply mon to push a new mon to the same host, but it seems to not do anything, nothing shows in the same log file on sn-m01Also ceph -s shows full health and no errors and has no trace of the "failed" mon (not sure if this is expected), only in the ceph dashboard under services can I see the stopped not running mon. Sent via MXlogin

3 years

2
2
0 0

RGW segmentation fault on Pacific 16.2.1 with multipart upload

by Daniel Iwan

Hi I have started to see segfaults during multiplart upload to one of the buckets File is about 60MB in size Upload of the same file to a brand new bucket works OK Command used aws --profile=tester --endpoint=$HOST_S3_API --region="" s3 cp ./pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack s3://tester-bucket/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack For some reason log shows upload to tester-bucket-2 ??? Bucket tester-bucket-2 is owned by the same user TESTER. I'm using Ceph 16.2.1 (recently upgraded from Octopus). Installed with cephadm in Docker OS Ubuntu 18.04.5 LTS Logs show as below May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:46.891+0000 7ffb0e25e700 1 ====== starting new request req=0x7ffa8e15d620 ===== May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:46.907+0000 7ffb0b258700 1 ====== req done req=0x7ffa8e15d620 op status=0 http_status=200 latency=0.011999841s ====== May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:46.907+0000 7ffb0b258700 1 beast: 0x7ffa8e15d620: 11.1.150.14 - TESTER [11/May/2021:11:00:46.891 +0000] "POST /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploads HTTP/1.1" 200 296 - "aws-cli/2.1.23 Python/3.7.3 Linux/4.19.128-microsoft-standard exe/x86_64.ubuntu.18 p May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.055+0000 7ffb09254700 1 ====== starting new request req=0x7ffa8e15d620 ===== May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+0000 7ffb51ae5700 1 ====== starting new request req=0x7ffa8e0dc620 ===== May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+0000 7ffb4eadf700 1 ====== starting new request req=0x7ffa8e05b620 ===== May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+0000 7ffb46acf700 1 ====== starting new request req=0x7ffa8df59620 ===== May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+0000 7ffb44acb700 1 ====== starting new request req=0x7ffa8ded8620 ===== May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.355+0000 7ffb3dabd700 1 ====== starting new request req=0x7ffa8dfda620 ===== May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.359+0000 7ffb1d27c700 1 ====== starting new request req=0x7ffa8de57620 ===== May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:47.359+0000 7ffb22a87700 1 ====== starting new request req=0x7ffa8ddd6620 ===== May 11 11:00:48 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:48.275+0000 7ffb2d29c700 1 ====== req done req=0x7ffa8e15d620 op status=0 http_status=200 latency=1.219983697s ====== May 11 11:00:48 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:48.275+0000 7ffb2d29c700 1 beast: 0x7ffa8e15d620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.055 +0000] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=8 HTTP/1.1" 200 2485288 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:00:54 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:54.695+0000 7ffad89f3700 1 ====== req done req=0x7ffa8ddd6620 op status=0 http_status=200 latency=7.335902214s ====== May 11 11:00:54 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:54.695+0000 7ffad89f3700 1 beast: 0x7ffa8ddd6620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.359 +0000] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=6 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:00:56 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:56.871+0000 7ffb11a65700 1 ====== req done req=0x7ffa8e0dc620 op status=0 http_status=200 latency=9.515872955s ====== May 11 11:00:56 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:56.871+0000 7ffb11a65700 1 beast: 0x7ffa8e0dc620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=7 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:00:59 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:59.491+0000 7ffac89d3700 1 ====== req done req=0x7ffa8dfda620 op status=0 http_status=200 latency=12.135838509s ====== May 11 11:00:59 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:00:59.491+0000 7ffac89d3700 1 beast: 0x7ffa8dfda620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=2 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:01:02 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:02.891+0000 7ffb68312700 1 ====== req done req=0x7ffa8e05b620 op status=0 http_status=200 latency=15.535793304s ====== May 11 11:01:02 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:02.891+0000 7ffb68312700 1 beast: 0x7ffa8e05b620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=4 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:03.299+0000 7ffb70b23700 1 ====== req done req=0x7ffa8df59620 op status=0 http_status=200 latency=15.943787575s ====== May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:03.299+0000 7ffb70b23700 1 beast: 0x7ffa8df59620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=3 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:03.647+0000 7ffb8534c700 1 ====== req done req=0x7ffa8ded8620 op status=0 http_status=200 latency=16.291782379s ====== May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:03.647+0000 7ffb8534c700 1 beast: 0x7ffa8ded8620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=1 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:03.835+0000 7ffabe9bf700 1 ====== req done req=0x7ffa8de57620 op status=0 http_status=200 latency=16.475780487s ====== May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:03.835+0000 7ffabe9bf700 1 beast: 0x7ffa8de57620: 11.1.150.14 - TESTER [11/May/2021:11:00:47.359 +0000] "PUT /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=5 HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:03.875+0000 7ffabf1c0700 1 ====== starting new request req=0x7ffa8de57620 ===== May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:03.895+0000 7ffaa0983700 1 ====== req done req=0x7ffa8de57620 op status=0 http_status=200 latency=0.019999731s ====== May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:03.895+0000 7ffaa0983700 1 beast: 0x7ffa8de57620: 11.1.150.14 - TESTER [11/May/2021:11:01:03.875 +0000] "POST /tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF HTTP/1.1" 200 400 - "aws-cli/2.1.23 Python/3.7.3 Linux/4.19.128-micros May 11 11:01:06 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:06.147+0000 7ffac31c8700 1 failed to read header: The socket was closed due to a timeout May 11 11:01:06 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:06.147+0000 7ffac31c8700 1 ====== req done http_status=400 ====== May 11 11:01:16 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:16.667+0000 7ffab51ac700 1 failed to read header: The socket was closed due to a timeout May 11 11:01:16 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:16.667+0000 7ffab51ac700 1 ====== req done http_status=400 ====== May 11 11:01:17 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:17.687+0000 7ffaa598d700 1 failed to read header: The socket was closed due to a timeout May 11 11:01:17 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:17.687+0000 7ffaa598d700 1 ====== req done http_status=400 ====== May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:18.179+0000 7ffa9e97f700 1 ====== starting new request req=0x7ffbd40b8620 ===== May 11 11:01:18 ceph-om-vm-node1 bash[27881]: *** Caught signal (Segmentation fault) ** May 11 11:01:18 ceph-om-vm-node1 bash[27881]: in thread 7ffac89d3700 thread_name:radosgw May 11 11:01:18 ceph-om-vm-node1 bash[27881]: ceph version 16.2.1 (afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable) May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 1: /lib64/libpthread.so.0(+0x12b20) [0x7ffbc8558b20] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 2: (rgw_bucket::rgw_bucket(rgw_bucket const&)+0x23) [0x7ffbd3393403] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 3: (rgw::sal::RGWObject::get_obj() const+0x20) [0x7ffbd33c1bb0] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 4: (RGWInitMultipart::verify_permission(optional_yield)+0x6c) [0x7ffbd36abf4c] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 5: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, bool)+0x898) [0x7ffbd3373f58] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 6: (process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 7: /lib64/libradosgw.so.2(+0x49510d) [0x7ffbd32ca10d] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 8: /lib64/libradosgw.so.2(+0x496b74) [0x7ffbd32cbb74] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 9: /lib64/libradosgw.so.2(+0x496dde) [0x7ffbd32cbdde] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 10: make_fcontext() May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug 2021-05-11T11:01:18.183+0000 7ffac89d3700 -1 *** Caught signal (Segmentation fault) ** May 11 11:01:18 ceph-om-vm-node1 bash[27881]: in thread 7ffac89d3700 thread_name:radosgw May 11 11:01:18 ceph-om-vm-node1 bash[27881]: ceph version 16.2.1 (afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable) May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 1: /lib64/libpthread.so.0(+0x12b20) [0x7ffbc8558b20] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 2: (rgw_bucket::rgw_bucket(rgw_bucket const&)+0x23) [0x7ffbd3393403] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 3: (rgw::sal::RGWObject::get_obj() const+0x20) [0x7ffbd33c1bb0] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 4: (RGWInitMultipart::verify_permission(optional_yield)+0x6c) [0x7ffbd36abf4c] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 5: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, bool)+0x898) [0x7ffbd3373f58] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 6: (process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 7: /lib64/libradosgw.so.2(+0x49510d) [0x7ffbd32ca10d] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 8: /lib64/libradosgw.so.2(+0x496b74) [0x7ffbd32cbb74] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 9: /lib64/libradosgw.so.2(+0x496dde) [0x7ffbd32cbdde] May 11 11:01:18 ceph-om-vm-node1 bash[27881]: 10: make_fcontext() May 11 11:01:18 ceph-om-vm-node1 bash[27881]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. May 11 11:01:18 ceph-om-vm-node1 bash[27881]: --- begin dump of recent events --- May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2732> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command assert hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2731> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command abort hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2730> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command leak_some_memory hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2729> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command perfcounters_dump hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2728> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command 1 hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2727> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command perf dump hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2726> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command perfcounters_schema hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2725> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command perf histogram dump hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2724> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command 2 hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2723> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command perf schema hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2722> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command perf histogram schema hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2721> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command perf reset hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2720> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command config show hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2719> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command config help hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2718> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command config set hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2717> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command config unset hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2716> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command config get hook 0x55f96fdf4580 May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug -2715> 2021-05-11T10:57:05.234+0000 7ffbd4103440 5 asok(0x55f96fef0000) register_command config diff hook 0x55f96fdf4580 Regards Daniel

3 years

2
2
0 0

Zabbix module Octopus 15.2.3

by Gert Wieberdink

Trying to configure Zabbix module in Octopus 15.2.3. CentOS 8.1 environment. Installed zabbix40-agent for CentOS 8.1 (from epel repository). This will also install zabbix_sender. After enabling the Zabbix module in Ceph, I configured my Zabbix host and Zabbix identifier. # ceph zabbix config-set zabbix_host <zabbix-fqdn> # ceph zabbix config-set zabbix_identifier <ident> # ceph zabbix config-show Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/zabbix/module.py", line 407, in handle_command return 0, json.dumps(self.config, index=4, sort_keys=True), '' File "/lib64/python3.6/json/__init__.py", line 238, in dumps **kw).encode(obj) TypeError: __init__() got an unexpected keyword argument 'index' # ceph -v ceph version 15.2.3 (d289bbdec69ed7c1f516e0a093594580a76b78d0) octopus (stable) # ceph health detail HEALTH_OK Anyone found a solution? rgds, -gw

3 years

5
4
0 0

Limit memory of ceph-mgr

by mabi

Hello, I just noticed on my small Octopus cluster that the ceph-mgr on a mgr/mon node uses 3.6GB of resident memory (RES) as you can see below from the top output: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2704 167 20 0 5030528 3.6g 35796 S 6.6 47.2 23:08.18 ceph-mgr 2699 167 20 0 1291504 884796 23672 S 4.6 11.1 13:23.63 ceph-mon Is there a way to limit the memory usage of ceph-mgr just like one can do with ceph OSD (osd_memory_target)? I tried something like mgr_memory_target but that parameter does not exist. Thanks, Mabi

3 years

1
0
0 0

v14.2.21 Nautilus released

by David Galloway

This is a hotfix release addressing a number of security issues and regressions. We recommend all users update to this release. For a detailed release notes with links & changelog please refer to the official blog entry at https://ceph.io/releases/v14-2-21-nautilus-released Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at https://download.ceph.com/tarballs/ceph-14.2.21.tar.gz * For packages, see https://docs.ceph.com/docs/master/install/get-packages/ * Release git sha1: 5ef401921d7a88aea18ec7558f7f9374ebd8f5a6

3 years

3
2
0 0

How to "out" a mon/mgr node with orchestrator

by mabi

Hello, I need to re-install one node of my Octopus cluster (installed with cephadm) which is a mon/mgr node and did not find in the documentation how to do that with the new ceph orchestrator commands. So my question would be what are the "ceph orch" commands I need to run in order to "out" nicely the mgr and mon services from that specific node? I have a standby manager and 3 mons in total so from the redundancy it should be no problem to take that one node out for re-installing it. Best regards, Mabi

3 years

1
0
0 0

DNS and /etc/hosts in Pacific Release

by Paul Cuzner

Hi, One of the recent changes that Ceph Pacific has introduced is the removal of support for /etc/hosts on the ceph cluster nodes, that use *podman* as the container engine (CentOS8+) This means that name resolution from within the ceph containers now relies on either DNS, or the host to ip mapping that is created when adding a host to the cluster with the orchestrator CLI (orch host add). The exclusion of /etc/hosts has been implemented using a --no-hosts setting on the "podman run" command. Installations that use docker are unaffected. So if you're planning to use Ceph Pacific with podman *and* need /etc/hosts to work, it would be great to hear from you! Cheers, Paul Cuzner

3 years

1
0
0 0

Osd can not goto up/in status on arm64

by 赵贺东

Hi ceph-users, I deploy ceph on arm64 by cephadm. Mon/mgr seems work well. But for osd, it can not work well, that means I can add osd to my cluster but osd can not goto up&in status. I can not find helpful info from ceph log. Can anyone help me, or how can I find the reason why osd can not goto up/in status. Before add osd, raw disk is available. Add osd by "ceph orch apply osd --all-available-devices” Hardware: NXP LS1043A processor，64bit OS: Ubuntu 18.04.5 LTS (GNU/Linux 4.19.26 aarch64) Ceph:15.2.9/15.2.11/16.2.3

3 years

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users May 2021