Hello,
today i came across a strange behaviour.
After stoppping an osd, im not able to restart or /stop/start a radosgw
daemon.
The boot proccess will stuck until i have started the osd again.
Specs:
3 ceph nodes
2 radosgw
nautilus 14.2.13
CentOS7
Steps:
* stopping radosgw daemon on rgw
* stopping one osd on a ceph-node
* starting radosgw daemon on rgw
* rgw daemon stucks in boot proccess
2020-11-20 09:58:35.412 7f8a0b7c4900 0 framework: civetweb
2020-11-20 09:58:35.412 7f8a0b7c4900 0 framework conf key: port, val:
10.220.196.31:80
2020-11-20 09:58:35.412 7f8a0b7c4900 0 framework conf key: num_threads,
val: 100
2020-11-20 09:58:35.412 7f8a0b7c4900 0 deferred set uid:gid to 167:167
(ceph:ceph)
2020-11-20 09:58:35.412 7f8a0b7c4900 0 ceph version 14.2.13
(1778d63e55dbff6cedb071ab7d367f8f52a8699f) nautilus (stable), process
radosgw, pid 8145
-- > STUCKS/no log entries
* As soon as i starting the osd, the boot sequence continues and
everything works.
2020-11-20 09:58:35.412 7f8a0b7c4900 0 framework: civetweb
2020-11-20 09:58:35.412 7f8a0b7c4900 0 framework conf key: port, val:
10.220.196.31:80
2020-11-20 09:58:35.412 7f8a0b7c4900 0 framework conf key: num_threads,
val: 100
2020-11-20 09:58:35.412 7f8a0b7c4900 0 deferred set uid:gid to 167:167
(ceph:ceph)
2020-11-20 09:58:35.412 7f8a0b7c4900 0 ceph version 14.2.13
(1778d63e55dbff6cedb071ab7d367f8f52a8699f) nautilus (stable), process
radosgw, pid 8145
2020-11-20 10:00:23.895 7f8a0b7c4900 0 starting handler: civetweb
2020-11-20 10:00:23.913 7f8a0b7c4900 1 mgrc service_daemon_register
rgw.rgw1 metadata {arch=x86_64,ceph_release=nautilus,ceph_version=ceph
version 14.2.13 (1778d63e55dbff6cedb071ab7d367f8f52a8699f) nautilus
(stable),ceph_version_short=14.2.13,cpu=Intel Xeon E3-12xx v2 (Ivy
Bridge, IBRS),distro=centos,distro_description=CentOS Linux 7
(Core),distro_version=7,frontend_config#0=civetweb port=10.220.196.31:80
num_threads=100,frontend_type#0=civetweb,hostname=rgw1,kernel_description=#1
SMP Tue Oct 20 16:53:08 UTC
2020,kernel_version=3.10.0-1160.2.2.el7.x86_64,mem_swap_kb=1048572,mem_total_kb=1881836,num_handles=1,os=Linux,pid=8145,zone_id=ecf200a8-2c4a-4c96-96d8-4fcff5b2c8c3,zone_name=default,zonegroup_id=01d11ed1-6157-4c26-addf-ecba49820e20,zonegroup_name=default}
2020-11-20 10:00:25.551 7f89d3aac700 1 civetweb: 0x557f6a6f2000:
10.220.199.4 - - [20/Nov/2020:10:00:25 +0100] "GET / HTTP/1.0" 200 416 - -
2020-11-20 10:00:25.582 7f89d3aac700 1 civetweb: 0x557f6a6f2000:
10.220.199.3 - - [20/Nov/2020:10:00:25 +0100] "GET / HTTP/1.0" 200 416 - -
2020-11-20 10:00:27.555 7f89d3aac700 1 civetweb: 0x557f6a6f2000:
10.220.199.4 - - [20/Nov/2020:10:00:27 +0100] "GET / HTTP/1.0" 200 416 - -
2020-11-20 10:00:27.586 7f89d3aac700 1 civetweb: 0x557f6a6f2000:
10.220.199.3 - - [20/Nov/2020:10:00:27 +0100] "GET / HTTP/1.0" 200 416 - -
regards
Bernhard
Show replies by date
Den fre 20 nov. 2020 kl 10:17 skrev Bernhard Krieger <b.krieger(a)lagis.at>at>:
Hello,
today i came across a strange behaviour.
After stoppping an osd, im not able to restart or /stop/start a radosgw
daemon.
The boot proccess will stuck until i have started the osd again.
Specs:
3 ceph nodes
What is the ceph status while one OSD is down? If this makes all PGs
inactive or worse, then any client that wants to write anything on the
cluster will block until they are writable again.
--
May the most significant bit of your life be positive.