Is the public_network definition in ceph.conf just used to determine which
interface or IP to use for the public network or does it need to need to
encompass the public ip addresses of all cluster nodes?
Specifically, can the public_network be defined differently for different
OSD nodes as long as they have layer 3 connectivity?
thx
Frank
Hi Jason, dongsheng
I found a problem using rbd_open_by_id when connection timeout(errno = 110, ceph version 12.2.8, there is no change about rbd_open_by_id in master branch).
int r = ictx->state->open(false);
if (r < 0) { // r = -110
delete ictx; // crash,the stack is shown below:
} else {
*image = (rbd_image_t)ictx;
}
Stack is :
(gdb) bt
Program terminated with signal 11, Segmentation fault.
#0 Mutex::Lock (this=this@entry=0x8, no_lockdep=no_lockdep@entry=false)
at /var/lib/jenkins/workspace/ceph_L/ceph/src/common/Mutex.cc:92
#1 0x0000ffff765eb814 in Locker (m=..., this=<synthetic pointer>)
at /var/lib/jenkins/workspace/ceph_L/ceph/src/common/Mutex.h:115
#2 PerfCountersCollection::remove (this=0x0, l=0xffff85bf0588 <main_arena+88>)
at /var/lib/jenkins/workspace/ceph_L/ceph/src/common/perf_counters.cc:62
#3 0x0000ffff757f26c4 in librbd::ImageCtx::perf_stop (this=0x2065d2f0)
at /var/lib/jenkins/workspace/ceph_L/ceph/src/librbd/ImageCtx.cc:426
#4 0x0000ffff757f6e48 in librbd::ImageCtx::~ImageCtx (this=0x2065d2f0, __in_chrg=<optimized out>)
at /var/lib/jenkins/workspace/ceph_L/ceph/src/librbd/ImageCtx.cc:239
#5 0x0000ffff757e8e70 in rbd_open_by_id (p=p@entry=0x2065e780,
id=id@entry=0xffff85928bb4 "20376aa706c0", image=image@entry=0xffff75b9b0b8,
snap_name=snap_name@entry=0x0) at /var/lib/jenkins/workspace/ceph_L/ceph/src/librbd/librbd.cc:2692
#6 0x0000ffff75aed524 in __pyx_pf_3rbd_5Image___init__ (
__pyx_v_read_only=0xffff85f02bb8 <_Py_ZeroStruct>, __pyx_v_snapshot=0xffff85f15938 <_Py_NoneStruct>,
__pyx_v_by_id=0xffff85f02ba0 <_Py_TrueStruct>, __pyx_v_name=0xffff85928b90,
__pyx_v_ioctx=0xffff7f0283d0, __pyx_v_self=0xffff75b9b0a8)
at /var/lib/jenkins/workspace/ceph_L/ceph/build/src/pybind/rbd/pyrex/rbd.c:12662
#7 __pyx_pw_3rbd_5Image_1__init__ (__pyx_v_self=0xffff75b9b0a8, __pyx_args=<optimized out>,
__pyx_kwds=<optimized out>)
at /var/lib/jenkins/workspace/ceph_L/ceph/build/src/pybind/rbd/pyrex/rbd.c:12378
#8 0x0000ffff85e02924 in type_call () from /lib64/libpython2.7.so.1.0
#9 0x0000ffff85dac254 in PyObject_Call () from /lib64/libpython2.7.so.1.0
Meanwhile, I have sone questions about rbd_open, if state->open failed(return value r < 0), Is there a risk of memory leaks about ictx?
extern "C" int rbd_open(rados_ioctx_t p, const char *name, rbd_image_t *image,
const char *snap_name)
{
librados::IoCtx io_ctx;
librados::IoCtx::from_rados_ioctx_t(p, io_ctx);
TracepointProvider::initialize<tracepoint_traits>(get_cct(io_ctx));
librbd::ImageCtx *ictx = new librbd::ImageCtx(name, "", snap_name, io_ctx,
false);
tracepoint(librbd, open_image_enter, ictx, ictx->name.c_str(), ictx->id.c_str(), ictx->snap_name.c_str(), ictx->read_only);
int r = ictx->state->open(false);
if (r >= 0) { // if r < 0, Is there a risk of memory leaks?
*image = (rbd_image_t)ictx;
}
tracepoint(librbd, open_image_exit, r);
return r;
}
Can you give me some advice? many thanks.
yangjun(a)cmss.chinamobile.com
Hello,
We are planning to upgrade our cluster from Jewel to Nautilus. From my understanding, leveldb of monitor and filestore of OSDs will not be converted to rocketdb and bluestore automatically. So do you suggest to convert them manually after upgrading software? Is there any document or guidance available?
Br,
Xu Yun
Hi,
I want to setup Ceph iSCSI Gateway and I follow this
<https://docs.ceph.com/docs/master/rbd/iscsi-overview/> documentation.
In step "Setup" of process "Configuring the iSCSI target using the
command line interface
<https://docs.ceph.com/docs/master/rbd/iscsi-target-cli/>" I cannot
start service rbd-target-api.
There's no error message in status or anywhere else:
root@ld5505:~# systemctl status rbd-target-api
_ rbd-target-api.service - Ceph iscsi target configuration API
Loaded: loaded (/lib/systemd/system/rbd-target-api.service; enabled;
vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2019-12-04 13:47:51 CET;
3min 16s ago
Process: 4143457 ExecStart=/usr/bin/rbd-target-api (code=exited,
status=1/FAILURE)
Main PID: 4143457 (code=exited, status=1/FAILURE)
Dec 04 13:47:51 ld5505 systemd[1]: rbd-target-api.service: Service
RestartSec=100ms expired, scheduling restart.
Dec 04 13:47:51 ld5505 systemd[1]: rbd-target-api.service: Scheduled
restart job, restart counter is at 3.
Dec 04 13:47:51 ld5505 systemd[1]: Stopped Ceph iscsi target
configuration API.
Dec 04 13:47:51 ld5505 systemd[1]: rbd-target-api.service: Start request
repeated too quickly.
Dec 04 13:47:51 ld5505 systemd[1]: rbd-target-api.service: Failed with
result 'exit-code'.
Dec 04 13:47:51 ld5505 systemd[1]: Failed to start Ceph iscsi target
configuration API.
Can you please advise how to troubleshoot this issue?
THX
Hi,
The crushmap produced by ceph osd getcrushmap in ceph version 14.2.4
has more info than defined in
https://docs.ceph.com/docs/cuttlefish/rados/operations/crush-map/
There is a second id per bucket:
host a1-df {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
# weight 1.819
alg straw2
hash 0 # rjenkins1
item osd.0 weight 1.819
}
What exactly is this second id which, by the way, comes with a device
class specified?
Regards,
Rodrigo Severo
Hi all,
How is the following situation handled with bluestore:
1. You have a 200GB OSD (no separate DB/WAL devices)
2. The metadata grows past 30G for some reason and wants to create a 300GB
level but can't?
Where is the metadata over 30G stored?
We utilize 4 iSCSI gateways in a cluster and have noticed the following
during patching cycles when we sequentially reboot single iSCSI-gateways:
"gwcli" often hangs on the still-up iSCSI GWs but sometimes still functions
and gives the message:
"1 gateway is inaccessible - updates will be disabled"
This got me thinking about what the course of action would be should an
iSCSI gateway fail permanently or semi-permanently, say a hardware issue.
What would be the best course of action to instruct the remaining iSCSI
gateways that one of them is no longer available and that they should allow
updates again and take ownership of the now-defunct-node's LUNS?
I'm guessing pulling down the RADOS config object and rewriting it and
re-put'ing it followed by a rbd-target-api restart might do the trick but
am hoping there is a more "in-band" and less potentially devastating way to
do this.
Thanks for any insights.
Respectfully,
*Wes Dillingham*
wes(a)wesdillingham.com
LinkedIn <http://www.linkedin.com/in/wesleydillingham>
Em qui., 5 de dez. de 2019 às 16:38, Marc Roos
<M.Roos(a)f1-outsourcing.eu> escreveu:
>
>
>
> ceph-users(a)lists.ceph.com is old one, why this is, I also do not know
Ok Marc.
Thanks for your information.
Rodrigo