Am 20.01.20 um 21:43 schrieb Yaarit Hatuka:
On Mon, Jan 20, 2020 at 9:47 AM Sage Weil <sweil(a)redhat.com
<mailto:sweil@redhat.com>> wrote:
[Adding Sebastian, dev(a)ceph.io <mailto:dev@ceph.io>]
Some things to improve with the OSD create path!
On Mon, 20 Jan 2020, Yaarit Hatuka wrote:
Here are a few Insights from this debugging
process - I hope I got
it right:
1. Adding the device with "/dev/disk/by-id/...." did not work for
me, it
failed in pybind/mgr/cephadm/module.py at:
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/module.py#L…
"if len(list(set(devices) &
set(osd['devices']))) == 0"
because osd['devices'] has the devices listed as "/dev/sdX", but
set(devices) has them by their dev-id.... (which is the syntax
specified as
> the example in the docs, which I followed).
> It took me a couple of days to debug this :-)
I'm really looking forward for Joshuas PR to get merged:
2. I think that cephadm should be more verbose by default. When
creating
OSD it only writes "Created osd(s) on host
'mira027.front.sepia.ceph.com
<http://mira027.front.sepia.ceph.com>'"
(even in case creation failed...). It will help
if it outputs the
different
> stages so that the user can see where it stopped in case of error.
Might be an idea to also print out the ceph-volume command line that was
executed.
3. ceph status shows that the OSD was added even if the
orchestrator failed
to add it (but it's marked down and out).
IIUC this is ceph-volumes failure path not cleaning up? Is this the
failure you saw when you passed the /dev/disk/by-id device path?
It seems like ceph-volume completed successfully all this time, but
since I always passed /dev/disk/by-id and not /dev/sdX to 'ceph
orchestrator osd create', this intersection was always empty:
set(devices) & set(osd['devices']) [1]
The other part of the condition was also true, so the 'continue'
happened all the time.
Therefore the orchestrator does not even try to:
self._create_daemon('osd',...) [2]
Not sure why the OSD count is incremented, though.
4. I couldn't find the logs that cephadm
produces.
I searched for them on both the source (mira010) and the target
(mira027)
machines in /var/log/ceph/<fsid>/* and
couldn't find any print
from either
the cephadm mgr module nor the cephadm script. I
also looked at
/var/log/*.
Where are they hiding?
The ceph-volume.log is the one to look at.
I looked at ceph-volume.log, but couldn't find any orchestrator /
cephadm module log messages there... like this one:
self.log.info <http://self.log.info>("create_mgr({}, mgr.{}):
starting".format(host, name)) [3]
I found
```
fsid="$(ceph status --format json | jq -r .fsid)"
for name in $(cephadm ls | jq -r '.[].name') ; do
journalctl -u "ceph-$fsid(a)$name.service" > $name;
done
```
to be sometimes helpful to gather the journald logs.
5. After ceph-volume creates its LVs, the
host's
lvdisplay/vgdisplay/pvdisplay showed nothing. I had to run "pvscan
--cache"
on the host in order for those commands to output
the current
state. This
may confuse the user.
6. I think it's also a good idea to have another cephadm feature
"cephadm
shell --host=<host>" to open a cephadm
shell on a remote host. I
wanted to
run "ceph-volume lvm zap" on one of the
remote hosts and to do
that I sshed
over, copied the cephadm script and ran
"cephadm shell". it would
be cool
if we could do that from the original machine.
The cephadm script doesn't know how to ssh. We could probably teach
it,
though, for something like this... but it might be simpler for the
user to just 'scp cephadm $host:', as that's basically what cephadm
would
do to "install" itself remotely?
sage
[1]
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/module.py#L…
[2]
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/module.py#L…
[3]
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/module.py#L…
--
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer