In the end it looks like I might be able to get the node up to about 30
odds before it stops creating any more.
Or more it formats the disks but freezes up starting the daemons.
I suspect I'm missing somthing I can tune to get it working better.
If I could see any error messages that might help, but I'm yet to spit
anything.
Peter.
On Wed, 26 May 2021, 10:57 Eugen Block, <eblock(a)nde.ag> wrote:
If I add the
osd daemons one at a time with
ceph orch daemon add osd drywood12:/dev/sda
It does actually work,
Great!
I suspect what's happening is when my rule
for creating osds run and
creates them all-at-once it ties the orch it overloads cephadm and it
can't
cope.
It's possible, I guess.
I suspect what I might need to do at least to
work around the issue is
set
"limit:" and bring it up until it stops
working.
It's worth a try, yes, although the docs state you should try to avoid
it, it's possible that it doesn't work properly, in that case create a
bug report. ;-)
I did work out how to get ceph-volume to nearly
work manually.
cephadm shell
ceph auth get client.bootstrap-osd -o
/var/lib/ceph/bootstrap-osd/ceph.keyring
ceph-volume lvm create --data /dev/sda --dmcrypt
but given I've now got "add osd" to work, I suspect I just need to fine
tune my osd creation rules, so it does not try and create too many osds
on
the same node at the same time.
I agree, no need to do it manually if there is an automated way,
especially if you're trying to bring up dozens of OSDs.
Zitat von Peter Childs <pchilds(a)bcs.org>rg>:
> After a bit of messing around. I managed to get it somewhat working.
>
If I add the osd daemons one at a time with
ceph orch daemon add osd drywood12:/dev/sda
It does actually work,
>
I suspect what's happening is when my rule
for creating osds run and
creates them all-at-once it ties the orch it overloads cephadm and it
can't
cope.
>
> service_type: osd
> service_name: osd.drywood-disks
> placement:
> host_pattern: 'drywood*'
> spec:
> data_devices:
> size: "7TB:"
> objectstore: bluestore
>
I suspect what I might need to do at least to
work around the issue is
set
"limit:" and bring it up until it stops
working.
>
I did work out how to get ceph-volume to nearly
work manually.
cephadm shell
ceph auth get client.bootstrap-osd -o
/var/lib/ceph/bootstrap-osd/ceph.keyring
ceph-volume lvm create --data /dev/sda --dmcrypt
but given I've now got "add osd" to work, I suspect I just need to fine
tune my osd creation rules, so it does not try and create too many osds
on
the same node at the same time.
>
>
>
> On Wed, 26 May 2021 at 08:25, Eugen Block <eblock(a)nde.ag> wrote:
>
>> Hi,
>>
>> I believe your current issue is due to a missing keyring for
>> client.bootstrap-osd on the OSD node. But even after fixing that
>> you'll probably still won't be able to deploy an OSD manually with
>> ceph-volume because 'ceph-volume activate' is not supported with
>> cephadm [1]. I just tried that in a virtual environment, it fails when
>> activating the systemd-unit:
>>
>> ---snip---
>> [2021-05-26 06:47:16,677][ceph_volume.process][INFO ] Running
>> command: /usr/bin/systemctl enable
>> ceph-volume@lvm-8-1a8fc8ae-8f4c-4f91-b044-d5636bb52456
>> [2021-05-26 06:47:16,692][ceph_volume.process][INFO ] stderr Failed
>> to connect to bus: No such file or directory
>> [2021-05-26 06:47:16,693][ceph_volume.devices.lvm.create][ERROR ] lvm
>> activate was unable to complete, while creating the OSD
>> Traceback (most recent call last):
>> File
>> "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py",
>> line 32, in create
>> Activate([]).activate(args)
>> File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py",
>> line 16, in is_root
>> return func(*a, **kw)
>> File
>>
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py",
>> line
>> 294, in activate
>> activate_bluestore(lvs, args.no_systemd)
>> File
>>
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py",
>> line
>> 214, in activate_bluestore
>> systemctl.enable_volume(osd_id, osd_fsid, 'lvm')
>> File
>> "/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py",
>> line 82, in enable_volume
>> return enable(volume_unit % (device_type, id_, fsid))
>> File
>> "/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py",
>> line 22, in enable
>> process.run(['systemctl', 'enable', unit])
>> File "/usr/lib/python3.6/site-packages/ceph_volume/process.py",
>> line 153, in run
>> raise RuntimeError(msg)
>> RuntimeError: command returned non-zero exit status: 1
>> [2021-05-26 06:47:16,694][ceph_volume.devices.lvm.create][INFO ] will
>> rollback OSD ID creation
>> [2021-05-26 06:47:16,697][ceph_volume.process][INFO ] Running
>> command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
>> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.8
>> --yes-i-really-mean-it
>> [2021-05-26 06:47:17,597][ceph_volume.process][INFO ] stderr purged
osd.8
> ---snip---
>
> There's a workaround described in [2] that's not really an option for
> dozens of OSDs. I think your best approach is to bring cephadm to
> activate the OSDs for you.
> You wrote you didn't find any helpful error messages, but did cephadm
> even try to deploy OSDs? What does your osd spec file look like? Did
> you explicitly run 'ceph orch apply osd -i specfile.yml'? This should
> trigger cephadm and you should see at least some output like this:
>
> Mai 26 08:21:48 pacific1 conmon[31446]: 2021-05-26T06:21:48.466+0000
> 7effc15ff700 0 log_channel(cephadm) log [INF] : Applying service
> osd.ssd-hdd-mix on host pacific2...
> Mai 26 08:21:49 pacific1 conmon[31009]: cephadm
> 2021-05-26T06:21:48.469611+0000 mgr.pacific1.whndiw (mgr.14166) 1646 :
> cephadm [INF] Applying service osd.ssd-hdd-mix on host pacific2...
>
> Regards,
> Eugen
>
> [1]
https://tracker.ceph.com/issues/49159
> [2]
https://tracker.ceph.com/issues/46691
>
>
> Zitat von Peter Childs <pchilds(a)bcs.org>rg>:
>
> > Not sure what I'm doing wrong, I suspect its the way I'm running
> > ceph-volume.
> >
> > root@drywood12:~# cephadm ceph-volume lvm create --data /dev/sda
> --dmcrypt
> > Inferring fsid 1518c8e0-bbe4-11eb-9772-001e67dc85ea
> > Using recent ceph image ceph/ceph@sha256
> > :54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949
> > /usr/bin/docker: Running command: /usr/bin/ceph-authtool
--gen-print-key
> > /usr/bin/docker: Running command:
/usr/bin/ceph-authtool
--gen-print-key
> > /usr/bin/docker: --> RuntimeError:
No valid ceph configuration file
was
> > loaded.
> > Traceback (most recent call last):
> > File "/usr/sbin/cephadm", line 8029, in <module>
> > main()
> > File "/usr/sbin/cephadm", line 8017, in main
> > r = ctx.func(ctx)
> > File "/usr/sbin/cephadm", line 1678, in _infer_fsid
> > return func(ctx)
> > File "/usr/sbin/cephadm", line 1738, in _infer_image
> > return func(ctx)
> > File "/usr/sbin/cephadm", line 4514, in command_ceph_volume
> > out, err, code = call_throws(ctx, c.run_cmd(),
verbosity=verbosity)
> > File "/usr/sbin/cephadm",
line 1464, in call_throws
> > raise RuntimeError('Failed command: %s' % '
'.join(command))
> > RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host
> > --net=host --entrypoint /usr/sbin/ceph-volume --privileged
> --group-add=disk
> > --init -e CONTAINER_IMAGE=ceph/ceph@sha256:54e95ae1e11404157d7b329d0t
> >
> > root@drywood12:~# cephadm shell
> > Inferring fsid 1518c8e0-bbe4-11eb-9772-001e67dc85ea
> > Inferring config
> >
/var/lib/ceph/1518c8e0-bbe4-11eb-9772-001e67dc85ea/mon.drywood12/config
> > Using recent ceph image
ceph/ceph@sha256
> > :54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949
> > root@drywood12:/# ceph-volume lvm create --data /dev/sda --dmcrypt
> > Running command: /usr/bin/ceph-authtool --gen-print-key
> > Running command: /usr/bin/ceph-authtool --gen-print-key
> > Running command: /usr/bin/ceph --cluster ceph --name
client.bootstrap-osd
> > --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
> > 70054a5c-c176-463a-a0ac-b44c5db0987c
> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 auth: unable to
> find
> > a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such
file
> or
> > directory
> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1
> > AuthRegistry(0x7fdef405b378) no keyring found at
> > /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 auth: unable to
> find
> > a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such
file
> or
> > directory
> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1
> > AuthRegistry(0x7fdef405ef20) no keyring found at
> > /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 auth: unable to
> find
> > a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such
file
> or
> > directory
> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1
> > AuthRegistry(0x7fdef8f0bea0) no keyring found at
> > /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
> > stderr: 2021-05-25T07:46:18.188+0000 7fdef2d9d700 -1
monclient(hunting):
> > handle_auth_bad_method server
allowed_methods [2] but i only support
[1]
> > stderr: 2021-05-25T07:46:18.188+0000
7fdef259c700 -1
monclient(hunting):
> > handle_auth_bad_method server
allowed_methods [2] but i only support
[1]
> > stderr: 2021-05-25T07:46:18.188+0000
7fdef1d9b700 -1
monclient(hunting):
> > handle_auth_bad_method server
allowed_methods [2] but i only support
[1]
> > stderr: 2021-05-25T07:46:18.188+0000
7fdef8f0d700 -1 monclient:
> > authenticate NOTE: no keyring found; disabled cephx authentication
> > stderr: [errno 13] RADOS permission denied (error connecting to the
> > cluster)
> > --> RuntimeError: Unable to create a new OSD id
> > root@drywood12:/# lsblk /dev/sda
> > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> > sda 8:0 0 7.3T 0 disk
> >
> > As far as I can see cephadm gets a little further than this as the
disks
> > have lvm volumes on them just the
osd's daemons are not created or
> started.
> > So maybe I'm invoking ceph-volume incorrectly.
> >
> >
> > On Tue, 25 May 2021 at 06:57, Peter Childs <pchilds(a)bcs.org> wrote:
> >
> >>
> >>
> >> On Mon, 24 May 2021, 21:08 Marc, <Marc(a)f1-outsourcing.eu> wrote:
> >>
> >>> >
> >>> > I'm attempting to use cephadm and Pacific, currently on debian
> buster,
> >>> > mostly because centos7 ain't supported any more and cenotos8
ain't
> >>> > support
> >>> > by some of my hardware.
> >>>
> >>> Who says centos7 is not supported any more? Afaik centos7/el7 is
being
> >>> supported till its EOL 2024. By
then maybe a good alternative for
> >>> el8/stream has surfaced.
> >>>
> >>
> >> Not supported by ceph Pacific, it's our os of choice otherwise.
> >>
> >> My testing says the version available of podman, docker and python3,
do
> >> not work with Pacific.
> >>
> >> Given I've needed to upgrade docker on buster can we please have a
list
> of
> >> versions that work with cephadm, maybe even have cephadm say no,
please
> >> upgrade unless your running the
right version or better.
> >>
> >>
> >>
> >>> > Anyway I have a few nodes with 59x 7.2TB disks but for some reason
> the
> >>> > osd
> >>> > daemons don't start, the disks get formatted and the osd are
created
> but
> >>> > the daemons never come up.
> >>>
> >>> what if you try with
> >>> ceph-volume lvm create --data /dev/sdi --dmcrypt ?
> >>>
> >>
> >> I'll have a go.
> >>
> >>
> >>> > They are probably the wrong spec for ceph (48gb of memory and
only 4
> >>> > cores)
> >>>
> >>> You can always start with just configuring a few disks per node.
That
> >>> should always work.
> >>>
> >>
> >> That was my thought too.
> >>
> >> Thanks
> >>
> >> Peter
> >>
> >>
> >>> > but I was expecting them to start and be either dirt slow or crash
> >>> > later,
> >>> > anyway I've got upto 30 of them, so I was hoping on getting at
least
> get
> >>> > 6PB of raw storage out of them.
> >>> >
> >>> > As yet I've not spotted any helpful error messages.
> >>> >
> >>> _______________________________________________
> >>> ceph-users mailing list -- ceph-users(a)ceph.io
> >>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
> >>>
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io