Hello,
We're planning another batch of OSD nodes for our cluster. Our prior nodes
have been 8 x 12TB SAS drives plus 500GB NVMe per HDD. Due to market
circumstances and the shortage of drives those 12TB SAS drives are in short
supply.
Our integrator has offered an option of 8 x 14TB SATA drives (still
Enterprise). For Ceph, will the switch to SATA carry a performance
difference that I should be concerned about?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdhall(a)binghamton.edu
Hello,
I had an OSD drop out a couple days ago. This is 14.2.16, Bluestore, HDD +
NVMe, non-container. The HDD sort of went away. I powered down the node,
reseated the drive, and it came back. However, the OSD won't start.
Systemctl --failed shows that the lvm2 pvscan failed, preventing the OSD
unit from starting.
Running the pvscan activate command manually with with verbose gave
'device-mapper: reload ioctl on (253:7) failed: Read-only file system'. I
have been looking at this for a while, but I can't figure out what is
read-only that is causing the problem. The full output of the pvscan is:
# pvscan --cache --activate ay --verbose '8:48'
pvscan devices on command line.
activation/auto_activation_volume_list configuration setting not
defined: All logical volumes will be auto-activated.
Activating logical volume
ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79/osd-block-425faf92-449e-4b57-98f2-a90a7f60e2a4.
activation/volume_list configuration setting not defined: Checking only
host tags for
ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79/osd-block-425faf92-449e-4b57-98f2-a90a7f60e2a4.
Creating
ceph--block--b1fea172--71a4--463e--a3e3--8cdcc1bc7b79-osd--block--425faf92--449e--4b57--98f2--a90a7f60e2a4
Loading table for
ceph--block--b1fea172--71a4--463e--a3e3--8cdcc1bc7b79-osd--block--425faf92--449e--4b57--98f2--a90a7f60e2a4
(253:7).
device-mapper: reload ioctl on (253:7) failed: Read-only file system
Removing
ceph--block--b1fea172--71a4--463e--a3e3--8cdcc1bc7b79-osd--block--425faf92--449e--4b57--98f2--a90a7f60e2a4
(253:7)
Activated 0 logical volumes in volume group
ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79.
0 logical volume(s) in volume group
"ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79" now active
ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79: autoactivation failed.
-Dave
--
Dave Hall
Binghamton University
kdhall(a)binghamton.edu
I am seeing a huge usage of ram, while my bucket delete is churning
over left-over multiparts, and while I realize there are *many* being
done a 1000 at a time, like this:
2021-06-03 07:29:06.408 7f9b7f633240 0 abort_bucket_multiparts
WARNING : aborted 254000 incomplete multipart uploads
..my first run ended radosgw-admin with out-of-mem, so it seems some
part of this keeps data or forgets to free up old parts of the lists
after they have been cancelled? I also don't know if restarting means
it iterates over the same ones again, or if this log line means 254k
are processed and if I need to restart again, it will at least have
made a few hours of progress.
At this point, RES is 2.1g so roughly 10000 bytes per "entry" if it is
linear somehow.
ceph 13.2.10 in this case.
--
May the most significant bit of your life be positive.
Dear Ceph user,
I want to configure ceph in our production environment using ubuntu .
anyone who used it can help for the tutorial used .
Best regards.
hi folks,
i am glad to announce that with the help of Sven and Chris,
ceph-client homebrew for MacOS was updated with a recent master
commit. this homebrew provides some essential client side tools and
libraries which allow us to talk to a ceph cluster from MacOS machine,
like:
ceph
ceph-conf
ceph-fuse
rados
rbd
in addition to the executables and libraries, the brew also packages
the header files. so, if you are up to do some development with
librados, librbd and libcephfs, you can also have the necessary bits
from the brew. see
https://github.com/mulbc/homebrew-ceph-client
it comes with a pre-bottled brew for BigSur on amd64.
the last release[0] of the brew was packaged 3 years ago with mimic-13.2.2 .
please note, this is not official - do not expect frequent updates or
support from
Ceph.
cheers,
--
[0] https://ceph-users.ceph.narkive.com/bnyGalkH/ceph-client-libraries-for-osx
--
Regards
Kefu Chai
Hi all, I have observed that the MDS Cache Configuration has 18 parameters:
mds_cache_memory_limit
mds_cache_reservation
mds_health_cache_threshold
mds_cache_trim_threshold
mds_cache_trim_decay_rate
mds_recall_max_caps
mds_recall_max_decay_threshold
mds_recall_max_decay_rate
mds_recall_global_max_decay_threshold
mds_recall_warning_threshold
mds_recall_warning_decay_rate
mds_session_cap_acquisition_throttle
mds_session_cap_acquisition_decay_rate
mds_session_max_caps_throttle_ratio
mds_cap_acquisition_throttle_retry_request_timeout
mds_session_cache_liveness_magnitude
mds_session_cache_liveness_decay_rate
mds_max_caps_per_client
I find the Ceph documentation in this section a bit cryptic and I have
tried to find some resources that talk about how to tune these
parameters, but without success.
Does anyone have experience in adjusting these parameters according to
the characteristics of the Ceph cluster itself, the hardware and the use
of MDS?
Regards!
--
*******************************************************
Andrés Rojas Guerrero
Unidad Sistemas Linux
Area Arquitectura Tecnológica
Secretaría General Adjunta de Informática
Consejo Superior de Investigaciones Científicas (CSIC)
Pinar 19
28006 - Madrid
Tel: +34 915680059 -- Ext. 990059
email: a.rojas(a)csic.es
ID comunicate.csic.es: @50852720l:matrix.csic.es
*******************************************************
Hi,
is it normal that radosgw-admin user info --uid=user ... takes around 3s or
more?
Also other radosgw-admin are taking quite a lot of time.
Kind regards,
Rok
CEPH 16.2.4. I was having an issue where I put a server into maintenance mode and after doing so, the containers for the iSCSI gateway were not running, so I decided to do a redeploy of the service. This caused all the servers running iSCSI to get in a state where it looks like ceph orch was trying to delete the container, but it was stuck. My only recourse was to reboot the servers. I ended up doing a ‘ceph orch rm iscsi.iscsi’ to just remove the services and then tried to redeploy. When I do this, I’m seeing the following in the cephadm logs on the servers where the iscsi gateway is being deployed:
2021-06-01 19:48:15,110 INFO Deploy daemon iscsi.iscsi.cxcto-c240-j27-02.zeypah ...
2021-06-01 19:48:15,111 DEBUG Running command: /bin/docker run --rm --ipc=host --net=host --entrypoint stat --init -e CONTAINER_IMAGE=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 -e NODE_NAME=cxcto-c240-j27-02.cisco.com -e CEPH_USE_RANDOM_NONCE=1 docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 -c %u %g /var/lib/ceph
2021-06-01 19:48:15,529 DEBUG stat: 167 167
Later in the logs I see:
2021-06-01 19:48:25,933 DEBUG Running command: /bin/docker inspect --format {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index .Config.Labels "io.ceph.version"}} ceph-a67d529e-ba7f-11eb-940b-5c838f8013a5-iscsi.iscsi.cxcto-c240-j27-02.zeypah
2021-06-01 19:48:25,984 DEBUG /bin/docker:
2021-06-01 19:48:25,984 DEBUG /bin/docker: Error: No such object: ceph-a67d529e-ba7f-11eb-940b-5c838f8013a5-iscsi.iscsi.cxcto-c240-j27-02.zeypah
Obviously no such object because the container creation failed.
If I try to run that command that is in the logs manually, I get:
[root@cxcto-c240-j27-02 ceph]# /bin/docker run --rm --ipc=host --net=host --entrypoint stat --init -e CONTAINER_IMAGE=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 -e NODE_NAME=cxcto-c240-j27-02.cisco.com -e CEPH_USE_RANDOM_NONCE=1 docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 -c %u %g /var/lib/ceph
stat: cannot stat '%g': No such file or directory
167
So the 167 seems to line up with what’s showing up in the script. I’m not clear on what the deal is with the %g. What is supposed to be in that placeholder? Any thoughts on why this is failing?
Right now all my iSCSI gateways are down and basically my whole environment is down as a result 🙁
-Paul
Peter,
We're seeing the same issues as you are. We have 2 new hosts Intel(R)
Xeon(R) Gold 6248R CPU @ 3.00GHz w/ 48 cores, 384GB RAM, and 60x 10TB SED
drives and we have tried both 15.2.13 and 16.2.4
Cephadm does NOT properly deploy and activate OSDs on Ubuntu 20.04.2 with
Docker.
Seems to be a bug in Cephadm and a product regression, as we have 4 near
identical nodes on Centos running Nautilus (240 x 10TB SED drives) and had
no problems.
FWIW we had no luck yet with one-by-one OSD daemon additions through ceph
orch either. We also reproduced the issue easily in a virtual lab using
small virtual disks on a single ceph VM with 1 mon.
We are now looking into whether we can get past this with a manual buildout.
If you, or anyone, has hit the same stumbling block and gotten past it, I
would really appreciate some guidance.
Thanks,
Marco
On Thu, May 27, 2021 at 2:23 PM Peter Childs <pchilds(a)bcs.org> wrote:
> In the end it looks like I might be able to get the node up to about 30
> odds before it stops creating any more.
>
> Or more it formats the disks but freezes up starting the daemons.
>
> I suspect I'm missing somthing I can tune to get it working better.
>
> If I could see any error messages that might help, but I'm yet to spit
> anything.
>
> Peter.
>
> On Wed, 26 May 2021, 10:57 Eugen Block, <eblock(a)nde.ag> wrote:
>
> > > If I add the osd daemons one at a time with
> > >
> > > ceph orch daemon add osd drywood12:/dev/sda
> > >
> > > It does actually work,
> >
> > Great!
> >
> > > I suspect what's happening is when my rule for creating osds run and
> > > creates them all-at-once it ties the orch it overloads cephadm and it
> > can't
> > > cope.
> >
> > It's possible, I guess.
> >
> > > I suspect what I might need to do at least to work around the issue is
> > set
> > > "limit:" and bring it up until it stops working.
> >
> > It's worth a try, yes, although the docs state you should try to avoid
> > it, it's possible that it doesn't work properly, in that case create a
> > bug report. ;-)
> >
> > > I did work out how to get ceph-volume to nearly work manually.
> > >
> > > cephadm shell
> > > ceph auth get client.bootstrap-osd -o
> > > /var/lib/ceph/bootstrap-osd/ceph.keyring
> > > ceph-volume lvm create --data /dev/sda --dmcrypt
> > >
> > > but given I've now got "add osd" to work, I suspect I just need to fine
> > > tune my osd creation rules, so it does not try and create too many osds
> > on
> > > the same node at the same time.
> >
> > I agree, no need to do it manually if there is an automated way,
> > especially if you're trying to bring up dozens of OSDs.
> >
> >
> > Zitat von Peter Childs <pchilds(a)bcs.org>:
> >
> > > After a bit of messing around. I managed to get it somewhat working.
> > >
> > > If I add the osd daemons one at a time with
> > >
> > > ceph orch daemon add osd drywood12:/dev/sda
> > >
> > > It does actually work,
> > >
> > > I suspect what's happening is when my rule for creating osds run and
> > > creates them all-at-once it ties the orch it overloads cephadm and it
> > can't
> > > cope.
> > >
> > > service_type: osd
> > > service_name: osd.drywood-disks
> > > placement:
> > > host_pattern: 'drywood*'
> > > spec:
> > > data_devices:
> > > size: "7TB:"
> > > objectstore: bluestore
> > >
> > > I suspect what I might need to do at least to work around the issue is
> > set
> > > "limit:" and bring it up until it stops working.
> > >
> > > I did work out how to get ceph-volume to nearly work manually.
> > >
> > > cephadm shell
> > > ceph auth get client.bootstrap-osd -o
> > > /var/lib/ceph/bootstrap-osd/ceph.keyring
> > > ceph-volume lvm create --data /dev/sda --dmcrypt
> > >
> > > but given I've now got "add osd" to work, I suspect I just need to fine
> > > tune my osd creation rules, so it does not try and create too many osds
> > on
> > > the same node at the same time.
> > >
> > >
> > >
> > > On Wed, 26 May 2021 at 08:25, Eugen Block <eblock(a)nde.ag> wrote:
> > >
> > >> Hi,
> > >>
> > >> I believe your current issue is due to a missing keyring for
> > >> client.bootstrap-osd on the OSD node. But even after fixing that
> > >> you'll probably still won't be able to deploy an OSD manually with
> > >> ceph-volume because 'ceph-volume activate' is not supported with
> > >> cephadm [1]. I just tried that in a virtual environment, it fails when
> > >> activating the systemd-unit:
> > >>
> > >> ---snip---
> > >> [2021-05-26 06:47:16,677][ceph_volume.process][INFO ] Running
> > >> command: /usr/bin/systemctl enable
> > >> ceph-volume@lvm-8-1a8fc8ae-8f4c-4f91-b044-d5636bb52456
> > >> [2021-05-26 06:47:16,692][ceph_volume.process][INFO ] stderr Failed
> > >> to connect to bus: No such file or directory
> > >> [2021-05-26 06:47:16,693][ceph_volume.devices.lvm.create][ERROR ] lvm
> > >> activate was unable to complete, while creating the OSD
> > >> Traceback (most recent call last):
> > >> File
> > >> "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py",
> > >> line 32, in create
> > >> Activate([]).activate(args)
> > >> File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py",
> > >> line 16, in is_root
> > >> return func(*a, **kw)
> > >> File
> > >>
> "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py",
> > >> line
> > >> 294, in activate
> > >> activate_bluestore(lvs, args.no_systemd)
> > >> File
> > >>
> "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py",
> > >> line
> > >> 214, in activate_bluestore
> > >> systemctl.enable_volume(osd_id, osd_fsid, 'lvm')
> > >> File
> > >> "/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py",
> > >> line 82, in enable_volume
> > >> return enable(volume_unit % (device_type, id_, fsid))
> > >> File
> > >> "/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py",
> > >> line 22, in enable
> > >> process.run(['systemctl', 'enable', unit])
> > >> File "/usr/lib/python3.6/site-packages/ceph_volume/process.py",
> > >> line 153, in run
> > >> raise RuntimeError(msg)
> > >> RuntimeError: command returned non-zero exit status: 1
> > >> [2021-05-26 06:47:16,694][ceph_volume.devices.lvm.create][INFO ] will
> > >> rollback OSD ID creation
> > >> [2021-05-26 06:47:16,697][ceph_volume.process][INFO ] Running
> > >> command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
> > >> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.8
> > >> --yes-i-really-mean-it
> > >> [2021-05-26 06:47:17,597][ceph_volume.process][INFO ] stderr purged
> > osd.8
> > >> ---snip---
> > >>
> > >> There's a workaround described in [2] that's not really an option for
> > >> dozens of OSDs. I think your best approach is to bring cephadm to
> > >> activate the OSDs for you.
> > >> You wrote you didn't find any helpful error messages, but did cephadm
> > >> even try to deploy OSDs? What does your osd spec file look like? Did
> > >> you explicitly run 'ceph orch apply osd -i specfile.yml'? This should
> > >> trigger cephadm and you should see at least some output like this:
> > >>
> > >> Mai 26 08:21:48 pacific1 conmon[31446]: 2021-05-26T06:21:48.466+0000
> > >> 7effc15ff700 0 log_channel(cephadm) log [INF] : Applying service
> > >> osd.ssd-hdd-mix on host pacific2...
> > >> Mai 26 08:21:49 pacific1 conmon[31009]: cephadm
> > >> 2021-05-26T06:21:48.469611+0000 mgr.pacific1.whndiw (mgr.14166) 1646 :
> > >> cephadm [INF] Applying service osd.ssd-hdd-mix on host pacific2...
> > >>
> > >> Regards,
> > >> Eugen
> > >>
> > >> [1] https://tracker.ceph.com/issues/49159
> > >> [2] https://tracker.ceph.com/issues/46691
> > >>
> > >>
> > >> Zitat von Peter Childs <pchilds(a)bcs.org>:
> > >>
> > >> > Not sure what I'm doing wrong, I suspect its the way I'm running
> > >> > ceph-volume.
> > >> >
> > >> > root@drywood12:~# cephadm ceph-volume lvm create --data /dev/sda
> > >> --dmcrypt
> > >> > Inferring fsid 1518c8e0-bbe4-11eb-9772-001e67dc85ea
> > >> > Using recent ceph image ceph/ceph@sha256
> > >> > :54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949
> > >> > /usr/bin/docker: Running command: /usr/bin/ceph-authtool
> > --gen-print-key
> > >> > /usr/bin/docker: Running command: /usr/bin/ceph-authtool
> > --gen-print-key
> > >> > /usr/bin/docker: --> RuntimeError: No valid ceph configuration file
> > was
> > >> > loaded.
> > >> > Traceback (most recent call last):
> > >> > File "/usr/sbin/cephadm", line 8029, in <module>
> > >> > main()
> > >> > File "/usr/sbin/cephadm", line 8017, in main
> > >> > r = ctx.func(ctx)
> > >> > File "/usr/sbin/cephadm", line 1678, in _infer_fsid
> > >> > return func(ctx)
> > >> > File "/usr/sbin/cephadm", line 1738, in _infer_image
> > >> > return func(ctx)
> > >> > File "/usr/sbin/cephadm", line 4514, in command_ceph_volume
> > >> > out, err, code = call_throws(ctx, c.run_cmd(),
> > verbosity=verbosity)
> > >> > File "/usr/sbin/cephadm", line 1464, in call_throws
> > >> > raise RuntimeError('Failed command: %s' % ' '.join(command))
> > >> > RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host
> > >> > --net=host --entrypoint /usr/sbin/ceph-volume --privileged
> > >> --group-add=disk
> > >> > --init -e CONTAINER_IMAGE=ceph/ceph@sha256
> :54e95ae1e11404157d7b329d0t
> > >> >
> > >> > root@drywood12:~# cephadm shell
> > >> > Inferring fsid 1518c8e0-bbe4-11eb-9772-001e67dc85ea
> > >> > Inferring config
> > >> >
> > /var/lib/ceph/1518c8e0-bbe4-11eb-9772-001e67dc85ea/mon.drywood12/config
> > >> > Using recent ceph image ceph/ceph@sha256
> > >> > :54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949
> > >> > root@drywood12:/# ceph-volume lvm create --data /dev/sda --dmcrypt
> > >> > Running command: /usr/bin/ceph-authtool --gen-print-key
> > >> > Running command: /usr/bin/ceph-authtool --gen-print-key
> > >> > Running command: /usr/bin/ceph --cluster ceph --name
> > client.bootstrap-osd
> > >> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
> > >> > 70054a5c-c176-463a-a0ac-b44c5db0987c
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 auth: unable
> to
> > >> find
> > >> > a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such
> > file
> > >> or
> > >> > directory
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1
> > >> > AuthRegistry(0x7fdef405b378) no keyring found at
> > >> > /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 auth: unable
> to
> > >> find
> > >> > a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such
> > file
> > >> or
> > >> > directory
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1
> > >> > AuthRegistry(0x7fdef405ef20) no keyring found at
> > >> > /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 auth: unable
> to
> > >> find
> > >> > a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such
> > file
> > >> or
> > >> > directory
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1
> > >> > AuthRegistry(0x7fdef8f0bea0) no keyring found at
> > >> > /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef2d9d700 -1
> > monclient(hunting):
> > >> > handle_auth_bad_method server allowed_methods [2] but i only support
> > [1]
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef259c700 -1
> > monclient(hunting):
> > >> > handle_auth_bad_method server allowed_methods [2] but i only support
> > [1]
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef1d9b700 -1
> > monclient(hunting):
> > >> > handle_auth_bad_method server allowed_methods [2] but i only support
> > [1]
> > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 monclient:
> > >> > authenticate NOTE: no keyring found; disabled cephx authentication
> > >> > stderr: [errno 13] RADOS permission denied (error connecting to the
> > >> > cluster)
> > >> > --> RuntimeError: Unable to create a new OSD id
> > >> > root@drywood12:/# lsblk /dev/sda
> > >> > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> > >> > sda 8:0 0 7.3T 0 disk
> > >> >
> > >> > As far as I can see cephadm gets a little further than this as the
> > disks
> > >> > have lvm volumes on them just the osd's daemons are not created or
> > >> started.
> > >> > So maybe I'm invoking ceph-volume incorrectly.
> > >> >
> > >> >
> > >> > On Tue, 25 May 2021 at 06:57, Peter Childs <pchilds(a)bcs.org> wrote:
> > >> >
> > >> >>
> > >> >>
> > >> >> On Mon, 24 May 2021, 21:08 Marc, <Marc(a)f1-outsourcing.eu> wrote:
> > >> >>
> > >> >>> >
> > >> >>> > I'm attempting to use cephadm and Pacific, currently on debian
> > >> buster,
> > >> >>> > mostly because centos7 ain't supported any more and cenotos8
> ain't
> > >> >>> > support
> > >> >>> > by some of my hardware.
> > >> >>>
> > >> >>> Who says centos7 is not supported any more? Afaik centos7/el7 is
> > being
> > >> >>> supported till its EOL 2024. By then maybe a good alternative for
> > >> >>> el8/stream has surfaced.
> > >> >>>
> > >> >>
> > >> >> Not supported by ceph Pacific, it's our os of choice otherwise.
> > >> >>
> > >> >> My testing says the version available of podman, docker and
> python3,
> > do
> > >> >> not work with Pacific.
> > >> >>
> > >> >> Given I've needed to upgrade docker on buster can we please have a
> > list
> > >> of
> > >> >> versions that work with cephadm, maybe even have cephadm say no,
> > please
> > >> >> upgrade unless your running the right version or better.
> > >> >>
> > >> >>
> > >> >>
> > >> >>> > Anyway I have a few nodes with 59x 7.2TB disks but for some
> reason
> > >> the
> > >> >>> > osd
> > >> >>> > daemons don't start, the disks get formatted and the osd are
> > created
> > >> but
> > >> >>> > the daemons never come up.
> > >> >>>
> > >> >>> what if you try with
> > >> >>> ceph-volume lvm create --data /dev/sdi --dmcrypt ?
> > >> >>>
> > >> >>
> > >> >> I'll have a go.
> > >> >>
> > >> >>
> > >> >>> > They are probably the wrong spec for ceph (48gb of memory and
> > only 4
> > >> >>> > cores)
> > >> >>>
> > >> >>> You can always start with just configuring a few disks per node.
> > That
> > >> >>> should always work.
> > >> >>>
> > >> >>
> > >> >> That was my thought too.
> > >> >>
> > >> >> Thanks
> > >> >>
> > >> >> Peter
> > >> >>
> > >> >>
> > >> >>> > but I was expecting them to start and be either dirt slow or
> crash
> > >> >>> > later,
> > >> >>> > anyway I've got upto 30 of them, so I was hoping on getting at
> > least
> > >> get
> > >> >>> > 6PB of raw storage out of them.
> > >> >>> >
> > >> >>> > As yet I've not spotted any helpful error messages.
> > >> >>> >
> > >> >>> _______________________________________________
> > >> >>> ceph-users mailing list -- ceph-users(a)ceph.io
> > >> >>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
> > >> >>>
> > >> >>
> > >> > _______________________________________________
> > >> > ceph-users mailing list -- ceph-users(a)ceph.io
> > >> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> > >>
> > >>
> > >> _______________________________________________
> > >> ceph-users mailing list -- ceph-users(a)ceph.io
> > >> To unsubscribe send an email to ceph-users-leave(a)ceph.io
> > >>
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
This morning I tried adding a mon node to my home Ceph cluster with the following command:
ceph orch daemon add mon ether
This seemed to work at first, but then it decided to remove it fairly quickly which broke the cluster because the mon. keyring was also removed:
2021-06-01T14:16:11.523210+0000 mgr.paris.glbvov [INF] Deploying daemon mon.ether on ether
2021-06-01T14:16:43.621759+0000 mgr.paris.glbvov [INF] Safe to remove mon.ether: not in monmap (['paris', 'excalibur'])
2021-06-01T14:16:43.622135+0000 mgr.paris.glbvov [INF] Removing monitor ether from monmap...
2021-06-01T14:16:43.641365+0000 mgr.paris.glbvov [INF] Removing daemon mon.ether from ether
2021-06-01T14:16:46.610283+0000 mgr.paris.glbvov [INF] Removing key for mon.
Digging in to this it seems like this line might need to check for 'mon.' and not 'mon':
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/services/ce…
Anyways, does anyone know how to import the mon. keyring again once it has been removed?
Thanks,
Bryan