June 2021 - ceph-users - lists.ceph.io

by Dave Hall

Hello, We're planning another batch of OSD nodes for our cluster. Our prior nodes have been 8 x 12TB SAS drives plus 500GB NVMe per HDD. Due to market circumstances and the shortage of drives those 12TB SAS drives are in short supply. Our integrator has offered an option of 8 x 14TB SATA drives (still Enterprise). For Ceph, will the switch to SATA carry a performance difference that I should be concerned about? Thanks. -Dave -- Dave Hall Binghamton University kdhall(a)binghamton.edu

2 years, 11 months

4
7
0 0

OSD Won't Start - LVM IOCTL Error - Read-only

by Dave Hall

Hello, I had an OSD drop out a couple days ago. This is 14.2.16, Bluestore, HDD + NVMe, non-container. The HDD sort of went away. I powered down the node, reseated the drive, and it came back. However, the OSD won't start. Systemctl --failed shows that the lvm2 pvscan failed, preventing the OSD unit from starting. Running the pvscan activate command manually with with verbose gave 'device-mapper: reload ioctl on (253:7) failed: Read-only file system'. I have been looking at this for a while, but I can't figure out what is read-only that is causing the problem. The full output of the pvscan is: # pvscan --cache --activate ay --verbose '8:48' pvscan devices on command line. activation/auto_activation_volume_list configuration setting not defined: All logical volumes will be auto-activated. Activating logical volume ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79/osd-block-425faf92-449e-4b57-98f2-a90a7f60e2a4. activation/volume_list configuration setting not defined: Checking only host tags for ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79/osd-block-425faf92-449e-4b57-98f2-a90a7f60e2a4. Creating ceph--block--b1fea172--71a4--463e--a3e3--8cdcc1bc7b79-osd--block--425faf92--449e--4b57--98f2--a90a7f60e2a4 Loading table for ceph--block--b1fea172--71a4--463e--a3e3--8cdcc1bc7b79-osd--block--425faf92--449e--4b57--98f2--a90a7f60e2a4 (253:7). device-mapper: reload ioctl on (253:7) failed: Read-only file system Removing ceph--block--b1fea172--71a4--463e--a3e3--8cdcc1bc7b79-osd--block--425faf92--449e--4b57--98f2--a90a7f60e2a4 (253:7) Activated 0 logical volumes in volume group ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79. 0 logical volume(s) in volume group "ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79" now active ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79: autoactivation failed. -Dave -- Dave Hall Binghamton University kdhall(a)binghamton.edu

2 years, 11 months

1
0
0 0

radosgw-admin bucket delete linear memory growth?

by Janne Johansson

I am seeing a huge usage of ram, while my bucket delete is churning over left-over multiparts, and while I realize there are *many* being done a 1000 at a time, like this: 2021-06-03 07:29:06.408 7f9b7f633240 0 abort_bucket_multiparts WARNING : aborted 254000 incomplete multipart uploads ..my first run ended radosgw-admin with out-of-mem, so it seems some part of this keeps data or forgets to free up old parts of the lists after they have been cancelled? I also don't know if restarting means it iterates over the same ones again, or if this log line means 254k are processed and if I need to restart again, it will at least have made a few hours of progress. At this point, RES is 2.1g so roughly 10000 bytes per "entry" if it is linear somehow. ceph 13.2.10 in this case. -- May the most significant bit of your life be positive.

2 years, 11 months

1
0
0 0

ceph configuration using ubuntu 18.04

by Michel Niyoyita

Dear Ceph user, I want to configure ceph in our production environment using ubuntu . anyone who used it can help for the tutorial used . Best regards.

2 years, 11 months

1
0
0 0

ceph-client homebrew for MacOS

by kefu chai

hi folks, i am glad to announce that with the help of Sven and Chris, ceph-client homebrew for MacOS was updated with a recent master commit. this homebrew provides some essential client side tools and libraries which allow us to talk to a ceph cluster from MacOS machine, like: ceph ceph-conf ceph-fuse rados rbd in addition to the executables and libraries, the brew also packages the header files. so, if you are up to do some development with librados, librbd and libcephfs, you can also have the necessary bits from the brew. see https://github.com/mulbc/homebrew-ceph-client it comes with a pre-bottled brew for BigSur on amd64. the last release[0] of the brew was packaged 3 years ago with mimic-13.2.2 . please note, this is not official - do not expect frequent updates or support from Ceph. cheers, -- [0] https://ceph-users.ceph.narkive.com/bnyGalkH/ceph-client-libraries-for-osx -- Regards Kefu Chai

2 years, 11 months

1
0
0 0

MDS cache tunning

by Andres Rojas Guerrero

Hi all, I have observed that the MDS Cache Configuration has 18 parameters: mds_cache_memory_limit mds_cache_reservation mds_health_cache_threshold mds_cache_trim_threshold mds_cache_trim_decay_rate mds_recall_max_caps mds_recall_max_decay_threshold mds_recall_max_decay_rate mds_recall_global_max_decay_threshold mds_recall_warning_threshold mds_recall_warning_decay_rate mds_session_cap_acquisition_throttle mds_session_cap_acquisition_decay_rate mds_session_max_caps_throttle_ratio mds_cap_acquisition_throttle_retry_request_timeout mds_session_cache_liveness_magnitude mds_session_cache_liveness_decay_rate mds_max_caps_per_client I find the Ceph documentation in this section a bit cryptic and I have tried to find some resources that talk about how to tune these parameters, but without success. Does anyone have experience in adjusting these parameters according to the characteristics of the Ceph cluster itself, the hardware and the use of MDS? Regards! -- ******************************************************* Andrés Rojas Guerrero Unidad Sistemas Linux Area Arquitectura Tecnológica Secretaría General Adjunta de Informática Consejo Superior de Investigaciones Científicas (CSIC) Pinar 19 28006 - Madrid Tel: +34 915680059 -- Ext. 990059 email: a.rojas(a)csic.es ID comunicate.csic.es: @50852720l:matrix.csic.es *******************************************************

2 years, 11 months

2
11
0 0

time duration of radosgw-admin

by Rok Jaklič

Hi, is it normal that radosgw-admin user info --uid=user ... takes around 3s or more? Also other radosgw-admin are taking quite a lot of time. Kind regards, Rok

2 years, 11 months

2
1
0 0

Redeploy iSCSI Gateway fail - 167 returned from docker run

by Paul Giralt (pgiralt)

CEPH 16.2.4. I was having an issue where I put a server into maintenance mode and after doing so, the containers for the iSCSI gateway were not running, so I decided to do a redeploy of the service. This caused all the servers running iSCSI to get in a state where it looks like ceph orch was trying to delete the container, but it was stuck. My only recourse was to reboot the servers. I ended up doing a ‘ceph orch rm iscsi.iscsi’ to just remove the services and then tried to redeploy. When I do this, I’m seeing the following in the cephadm logs on the servers where the iscsi gateway is being deployed: 2021-06-01 19:48:15,110 INFO Deploy daemon iscsi.iscsi.cxcto-c240-j27-02.zeypah ... 2021-06-01 19:48:15,111 DEBUG Running command: /bin/docker run --rm --ipc=host --net=host --entrypoint stat --init -e CONTAINER_IMAGE=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 -e NODE_NAME=cxcto-c240-j27-02.cisco.com -e CEPH_USE_RANDOM_NONCE=1 docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 -c %u %g /var/lib/ceph 2021-06-01 19:48:15,529 DEBUG stat: 167 167 Later in the logs I see: 2021-06-01 19:48:25,933 DEBUG Running command: /bin/docker inspect --format {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index .Config.Labels "io.ceph.version"}} ceph-a67d529e-ba7f-11eb-940b-5c838f8013a5-iscsi.iscsi.cxcto-c240-j27-02.zeypah 2021-06-01 19:48:25,984 DEBUG /bin/docker: 2021-06-01 19:48:25,984 DEBUG /bin/docker: Error: No such object: ceph-a67d529e-ba7f-11eb-940b-5c838f8013a5-iscsi.iscsi.cxcto-c240-j27-02.zeypah Obviously no such object because the container creation failed. If I try to run that command that is in the logs manually, I get: [root@cxcto-c240-j27-02 ceph]# /bin/docker run --rm --ipc=host --net=host --entrypoint stat --init -e CONTAINER_IMAGE=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 -e NODE_NAME=cxcto-c240-j27-02.cisco.com -e CEPH_USE_RANDOM_NONCE=1 docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 -c %u %g /var/lib/ceph stat: cannot stat '%g': No such file or directory 167 So the 167 seems to line up with what’s showing up in the script. I’m not clear on what the deal is with the %g. What is supposed to be in that placeholder? Any thoughts on why this is failing? Right now all my iSCSI gateways are down and basically my whole environment is down as a result 🙁 -Paul

2 years, 11 months

1
1
0 0

Fwd: Re: Ceph osd will not start.

by Marco Pizzolo

Peter, We're seeing the same issues as you are. We have 2 new hosts Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz w/ 48 cores, 384GB RAM, and 60x 10TB SED drives and we have tried both 15.2.13 and 16.2.4 Cephadm does NOT properly deploy and activate OSDs on Ubuntu 20.04.2 with Docker. Seems to be a bug in Cephadm and a product regression, as we have 4 near identical nodes on Centos running Nautilus (240 x 10TB SED drives) and had no problems. FWIW we had no luck yet with one-by-one OSD daemon additions through ceph orch either. We also reproduced the issue easily in a virtual lab using small virtual disks on a single ceph VM with 1 mon. We are now looking into whether we can get past this with a manual buildout. If you, or anyone, has hit the same stumbling block and gotten past it, I would really appreciate some guidance. Thanks, Marco On Thu, May 27, 2021 at 2:23 PM Peter Childs <pchilds(a)bcs.org> wrote: > In the end it looks like I might be able to get the node up to about 30 > odds before it stops creating any more. > > Or more it formats the disks but freezes up starting the daemons. > > I suspect I'm missing somthing I can tune to get it working better. > > If I could see any error messages that might help, but I'm yet to spit > anything. > > Peter. > > On Wed, 26 May 2021, 10:57 Eugen Block, <eblock(a)nde.ag> wrote: > > > > If I add the osd daemons one at a time with > > > > > > ceph orch daemon add osd drywood12:/dev/sda > > > > > > It does actually work, > > > > Great! > > > > > I suspect what's happening is when my rule for creating osds run and > > > creates them all-at-once it ties the orch it overloads cephadm and it > > can't > > > cope. > > > > It's possible, I guess. > > > > > I suspect what I might need to do at least to work around the issue is > > set > > > "limit:" and bring it up until it stops working. > > > > It's worth a try, yes, although the docs state you should try to avoid > > it, it's possible that it doesn't work properly, in that case create a > > bug report. ;-) > > > > > I did work out how to get ceph-volume to nearly work manually. > > > > > > cephadm shell > > > ceph auth get client.bootstrap-osd -o > > > /var/lib/ceph/bootstrap-osd/ceph.keyring > > > ceph-volume lvm create --data /dev/sda --dmcrypt > > > > > > but given I've now got "add osd" to work, I suspect I just need to fine > > > tune my osd creation rules, so it does not try and create too many osds > > on > > > the same node at the same time. > > > > I agree, no need to do it manually if there is an automated way, > > especially if you're trying to bring up dozens of OSDs. > > > > > > Zitat von Peter Childs <pchilds(a)bcs.org>: > > > > > After a bit of messing around. I managed to get it somewhat working. > > > > > > If I add the osd daemons one at a time with > > > > > > ceph orch daemon add osd drywood12:/dev/sda > > > > > > It does actually work, > > > > > > I suspect what's happening is when my rule for creating osds run and > > > creates them all-at-once it ties the orch it overloads cephadm and it > > can't > > > cope. > > > > > > service_type: osd > > > service_name: osd.drywood-disks > > > placement: > > > host_pattern: 'drywood*' > > > spec: > > > data_devices: > > > size: "7TB:" > > > objectstore: bluestore > > > > > > I suspect what I might need to do at least to work around the issue is > > set > > > "limit:" and bring it up until it stops working. > > > > > > I did work out how to get ceph-volume to nearly work manually. > > > > > > cephadm shell > > > ceph auth get client.bootstrap-osd -o > > > /var/lib/ceph/bootstrap-osd/ceph.keyring > > > ceph-volume lvm create --data /dev/sda --dmcrypt > > > > > > but given I've now got "add osd" to work, I suspect I just need to fine > > > tune my osd creation rules, so it does not try and create too many osds > > on > > > the same node at the same time. > > > > > > > > > > > > On Wed, 26 May 2021 at 08:25, Eugen Block <eblock(a)nde.ag> wrote: > > > > > >> Hi, > > >> > > >> I believe your current issue is due to a missing keyring for > > >> client.bootstrap-osd on the OSD node. But even after fixing that > > >> you'll probably still won't be able to deploy an OSD manually with > > >> ceph-volume because 'ceph-volume activate' is not supported with > > >> cephadm [1]. I just tried that in a virtual environment, it fails when > > >> activating the systemd-unit: > > >> > > >> ---snip--- > > >> [2021-05-26 06:47:16,677][ceph_volume.process][INFO ] Running > > >> command: /usr/bin/systemctl enable > > >> ceph-volume@lvm-8-1a8fc8ae-8f4c-4f91-b044-d5636bb52456 > > >> [2021-05-26 06:47:16,692][ceph_volume.process][INFO ] stderr Failed > > >> to connect to bus: No such file or directory > > >> [2021-05-26 06:47:16,693][ceph_volume.devices.lvm.create][ERROR ] lvm > > >> activate was unable to complete, while creating the OSD > > >> Traceback (most recent call last): > > >> File > > >> "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py", > > >> line 32, in create > > >> Activate([]).activate(args) > > >> File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", > > >> line 16, in is_root > > >> return func(*a, **kw) > > >> File > > >> > "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", > > >> line > > >> 294, in activate > > >> activate_bluestore(lvs, args.no_systemd) > > >> File > > >> > "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", > > >> line > > >> 214, in activate_bluestore > > >> systemctl.enable_volume(osd_id, osd_fsid, 'lvm') > > >> File > > >> "/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py", > > >> line 82, in enable_volume > > >> return enable(volume_unit % (device_type, id_, fsid)) > > >> File > > >> "/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py", > > >> line 22, in enable > > >> process.run(['systemctl', 'enable', unit]) > > >> File "/usr/lib/python3.6/site-packages/ceph_volume/process.py", > > >> line 153, in run > > >> raise RuntimeError(msg) > > >> RuntimeError: command returned non-zero exit status: 1 > > >> [2021-05-26 06:47:16,694][ceph_volume.devices.lvm.create][INFO ] will > > >> rollback OSD ID creation > > >> [2021-05-26 06:47:16,697][ceph_volume.process][INFO ] Running > > >> command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd > > >> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.8 > > >> --yes-i-really-mean-it > > >> [2021-05-26 06:47:17,597][ceph_volume.process][INFO ] stderr purged > > osd.8 > > >> ---snip--- > > >> > > >> There's a workaround described in [2] that's not really an option for > > >> dozens of OSDs. I think your best approach is to bring cephadm to > > >> activate the OSDs for you. > > >> You wrote you didn't find any helpful error messages, but did cephadm > > >> even try to deploy OSDs? What does your osd spec file look like? Did > > >> you explicitly run 'ceph orch apply osd -i specfile.yml'? This should > > >> trigger cephadm and you should see at least some output like this: > > >> > > >> Mai 26 08:21:48 pacific1 conmon[31446]: 2021-05-26T06:21:48.466+0000 > > >> 7effc15ff700 0 log_channel(cephadm) log [INF] : Applying service > > >> osd.ssd-hdd-mix on host pacific2... > > >> Mai 26 08:21:49 pacific1 conmon[31009]: cephadm > > >> 2021-05-26T06:21:48.469611+0000 mgr.pacific1.whndiw (mgr.14166) 1646 : > > >> cephadm [INF] Applying service osd.ssd-hdd-mix on host pacific2... > > >> > > >> Regards, > > >> Eugen > > >> > > >> [1] https://tracker.ceph.com/issues/49159 > > >> [2] https://tracker.ceph.com/issues/46691 > > >> > > >> > > >> Zitat von Peter Childs <pchilds(a)bcs.org>: > > >> > > >> > Not sure what I'm doing wrong, I suspect its the way I'm running > > >> > ceph-volume. > > >> > > > >> > root@drywood12:~# cephadm ceph-volume lvm create --data /dev/sda > > >> --dmcrypt > > >> > Inferring fsid 1518c8e0-bbe4-11eb-9772-001e67dc85ea > > >> > Using recent ceph image ceph/ceph@sha256 > > >> > :54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 > > >> > /usr/bin/docker: Running command: /usr/bin/ceph-authtool > > --gen-print-key > > >> > /usr/bin/docker: Running command: /usr/bin/ceph-authtool > > --gen-print-key > > >> > /usr/bin/docker: --> RuntimeError: No valid ceph configuration file > > was > > >> > loaded. > > >> > Traceback (most recent call last): > > >> > File "/usr/sbin/cephadm", line 8029, in <module> > > >> > main() > > >> > File "/usr/sbin/cephadm", line 8017, in main > > >> > r = ctx.func(ctx) > > >> > File "/usr/sbin/cephadm", line 1678, in _infer_fsid > > >> > return func(ctx) > > >> > File "/usr/sbin/cephadm", line 1738, in _infer_image > > >> > return func(ctx) > > >> > File "/usr/sbin/cephadm", line 4514, in command_ceph_volume > > >> > out, err, code = call_throws(ctx, c.run_cmd(), > > verbosity=verbosity) > > >> > File "/usr/sbin/cephadm", line 1464, in call_throws > > >> > raise RuntimeError('Failed command: %s' % ' '.join(command)) > > >> > RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host > > >> > --net=host --entrypoint /usr/sbin/ceph-volume --privileged > > >> --group-add=disk > > >> > --init -e CONTAINER_IMAGE=ceph/ceph@sha256 > :54e95ae1e11404157d7b329d0t > > >> > > > >> > root@drywood12:~# cephadm shell > > >> > Inferring fsid 1518c8e0-bbe4-11eb-9772-001e67dc85ea > > >> > Inferring config > > >> > > > /var/lib/ceph/1518c8e0-bbe4-11eb-9772-001e67dc85ea/mon.drywood12/config > > >> > Using recent ceph image ceph/ceph@sha256 > > >> > :54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 > > >> > root@drywood12:/# ceph-volume lvm create --data /dev/sda --dmcrypt > > >> > Running command: /usr/bin/ceph-authtool --gen-print-key > > >> > Running command: /usr/bin/ceph-authtool --gen-print-key > > >> > Running command: /usr/bin/ceph --cluster ceph --name > > client.bootstrap-osd > > >> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new > > >> > 70054a5c-c176-463a-a0ac-b44c5db0987c > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 auth: unable > to > > >> find > > >> > a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such > > file > > >> or > > >> > directory > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 > > >> > AuthRegistry(0x7fdef405b378) no keyring found at > > >> > /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 auth: unable > to > > >> find > > >> > a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such > > file > > >> or > > >> > directory > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 > > >> > AuthRegistry(0x7fdef405ef20) no keyring found at > > >> > /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 auth: unable > to > > >> find > > >> > a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such > > file > > >> or > > >> > directory > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 > > >> > AuthRegistry(0x7fdef8f0bea0) no keyring found at > > >> > /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef2d9d700 -1 > > monclient(hunting): > > >> > handle_auth_bad_method server allowed_methods [2] but i only support > > [1] > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef259c700 -1 > > monclient(hunting): > > >> > handle_auth_bad_method server allowed_methods [2] but i only support > > [1] > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef1d9b700 -1 > > monclient(hunting): > > >> > handle_auth_bad_method server allowed_methods [2] but i only support > > [1] > > >> > stderr: 2021-05-25T07:46:18.188+0000 7fdef8f0d700 -1 monclient: > > >> > authenticate NOTE: no keyring found; disabled cephx authentication > > >> > stderr: [errno 13] RADOS permission denied (error connecting to the > > >> > cluster) > > >> > --> RuntimeError: Unable to create a new OSD id > > >> > root@drywood12:/# lsblk /dev/sda > > >> > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > >> > sda 8:0 0 7.3T 0 disk > > >> > > > >> > As far as I can see cephadm gets a little further than this as the > > disks > > >> > have lvm volumes on them just the osd's daemons are not created or > > >> started. > > >> > So maybe I'm invoking ceph-volume incorrectly. > > >> > > > >> > > > >> > On Tue, 25 May 2021 at 06:57, Peter Childs <pchilds(a)bcs.org> wrote: > > >> > > > >> >> > > >> >> > > >> >> On Mon, 24 May 2021, 21:08 Marc, <Marc(a)f1-outsourcing.eu> wrote: > > >> >> > > >> >>> > > > >> >>> > I'm attempting to use cephadm and Pacific, currently on debian > > >> buster, > > >> >>> > mostly because centos7 ain't supported any more and cenotos8 > ain't > > >> >>> > support > > >> >>> > by some of my hardware. > > >> >>> > > >> >>> Who says centos7 is not supported any more? Afaik centos7/el7 is > > being > > >> >>> supported till its EOL 2024. By then maybe a good alternative for > > >> >>> el8/stream has surfaced. > > >> >>> > > >> >> > > >> >> Not supported by ceph Pacific, it's our os of choice otherwise. > > >> >> > > >> >> My testing says the version available of podman, docker and > python3, > > do > > >> >> not work with Pacific. > > >> >> > > >> >> Given I've needed to upgrade docker on buster can we please have a > > list > > >> of > > >> >> versions that work with cephadm, maybe even have cephadm say no, > > please > > >> >> upgrade unless your running the right version or better. > > >> >> > > >> >> > > >> >> > > >> >>> > Anyway I have a few nodes with 59x 7.2TB disks but for some > reason > > >> the > > >> >>> > osd > > >> >>> > daemons don't start, the disks get formatted and the osd are > > created > > >> but > > >> >>> > the daemons never come up. > > >> >>> > > >> >>> what if you try with > > >> >>> ceph-volume lvm create --data /dev/sdi --dmcrypt ? > > >> >>> > > >> >> > > >> >> I'll have a go. > > >> >> > > >> >> > > >> >>> > They are probably the wrong spec for ceph (48gb of memory and > > only 4 > > >> >>> > cores) > > >> >>> > > >> >>> You can always start with just configuring a few disks per node. > > That > > >> >>> should always work. > > >> >>> > > >> >> > > >> >> That was my thought too. > > >> >> > > >> >> Thanks > > >> >> > > >> >> Peter > > >> >> > > >> >> > > >> >>> > but I was expecting them to start and be either dirt slow or > crash > > >> >>> > later, > > >> >>> > anyway I've got upto 30 of them, so I was hoping on getting at > > least > > >> get > > >> >>> > 6PB of raw storage out of them. > > >> >>> > > > >> >>> > As yet I've not spotted any helpful error messages. > > >> >>> > > > >> >>> _______________________________________________ > > >> >>> ceph-users mailing list -- ceph-users(a)ceph.io > > >> >>> To unsubscribe send an email to ceph-users-leave(a)ceph.io > > >> >>> > > >> >> > > >> > _______________________________________________ > > >> > ceph-users mailing list -- ceph-users(a)ceph.io > > >> > To unsubscribe send an email to ceph-users-leave(a)ceph.io > > >> > > >> > > >> _______________________________________________ > > >> ceph-users mailing list -- ceph-users(a)ceph.io > > >> To unsubscribe send an email to ceph-users-leave(a)ceph.io > > >> > > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users(a)ceph.io > > To unsubscribe send an email to ceph-users-leave(a)ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io >

2 years, 11 months

3
11
1 0

cephadm removed mon. key when adding new mon node

by Bryan Stillwell

This morning I tried adding a mon node to my home Ceph cluster with the following command: ceph orch daemon add mon ether This seemed to work at first, but then it decided to remove it fairly quickly which broke the cluster because the mon. keyring was also removed: 2021-06-01T14:16:11.523210+0000 mgr.paris.glbvov [INF] Deploying daemon mon.ether on ether 2021-06-01T14:16:43.621759+0000 mgr.paris.glbvov [INF] Safe to remove mon.ether: not in monmap (['paris', 'excalibur']) 2021-06-01T14:16:43.622135+0000 mgr.paris.glbvov [INF] Removing monitor ether from monmap... 2021-06-01T14:16:43.641365+0000 mgr.paris.glbvov [INF] Removing daemon mon.ether from ether 2021-06-01T14:16:46.610283+0000 mgr.paris.glbvov [INF] Removing key for mon. Digging in to this it seems like this line might need to check for 'mon.' and not 'mon': https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/services/ce… Anyways, does anyone know how to import the mon. keyring again once it has been removed? Thanks, Bryan

2 years, 11 months

1
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users June 2021