I do not believe it was in 16.2.4. I will build
another patched version of
the image tomorrow based on that version. I do agree, I feel this breaks
new deploys as well as existing, and hope a point release will come soon
that includes the fix.
On May 31, 2021, at 15:33, Marco Pizzolo <marcopizzolo(a)gmail.com> wrote:
David,
What I can confirm is that if this fix is already in 16.2.4 and 15.2.13,
then there's another issue resulting in the same situation, as it continues
to happen in the latest available images.
We are going to try and see if we can install a 15.2.x release and
subsequently upgrade using a fixed image. We were not finding a good way
to bootstrap directly with a custom image, but maybe we missed something.
cephadm bootstrap command didn't seem to support image path.
Thanks for your help thus far. I'll update later today or tomorrow when
we get the chance to go the upgrade route.
Seems tragic that when an all-stopping, immediately reproducible issue
such as this occurs, adopters are allowed to flounder for so long. Ceph
has had a tremendously positive impact for us since we began using it
in luminous/mimic, but situations such as this are hard to look past. It's
really unfortunate as our existing production clusters have been rock solid
thus far, but this does shake one's confidence, and I would wager that I'm
not alone.
Marco
On Mon, May 31, 2021 at 3:57 PM David Orman <ormandj(a)corenode.com> wrote:
Does the image we built fix the problem for you?
That's how we worked
around it. Unfortunately, it even bites you with less OSDs if you have
DB/WAL on other devices, we have 24 rotational drives/OSDs, but split
DB/WAL onto multiple NVMEs. We're hoping the remoto fix (since it's
merged upstream and pushed) will land in the next point release of
16.x (and it sounds like 15.x), since this is a blocking issue without
using patched containers. I guess testing isn't done against clusters
with these kinds of configurations, as we can replicate it on any of
our dev/test clusters with this type of drive configuration. We
weren't able to upgrade any clusters/deploy new hosts on any clusters,
so it caused quite an issue until we figured out the problem and
resolved it.
If you want to build your own images, this is the simple Dockerfile we
used to get beyond this issue:
$ cat Dockerfile
FROM docker.io/ceph/ceph:v16.2.3
COPY process.py /lib/python3.6/site-packages/remoto/process.py
The process.py is the patched version we submitted here:
https://github.com/alfredodeza/remoto/pull/63/commits/6f98078a1479de1f246f9…
(merged upstream).
Hope this helps,
David
On Mon, May 31, 2021 at 11:43 AM Marco Pizzolo <marcopizzolo(a)gmail.com>
wrote:
Unfortunately Ceph 16.2.4 is still not working for us. We continue to
have issues
where the 26th OSD is not fully created and started. We've
confirmed that we do get the flock as described in:
https://tracker.ceph.com/issues/50526
-----
I have verified in our labs a way to reproduce easily the problem:
0. Please stop the cephadm orchestrator:
In your bootstrap node:
# cephadm shell
# ceph mgr module disable cephadm
1. In one of the hosts where you want to create osds and you have a big
amount of
devices:
See if you have a "cephadm" filelock:
for example:
# lslocks | grep cephadm
python3 1098782 FLOCK 0B WRITE 0 0 0
/run/cephadm/9fa2b396-adb5-11eb-a2d3-bc97e17cf960.lock
if that is the case. just kill the process to start with a "clean"
situation
2. Go to the folder: /var/lib/ceph/<your_ceph_cluster_fsid>
you will find there a file called "cephadm.xxxxxxxxxxxxxx".
execute:
# python3 cephadm.xxxxxxxxxxxxxx ceph-volume inventory
3. If the problem is present in your cephadm file, you will have the
command
blocked and you will see again a cephadm filelock
4. In the case that the modification was not present. Change your
cephadm.xxxxxxxxxx file to include the modification I did (is just to
remove the verbosity parameter in the call_throws call)
https://github.com/ceph/ceph/blob/2f4dc3147712f1991242ef0d059690b5fa3d8463/…
go to step 1, to clean the filelock and try again... with the
modification in
place it must work.
-----
For us, it takes a few seconds but then the manual execution does come
back, and
there are no file locks, however we remain unable to add any
further OSDs.
Furthermore, this is happening as part of the creation of a new Pacific
Cluster
creation post bootstrap and adding one OSD daemon at a time and
allowing each OSD to be created, set in, and brought up.
How is everyone else managing to get past this, or are we the only ones
(aside
from David) using >25 OSDs per host?
Our luck has been the same with 15.2.13 and 16.2.4, and using both
Docker and
Podman on Ubuntu 20.04.2
>
> Thanks,
> Marco
>
> On Sun, May 30, 2021 at 7:33 AM Peter Childs <pchilds(a)bcs.org> wrote:
>>
>> I've actually managed to get a little further with my problem.
>>
>> As I've said before these servers are slightly distorted in config.
>>
>> 63 drives and only 48g if memory.
>>
>> Once I create about 15-20 osds it continues to format the disks but
won't actually create the containers or start any service.
>
> Worse than that on reboot the disks disappear, not stop working but
not
detected by Linux, which makes me think I'm hitting some kernel limit.
>
> At this point I'm going to cut my loses and give up and use the small
slightly more powerful 30x drive systems I have (with 256g memory), maybe
transplanting the larger disks if I need more capacity.
>
> Peter
>
> On Sat, 29 May 2021, 23:19 Marco Pizzolo, <marcopizzolo(a)gmail.com>
wrote:
>>
>> Thanks David
>> We will investigate the bugs as per your suggestion, and then will
look to
test with the custom image.
>>
>> Appreciate it.
>>
>> On Sat, May 29, 2021, 4:11 PM David Orman <ormandj(a)corenode.com>
wrote:
>>>
>>> You may be running into the same issue we ran into (make sure to read
>>> the first issue, there's a few mingled in there), for which we
>>> submitted a patch:
>>>
>>>
https://tracker.ceph.com/issues/50526
>>>
https://github.com/alfredodeza/remoto/issues/62
>>>
>>> If you're brave (YMMV, test first non-prod), we pushed an image with
>>> the issue we encountered fixed as per above here:
>>>
https://hub.docker.com/repository/docker/ormandj/ceph/tags?page=1 .
We
>>> 'upgraded' to this when we
encountered the mgr hanging on us after
>>> updating ceph to v16 and experiencing this issue using: "ceph orch
>>> upgrade start --image docker.io/ormandj/ceph:v16.2.3-mgrfix". I've
not
>>> tried to boostrap a new cluster with
a custom image, and I don't know
>>> when 16.2.4 will be released with this change (hopefully) integrated
>>> as remoto accepted the patch upstream.
>>>
>>> I'm not sure if this is your exact issue, see the bug reports and see
>>> if you see the lock/the behavior matches, if so - then it may help
you
>>> out. The only change in that image is
that patch to remoto being
>>> overlaid on the default 16.2.3 image.
>>>
>>> On Fri, May 28, 2021 at 1:15 PM Marco Pizzolo <
marcopizzolo(a)gmail.com> wrote:
>>> >
>>> > Peter,
>>> >
>>> > We're seeing the same issues as you are. We have 2 new hosts
Intel(R)
>>> > Xeon(R) Gold 6248R CPU @ 3.00GHz
w/ 48 cores, 384GB RAM, and 60x
10TB SED
>>> > drives and we have tried both
15.2.13 and 16.2.4
>>> >
>>> > Cephadm does NOT properly deploy and activate OSDs on Ubuntu
20.04.2 with
>>> > Docker.
>>> >
>>> > Seems to be a bug in Cephadm and a product regression, as we have
4 near
>>> > identical nodes on Centos
running Nautilus (240 x 10TB SED drives)
and had
>>> > no problems.
>>> >
>>> > FWIW we had no luck yet with one-by-one OSD daemon additions
through ceph
>>> > orch either. We also reproduced
the issue easily in a virtual lab
using
>>> > small virtual disks on a single
ceph VM with 1 mon.
>>> >
>>> > We are now looking into whether we can get past this with a manual
buildout.
>>> >
>>> > If you, or anyone, has hit the same stumbling block and gotten
past it, I
>>> > would really appreciate some
guidance.
>>> >
>>> > Thanks,
>>> > Marco
>>> >
>>> > On Thu, May 27, 2021 at 2:23 PM Peter Childs <pchilds(a)bcs.org>
wrote:
>>> >
>>> > > In the end it looks like I might be able to get the node up to
about 30
>>> > > odds before it stops
creating any more.
>>> > >
>>> > > Or more it formats the disks but freezes up starting the daemons.
>>> > >
>>> > > I suspect I'm missing somthing I can tune to get it working
better.
>>> > >
>>> > > If I could see any error messages that might help, but I'm yet
to spit
>>> > > anything.
>>> > >
>>> > > Peter.
>>> > >
>>> > > On Wed, 26 May 2021, 10:57 Eugen Block, <eblock(a)nde.ag>
wrote:
>>> > >
>>> > > > > If I add the osd daemons one at a time with
>>> > > > >
>>> > > > > ceph orch daemon add osd drywood12:/dev/sda
>>> > > > >
>>> > > > > It does actually work,
>>> > > >
>>> > > > Great!
>>> > > >
>>> > > > > I suspect what's happening is when my rule for
creating osds
run and
>>> > > > > creates them
all-at-once it ties the orch it overloads
cephadm and it
>>> > > > can't
>>> > > > > cope.
>>> > > >
>>> > > > It's possible, I guess.
>>> > > >
>>> > > > > I suspect what I might need to do at least to work around
the issue is
>>> > > > set
>>> > > > > "limit:" and bring it up until it stops
working.
>>> > > >
>>> > > > It's worth a try, yes, although the docs state you should
try
to avoid
>>> > > > it, it's possible
that it doesn't work properly, in that case
create a
>>> > > > bug report. ;-)
>>> > > >
>>> > > > > I did work out how to get ceph-volume to nearly work
manually.
>>> > > > >
>>> > > > > cephadm shell
>>> > > > > ceph auth get client.bootstrap-osd -o
>>> > > > > /var/lib/ceph/bootstrap-osd/ceph.keyring
>>> > > > > ceph-volume lvm create --data /dev/sda --dmcrypt
>>> > > > >
>>> > > > > but given I've now got "add osd" to work, I
suspect I just
need to fine
>>> > > > > tune my osd
creation rules, so it does not try and create
too many osds
>>> > > > on
>>> > > > > the same node at the same time.
>>> > > >
>>> > > > I agree, no need to do it manually if there is an automated
way,
>>> > > > especially if
you're trying to bring up dozens of OSDs.
>>> > > >
>>> > > >
>>> > > > Zitat von Peter Childs <pchilds(a)bcs.org>rg>:
>>> > > >
>>> > > > > After a bit of messing around. I managed to get it
somewhat
working.
>>> > > > >
>>> > > > > If I add the osd daemons one at a time with
>>> > > > >
>>> > > > > ceph orch daemon add osd drywood12:/dev/sda
>>> > > > >
>>> > > > > It does actually work,
>>> > > > >
>>> > > > > I suspect what's happening is when my rule for
creating osds
run and
>>> > > > > creates them
all-at-once it ties the orch it overloads
cephadm and it
>>> > > > can't
>>> > > > > cope.
>>> > > > >
>>> > > > > service_type: osd
>>> > > > > service_name: osd.drywood-disks
>>> > > > > placement:
>>> > > > > host_pattern: 'drywood*'
>>> > > > > spec:
>>> > > > > data_devices:
>>> > > > > size: "7TB:"
>>> > > > > objectstore: bluestore
>>> > > > >
>>> > > > > I suspect what I might need to do at least to work around
the issue is
>>> > > > set
>>> > > > > "limit:" and bring it up until it stops
working.
>>> > > > >
>>> > > > > I did work out how to get ceph-volume to nearly work
manually.
>>> > > > >
>>> > > > > cephadm shell
>>> > > > > ceph auth get client.bootstrap-osd -o
>>> > > > > /var/lib/ceph/bootstrap-osd/ceph.keyring
>>> > > > > ceph-volume lvm create --data /dev/sda --dmcrypt
>>> > > > >
>>> > > > > but given I've now got "add osd" to work, I
suspect I just
need to fine
>>> > > > > tune my osd
creation rules, so it does not try and create
too many osds
>>> > > > on
>>> > > > > the same node at the same time.
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > On Wed, 26 May 2021 at 08:25, Eugen Block
<eblock(a)nde.ag>
wrote:
>>> > > > >
>>> > > > >> Hi,
>>> > > > >>
>>> > > > >> I believe your current issue is due to a missing
keyring for
>>> > > > >> client.bootstrap-osd on the OSD node. But even after
fixing
that
>>> > > > >> you'll
probably still won't be able to deploy an OSD
manually with
>>> > > > >> ceph-volume
because 'ceph-volume activate' is not supported
with
>>> > > > >> cephadm [1].
I just tried that in a virtual environment, it
fails when
>>> > > > >> activating
the systemd-unit:
>>> > > > >>
>>> > > > >> ---snip---
>>> > > > >> [2021-05-26 06:47:16,677][ceph_volume.process][INFO
]
Running
>>> > > > >> command:
/usr/bin/systemctl enable
>>> > > > >>
ceph-volume@lvm-8-1a8fc8ae-8f4c-4f91-b044-d5636bb52456
>>> > > > >> [2021-05-26 06:47:16,692][ceph_volume.process][INFO
]
stderr Failed
>>> > > > >> to connect to
bus: No such file or directory
>>> > > > >> [2021-05-26
06:47:16,693][ceph_volume.devices.lvm.create][ERROR ] lvm
>>> > > > >> activate was
unable to complete, while creating the OSD
>>> > > > >> Traceback (most recent call last):
>>> > > > >> File
>>> > > > >>
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py",
>>> > > > >> line 32, in
create
>>> > > > >> Activate([]).activate(args)
>>> > > > >> File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py",
>>> > > > >> line 16, in
is_root
>>> > > > >> return func(*a, **kw)
>>> > > > >> File
>>> > > > >>
>>> > >
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py",
>>> > > > >> line
>>> > > > >> 294, in activate
>>> > > > >> activate_bluestore(lvs, args.no_systemd)
>>> > > > >> File
>>> > > > >>
>>> > >
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py",
>>> > > > >> line
>>> > > > >> 214, in activate_bluestore
>>> > > > >> systemctl.enable_volume(osd_id, osd_fsid,
'lvm')
>>> > > > >> File
>>> > > > >>
"/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py",
>>> > > > >> line 82, in
enable_volume
>>> > > > >> return enable(volume_unit % (device_type, id_,
fsid))
>>> > > > >> File
>>> > > > >>
"/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py",
>>> > > > >> line 22, in
enable
>>> > > > >> process.run(['systemctl',
'enable', unit])
>>> > > > >> File
"/usr/lib/python3.6/site-packages/ceph_volume/process.py",
>>> > > > >> line 153, in
run
>>> > > > >> raise RuntimeError(msg)
>>> > > > >> RuntimeError: command returned non-zero exit status:
1
>>> > > > >> [2021-05-26
06:47:16,694][ceph_volume.devices.lvm.create][INFO ] will
>>> > > > >> rollback OSD
ID creation
>>> > > > >> [2021-05-26 06:47:16,697][ceph_volume.process][INFO
]
Running
>>> > > > >> command:
/usr/bin/ceph --cluster ceph --name
client.bootstrap-osd
>>> > > > >> --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring osd
purge-new osd.8
>>> > > > >>
--yes-i-really-mean-it
>>> > > > >> [2021-05-26 06:47:17,597][ceph_volume.process][INFO
]
stderr purged
>>> > > > osd.8
>>> > > > >> ---snip---
>>> > > > >>
>>> > > > >> There's a workaround described in [2] that's
not really an
option for
>>> > > > >> dozens of
OSDs. I think your best approach is to bring
cephadm to
>>> > > > >> activate the
OSDs for you.
>>> > > > >> You wrote you didn't find any helpful error
messages, but
did cephadm
>>> > > > >> even try to
deploy OSDs? What does your osd spec file look
like? Did
>>> > > > >> you
explicitly run 'ceph orch apply osd -i specfile.yml'?
This should
>>> > > > >> trigger
cephadm and you should see at least some output
like this:
>>> > > > >>
>>> > > > >> Mai 26 08:21:48 pacific1 conmon[31446]:
2021-05-26T06:21:48.466+0000
>>> > > > >> 7effc15ff700
0 log_channel(cephadm) log [INF] : Applying
service
>>> > > > >>
osd.ssd-hdd-mix on host pacific2...
>>> > > > >> Mai 26 08:21:49 pacific1 conmon[31009]: cephadm
>>> > > > >> 2021-05-26T06:21:48.469611+0000 mgr.pacific1.whndiw
(mgr.14166) 1646 :
>>> > > > >> cephadm [INF]
Applying service osd.ssd-hdd-mix on host
pacific2...
>>> > > > >>
>>> > > > >> Regards,
>>> > > > >> Eugen
>>> > > > >>
>>> > > > >> [1]
https://tracker.ceph.com/issues/49159
>>> > > > >> [2]
https://tracker.ceph.com/issues/46691
>>> > > > >>
>>> > > > >>
>>> > > > >> Zitat von Peter Childs <pchilds(a)bcs.org>rg>:
>>> > > > >>
>>> > > > >> > Not sure what I'm doing wrong, I suspect its
the way I'm
running
>>> > > > >> >
ceph-volume.
>>> > > > >> >
>>> > > > >> > root@drywood12:~# cephadm ceph-volume lvm create
--data
/dev/sda
>>> > > > >> --dmcrypt
>>> > > > >> > Inferring fsid
1518c8e0-bbe4-11eb-9772-001e67dc85ea
>>> > > > >> > Using recent ceph image ceph/ceph@sha256
>>> > > > >> >
:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949
>>> > > > >> >
/usr/bin/docker: Running command: /usr/bin/ceph-authtool
>>> > > > --gen-print-key
>>> > > > >> > /usr/bin/docker: Running command:
/usr/bin/ceph-authtool
>>> > > > --gen-print-key
>>> > > > >> > /usr/bin/docker: --> RuntimeError: No valid
ceph
configuration file
>>> > > > was
>>> > > > >> > loaded.
>>> > > > >> > Traceback (most recent call last):
>>> > > > >> > File "/usr/sbin/cephadm", line 8029,
in <module>
>>> > > > >> > main()
>>> > > > >> > File "/usr/sbin/cephadm", line 8017,
in main
>>> > > > >> > r = ctx.func(ctx)
>>> > > > >> > File "/usr/sbin/cephadm", line 1678,
in _infer_fsid
>>> > > > >> > return func(ctx)
>>> > > > >> > File "/usr/sbin/cephadm", line 1738,
in _infer_image
>>> > > > >> > return func(ctx)
>>> > > > >> > File "/usr/sbin/cephadm", line 4514,
in
command_ceph_volume
>>> > > > >> > out,
err, code = call_throws(ctx, c.run_cmd(),
>>> > > > verbosity=verbosity)
>>> > > > >> > File "/usr/sbin/cephadm", line 1464,
in call_throws
>>> > > > >> > raise RuntimeError('Failed command:
%s' % '
'.join(command))
>>> > > > >> >
RuntimeError: Failed command: /usr/bin/docker run --rm
--ipc=host
>>> > > > >> >
--net=host --entrypoint /usr/sbin/ceph-volume --privileged
>>> > > > >> --group-add=disk
>>> > > > >> > --init -e CONTAINER_IMAGE=ceph/ceph@sha256
>>> > > :54e95ae1e11404157d7b329d0t
>>> > > > >> >
>>> > > > >> > root@drywood12:~# cephadm shell
>>> > > > >> > Inferring fsid
1518c8e0-bbe4-11eb-9772-001e67dc85ea
>>> > > > >> > Inferring config
>>> > > > >> >
>>> > > >
/var/lib/ceph/1518c8e0-bbe4-11eb-9772-001e67dc85ea/mon.drywood12/config
>>> > > > >> > Using
recent ceph image ceph/ceph@sha256
>>> > > > >> >
:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949
>>> > > > >> >
root@drywood12:/# ceph-volume lvm create --data /dev/sda
--dmcrypt
>>> > > > >> > Running
command: /usr/bin/ceph-authtool --gen-print-key
>>> > > > >> > Running command: /usr/bin/ceph-authtool
--gen-print-key
>>> > > > >> > Running command: /usr/bin/ceph --cluster ceph
--name
>>> > > > client.bootstrap-osd
>>> > > > >> > --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring -i -
osd new
>>> > > > >> >
70054a5c-c176-463a-a0ac-b44c5db0987c
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef8f0d700 -1
auth: unable
>>> > > to
>>> > > > >> find
>>> > > > >> > a keyring on
/var/lib/ceph/bootstrap-osd/ceph.keyring:
(2) No such
>>> > > > file
>>> > > > >> or
>>> > > > >> > directory
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef8f0d700 -1
>>> > > > >> > AuthRegistry(0x7fdef405b378) no keyring found
at
>>> > > > >> > /var/lib/ceph/bootstrap-osd/ceph.keyring,
disabling cephx
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef8f0d700 -1
auth: unable
>>> > > to
>>> > > > >> find
>>> > > > >> > a keyring on
/var/lib/ceph/bootstrap-osd/ceph.keyring:
(2) No such
>>> > > > file
>>> > > > >> or
>>> > > > >> > directory
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef8f0d700 -1
>>> > > > >> > AuthRegistry(0x7fdef405ef20) no keyring found
at
>>> > > > >> > /var/lib/ceph/bootstrap-osd/ceph.keyring,
disabling cephx
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef8f0d700 -1
auth: unable
>>> > > to
>>> > > > >> find
>>> > > > >> > a keyring on
/var/lib/ceph/bootstrap-osd/ceph.keyring:
(2) No such
>>> > > > file
>>> > > > >> or
>>> > > > >> > directory
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef8f0d700 -1
>>> > > > >> > AuthRegistry(0x7fdef8f0bea0) no keyring found
at
>>> > > > >> > /var/lib/ceph/bootstrap-osd/ceph.keyring,
disabling cephx
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef2d9d700 -1
>>> > > > monclient(hunting):
>>> > > > >> > handle_auth_bad_method server allowed_methods
[2] but i
only support
>>> > > > [1]
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef259c700 -1
>>> > > > monclient(hunting):
>>> > > > >> > handle_auth_bad_method server allowed_methods
[2] but i
only support
>>> > > > [1]
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef1d9b700 -1
>>> > > > monclient(hunting):
>>> > > > >> > handle_auth_bad_method server allowed_methods
[2] but i
only support
>>> > > > [1]
>>> > > > >> > stderr: 2021-05-25T07:46:18.188+0000
7fdef8f0d700 -1
monclient:
>>> > > > >> >
authenticate NOTE: no keyring found; disabled cephx
authentication
>>> > > > >> > stderr:
[errno 13] RADOS permission denied (error
connecting to the
>>> > > > >> >
cluster)
>>> > > > >> > --> RuntimeError: Unable to create a new OSD
id
>>> > > > >> > root@drywood12:/# lsblk /dev/sda
>>> > > > >> > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
>>> > > > >> > sda 8:0 0 7.3T 0 disk
>>> > > > >> >
>>> > > > >> > As far as I can see cephadm gets a little
further than
this as the
>>> > > > disks
>>> > > > >> > have lvm volumes on them just the osd's
daemons are not
created or
>>> > > > >> started.
>>> > > > >> > So maybe I'm invoking ceph-volume
incorrectly.
>>> > > > >> >
>>> > > > >> >
>>> > > > >> > On Tue, 25 May 2021 at 06:57, Peter Childs <
pchilds(a)bcs.org> wrote:
>>> > > > >> >
>>> > > > >> >>
>>> > > > >> >>
>>> > > > >> >> On Mon, 24 May 2021, 21:08 Marc,
<Marc(a)f1-outsourcing.eu>
wrote:
>>> > > > >> >>
>>> > > > >> >>> >
>>> > > > >> >>> > I'm attempting to use cephadm
and Pacific, currently
on debian
>>> > > > >> buster,
>>> > > > >> >>> > mostly because centos7 ain't
supported any more and
cenotos8
>>> > > ain't
>>> > > > >> >>> > support
>>> > > > >> >>> > by some of my hardware.
>>> > > > >> >>>
>>> > > > >> >>> Who says centos7 is not supported any
more? Afaik
centos7/el7 is
>>> > > > being
>>> > > > >> >>> supported till its EOL 2024. By then
maybe a good
alternative for
>>> > > > >> >>>
el8/stream has surfaced.
>>> > > > >> >>>
>>> > > > >> >>
>>> > > > >> >> Not supported by ceph Pacific, it's our
os of choice
otherwise.
>>> > > > >> >>
>>> > > > >> >> My testing says the version available of
podman, docker
and
>>> > > python3,
>>> > > > do
>>> > > > >> >> not work with Pacific.
>>> > > > >> >>
>>> > > > >> >> Given I've needed to upgrade docker on
buster can we
please have a
>>> > > > list
>>> > > > >> of
>>> > > > >> >> versions that work with cephadm, maybe even
have cephadm
say no,
>>> > > > please
>>> > > > >> >> upgrade unless your running the right
version or better.
>>> > > > >> >>
>>> > > > >> >>
>>> > > > >> >>
>>> > > > >> >>> > Anyway I have a few nodes with 59x
7.2TB disks but
for some
>>> > > reason
>>> > > > >> the
>>> > > > >> >>> > osd
>>> > > > >> >>> > daemons don't start, the disks
get formatted and the
osd are
>>> > > > created
>>> > > > >> but
>>> > > > >> >>> > the daemons never come up.
>>> > > > >> >>>
>>> > > > >> >>> what if you try with
>>> > > > >> >>> ceph-volume lvm create --data /dev/sdi
--dmcrypt ?
>>> > > > >> >>>
>>> > > > >> >>
>>> > > > >> >> I'll have a go.
>>> > > > >> >>
>>> > > > >> >>
>>> > > > >> >>> > They are probably the wrong spec
for ceph (48gb of
memory and
>>> > > > only 4
>>> > > > >> >>> > cores)
>>> > > > >> >>>
>>> > > > >> >>> You can always start with just
configuring a few disks
per node.
>>> > > > That
>>> > > > >> >>> should always work.
>>> > > > >> >>>
>>> > > > >> >>
>>> > > > >> >> That was my thought too.
>>> > > > >> >>
>>> > > > >> >> Thanks
>>> > > > >> >>
>>> > > > >> >> Peter
>>> > > > >> >>
>>> > > > >> >>
>>> > > > >> >>> > but I was expecting them to start
and be either dirt
slow or
>>> > > crash
>>> > > > >> >>> > later,
>>> > > > >> >>> > anyway I've got upto 30 of
them, so I was hoping on
getting at
>>> > > > least
>>> > > > >> get
>>> > > > >> >>> > 6PB of raw storage out of them.
>>> > > > >> >>> >
>>> > > > >> >>> > As yet I've not spotted any
helpful error messages.
>>> > > > >> >>> >
>>> > > > >> >>>
_______________________________________________
>>> > > > >> >>> ceph-users mailing list --
ceph-users(a)ceph.io
>>> > > > >> >>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io
>>> > > > >> >>>
>>> > > > >> >>
>>> > > > >> > _______________________________________________
>>> > > > >> > ceph-users mailing list -- ceph-users(a)ceph.io
>>> > > > >> > To unsubscribe send an email to
ceph-users-leave(a)ceph.io
>>> > > > >>
>>> > > > >>
>>> > > > >> _______________________________________________
>>> > > > >> ceph-users mailing list -- ceph-users(a)ceph.io
>>> > > > >> To unsubscribe send an email to
ceph-users-leave(a)ceph.io
>>> > > > >>
>>> > > >
>>> > > >
>>> > > > _______________________________________________
>>> > > > ceph-users mailing list -- ceph-users(a)ceph.io
>>> > > > To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>> > > >
>>> > > _______________________________________________
>>> > > ceph-users mailing list -- ceph-users(a)ceph.io
>>> > > To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>> > >
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users(a)ceph.io
>>> > To unsubscribe send an email to ceph-users-leave(a)ceph.io