[ceph-users] Re: Getting started with cephadm

28 Feb 2021

Currently I'm using the default podman that comes with CentOS7 1.6.4 which
I fear is the issue.

/bin/podman: stderr WARNING: The same type, major and minor should not be
used for multiple devices.

Looks to be part of the issue, and I've heard this is an issue in older
versions of podman.

I can't see a ram issue, or a CPU issue it looks like its probably an issue
with podman mounting overlays, so maybe upgrading podman past that
available with CentOS 7 is the first plan, shame CentOS 8 is a non-project
now :(

Peter.

On Sat, 27 Feb 2021 at 19:37, David Orman &lt;ormandj(a)corenode.com&gt; wrote:

...
  Podman is fine (preferably 3.0+). What were those
variables set to
 before? With most recent distributions and kernels we've not noticed a
 problem with the defaults. Did you notice errors that lead to you
 changing them? We have many clusters of 21 nodes, 24 HDDs each,
 multiple NVMEs serving as WAL/DB which were on 15.2.7 and prior, but
 now all are 15.2.9, running in podman 3.0.1 (fixes issues with the 2.2
 series on upgrade). We have less RAM (128G) per node without issues.

 On the OSDs that will not start - what error(s) do you see? You can
 inspect the OSDs with "podman logs <id>" if they've started inside
of
 podman but just aren't joining the cluster; if they haven't, then
 looking at the systemctl status for the service or journalctl will
 normally give more insight. Hopefully the root cause of your problems
 can be identified so it can be addressed directly.

 On Sat, Feb 27, 2021 at 11:34 AM Peter Childs &lt;pchilds(a)bcs.org&gt; wrote:

 I'm new to ceph, and I've been trying to set up a new cluster with 16
 computers with 30 disks each and 6 SSD (plus boot disks), 256G of memory,
 IB Networking. (ok its currently 15 but never mind)

 When I take them over about 10 OSD's each they start having problems
 starting the OSD up and I can normally fix this by rebooting them and it
 will continue again for a while, and it is possible to get them up to the
 full complement with a bit of poking around. (Once its working it fne
 unless you start adding services or moving the OSD's around

 Is there anything I can change to make it a bit more stable.

 I've already set

 fs.aio-max-nr = 1048576
 kernel.pid_max = 4194303
 fs.file-max = 500000

 which made it a bit better, but I feel it could be even better.

 I'm currently trying to upgrade to 15.2.9 from the default cephadm  version
  of octopus.  The upgrade is going very very
slowly. I'm currently using
 podman if that helps, I'm not sure if docker would be better? (I've 
mainly
  used singularity when I've handled containers
before)

 Thanks in advance

 Peter Childs
 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io 

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Getting started with cephadm