Thank your reply
Our cluster are runing for two years in production,and it has no problem,so we don't
upgrade.
I check memory on host.Very little memory of free left.Does creating thread failure have
anything to do with this?
In addition to the kvm virtual machine, there are 22 osds on the host.
free -m
total used free shared buff/cache available
Mem: 515420 178212 4323 729 332884 335360
Swap: 8191 8145 46
sysctl:
kernel.pid_max=4194303
kernel.threads-max=2097152
vm.max_map_count=524288
But really, why are you still running Hammer? Later releases handle a large number of
OSDs *much* better.
> On Jun 1, 2020, at 7:08 PM, 展荣臻(信泰) <zhanrzh_xt(a)teamsun.com.cn> wrote:
>
> Hi all,
> We have a hammer ceph cluster with 3 monitor,324 osds. OSD daemon and kvm is
collocated on node;
> The ceph cluster are runing 2 years.Recently we added ~700 osds to the cluster,as
process:
> 1.ceph osd create
> 2. mkdir -p /var/lib/ceph/osd/ceph-$osd
> 3. mkfs.xfs -f /dev/$disk
> 4. mount -o inode64,notime /dev/$disk /var/lib/ceph/osd/ceph-$osd
> 5. ceph-osd -i 0 --mkfs --mkkey
> 6.ceph auth add osd.$osd osd 'allow *' mon 'allow profile osd' -i
/var/lib/ceph/osd/ceph-$osd/keyring
> 7.ceph osd crush create-or-move $osd host=kvm101 root=default
> Mabe we do that requently.After add 122 osds, osd.1-osd.8 failed
>
> 2020-05-14 16:48:29.881021 7f6727fb9700 -1 common/Thread.cc: In function 'void
Thread::create(size_t)' thread 7f6727fb9700 time 2020-05-14 16:48:29.870051
> common/Thread.cc: 129: FAILED assert(ret == 0)
>
> ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85)
[0xbc8b55]
> 2: (Thread::create(unsigned long)+0x8a) [0xbac50a]
> 3: (Pipe::accept()+0x37fb) [0xca6c3b]
> 4: (Pipe::reader()+0x1a0f) [0xcaa75f]
> 5: (Pipe::Reader::entry()+0xd) [0xcb351d]
> 6: (()+0x7dc5) [0x7f67a45ebdc5]
> 7: (clone()+0x6d) [0x7f67a30cc1cd]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
>
> ulimit -u
> 2061600
> open files 32768
>
>
> Does anyone know what's going on? Why create thread faild?
>
>
>
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io