[ceph-users] Re: Host failure trigger " Cannot allocate memory"

10 Sep 2019

I have also found below error in dmesg.

[332884.028810] systemd-journald[6240]: Failed to parse kernel command
line, ignoring: Cannot allocate memory
[332885.054147] systemd-journald[6240]: Out of memory.
[332894.844765] systemd[1]: systemd-journald.service: Main process exited,
code=exited, status=1/FAILURE
[332897.199736] systemd[1]: systemd-journald.service: Failed with result
'exit-code'.
[332906.503076] systemd[1]: Failed to start Journal Service.
[332937.909198] systemd[1]: ceph-crash.service: Main process exited,
code=exited, status=1/FAILURE
[332939.308341] systemd[1]: ceph-crash.service: Failed with result
'exit-code'.
[332949.545907] systemd[1]: systemd-journald.service: Service has no
hold-off time, scheduling restart.
[332949.546631] systemd[1]: systemd-journald.service: Scheduled restart
job, restart counter is at 7.
[332949.546781] systemd[1]: Stopped Journal Service.
[332949.566402] systemd[1]: Starting Journal Service...
[332950.190332] systemd[1]: ceph-osd(a)1.service: Main process exited,
code=killed, status=6/ABRT
[332950.190477] systemd[1]: ceph-osd(a)1.service: Failed with result 'signal'.
[332950.842297] systemd-journald[6249]: File
/var/log/journal/8f2559099bf54865adc95e5340d04447/system.journal corrupted
or uncleanly shut down, renaming and replacing.
[332951.019531] systemd[1]: Started Journal Service.

On Tue, Sep 10, 2019 at 3:04 PM Amudhan P &lt;amudhan83(a)gmail.com&gt; wrote:

...
  Hi,

 I am using ceph version 13.2.6 (mimic) on test setup trying with cephfs.

 My current setup:
 3 nodes, 1 node contain two bricks and other 2 nodes contain single brick
 each.

 Volume is a 3 replica, I am trying to simulate node failure.

 I powered down one host and started getting msg in other systems when
 running any command
 "-bash: fork: Cannot allocate memory" and system not responding to
 commands.

 what could be the reason for this?
 at this stage, I could able to read some of the data stored in the volume
 and some just waiting for IO.

 output from "sudo ceph -s"
   cluster:
     id:     7c138e13-7b98-4309-b591-d4091a1742b4
     health: HEALTH_WARN
             1 osds down
             2 hosts (3 osds) down
             Degraded data redundancy: 5313488/7970232 objects degraded
 (66.667%), 64 pgs degraded

   services:
     mon: 1 daemons, quorum mon01
     mgr: mon01(active)
     mds: cephfs-tst-1/1/1 up  {0=mon01=up:active}
     osd: 4 osds: 1 up, 2 in

   data:
     pools:   2 pools, 64 pgs
     objects: 2.66 M objects, 206 GiB
     usage:   421 GiB used, 3.2 TiB / 3.6 TiB avail
     pgs:     5313488/7970232 objects degraded (66.667%)
              64 active+undersized+degraded

   io:
     client:   79 MiB/s rd, 24 op/s rd, 0 op/s wr

 output from : sudo ceph osd df
 ID CLASS WEIGHT  REWEIGHT SIZE    USE     AVAIL   %USE  VAR  PGS
  0   hdd 1.81940        0     0 B     0 B     0 B     0    0   0
  3   hdd 1.81940        0     0 B     0 B     0 B     0    0   0
  1   hdd 1.81940  1.00000 1.8 TiB 211 GiB 1.6 TiB 11.34 1.00   0
  2   hdd 1.81940  1.00000 1.8 TiB 210 GiB 1.6 TiB 11.28 1.00  64
                     TOTAL 3.6 TiB 421 GiB 3.2 TiB 11.31
 MIN/MAX VAR: 1.00/1.00  STDDEV: 0.03

 regards
 Amudhan

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Host failure trigger " Cannot allocate memory"