Re: openstack Vm shutoff by itself - Dev

27 Nov 2023

++adding
@ceph-users-confirm+4555fdc6282a38c849f4d27a40339f1b7e4bde74@ceph.io
&lt;ceph-users-confirm+4555fdc6282a38c849f4d27a40339f1b7e4bde74(a)ceph.io&gt;
++Adding dev(a)ceph.io

Thanks,&, Regards
Arihant Jain

On Mon, 27 Nov, 2023, 7:48 am AJ_ sunny, &lt;jains8550(a)gmail.com&gt; wrote:

...
  Hi team,

 After doing above changes I am still getting the issue in which machine
 continuously went shutdown

 In nova-compute logs I am getting only this footprint

 Logs:-
 2023-10-16 08:48:10.971 7 WARNING nova.compute.manager
 [req-c7b731db-2b61-400e-917f-8645c9984696 f226d81a45dd46488fb2e19515 848
 316d215042914de190f5f9e1c8466bf0 default default] [instance:
 4b04d3f1-1fbd-4b63-b693-a0ef316ecff3] Received unexpected - vent
 network-vif-plugged-f191f6c8-dff5-4c1b-94b3-8d91aa6ff5ac for instance with
 vm_state active and task_state None. 2023-10-21 22:42:44.589 7 INFO
 nova.compute.manager [-] [instance: 4b04d3f1-1fbd-4b63-b693-a0ef316ecff3]
 VM Stopped (Lifecyc Event)

 2023-10-21 22:42:44.683 7 INFO nova.compute.manager
 [req-1d99b87b-7ff7-462d-ab18-fbdec6bda71d -] [instance: 4b04d3f1-
 fbd-4b63-b693-a0ef316ecff3] During _sync_instance_power_state the DB
 power_state (1) does not match the vm_power_state from ti e hypervisor (4).
 Updating power_state in the DB to match the hypervisor.

 2023-10-21 22:42:44.811 7 WARNING nova.compute.manager
 [req-1d99b87b-7ff7-462d-ab18-fbdec6bda71d ----] [instance: 4b04d3f
 1-1fbd-4b63-b693-a0ef316ecff3] Instance shutdown by itself. Calling the
 stop API. Current vm_state: active, current task_state : None, original DB
 power_state: 1, current VM power_state: 4 2023-10-21 22:42:44.977 7 INFO
 nova.compute.manager [req-1d99b87b-7ff7-462d-ab18-fbdec6bda71d -]
 [instance: 4b04d3f1-1

 fbd-4b63-b693-a0ef316ecff3] Instance is already powered off in the
 hypervisor when stop is called.

 And in this architecture we are using ceph is the backend storage for
 Nova,glance & cinder
 When machine auto goes down and if i try to start the machine it will go
 in error i.e. in Vm console is show I/O ERROR during boot so first we need
 to rebuild the volume from ceph side then I have to start the machine
 Rbd object-map rebuild<volume-id>
 Openstack server start <server-id>

 So this issue is showing two faces one from ceph side and another from
 nova-compute log
 can someone please help me out to fix out this issue asap

 Thanks & Regards
 Arihant Jain

 On Tue, 24 Oct, 2023, 4:56 pm , &lt;smooney(a)redhat.com&gt; wrote:

> On Tue, 2023-10-24 at 10:11 +0530, AJ_ sunny wrote:
> > Hi team,
> >
> > Vm is not shutting off by owner from inside its automatically went to
> > shutdown i.e. libvirt lifecycle stop event triggering
> > In my  nova.conf configuration I am using ram_allocation_ratio = 1.5
> > And previously I tried to set in nova.conf
> > Sync_power_state_interval = -1 but still facing the same problem
> > OOM might be causing this issue
> > Can you please give me some idea to fix this issue if OOM is the cause
> the general answer is swap.
>
> nova should alwasy be deployed with swap even if you do not have over
> commit enabled.
> there are a few reason for this the first being python allocates memory
> diffently if
> any swap is aviable, even 1G is enough to have it not try to commit all
> memory. so
> when swap is aviable the nova/neutron agents will use much less resident
> memeory even with
> out usign any of the swap space.
>
> we have some docs about this downstream
>
>
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17…
>
> if you are being ultra conservative we recommend allocating (ram *
> allocation ratio) in swap so in your case allcoate
> 1.5 times your ram as swap. we woudl expect the actul useage of swap to
> be a small fraction of that however so we
> also provide a formula for
>
>     overcommit_ratio = NovaRAMAllocationRatio - 1
>     Minimum swap size (MB) = (total_RAM * overcommit_ratio) +
> RHEL_min_swap
>     Recommended swap size (MB) = total_RAM * (overcommit_ratio +
> percentage_of_RAM_to_use_for_swap)
>
> so say your host had 64G of ram with an allocation ratio of 1.5 and a min
> swap percentaiong of 25%
> the conserviver swap recommentation would be
>
> (64*(0.5+0.25)) + disto_min_swap
> (64*0.75) + 4G = 52G of recommended swap
>
> if your wondering why we add a min swap precentage and disto min swap its
> basically to acocund for the
> Qemu and host OS memory overhead as well as the memory used by the
> nova/neutron agents and libvirt/ovs
>
>
> if your not using memory over commit my general recommdation is if you
> have less then 64G of ram allcoate 16G if you
> have more then 256G of ram allocate 64G and you should be fine. when you
> do use memofy over commit you must
> have at least enouch swap to account for the qemu overhead of all
> instance + the over committed memory.
>
>
> the other common cause of OOM errors is if you are using numa affinity
> and the guest dont request
> hw:mem_page_size=<something> without setting a mem_page_size request we
> dont do numa aware memory placement. the kernel
> OOM system works
> on a per numa node basis, numa affintiy does not supprot memory over
> commit either so that is likly not your issue.
> i jsut said i woudl mention it to cover all basis.
>
> regards
> sean
>
>
>
> >
> >
> > Thanks & Regards
> > Arihant Jain
> >
> > On Mon, 23 Oct, 2023, 11:29 pm , &lt;smooney(a)redhat.com&gt; wrote:
> >
> > > On Mon, 2023-10-23 at 13:19 -0400, Jonathan Proulx wrote:
> > > >
> > > > I've seen similar log traces with overcommitted memory when the
> > > > hypervisor runs out of physical memory and OOM killer gets the VM
> > > > process.
> > > >
> > > > This is an unusuall configuration (I think) but if the VM owner
> claims
> > > > they didn't power down the VM internally you might look at the
local
> > > > hypevisor logs to see if the VM process crashed or was killed for
> some
> > > > other reason.
> > > yep OOM events are one common causes fo this.
> > >
> > > nova is bacialy just saying "hay you said this vm should be active
> but its
> > > not, im going to update the db to reflect
> > > reality." you can turn that off with
> > >
> > >
>
https://docs.openstack.org/nova/latest/configuration/config.html#workaround…
> > > or
> > >
> > >
>
https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.sy…
> > > either disabel the sync via setign the interval to -1
> > > or disable haneling the virt lifecycle events.
> > >
> > > i would recommend the sync_power_state_interval approach but again if
> vms
> > > are stopping
> > > and you dont know why you likely should discover why rahter then just
> > > turning if the update of the nova db to reflect
> > > the actual sate.
> > >
> > > >
> > > > -Jon
> > > >
> > > > On Mon, Oct 23, 2023 at 02:02:26PM +0100, smooney(a)redhat.com wrote:
> > > > :On Mon, 2023-10-23 at 17:45 +0530, AJ_ sunny wrote:
> > > > :> Hi team,
> > > > :>
> > > > :> I am using openstack kolla ansible on wallaby version and
> currently I
> > > am
> > > > :> facing issue with virtual machine, vm is shutoff by itself and
> and
> > > from log
> > > > :> it seems libvirt lifecycle stop event triggering again and
again
> > > > :>
> > > > :> Logs:-
> > > > :> 2023-10-16 08:48:10.971 7 WARNING nova.compute.manager
> > > > :> [req-c7b731db-2b61-400e-917f-8645c9984696
> f226d81a45dd46488fb2e19515
> > > 848
> > > > :> 316d215042914de190f5f9e1c8466bf0 default default] [instance:
> > > > :> 4b04d3f1-1fbd-4b63-b693-a0ef316ecff3] Received unexpected -
vent
> > > > :> network-vif-plugged-f191f6c8-dff5-4c1b-94b3-8d91aa6ff5ac for
> instance
> > > with
> > > > :> vm_state active and task_state None. 2023-10-21 22:42:44.589 7
> INFO
> > > > :> nova.compute.manager [-] [instance:
> > > 4b04d3f1-1fbd-4b63-b693-a0ef316ecff3]
> > > > :> VM Stopped (Lifecyc Event)
> > > > :>
> > > > :> 2023-10-21 22:42:44.683 7 INFO nova.compute.manager
> > > > :> [req-1d99b87b-7ff7-462d-ab18-fbdec6bda71d -] [instance:
4b04d3f1-
> > > > :> fbd-4b63-b693-a0ef316ecff3] During _sync_instance_power_state
> the DB
> > > > :> power_state (1) does not match the vm_power_state from ti e
> > > hypervisor (4).
> > > > :> Updating power_state in the DB to match the hypervisor.
> > > > :>
> > > > :> 2023-10-21 22:42:44.811 7 WARNING nova.compute.manager
> > > > :> [req-1d99b87b-7ff7-462d-ab18-fbdec6bda71d ----] [instance:
> 4b04d3f
> > > > :> 1-1fbd-4b63-b693-a0ef316ecff3] Instance shutdown by itself.
> Calling
> > > the
> > > > :> stop API. Current vm_state: active, current task_state : None,
> > > original DB
> > > > :> power_state: 1, current VM power_state: 4 2023-10-21
> 22:42:44.977 7
> > > INFO
> > > > :> nova.compute.manager [req-1d99b87b-7ff7-462d-ab18-fbdec6bda71d
-]
> > > > :> [instance: 4b04d3f1-1
> > > > :>
> > > > :> fbd-4b63-b693-a0ef316ecff3] Instance is already powered off in
> the
> > > > :> hypervisor when stop is called.
> > > > :
> > > > :that sounds like the guest os shutdown the vm.
> > > > :i.e. somethign in the guest ran sudo poweroff
> > > > :then nova detected teh vm was stoped by the user and updated its
> db to
> > > match
> > > > :that.
> > > > :
> > > > :that is the expected beahvior wehn you have the power sync enabled.
> > > > :it is enabled by default.
> > > > :>
> > > > :>
> > > > :> Thanks & Regards
> > > > :> Arihant Jain
> > > > :> +91 8299719369
> > > > :
> > > >
> > >
> > >
>
>