Virtual Machine Restart or VMM Panic "CoreDump error line 2160, error Cannot allocate memory"

Virtual machines could have been restarted by vSphere HA for several reasons.

Have you ever imagined a HA restart of VM happened when –

No host failure , No HA heart beat failure
No VM monitoring configured at HA – Virtual machine options.
Only one VM in a host (present in a HA cluster with 5 hosts) was restarted while rest of the VMs stay alive and healthy.
No host isolation has happened.

Event Message in vCenter :

“vSphere HA restarted this virtual machine “. No related events found.

Analyzing FDM.log in the host didn’t provided any sufficient information about the restart of a specific VM.

Analyzing vmkernel.log retrieved some information as below,

2016-12-23T13:53:35.921Z cpu25:46625)UserDump: 1907: Dumping cartel 46625(from world 46625) to file /vmfs/volumes/56692f88-ae779548-ad65-0025b501a007/vmname/vmx-zdump.000 ...
2016-12-23T13:53:45.615Z cpu1:46625)UserDump: 2031: Userworld coredump complete.
2016-12-23T13:53:45.626Z cpu6:46625)WARNING: World: vm 46625: 3973: VMMWorld group leader = 46626, members = 4

From above logs , the VM <> in its file location have got a zdump because of higher resource usage resulting in coredump operation.

Digging more onto vmware.log revealed some more errors,

2016-12-23T13:53:35.814Z| vmx| I120: VERIFY bora/lib/misc/strutil.c:1079
2016-12-23T13:53:45.615Z| vmx| W110: A core file is available in "/vmfs/volumes/56692f88-ae779548-ad65-0025b501a007/VMNAME/vmx-zdump.000"
2016-12-23T13:53:45.615Z| vmx| W110: Writing monitor corefile "/vmfs/volumes/56692f88-ae779548-ad65-0025b501a007/VMNAME/vmmcores.gz"
2016-12-23T13:53:45.620Z| vmx| W110: CoreDump error line 2160, error Cannot allocate memory

This happens when there are two specific reasons.

Reason 1:

The Virtual machine is running SAP host agent which in turn ruins up the VMX memory handling process causing the Guest OS to hang or a VM failure with core dumping operation.

This is because , SAP host agent is trying to retrieve metrics from the ESXi host through the virtual machine causing a memory leak in the ESXi host resulting in total memory exhaustion.

Affected Products : ESXi 5.5 U3 b , ESXi 6.0, ESXi 6.0u1, or ESXi 6.0u1a

This isssue is fixed in ESXi 6.0U1b

Resolution:

Perform vmotion operation in virtual machine to fix it temporarily for the moment or upgrade the ESXi to 6.0U1b

Reference:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2137310

Reason 2 :

This reason applies when additionally you see below errors in ESXi vobd.log file.

"2016-12-23T13:53:45.615Z: [UserWorldCorrelator] 2379432939456us: [vob.uw.core.dumped] /bin/vmx(46625) /vmfs/volumes/56692f88-ae779548-ad65-0025b501a007/vmname/vmx-zdump.000
2016-12-23T13:53:45.615Z: [UserWorldCorrelator] 2379441994799us: [esx.problem.application.core.dumped] An application (/bin/vmx) running on ESXi host has crashed (1 time(s) so far). A core file may have been created at /vmfs/volumes/56692f88-ae779548-ad65-0025b501a007/vmname/vmx-zdump.000."

The above error reports that it is a faulty CPU present in the ESXi host.

You can follow below KB for checking which CPU core is reporting error while thread is being cloned.

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002771

Resolution:

Check and replace the faulty CPU if needed.

Share this:

Related

Leave a comment Cancel reply