Lost heartbeat errors can have many different root causes.
Starting with ESX 3.5 Update 2 and later, memory leaks are reported in Pegasus (cimserver). Over time, memory leaks lead to the cimserver process occupying the available memory and swap space. Some memory leaks might appear on specific hardware models or might depend on how frequently the CIM server is queried. This issue causes the service console to fail, causing ESX to fail with a Lost Heartbeat error message.
Note: This issue affects ESX 3.5 Update 2, Update 3, Update 4 and Update 5. All other versions of ESX (including ESXi) are not affected. This issue does not affect ESX 4.0 and ESXi 4.0 as Pegasus component is not used.
To workaround this issue, periodically restart the Pegasus service process so that any excessive memory which is being used is freed.
To schedule a daily service restart at midnight:
Log into the ESX host as root at the console or via SSH. For more information, see Unable to connect to an ESX host using Secure Shell (SSH) (1003807).
At the root shell prompt, run the following command to edit the root crontab:
Note: This opens a vi editor.