ESXi 4.0 host configures HA but displays a health check error after upgrading to vCenter Server 4.1
Prior to performing the update to vCenter Server 4.1, HA was configured and working on the ESXi hosts. Post-upgrade, you experience these symptoms:
The ESXi host reconfigures successfully for HA, but immediately displays an error
In the Summary tab, you see the error:
HA agent on in cluster has an error: error while running health check script
The /var/log/vmware/vpx/vpxa.log contains messages similar to:
cmd=monitornodes -domain=vmware failed with error 3
The issue occurs if the HA agents on the ESXi hosts are not upgraded properly. For the hosts already experiencing the problem, the agents must be replaced with a correct one.
This issue is resolved in vSphere 4.1 Update 1 and vSphere 4.0 Update 3. For information on updating your vCenter Server and ESXi host to vSphere 4.1 Update 1, see Upgrading vCenter Server, Update Manager and ESX/ESXi to vSphere 4.1 Update 1 (1034497).
To workaround this issue, replace the agents using one of these options:
Re-install HA agents via vCenter Server:
Put the affected host into maintenance mode.
Remove the host from the vCenter Server inventory.
Without rebooting the host, add the host back into a HA cluster within vCenter Server.
Exit maintenance mode.
Re-install HA agents manually on the host:
Within the vSphere client connected to vCenter, right click on the host and choose disconnect
Log into the ESXi host using Tech Support Mode. For more information, see Tech Support Mode for Emergency Support (1003677).
Run these commands to uninstall the vCenter and HA agents from the ESXi host: