VMware VSphere High Availability (HA) is a super cool feature. It is different from the Vmotion. When doing vmotion a running guest is migrated from one ESX host to another ESX host. The condition that should be satisfied for the vmotion is that both hosts should be powered on. In case one ESXi Host fails and we have a production virtual machine running on that host, we wont be able to migrate that to another failover host. This is where the VMWare VSphere High Availability (HA) comes into play. The VSphere High Availability (HA) is a business continuity feature if a host fails and is configured for VSphere High Availability (HA) then the selected machines are started automatically on the another ESXi host.
So for eg. we have a DNS server running on ESXi-HOST1 but we want to upgrade the hardware of the ESXi-HOST1 so we migrate (VMotion) the DNS server to ESXi-HOST2. This is a manual process but its very quick and gives a zero down-time. But the limitation is that both the server needs to be operational. Now high availability as per its name is a real high availability feature. If by any chance any of the host goes down the machines configured for the VSphere High Availability (HA) will automatically be restarted on another host which is configured for VSphere High Availability (HA). So this is a very neat feature and ensures high uptime and maintains business continuity.
So in a nutshell – VSphere High Availability (HA) restarts the VM on another host in case of an host failure.
VSphere High Availability (HA) – My VCP 5 Notes :-
- It is also used for VM and application level failure.
- Previously the VSphere High Availability (HA) used AAM (Automated availability Manager) but now they use FDM (Fault Domain Manager)
FDM uses :-
Management network and storage devices for communication.
It also supports IPv6.
It also works well with the network partition and network isolation.
The VSphere High Availability (HA) agent runs on each ESXi host. This host is different from the VXPA agent running on the vcenter server. The directory location for the FDM is
and configuration files are stored at
As mentioned previously the VMware Vsphere High Availability (HA) works in Master Slave mode, An election takes place for the master and slave ( Its a deep-dive topic for HA). The master performs following roles:-
– Monitors the slave hosts and restart VMs.
– Monitor power state of all protected VMs.
– Manages lists of hosts that are member of the cluster and manages the process of adding and removing hosts.
– Manages lists of protected VMs, list is updated after each power on/off. The requests are made from the vcenter server.
– It caches the cluster configuration.
– Sends heart beat to slaves.
– Reports state information to the vCenter Server.
The Role of Slaves
– The slaves watch the state of the VMs running locally on that host.
– Monitor the health of the master and participate in the.
– They run vm-health monitoring.
Role of Vcenter in VMware VSphere High Availability (HA)
– It scans for master when a host is isolated or partitioned or when a master reports it cannot reach a slave.
So how does network isolation keep false triggering of VSphere High Availability (HA) off.
– The network isolation or the failure of management network could cause the heart beat to stop reaching the master. This would be an unnecessary restart of VM on the other host. So in case it uses the datastore heartbeat. The salve creates a special binary file called HOSTmax-poweron to alert the master that it is still powered on and no need to restart the VM on another host.
It no longer uses DNS. So it is also a nice upgrade from vsphere 4.1 to vsphere 5.
The requirements for VSphere High Availability (HA) :-
It has the same requirements as VMotion. Make sure you have the shared storage and identical network configuration.
If you are using a distributed switch then all the hosts should be connected to the vDS ( the distributed switch)
on a quick note it is easy and simple to verify whether VSphere High Availability (HA) will work is to test the vmotion. If vmotion works then HA will also work.
Another quick note is that when doing network maintenance stop the HA services else machines will be restarted