So I have known about network partitions for a long time mostly due to the vsphere clustering deep dive one of the best vsphere books around. But I was having a little trouble figuring out how it determines a machine is isolated vs partitioned. Since this is a critical point in determining failure on ToR configurations I dug into it. The book mentioned above gave a really good back ground and overview. It also eluded to some answers. Here are the basics:
The ESXi host master (determined by machine with most datastores (then by moid)) communicates with each host via management network. When this fails it trys to ping it’s managment ip. Assuming that you are running 5.0 it try’s to identify if the host has released it’s lock on datastores (storage heartbeat) Assuming the storage heartbeat is successful then the following two states could happen: (if datastore heartbeat fails then it assumes host has failed and restarts vm’s)
- Host is isolated (all alone cannot see any other ESXi hosts and will use it’s host isolation response after the timeout)
- Host is partitioned (can see other ESXi hosts but not the master)
If a host is partitioned then after the timeout (timing listed in the book above) a master election takes place. Then a master is selected inside the partition. The master will attempt to protect the virtual machines in it’s new protection domain. This is not always possible because the original master may still be protecting the virtual machines (holding a lock on them via file system – lock is on .vmx file not .vmdk) Any new virtual machine may or may not be protected. Due to these issues you may have to plan your design and host isolation response correctly.
In a partitioned state host isolation response is not inacted… it is only used on isolations.
Hi,
How does Admission control policy behaves with network partition?
Thanks
This depends on the admission control policy used. Admission control is a construct of vCenter and not ESXi so if it’s partitioned and cannot talk to vCenter then no admission control is enforced. If the partition can talk to vCenter and it’s using slot size the same equation is used taking into account all cluster members sizing (or manual slot) before allowing power on operations. Percent based just reserves a percent on each host nothing more so it works the same as non-partition situations. The key being that admission control requires vCenter and really only stops power on events not in response to HA.