Deep Dive: vSphere Network Load Balancing

In vSphere load balancing is a hot topic.   As load size per physical host increases so does the need for more bandwidth.  In a traditional sense this was done with etherchannel or LACP.  This bonds together multiple links so they link and act like a single link.   This helps avoid loops.

What the heck is a loop?

A loop is anytime two layer 2 (ethernet) endpoints have multiple connections to each other.

 

It is possible with two virtual switches to create a bridged loop if care is not taken.   Virtual switches by default will not create loops.  On the physical switch side protocols like spanning tree were created to solve this link issue.  STP disables a link if a loop is detected.  If the enabled link goes down STP turns on the disabled link.   This process works for redundancy but does not do anything if link 1 is not a big enough pipe to handle the full load.    VMware has  provided a number of load balancing algorithms to provide more bandwidth.

Options

  • Route Based on Originating virtual port (Default)
  • Route Based on IP Hash
  • Route Based on Source MAC Hash
  • Route Based on Physical NIC Load (LBT)
  • Use Explicit Failover Order

 

In order to explain each of these options assume we have a ESXi host with two physical network cards called nic1 and nic2.   It’s important to understand that the load balancing options can be configured at the network switch or port group level allowing for lots of different load balancing on the same server.

Route Based on Originating virtual port (Default)

The physical nic to be used is determined by the ID of the virtual port to which the VM is connected.  Each virtual machine is connected to a virtual switch which has a number of virtual ports, each port has a number.   Once assigned the port does not change unless the host changes ESXi hosts.  This number is the virtual ID.   I don’t know the exact method used but I assume it’s something as simple and odd’s and evens for two nics.  Everything odd goes to port 1 while even goes to port 0.  This method has the lowest overhead from a virtual switch processing, and works with any network configuration.  It does not require any special physical switch configuration.  You can see though it does not really load balance.  Lets assume you have a lot of port groups with only virtual machine on port 0.  In this case all virtual machines would use the same uplink leaving the other unused.

Route Based on IP Hash

The physical nic to be used is determined by a hash of the source and destination IP address.   This method provides load balancing to multiple physical network cards from a single virtual machine.  It’s the only method that allows a single virtual machine to use the bandwidth of multiple physical nics.  It has one major draw back the physical switches must be configured to use etherchannel (802.3ad link aggregation) so they present both network links as a single link to avoid problems.   This is a major design choice.  It also does not provide perfect load balancing.  Lets assume that you have a application server that does 80% of it’s traffic with a database server.  Their communication will always happen across the same link.  They will never use the bandwidth of two links.  Their hash will always assign them the same link. In addition this method uses a lot of CPU.

  • When using etherchannel only a single switch may be used
  • Beacon probing is not supported on IP Hash
  • vDS is required for LACP
  • Troubleshooting is difficult because each destination/source combination may take a different path.  (Some virtual machine paths may work with others will not in a non-consistent pattern.)

Route Based on Source Mac Hash

The physical nic to be used is determined by a hash created from the virtual machines source address.  This method provides a more balanced approach to load balancing than originating virtual port.  Each virtual machine will always use only a single link but load will be distributed.  This method has a low CPU overhead and does not require any physical switch configuration

Route Based on Physical NIC Load (Distributed Virtual Switch Required also called LBT)

The physical nic to be used is determined by load.  The nics are used in order (nic1 then nic2)  No traffic will be moved to nic2 untile nic1 is utilized above 75% capacity for 30 seconds.  Once this is achieved traffic flows are moved to the next available nic.  They will stay at that nic until another LBT event happens moving traffic.   LBT does require the dVS and some CPU overhead.  It does not allow a single virtual machine to gain more than 100% of a single link speed.   It does balance traffic among all links during times of contention.

Use Explicit Fail over

The physical nic to be used is determined by being the highest nic on the list of available nics.  The others will not be used unless the first nic is unavailable.  This method does no load balancing and should only be used is very special cases (link multi-nic vMotion).

 

Design Advice

Which one should you use?  It depends on your need.  Recently a friend told me they never changed the default because they never get close to using a single link.   While this method has merit and I wish more people understood their network metrics you may need to plan for the future.  There are two questions I use to determine which to use:

  • Do you have any virtual machines that alone require more than a single links bandwidth? (If yes then the only option is IP Hash and LACP or etherchannel)
  • Do you have vDS’s? (If yes then use Route based on physical nic load, if no then use default or source MAC)

Simply put the LBT is a lot more manageable and easy to configure.

2 Replies to “Deep Dive: vSphere Network Load Balancing”

  1. Loops and STP are a complete non-issue in ESXi context. Load balancing is, as you point out, but the first loop picture is completely pointless, as such a thing will never happen with ESXi. The ESXi networking stack will never accept a packet from the outside and send it back to the outside. Hence, no loops, ever.

    1. Bert,

      Thanks for the comment. You are correct. The picture shown was originally to be used for the IP Hash discussion. I have cleaned up the article a little. Thanks for spotting my mistaken wording and correcting me.

      Joseph

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.