VMware has lots of great options and features. Filtering through all the best practices combined with legacy knowledge can be a real challenge. I envy people starting with VMware now they don’t have knowledge of all the things that were broken on 3.5, 4.0, 4.1 etc… It’s been a great journey but you have to be careful not to let the legacy knowledge influence the design of today. In this design I will provide a radically simple solution to networking with VMware.
You have been given a VMware cluster running on HP blades. Each blade has a total of 20GB’s of potential bandwidth that can be divided anyway you want. You should make management of this solution easy and provide as much bandwidth as possible to each traffic type. You have the following traffic types:
- Fault Tolerance
- Virtual machine
Your storage is fiber channel and not in scope for the network design. You chassis is connected to a two upstream switches that are stacked. You cannot configure the switches beyond assigning vlans.
This design takes into account the following assumptions:
- Etherchannel and LAG are not desired or available
- You have enterprise plus licensing and vcenter
Physical NIC/switch Design:
We want a simple solution with maximum available bandwidth. This means we should use two 10Gb nic’s on our blades. The connections to the switch for each nic should be identical (exact same vlans) and include the vlans for management, FT, vMotion, backup and all virtual machines. Each with their own vlan ID for security purposes. This solution provides the following benefits:
- Maximum bandwidth available to all traffic types
- Easy configuration on the switch and nics (identical configuration)
The one major draw back to this solution is some environments require physical separation of traffic and require traffic to be segregated by nics.
Virtual Switch Design:
On the virtual switch side we will use a dVS. In the past there has been major concerns with using a dVS for management and virtual center. There are a number of chicken and the egg scenarios that come into play. If you still have concerns then make the port group for vCenter ephemeral so it does not need vcenter to allocate ports. Otherwise vDS brings a lot to the table over standard switches including:
- Centralized consistent configuration
- Traffic Shaping with NIOC
- Load based teaming
- dVS automatic health check
The first thing to understand about traffic shaping in VMware is it can only have effect ingress traffic and is unique to each host. We use a numeric value known as a share to enforce traffic shaping. These share values are only used during time of contention by default. This unique ability allows you to ensure nothing uses 100% of a link while other neighbors want access to the link. This is a unique and awesome feature that automates traffic policing in VMware solutions. You can read about the default NIOC pools here. I suggest you leave the default pools in place with their default values and then add a custom pool for backup. Traffic shares are applied a value from 1 to 100. Another design factor is that traffic that is not in use is not applied to the share algorithm. For example assume the following:
You would assume that the total shares would be 10+25+25+50 = 110 but if you are not using any FT traffic then it’s 10+25+50=95 either way this number can be divided by total bandwidth so worst case scenario with 100% contention with all traffic types would get the following:
- Management (20/110=.18*10) 1.8 GB
- FT (20/110=.18*25) 4.5 GB
- vMotion (20/110=.18*25) 4.5GB
- Virtual machine (20/110=.18*50) 9 GB
And remember this is per host. You will want to adjust the default settings to fit your requirements and traffic patterns.
This design has some real advantages:
- The vmotion nic is seen as 10GB which means you can do 8 concurrent vmotions at the same time
- No more wasted bandwidth
- Easy to setup and forget about
Load balancing algorithms in vSphere each have their own personality and physical requirements. But we want simple above everything else. So we choose to use Load Balanced teaming (LBT) known as physical nic load in vDS. This is a great choice for enterprise plus customers. It allocates usage of any one link to 80%, Once 80% is reached then some of the traffic is moved over to the next link. This configuration will work with any number of uplinks without any configuration on the physical switch. We avoid loops because unique traffic does not share uplinks. For example virtual machine 1 will use uplink1 exclusively while virtual machine 2 uses uplink2. With this load balancing method we don’t have to assign different uplink priorities to port groups in order to balance traffic just let LBT handle it. It is 100% fire and forget. If you find you need more bandwidth just add more uplinks to the switch and you will be using it.
Radically simple networking
It’s simple and it works. Here is a simple diagram of the solution:
Once setup it scales and provides for all your needs. It’s consistent clean and designed around possible failures. It allows all traffic types to use as much network as needed unless contention is present. Just think of it as DRS for networking. I just wish I could handle my physical switches this way… maybe some day NSX.