Network IO Control failing to shape traffic with multi-nic vMotion

NIOCI have always used network IO Control to shape my traffic on a virtual switch (Enterprise Plus required).   It does a great job of balancing traffic when contention comes into play.  Unfortunately it cannot shape traffic as it comes into the virtual switch.  It can only shape traffic going out.   A new friend (@VMPrime)  pointed this out to me at the perfect time.   He was knowledgeable and encouraging an all around great guy.   He pointed out that NIOC only have effect on the machine during traffic flows exiting the machine.  When traffic goes to another machine NIOC has no effect.  I remember reading about this but the terminology was a little fuzzy from a VMware perspective.  Joe provided a simple scenario when that lack of control could be a problem.  Take into account the following scenario.  Assume that we have a two host cluster each running two 10GB nics.  We have a vlan for management, virtual machines and we have setup multi-nic vMotion as shown in the diagram 1 below.   We have NIOC setup with shares to protect each traffic type during contention.  Assume that the network utilization of host A is 2GB.  While the network utilization of host b is 15GB.Assuming that host B has capacity for all of host A virtual workloads I put host A into maintenance mode host A now utilizes up to 18GB of network to transfer the running state of virtual machines to host B.  Host A’s NIOC kicks in preserving 2GB for virtual machines and allocated 18GB to vMotion to Host B.  We are now shoving 18GB into Host B who’s virtual machine need 15GB’s.  Now both sides are contending for space and we might have availability issues on our host in addition the vMotion might fail.

How do we solve this issue?

This is exactly why we have Network Limits. Unlike CPU and memory Limits NIOC limits can really help with this exact issue. Putting a limit on vMotion of for example 2.5 GB per link would create a scenario when it could never use more than 5GB per host. Will this still have an effect? Maybe it’s a cost benefit anaylisis. You have to weigh your options and you might have to adjust you limit lower.



