I want to let you in on a little secret of NSX called Traceflow. It was made available in the 6.2 release and I am in love with it. In order to explain my love let’s do a history lesson a fantastic read :
History Lesson (Get off my lawn kids time)
Back in the old days (pretty much right now in every enterprise) you had a bunch of switches, routers and firewalls. When a server was having a problem communicating with another server you had to trace its MAC address through every hop manually. You might be lucky and use a SIEM to identify if a firewall was dropping the traffic. Understanding each hop of the traffic is a pain. It takes time and can be very complex in enterprise implementations.
Enter NSX
NSX does some complex routing, switching and firewalling. Your visibility into the process in the past was articles like mine. With traceflow you can prove your theory and identify data paths. It still does not have visibility beyond the NSX world and into the physical. Hopefully some day we will have that too. Traceflow can get you pretty close.
Where is this traceflow of which you speak?
Login to vCenter, select networking and security and it’s on the right side most of the way down. It allows you to select a source and a destination then inject packets. The NSX components report back as the injected packet passes by allowing you to trace the flow of communication.
Show me some meat
Sounds good. Lets assume we have two virtual machines 172.16.0.2 and 172.16.0.3 both on VNI (think vlan) 5000. They are on the same ESXi host. There are no firewall rules blocking traffic. Here is the output from traceflow:
Look at that. The injected packet came from 172.16.0.2 and hit the vNIC FW then was forwarded directly to 172.16.0.3’s vNIC firewall and into the machine. This is simple and exactly what we expect. Let do the exact same thing except move the second machine to another ESXi host:
Now we have added the VTEP (virtual tunnel end point) connection between ESXi hosts. VTEP communication is layer 3 between ESXi hosts creating a stretch of VNI 5000 between distances or right next to each other.
Neat meat but it really only shows layer 2 communication that’s easy
How about some routing then. Two virtual machines 172.16.0.2 VNI 5000 and 172.16.10.2 VNI 5001. Each on the same ESXi host:
Look at that now we see the logical router in the mix taking the traffic from Logical switch (LS-172.16.0) and routing it to Logical router LS-172.16.10. Suddenly the flow of traffic is not a mystery.
What about if the firewall is blocking the traffic?
I assumed you would ask so here is a new firewall rule I added:
And the traceflow:
Yep my packet was dropped and it tells me where and what rule number blocked it.
What is the only problem with traceflow?
That is does not show the traffic flow on my physical network. This should be very simple given that all my traffic for NSX is routed we should not have complex layer 2 stretches or lots of vlans to ensure are in place. It’s just routed communication that can start at top of rack with the correct design.