What does apply to mean in NSX Firewall?

When I first started using NSX I ran into this little problem.   What does apply to mean and how should I use it?

Background

I believe the background for the apply to is from physical firewalls.   They allowed you to apply rules to a specific interface.   Applying to an interface had the following effects:

  • Limit the number of rules that have to be processed
  • Allow specific fine-grained controls

Applying rules to specific interfaces had a few issues:

  • You had to have a good understanding of the network topology in order apply rules correctly
  • New interfaces may be missed by rules

You also had the ability to apply the rule to all interfaces that existed.   On the surface if you had enough hardware to apply the rules everywhere it worked great.  Tons of interfaces who didn’t need the rules now had them.    There are a few problems:

  • New interfaces would have no rules and all rules would have to be applyed to them
  • These rules exist only on a single firewall rule creation is specific to that firewall

NSX Firewall

The NSX firewall takes a similar approach to firewall application.  All firewall rules are created in NSX manager and stored inside the NSX manager database.   By default rules are applied to the “distributed firewall”.  This will apply the rules to all virtual machines vNIC, regardless of the virtual machines location.   This creates the same problem as applying on every interface, each vNIC will have a long list of rules to attempt to match.

This is where the apply to tag becomes interesting.   In order to explain I’ll use a simple example:

Two virtual machines: 172.16.0.2 on VNI 5000 and 172.16.20.2 on VNI 5002.

My default firewall rule set allows them to communicate without any issues.  Let assume I want to block all traffic between these machines so I create the following rule:

pic1

Source:  172.16.0.2 virtual machine

Destination: 172.16.20.2 virtual machine

Service: Any

Action: Block

Apply to: Distributed firewall (default)

 

Using Traceflow we can identify where it was blocked:

pic2

You can see clearly the default of distributed switch applied the drop action to the source.   This is really great because it limits the traffic on the physical wire.   Since the object is known as a managed object in NSX the rule is enforced as soon as possible.   If you have a physical entity that is not managed by NSX the rule will be applied upon the destination.   This is hard to prove because traceflow cannot provide visibility to physical entities.

What does apply to do?

Simply put it tells NSX where to apply the firewall rule.  Lets examine some of the options for my rule above:

  • Host
  • Cluster
  • Virtual machine
  • IP or Mac set
  • etc..

It provides the full list of objects that DFW rules can made with including dynamic sets and tags.   This is really powerful.   For the sake of this example lets apply the rule to the destination virtual machine instead of the DFW.

pic1

Using traceflow we can see the results:

pic2

My attempted connection was dropped at the destination where I applied the firewall rule.    You can also see how it between 7 and 8 the message left host 3 and went across my physical network to host 1 (black hole of visibility)

Why use the apply to feature?

  • Reduce the amount of rules applied to each vNIC
  • Enforce the rule at a specific location (think situations with VM overlap or rule overlap)

Apply to does add to the complexity of the environment and troubleshooting but can limit scope.   This is where careful planning and understanding of the environment can really help.   Arkin can help as well but that’s another days post.

Greatest tool for NSX!

I want to let you in on a little secret of NSX called Traceflow.   It was made available in the 6.2 release and I am in love with it.   In order to explain my love let’s do a history lesson a fantastic read :

History Lesson (Get off my lawn kids time)

Back in the old days (pretty much right now in every enterprise) you had a bunch of switches, routers and firewalls.   When a server was having a problem communicating with another server you had to trace its MAC address through every hop manually.   You might be lucky and use a SIEM to identify if a firewall was dropping the traffic.   Understanding each hop of the traffic is a pain.    It takes time and can be very complex in enterprise implementations.

Enter NSX

NSX does some complex routing, switching and firewalling.   Your visibility into the process in the past was articles like mine.   With traceflow you can prove your theory and identify data paths.    It still does not have visibility beyond the NSX world and into the physical.   Hopefully some day we will have that too.   Traceflow can get you pretty close.

Where is this traceflow of which you speak?

Login to vCenter, select networking and security and it’s on the right side most of the way down.   It allows you to select a source and a destination then inject packets.   The NSX components report back as the injected packet passes by allowing you to trace the flow of communication.

Show me some meat

Sounds good.  Lets assume we have two virtual machines 172.16.0.2 and 172.16.0.3 both on VNI (think vlan) 5000.   They are on the same ESXi host.   There are no firewall rules blocking traffic.   Here is the output from traceflow:

first_same_host

Look at that.  The injected packet came from 172.16.0.2 and hit the vNIC FW then was forwarded directly to 172.16.0.3’s vNIC firewall and into the machine.   This is simple and exactly what we expect.  Let do the exact same thing except move the second machine to another ESXi host:

second_diff_hosts

Now we have added the VTEP (virtual tunnel end point) connection between ESXi hosts.  VTEP communication is layer 3 between ESXi hosts creating a stretch of VNI 5000 between distances or right next to each other.

Neat meat but it really only shows layer 2 communication that’s easy

How about some routing then.  Two virtual machines 172.16.0.2 VNI 5000 and 172.16.10.2 VNI 5001.   Each on the same ESXi host:

Third_usingtwo_networks

Look at that now we see the logical router in the mix taking the traffic from Logical switch (LS-172.16.0) and routing it to Logical router LS-172.16.10.   Suddenly the flow of traffic is not a mystery.

What about if the firewall is blocking the traffic?

I assumed you would ask so here is a new firewall rule I added:

rule

And the traceflow:

after_fw_rule_added_1

Yep my packet was dropped and it tells me where and what rule number blocked it.

What is the only problem with traceflow?

That is does not show the traffic flow on my physical network.   This should be very simple given that all my traffic for NSX is routed we should not have complex layer 2 stretches or lots of vlans to ensure are in place.   It’s just routed communication that can start at top of rack with the correct design.

Network Protocol: Mac Address

Mac Addresses – Data Layer

Each physical network interface card (NIC) has a unique identifer assigned to that NIC.   This unique identifier is called a Mac Address.  A mac address contains a vendor ID and a serial number and is made up of 12 hexidecimal characters.  Mac addresses are part of the Data layer of the OSI model and used heavily in Ethernet node to node transmissions.  Each nic responds to two addresses; it’s own unique address and the broadcast address of ff:ff:ff:ff:ff:ff.   In linux you can display your MAC address by using the command ifconfig -a.  It will display something similar to this:

ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:0E:A6:7A:19:E1
inet addr:192.168.10.10  Bcast:192.168.10.255  Mask:255.255.255.0
inet6 addr: fe80::20e:a6ff:fe7a:19e1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:42708883 errors:0 dropped:0 overruns:0 frame:0
TX packets:167206053 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3679921735 (3509.4 Mb)  TX bytes:879524352 (838.7 Mb)
Interrupt:201 Base address:0xe000

The HWaddr is the current mac address.  Most modern NIC’s support MAC address spoofing this can be done in almost any operating system.  Since mac addresses can be spoofed quickly it is not a good method for secure authentication of machines.  A lot of wireless routers use mac addresses as a method of access control this alone should not be the method of access control.    In Linux ifconfig allows you to change your mac address.  For example:

ifconfig hw ether 00:01:02:72:45:C2

Would allow me to change my mac address to 00:01:02:72:45:C2.

Network Protocol: IP

IP – Network Layer

Internet Protocol is the end to end addressing system used on the Internet with the Domain name system to identify unique hosts.

Host and Net

An IP address is made up of two parts a host and a network. The host identifies the individual computer while the network provide routing information. An example would be :

  • 192.168.0.1 with a subnet mask of 192.168.0.255

This address has a total of 255 total addresses

  • The network is 192.168.0
  • The Host is 1

Subnet Masks

Subnet masks are used to divide up sections of IP addresses into smaller logical elements. It is used by CIDR to handle routing. It splits the host portion into a smaller unit. Since all IP addresses are represented by binary numbers it takes additional space. This type of subnetting is identified with a slash notation.

Subnet Mask Slash Notation
Dotted Decimal Format Binary Slash Format Available Addresses
255.255.255.0 11111111 11111111 11111111 00000000 /24 256
255.255.255.128 11111111 11111111 11111111 10000000 /25 128
255.255.255.192 11111111 11111111 11111111 11000000 /26 64
255.255.255.224 11111111 11111111 11111111 11100000 /27 32
255.255.255.240 11111111 11111111 11111111 11110000 /28 16
255.255.255.248 11111111 11111111 11111111 11111000 /29 8
255.255.255.252 11111111 11111111 11111111 11111100 /30 4
255.255.252.0 11111111 11111111 11111100 00000000 /22 1024

Slash Notation

Slash notation represents where the netmask ends and the host mask beings based in location in binary.

Calculating NetMask IP Ranges

The easiest way to calulate netmask ranges is using the binary notation. If the netmask is 255.255.255.224 this is represented in binary by : 11111111 11111111 11111111 11100000 if you examine the last set of numbers and using binary notation you find that you have 32 addresses available. 1*128 + 1*64 + 1*32 + 0*16 + 0*8 + 0*4 + 0*2 + 0*1 = 224 256 – 224 = 32.
This becomes a lot more complicated when using higher level notation. For example 255.255.252.0 or /22 subnet is : 11111100 00000000 we have to break each section up seperatly. 11111100 = 252 and 00000000 = 0 then we subtract those numbers from 256 – 11111111 – 11111100 = 4 and 11111111 – 00000000 = 256 and we multiply those numbers 4 * 256 = 1024.

Gateway and Broadcast

One overhead of subnetting is the gateway and broadcast address. You subnet is really only routable by one address called the gateway. This address provides access to all your other ip addresses. You cannot assign this address to a device. Normally the first address in your range is the gateway. Also you are required to have a broadcast address that sends messages to all devices in your subnet this is normally the last address.

Private IP Addresses

Some groups of IP addresses are reserved for home or private networks these addresses are not routable on the Internet.

Private IP Addresses
Starting IP Ending IP Number of Addresses
10.0.0.0 10.255.255.255 16581375
172.16.0.0. 172.31.255.255 975375
192.168.0.0 192.168.255.255 65025
127.0.0.1 127.0.0.1 1

Machines using private ip addresses are able to access the Internet using Network Address Translation (NAT).

localhost

The address 127.0.0.1 refers to you local machines network interface. This address can be used to test the hardware on your network interface using ping. It is not routable outside your machine.

Network Protocols

Protocols are agreed upon standards.  Without protocols computers would be unable to talk to each other.  In networking terms protocols are a lot like languages.  Imagine if this website was in german and you only spoke english.  Without access to a dictionary there would be no chance to understand this website.  When computers talk to each other they are required to speak the same language or have a translator (dictionary).   When working with networking protocols a model is used to define function and role this is known as the OSI (open systems interconnect) model.  The model was first defined in 1977, since a lot has changed since 1977 it can be hard to fit newer protocols into the OSI model.  There are two different versions of the OSI model a 7 layer and a 5 layer.  Since the 7 layer incorperates the 5 layer this article will explain the 7 layer model.  From the top down the layers are: Application, Presentation, Session, Transport, Network, Data Link, and Physical.  Each layer provides a method for communication between it’s adjacent layers.

Layer 7: Application

The application layer directly interfaces the application.  It provides the data in a method that the application expects.  It also sends requests for information to the presentation layer. Examples of Layer 7 are:

  • A web browser
  • A mail client
  • A FTP client

Layer 6: Presentation layer

The presentation layer takes the request or information from either side of it’s adjacent layers and translates them into usable form.  It will break information from the application layer into encapsulated sessions for the session layer.  It will also re-assemble the session layer data into application usage.  Common examples are:

  • SSL
  • TLS

Layer 5: Session Layer

The session layer controls dialogue and connections (sessions) between computers  It handles communication between the local and remote applications.  t provides for full-duplex, half-duplex, or simplex operation, and establishes checkpointing, adjournment, termination, and restart procedures. Common examples are:

  • NetBIOS

Layer 4: Transport Layer

The transport layer provides transparent controls for link through flow control, segmentation/desegmentation, and error control.   The transport layer also may be responsible for resending lost packets.  Common Transport layer protocols are:

  • TCP
  • UDP
  • IPSec
  • IPX

Layer 3: Network Layer

The Network layer ensures quality of service and addressing for end to end communication.  This is the layer at which most routers operate.  Common examples are:

  • IP
  • RIP
  • ARP
  • ICMP

Layer 2: Data Layer

The data layer provides a method to deal with errors that happen in the phsyical layer it also provides node to node communication unlike the network layer that provides end to end communication.  Data can also be broken up to accomidate the needs of the phsyical layer. Common examples are:

  • Ethernet (802.3)
  • Wireless (802.11 a/b/g)
  • Frame Relay
  • Token Ring

Layer 1: Phsyical Layer

The phsyical layer defines the required electrical and phsyical needs for communication this may include wave modulation, fiber optic cables, cat 5 cable and phone lines.

Brocade Zoning via scripting

Update: If you are looking for instructions for FOS 7 go here.

From time to time I have to handle some storage zoning.  I use mostly brocade fiber channel switches.  They are pretty easy to zone via scripts.  Which leaves you with your whole zone documented and rebuildable at a moments notice.  Before I get into the scripts I should mention that I do end to end zoning via WWID not port based zoning.   In other words I connect my Server HBA to my storage system.  Port based zoning means we zone the port that the HBA is on to the port that contains the storage system.  Port based requires that everything is plugged into the same port always and can be hard to rebuild quickly without the correct documentation.  Comments in brocade scripts are proceeded by an exclamation mark !

!!NOTE: This is fabric A
!! Make all the aliases for systems
alicreate "Storage_HBA1_A", "50:01:43:81:02:45:DE:45"
alicreate "Storgae_HBA2_A", "50:01:43:81:02:45:DE:47"
alicreate "Server_HBA1_A", "50:01:23:45:FE:34:52:12"
alicreate "Server2_HBA1_A", "50:01:23:45:FE:35:52:15"
alicreate "Server2_HBA1_A", "50:01:23:45:FE:35:52:17"

!! Make the zones

zonecreate "Z_server_to_Storage_HBA1_A", "Server_HBA1_A; Storage_HBA1_A"

zonecreate "Z_server_to_Storage_HBA2_A", "Server_HBA1_A; Storage_HBA2_A"

zonecreate "Z_server2_to_Storage_HBA1_A", "Server2_HBA1_A; Storage_HBA1_A"

zonecreate "Z_server2_to_Storage_HBA2_A", "Server2_HBA1_A; Storage_HBA2_A"

!!NOTE: effective config and zone members on SWITCHA_config Fabric
cfgcreate "SWITCHA_config", "Z_server_to_Storage_HBA1_A; Z_server_to_Storage_HBA2_A; Z_server2_to_Storage_HBA1_A; Z_server2_to_Storage_HBA2_A"

cfgsave
cfgenable "SWIT

Load it into your switch and your ready to go!

Scripting out vSwitches in VMware

Virtual switches are a fun topic in ESX,  They are unique on each ESX node and not shared across the cluster.  This problem was addressed in ESX 4.0 with distributed virtual switches (DVS) which allows you to create switches on vCenter and pass it to all nodes.  Unfortunately DVS is available only in the plus licenses which cost about $1000 more per processor.  For those of us without DVS are forced to script out vSwitches.   The process is pretty simple but has to be done in the right order from the service console:

  1. Create the vSwitch
  2. Create port groups
  3. Assign VLAN tags to port groups if required
  4. Apply security policy
  5. Link a nic to the switch
  6. Create a service console if required
  7. Assign ip addresses if required
  8. Enable vmotion if required
1.
# Create New vSwitches
# create a vSwitch with 56 ports for our service console
esxcfg-vswitch -a vSwitch0
# create a vSwitch with 56 ports for the vmkernel network
esxcfg-vswitch -a vSwitch1
# create a vSwitch with 1024 ports for VM's
esxcfg-vswitch -a vSwitch2:1024

2.
# Create Base port groups
# Service console port group
esxcfg-vswitch --add-pg="Service Console" vSwitch0
# Vmkernel port group
esxcfg-vswitch --add-pg="Vmkernel" vSwitch1
# Port group for FT
esxcfg-vswitch --add-pg="FT" vSwitch1
# Port group for VM's in VLAN 801
esxcfg-vswitch --add-pg="VM - 801" vSwitch2
# Port group for VM's in VLAN 802
esxcfg-vswitch --add-pg="VM - 802" vSwitch2
3.
# Assign VLAN's to port groups
esxcfg-vswitch -p "VM - 801" -v 801 vSwitch2
esxcfg-vswitch -p "VM - 802" -v 802 vSwitch2
4.
# Default setting on ESX allow for mac changing and sniffing fix this via these commands
vmware-vim-cmd hostsvc/net/vswitch_setpolicy --securepolicy-macchange=false vSwitch0
vmware-vim-cmd hostsvc/net/vswitch_setpolicy --securepolicy-promisc=false vSwitch0
vmware-vim-cmd hostsvc/net/vswitch_setpolicy --securepolicy-forgedxmit=false vSwitch0

vmware-vim-cmd hostsvc/net/vswitch_setpolicy --securepolicy-macchange=false vSwitch1
vmware-vim-cmd hostsvc/net/vswitch_setpolicy --securepolicy-promisc=false vSwitch1
vmware-vim-cmd hostsvc/net/vswitch_setpolicy --securepolicy-forgedxmit=false vSwitch1

vmware-vim-cmd hostsvc/net/vswitch_setpolicy --securepolicy-macchange=false vSwitch2
vmware-vim-cmd hostsvc/net/vswitch_setpolicy --securepolicy-promisc=false vSwitch2
vmware-vim-cmd hostsvc/net/vswitch_setpolicy --securepolicy-forgedxmit=false vSwitch2
5.
# Link primary nic to switch
esxcfg-vswitch --link=vmnic0 vSwitch0
esxcfg-vswitch --link=vmnic6 vSwitch1
esxcfg-vswitch --link=vmnic2 vSwitch1
#Link VMnetwork to vSwitch2
esxcfg-vswitch --link=vmnic1 vSwitch2
esxcfg-vswitch --link=vmnic3 vSwitch2
esxcfg-vswitch --link=vmnic5 vSwitch2
esxcfg-vswitch --link=vmnic7 vSwitch2

6.
esxcfg-vswif -a vswif0 -i192.168.10.45 -n 255.255.255.0 -p "Service Console"

7.
esxcfg-vmknic -a -i 192.168.20.10 -n 255.255.255.0 -p "Vmkernel" vmkernel
esxcfg-vmknic -a -i 192.168.20.40 -n 255.255.255.0 -p "FT"

8.

vmware-vim-cmd hostsvc/vmotion/vnic_set vmk0

The only thing I missed was setting a default order on the nic’s if you have multiple nic’s: For example my vSwitch1 has two port groups with 2 vnics I can choose to force a vnic for each group:

# Force Vmkernel to use vmnic6 unless it's unavailable
vmware-vim-cmd /hostsvc/net/portgroup_set -–nicorderpolicy-active=vmnic6 vSwitch1 “Vmkernel”
vmware-vim-cmd /hostsvc/net/portgroup_set -–nicorderpolicy-standby=vmnic2 vSwitch1 “Vmkernel”
vmware-vim-cmd /hostsvc/net/portgroup_set -–nicorderpolicy-active=vmnic2 vSwitch1 “FT”
vmware-vim-cmd /hostsvc/net/portgroup_set -–nicorderpolicy-standby=vmnic6 vSwitch1 “FT”

VLAN Tagging in Linux

Recently I have been doing some reworking on networking at work. One of the new requirements is that everything be network connection be a tagged VLAN. This is a pretty simple process in Red Hat Linux with multiple paths. Test files are my favorite way to make these changes so lets assume that I want the VLAN to be 455 with the nic eth0.

  • Navigate to your networking scripts:  /etc/sysconfig/network-scripts
  • Copy your current eth0 configuration cp ifcfg-eth0 ifcfg-eth0.455
  • Open the file:
DEVICE=eth0
IPADDR=192.168.10.10
NETMASK=255.255.255.0
BOOTPROTO=static
HWADDR=00:e0:4c:87:e2:36
MTU=1500
ONBOOT=yes
BROADCAST=192.168.10.255
NETWORK=192.168.10.0
DNS1=192.168.10.1
  • Modify the device name to read eth0.455
  • Add the line VLAN=yes to the end of the file
  • Save and exit
  • Shutdown the old interface (make sure your on console)
  • ifdown eth0
  • Bring up new VLAN
  • ifup eth0.455
  • Delete old interface rm ifcfg-eth0

That’s all you have to do and your Operating system will be tagging all outbound traffic with VLAN 455 and only reading traffic from 455.