VMware NSX how to firewall between IP’s and issues

The first thing everyone does with NSX is try to create firewall rules between IP addresses.  I consider this a mistake because the DFW can key off a lot better markers than IP addresses.   Either way at some point you will want to use IP addresses in your rules.  This post will describe how to setup firewall rules between IP addresses.

 

Setup:

I have two Linux machines each on their own subnet:

Linux1 – 172.16.1.10 – 172.16.1.0/24 network

Linux3 – 172.16.10.10 – 172.16.10.0/24 network

Routing is setup between the hosts so they can connect to each other.  I would like to block all traffic except ssh between these subnets.   We are going to assume that both of these networks exist in NSX.

NSX Setup:

First we have to set up an IP set in NSX Manager.  This is suprisingly a set of IP addresses.

  • Login to the vSphere web client
  • Click networking and security
  • Select your NSX Manager and expand it
  • Select Manage -> grouping objects
  • On the lower pane select IP Sets
  • Press the green plus button to add a new set
  • Setup each set as shown below:

Capture

Capture

Tale of multiple cities:

Here is where NSX gets interesting you have multiple ways to block access.  First a little understanding of firewall constructs in NSX:

  • Security Groups – these are groups of machines / constructs they can include IP sets, MAC sets, dynamic name based wild card information.  They can contain whole datacenters or a single virtual machine.  It can be very dynamic with boolean conditions.
  • Security Policies – These are groups of firewall rules and introspection services.  These are policies that are applied to security groups.  Each of the firewall policies assume that they are assigned to one or more security groups.   So your source or destination needs to be the policies assigned security group.  The opposite side (source/destination) needs to either be a security group or any.

Remember we want the following rules:

  • SSH between 172.16.1.0/24 and 172.16.10.0/24 should be allowed bi-directional
  • Everything else between them should be blocked

Within these constructs there are a number of possible options for the firewalls:

  • Option 1 – rules in this order
    • Firewall rule allowing ssh between source: assigned policy group and destination: 172.16.10.0/24
    • Firewall rule allowing ssh between source: 172.16.10.0/24 and destination: assigned policy group
    • Firewall rule blocking any between source: assigned policy group and destination: 172.16.10.0/24
    • Firewall rule blocking any between source: 172.16.10.0/24 and destination: assigned policy group
    • Assign the security policy to 172.16.1.0/24
  • Option 2 – Security Groups
    • Firewall rule allowing ssh between source: assigned policy group and destination: assigned policy group
    • Firewall rule blocking any between source: assigned policy group and destination: assigned policy group
    • Assign the security policy to 172.16.1.0/24 and 172.16.10.0/24
  • Option 3 – Two rules
    • Rule 1
    • Firewall rule allowing ssh between source: Assigned Policy group and destination: 172.16.10.0/24
    • Firewall rule blocking any between source: Assigned Policy group and destination: 172.16.10.0/24
    • Assign Policy to 172.16.1.0/24
    • Rule 2
    • Firewall rule allowing ssh between source: Assigned Policy group and destination: 172.16.1.0/24
    • Firewall rule blocking any between source: Assigned Policy group and destination: 172.16.1.0/24
    • Assign Policy to 172.16.10.0/24

First question anyone will ask is why would I not use option 2?  It’s smaller and easier to read.  It does accomplish the same goal.   It does lack granularity in design.  What if you had a third subnet 172.16.20.0/24 and you only wanted it to access 172.16.1.0/24.  Option 1 would easily be able to do this, while option 2 would mistakenly open up 172.16.10.0/24.   This is the heart of firewall design.  Layer rules to create granularity.    I am not a master of the firewall but I do have a few suggestions:

  • Outbound firewall rules sound great but right away will kill you in complexity
  • Protect the end points… apply rules to the destination (think apply rules to the web server instead of every PC)  If you need to apply source rules do it on the destination
  • Use naming conventions that describe the purpose of the rule  Allow-SSH-Into-Production
  • Consider using a DROP all on your default rule and then applying only allow rules in security groups
  • Rules that are part of the default and not created in service composer don’t show up in the GUI so don’t use them beyond the default DROP apply everything as a security policy

 

Let’s do Option 1

  • Return to networking and security and select service composer
  • Select security groups and create a security group for each IP Set

Capture

Capture

  • Repeat for the other subnet
  • Click on security policies
  • Create a new policy as shown belowCapture

Capture

Capture

Capture

Capture

Capture

  • Now that you have it build your just need to apply it to a security group
  • Click on the text of your Security Policy
  • Select Manage -> Security Groups
  • Click edit and add 172.16.1.0/24

Now your rules should work.  You can test with ping and SSH.   Using the same dialog’s you can create option 2 or 3.   The same rules you use for firewalls on physical entities need to apply to DFW.   You need to think before you create or you will be in firewall spawl.

Deep Dive: How does NSX Distributed Firewall work

This is a continuation of my posts on NSX features you can find other posts on the Deep Dive page.   My favorite feature of VMware NSX is the Distributed firewall.   It provides some long over due security features.  At one time I worked in an environment where we wanted to ensure that every type of traffic was filtered with a firewall.   This was an attempt to increase security.  We wanted to ensure that there was no east <-> west traffic between hosts; so everyone was in its own subnet.  Each virtual machine was deployed inside a /27 subnet alone.   Every communication required a trip to the firewall which was also serving as a router. It’s kinda hard if your stuff is stolen to get around Europe – even getting a låna-pengar.biz – lån och krediter in Sweden is not possible.

LunchThis model worked but made us very firewall centric.  Everything required multiple firewall changes.  Basic provisioning took weeks because of the constant need for more firewall changes.   In addition we wanted secondary controls so each host ran their own host based firewall as well.   This model caused a few major design constraints: you had to buy larger firewalls to handle all the routing and you had to take your firewall guys to lunch all the time to avoid mega rage.

Enter the distributed firewall

The distributed firewall applies firewall rules at the virtual machine kernel and network interface right above the guest OS.  This has a few advantages:

  • No one on the OS can change firewall rules
  • Only traffic that should be on the network is on the network everything else gets blocked before leaving the virtual machine (Think mega cost savings, and less garbage traffic)
  • You can inspect each packet before it gets to the network and take action (lots of third-party plugins will be able to do this)
  • You can scale out your firewalls capacity by adding more hosts in a modular fashion that matched your server growth

The firewall has a api for third-party solutions like virus scanners or IDS.   This allows them to be part of the data stream in real-time.

Components of Distributed firewall (DFW)

The DFW has a management plane, control plane and data plane which should be familiar to network admins.

  • Management Plane – is Done via vCenter plugin or API access to the NSX manager – This allows you to use any vCenter object as the source or destination (Datacenter, VM name, vNic etc..) It also allows you to define IP ranges for more traditional firewalls between IP’s
  • Control Plane – is done by the NSX manager it takes changes from vCenter and stores them in a central database and then pushes the rules down to each ESXi host.  (Database is /etc/vmware/vsfwd/vsipfw_ruleset.dat on each ESXi host)
  • Data Plane – ESXi hosts are the data plane doing the actual work of the firewall.  All firewall functions take place in kernel modules on the ESXi hosts.  Remember that enforcement is done locally and at the destination reducing the traffic on the wire.

Each vNIC get its own instance of DFW put into place and managed by a set of daemons called vsfwd.

How does it work?

Each firewall rule is created and applied via the NSX manager GUI or API.   When published it pushes all rules down to each ESXi host.  They create a file on disk which holds the all the firewall rules.   The ESXi host applies rules to the instance of DFW when a change in vCenter (remember management plane – like a new vNic vlan change etc..) happens the firewall rules are re-consulted.  IP-based rules require VMware tools to identify the IP address / addresses of the server.

How about vMotion?

Since the rules are applied to the virtual container they are moved with the host when vMotion is used, no effect.

How about HA events?

Rules are loaded off disk and applied to virtual machines.

What about if NSX Manager is not available?

Rules are loaded off disk. New systems will get the rule set that apply to them, for example if my new server is called Web-Machine12 and I have rules that are applied to all vm’s named Web-* then it will get them from disk.  This entourages the use of naming standards.

How about if I create a new virtual machines and it does not have any rules?

At the bottom is a default rule (some vote for allow all other deny all, I vote deny all) so you machine will have deny all.

Group and Policies

DFW has the concept of Security Groups (yep like it sounds) groups of similar systems, these can be hard-coded to specific entities or dynamic using regular expresses on any vCenter entity.   They also have security policies these are groups of like-minded rules to be processes in order.   So you define the scope of the rules in the Security Groups and define what is done in Security policies.  It can be a one to many reference on both sides.  (A security group can have many policies or a policy can have may groups) providing the ability to layer rules.

How do I track my firewall drops / accepts?

This is the first thing your firewall guys are going to ask for…  And I don’t like the answer right now.  They are logged to the ESXi hosts syslog.   So you need to centralize your host logs and do some searches to gather the firewalls into one place.   If you search your host based logs for “vsip_pkt” (In 6.1 they changed this to dfwpktlogs:) you will find the firewall drops / accepts.