VMware predefined NIOC settings what do they mean?

Recently I was setting up a new 5.5 cluster with NIOC and I noticed all the new NIOC pre-build categories:

Untitled (1)

 

Some are obvious but others are a little more questionable.  After a great discussion with VMware support I found out the following:

  • NFS traffic – This is traffic using the NFS bindings in ESXi (not guest NFS traffic) only ESXi NFS traffic
  • Management Traffic – ESXi management traffic only – connections between vcenter and ESXi
  • vMotion Traffic – vMotion and heartbeats
  • vSphere storage Area network traffic – I had a lot of questions on this one but it turned out to be simple vSAN only traffic
  • vSphere replication traffic – Traffic coming from the vsphere replication appliance only no other replication traffic
  • iSCSI traffic – As expected it’s traffic to ESXi that is iSCSI using hardware or software initiator
  • Virtual Machine traffic – Traffic out of guest virtual machines
  • Fault Tolerance Traffic – Traffic specific to vmware FT

There is all the predefined ones… what if I create a user defined category and assign it to my NFS port group… which assigns NIOC.   Simple the one with the larger share.

How does storage multipathing work?

Every week I spend some time answering questions on the vmware forums.  It also provides me great idea’s for blog posts just like this one.   It started with a simple question how does multipathing work?   Along with a lot of well thought out specific questions.   I tried to answer the questions but figured it would be best with some diagrams and a blog post.    I will focus this post on fiber channel multipathing.  First it’s important to understand that Fiber channel is nothing more than L2 communication using frames to push scsi commands.   Fiber channel switches are tuned to pass scsi packets as past as possible.

Types of Arrays

There are really three types connectivity with fiber channel (FC) arrays

  • Active/Active – I/O can be sent to a LUN via any of the arrays storage processors (SP) and port.  Normally this is implemented in larger arrays with lots of cache.  Writes are sent to the cache then destaged to disk.   Since everything is delivered to cache SP and port does not matter.
  • Active/Passive – I/O is sent down to a single SP and port that owns the LUN.  If I/O is send down any other path it is denied by array.
  • Pseudo Active/Active – I/O can be sent down any SP and port but there is a SP and port combination that owns the LUN.  Traffic send to the owner of the LUN is much faster than traffic sent to non-owners.

The most common implementation of pseudo active/active is asymmetric logical unit access (AULA) defined in the SCSI-3 protocol.  In AULA the SP identifies the owner of a LUN with SCSI sense codes.

Access States

AULA has a few possible access states for any SP port combination:

  • Active/Optimized (AO) – this is the SP and port that owns the lun best possible path to use for performance
  • Active/Non-Optimized (ANO) – this is a SP and port that can be used to access a lun but it’s slower than the AO
  • Transitioning – this lun is changing from one state to another and not available for IO – Not used in most AULA now
  • Standby – Not active but available – Not used in most AULA now
  • Unavailable – SP and port not available

In a active/active array the following states exist:

  • Active – All SP and ports should be this state.
  • Unavailable – SP and port not available

In a active/passive array the following states exist:

  • Active – SP and port to access the lun (single owner)
  • Standby – SP and port available is active is gone
  • Transitioning – Switch to Active or Standby

In AULA arrays you also have Target port groups (TPG) which are SP and ports that have a similar state.  For example all the ports on a single SP may be a TPG since the LUN is owned by the SP.

How does your host know what the state is?

Great question.  Using SCSI commands a host and array communicate state.   There are lots of commands in the standard.  I will show three management commands from AULA array’s since they are the most interesting:

  • Inquiry – Ask a scsi question
  • Report Target port – Reports what TPG has the optimized path
  • Set Target port group – ask the array to switch the target port group ownership

 

This brings up some fun scenario’s who can initiate these commands and when…  All of these will use a AULA array

Setup:

So we have a server with two HBA’s connected to san switches.  In turn the SP’s are connected to the san switches.  SPa owns LUN1 via AO and SPb owns LUN2 via AO.

Untitled

 

Consider the following failures:

  • HBA1 fails – assuming the pathing software on the OS is set correctly (more on this later) The operating system access LUN1 via ANO path to SPb to continue to access storage.  Then it initiates a set target group command to SPb asking it to take over LUN1.  Which is fulfilled and the array sends out a report target port groups to all known systems that they should use SPb for access to LUN1 for AO.
  • SPa fails – assuming the pathing in OS is good.  Access to LUN1 fails via SPa and the OS fails over the SPb and initiates the LUN fail over.

This is designed just to show the interaction in a real environment you would want san switch a and b both connected to SPa and SPb if possible for redundancy.

How does ESXi deal with paths?

ESXi has three possible path states:

  • Active
  • Standby
  • Dead – cable unplug, bad connection / switch

It will always try to access to the lun via any path available.

Why does path selection policy matter?

The path selection policy can make a huge difference.  For example if you have a AULA array you would not use the round robin path selection policy.  Doing this would cause at least half your I/O’s to go down the ANO path which would be slow.   ESXi supports three policies out of the box:

  • Fixed – Honors the AO path until available most commonly used with AULA arrays
  • Most recently used (MRU) – Ignores the prefered path and uses the most recently used path until it’s dead (used in active/passive arrays)
  • Round Robin (RR) – sends a fixed number or I/O’s / bytes down a path then switches to next path.  Ignores AO.  Used normally with active/active arrays

The number of I/O’s or bytes sent before switching in RR can be configured but defaults to 1000 io’s and 10485760 bytes.

Which path should you use?  That depends on your storage array and you should work with your vendor to understand their best practices.  In addition a number of vendors have their own multipath systems that you should use (for example EMC’s powerpath).

 

VMUG Virtual Event

I am a huge fan of the VMUG organization.  I attend two different VMUG’s in central ohio.  From time to time I have been known to speak at the events. In fact I will be speaking on Feb. 25th on design at the Central Ohio VMUG.  So last week I attended the VMUG virtual day long event.  VMware has been playing around with this type of event for about six months now.  The concept is simple avoid the traveling show and create a virtual event.    There are live streams with Q and A sessions and of course vendor booths.

This type of medium is getting more popular by all companies to reduce expense.  Of course the key is they need to bring value to the table.  Traditionally they have these type of shows attract people for these reasons:

  • SWAG (Stuff we all get)
  • Get out of the office
  • Chance to talk to a lot of vendors in one place
  • Chance to learn about companies product (marketing)

The virtual event does offer prizes but not a lot of swag beyond white papers.   Personally, I enjoy going to the events because I can interact with other people.  I find the lunches at these events to be the most useful.   In that respect the virtual event is really missing out.  I can talk to vendors via chat or VMware employees but my ability to talk to other customers is limited.   I have always felt that VMUG is really about other customers.  I hope they find a way to create community at these events as well.  Here is the good news even if you could not attend you can get all the presentations streaming until Feb. 24th right here.

I did enjoy seeing more vSAN best practices.  I am really looking forward to seeing more vSAN deployments.