Puppet

A quick puppet config.  I always run into issues where I need a node to be in multiple groups in my nodes.pp  I have solved this by creating a generic template assigned to everything and use a format like this: (assume I want to push to machines named node1, node3, node5)

 

if ($hostname == 'node1') or ($hostname == 'node3') or ($hostname == 'node5') {
#do something like push a file etc..
}

 

Jumbo Frames per port group

A new friend recently pointed out that jumbo frames can be enabled on a port group basis just like vlan tagging.  I never noticed that… it’s a huge difference in my designs.   Of course it has to be enabled upstream on the switches.  Take a look at vmware’s article on the matter here.

vcloud Director my virtual machine cannot get http to work unless it’s on the same node as vShield Edge

The name says most of it.  You have a ORG vapp that cannot get http to work.  DNS works.. telnet works but true http connections will not work.  I spent a bunch of time troubleshooting this issue with vmware support and the end result: somewhere upstream my MTU was not set to 1600 breaking VXLAN.   You can test this by moving the VM to the same esxi host as the vshield edge.  If it works while on that node it’s a MTU issue on VXLAN trust me.

How do you prove it?  Login to the ESXi host and locate the virtual IP for VXLAN on another node (assume it’s 192.168.10.31) and use this command:

vmkping -s 1547 192.168.10.31 -d

If it hangs then your have MTU issues somewhere.

Vsphere Issue Guest Unable to collect IPv4 routing table

Yesterday Morning I showed up for work and found a virtual machine all buggered up due to some application errors in java.  My only recourse was a timely reboot.  During the reboot I found it stuck on the following:

Guest Unable to collect IPv4 routing table

This was a first for me.  I originally assumed that it was caused by an operating system issue (which it is see RHBA-2013:1290) but there really is only work around’s at this time.  You can follow the official route which is documented here.  Or a nice work around that works every time here.   I have tested both and they work but I hate modifying the vmx file.  This is a really nasty bug that needs to be fixed and I hope VMware will soon.

VMware Vsphere Network Partitions

So I have known about network partitions for a long time mostly due to the vsphere clustering deep dive one of the best vsphere books around.   But I was having a little trouble figuring out how it determines a machine is isolated vs partitioned.  Since this is a critical point in determining failure on ToR configurations I dug into it.  The book mentioned above gave a really good back ground and overview.  It also eluded to some answers.  Here are the basics:

The ESXi host master (determined by machine with most datastores (then by moid)) communicates with each host via management network.  When this fails it trys to ping it’s managment ip.  Assuming that you are running 5.0 it try’s to identify if the host has released it’s lock on datastores (storage heartbeat)  Assuming the storage heartbeat is successful then the following two states could happen: (if datastore heartbeat  fails then it assumes host has failed and restarts vm’s)

  • Host is isolated (all alone cannot see any other ESXi hosts and will use it’s host isolation response after the timeout)
  • Host is partitioned (can see other ESXi hosts but not the master)

If a host is partitioned then after the timeout (timing listed in the book above) a master election takes place.  Then a master is selected inside the partition.  The master will attempt to protect the virtual machines in it’s new protection domain.  This is not always possible because the original master may still be protecting the virtual machines (holding a lock on them via file system – lock is on .vmx file not .vmdk)  Any new virtual machine may or may not be protected.  Due to these issues you may have to plan your design and host isolation response correctly.

In a partitioned state host isolation response is not inacted… it is only used on isolations.