How to migrate current workloads into NSX

I get this question all the time.

How do I migrate my existing VLAN backed workloads into NSX?

The answer is pretty simple but it has some design concerns.   In order to explain the process let’s make some assumptions:

  • You have two virtual machines (VM1, VM2)
  • They are all on the same subnet that is backed by a VLAN 310
  • The subnet assigned to the VLAN is 10.40.10.0/24
  • Subnet 10.40.10.0/24 is routed by physical_router1

 

The environment is shown below:

Let’s assume that our NSX network is also built out at this time as follows:

  • Edge Services gateway ESG_1 provides routing between physical and virtual using OSPF area 10 to peer with physical_router1
  • ESG_1 connects to a distributed logical router (DLR_1)
  • Virtual networks backed by VXLAN operate behind DLR_1
  • The ESG is advertising for 10.60.10.0/24 running on VNI5000

The setup is visualized below:

Ok so how to do we get virtual machines on VLAN310 behind the DLR_1 so we can take advantage of all of the NSX routing advantages?

#1 Create a new destination network

Create a new logical switch which will be VNI5001 (assigned number by NSX) at this point don’t assign it a gateway on the DLR_1

#2 Deploy a new Edge gateway

Deploy a new ESG we will call ESG_2 with just management interfaces

#3 Create a bridge

Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.

Use ESG_2 to bridge VLAN310 and VNI5001.  There a number of constraints with bridges which I will mention after the process steps.

#4 Test VNI5001

Put a test virtual machine into 10.40.10.0/24 on VNI5001 and test connectivity to VM1 and VM2.

 

#5 Move virtual machines into VNI5001

Switch the network interface for VM1 and VM2 into VNI5001.   They will take a single ping interruption and should continue to work.

#6 Change routing

Here is the interruptive part.  Currently routing to VM1 is going from physical_router1 to switch on VLAN310 through ESG_2 into VNI5001 not an ideal path.    We need to switch 10.40.10.0/24 to be advertised by ESG_1.   We can do this by removing ESG_2 (interrupts network to VM1 + VM2) and adding a gateway for 10.40.10.0/24 on the DLR_1 for VNI5001.   ESG_1 will then advertise the new subnet to the physical_router1 assuming it’s accepted because the old route has been removed traffic will resume.

Bridge mode allows you to migrate into virtual networking with IP address changes.  It does cause an interruption.   One might wonder if you could not just run bridge mode forever.   There are performance and latency concerns to consider with this plan.

Design considerations to bridge mode:

  • An ESG used to provide a L2 bridge maps to a single VLAN so for each bridge you require a new ESG
  • If the ESG fails anything on the virtual networking side will fail because it’s the single point to bridge
  • Performance can be impacted all traffic crossing the bridge has to route into the ESG bridge then to the destination VM
  • If redundancy beyond VMware HA is a concern active / passive ESG’s are supported
  • L2 VLAN must be present on all ESXi hosts that may run the ESG with the bridge

 

So with some design considerations in the book this did not address VLAN’s with physical and virtual machines.   A bridge can provide the functionality of communication between physical and virtual.   This may seem like a good solution but it requires careful design and performance considerations.   Single points of failure or configuration challenges on the physical network can cause the whole solution to fail.

You can read more about bridges on VMware’s documentation here.

Cross site vMotion requires VMware switches technology

Cross site vMotion is a feature that really shows the power of the VMware platform.   When combined with NSX you can move live running virtual machines across long distances.    It’s a huge advantage for customers looking to balance workloads or avoid potential disasters.   I learned today that this feature does require VMware’s virtual standard switch or distributed switch it will not work on any third party switches today.   In addition there are only certain supported migration paths:

VSS = Virtual Standard Switch

VDS = Virtual distributed switch

There are only certain supported migration paths:

VSS -> VSS

VSS -> VDS

VDS ->VDS

Notice that VDS -> VSS is not supported.

Repoint 6.x vCenter to a new PSC

Since a vCenter has a connection to a single PSC it’s important to understand how to move between PSC’s and deploy new ones when old ones have failed.   This article details this mobility and process.   

 

Once installed check for working vCenter

Then login via ssh and check which PSC is being used

 

Let’s repoint it to psc2.griffiths.local

cmsso-util repoint --repoint-psc psc2.griffiths.local

 

Now we are pointing to psc2 at site1.  In 6.0 you were able to repoint a vCenter to different site PSC’s this is no longer available in 6.5 (Yep no longer possible remember this trying to repoint can cause some really bad stuff in 6.5).    

As you can see we have repointed the psc from 1 to 2 at the same site:

So what do you do when all your PSC’s at a site have failed?  (Don’t have a single PSC at a site first off..)   Or this:

Install a new PSC pointing to a remaining site psc we will use psc3 at site2 to create a new PSC5 at site1.  In order to test this I shutdown psc1 and psc2 to simulate failures.  


So we are creating: 

 

After the PSC is installed it will replicate with psc3.griffiths.local only.    We then can repoint vc1 to psc5 and rebuild missing psc’s at site1.    We have to make sure PSC5 was deployed correctly first via visiting it’s webpage:

Now we can repoint the vc to psc5 at site1.  

 

Login to the vCenter web client to test working authentication via psc5

And it’s working

 

William Lam posted a script to automatically change vCenter to new PSC when a failure is detected and it’s here:

http://www.virtuallyghetto.com/2015/12/how-to-automatically-repoint-failover-vcsa-to-another-replicated-platform-services-controller-psc.html

 

For those who don’t want to read the script it’s very simple it runs on the vCenter appliance and checks the PSC web page for a return code of 200 if it fails 3 times it switches to another PSC.   It runs as a automated task every x minutes. 

 

Remove an old PSC

Login to any PSC and type the following command:

 

cmsso-util unregister –node-pnid OLD_PSC_Name –username administrator@sso_domainname

 

So to remove psc1.griffiths.local I would type:

 

cmsso-util unregister –node-pnid psc1.griffiths.local –username administrator@vsphere.local

 

 

 

Setting up replication on the platform services controller

In the previous set of articles I discussed the following:

In this article I will discuss how to create the reference design from the following:

 

The reference replicate design should be:

So we need to add two new replication agreements.

PSC1 <->PC3

First login to psc1.griffiths.local and enable shell.  Then change the directory into /usr/lib/vmware-vmdir/bin

Look at the current agreements

Add an agreement with psc3.griffiths.local using the following command:

./vdcrepadmin -f createagreement -2 -h psc3.griffiths.local -u Administrator -H psc1.griffiths.local

You can see that we now have two agreements.   Rinse and repeat on PSC2 <=> PSC4 and we have the VVD topology.  Notice in this example that I did everything from PSC1 while referencing different PSC’s:

Now you have your replication agreements.   Looking at the vdcrepadmin command you have the following options