Public cloud has forced change

Readers will immediately cry foul of this title.   Public cloud adoption is not huge and even in the most die hard cloud only shops it’s only around 40%.   I believe public cloud has and will continue to have a transformative effect on private cloud.  The presence of a second option has forced the current options hand.    I will not detail the challenges in public cloud adoption that is a blog post for another day.   I want to focus on elements that public cloud’s presence has forced into our private on-prem. clouds:

  • Life cycle management for hypervisor is now table stakes – gone are the days with hypervisor specific teams – you can roll that cost into the operational budget on public cloud.   Quite simply upgrading / maintaining / tweaking the hypervisor needs to become easier and cost a lot less OpEx.
  • Delivery of traditional IT services need to become transparent and quick – the buzz word agility applies – the business does not care how many engineer’s it takes to screw in the server they just want it now
  • Consumers of IT services don’t like limits or scale issues – all on-prem. offerings need to have some form of elasticity
  • No one really wants or needs IaaS (Infrastructure as a Service) they want platform because only platforms provide value to developers who in turn provide value to the business-  Platform has to include multiple servers, networking/networking constructs, security, and authentication.
  • Cost is important like never before… if you don’t control / understand your cost, comparisons will be made to public cloud and you will loose.
  • Service catalogs are only useful if they change and are responsive to business needs (think application development life cycle)
  • Infrastructure people need to learn from development – the future is automated and created by developers who understand infrastructure – you can try to stand still but it will not last.
  • IT shops now want to spend IT budget incrementally very few shops want to buy IT in my company spend every three years. Eventually, they can apply for business loan from this site here. You can also check out Qprofit System Test to learn the latest trend about online investment.

I want to be clear I believe public and especially hybrid cloud should be part of every IT strategy.   It’s a critical reality in our world.   I also believe that private cloud is here to stay for many years but expectations will continue to change based upon public cloud experience.

The real question for me is how will the new edge of IoT force public clouds hand.f

Operational aspects of HCI video

While attending VMworld 2017 I presented on some of the operational aspects of hyper-converged infrastructure.   I believe the key takeaways were:

  • Hyper-converged is more than just storage to gain the real benefits
  • Hyper-converged has a difference scalability model  (more linear)
  • Hyper-converged requires a difference organizational structure to be successful
  • Hyper-converged performance and availability are policy based instead of location based.

You can watch the video here:

 

One question I was asked after the presentation was about scalability.   It was a really good question so I wanted to answer here.   Let’s assume that you start with a 3 node cluster.  After three years you add 12 nodes to a total of 15 nodes.   At some point newer hardware types are available does that mean that you now need to buy 15 brand new nodes running into a major investment instead of incremental growth?

  • The answer is yes and no.   At some point you do have to make the three node investment but you should do it long before the end of life for your current cluster so you can organically grow new hardware before end of life.   This should be taken into account on your growth models.

Thanks to all who attended.

How to operate IT with full Velocity

I was honored to be able to present this last week at VMworld 2017.   I have always been a huge supporter of vBrownBag and was really happy to present for them at VMworld again this year.   One of my presentations was on how to operate IT will full velocity as a follow-up to my post on how to make IT Agile.   The session was recorded and posted on YouTube.   You can view the brief (12 minute) talk here:

Please let me know if you have any feedback or thoughts

How can you make IT agile?

Every single day I hear the new magic word from IT groups:  I need improved agility.   It reminds me of how people talked about going to the cloud.   Agility is a capability not a destination.  It’s hard to measure which is the first challenge.  I believe when customers ask for agility the are actually asking for business relevance.  For example, the news that Nolimit City score collaboration deal means that the software company is willing to work with other platforms to reinvent itself. If you are closely aligned with the business you should be able to respond to the business as needed.

In order to illustrate my point I am going to use a story from my childhood.   As a young child of eight I used to play the video game Test Drive, using the best graphic cards for gaming I got online.

This early computer car simulator allowed you to drive very high-end sports cars in exotic location using your keyboard.     One day while driving my father was watching me drive.   I was frustrated I could not beat my computer opponents in the race.  My father wisely said “Son the problem is you wait until you are already in the turn to begin to turn.  As a driver I turn long before I get into the turn.”  This wise council has stuck with me in life.   If you don’t start to gradually turn in advance then you have to slow down to make the turn.  In my case I was having to brake down to 10 kph in order to make the turn.    I was being reactive to the turn instead of proactive.

I think this illustrates a common challenge with IT.   IT is focused on building the best sports car and assuming that if the car is fast enough they will be able to meet the business needs.    Without visibility into the business how can your sports car make the turns without massive slow down.   I firmly believe that change is constant and to be expected even more so with IT.   So the challenge is how do slow down building a shiny sports car and maintain velocity?

 

Signs of IT focus on sports car

How can you tell that organization has been too focused on building the best sports car?   I suggest the follow may be signs of a problem:

  • Business wants IT to cut costs year after year while the business is growing
  • Digital initiatives creating a bi-modal IT (Leaving traditional IT behind)

 

How does IT become business focused?

Focusing on the business has been a challenge for traditional IT.   They have normally been padded away from the business units by development.   Development continues to add value to the business by changing to meet their needs.   They have the ability to change with velocity because they talk to the business.   In order to solve this issue traditional IT needs to have a business focus.   Here are some suggestions:

  • Read your companies 10-K and understand what is important to the C-level
  • Spend time talking to development and business and understand how the project effects revenue – once you understand the revenue potential of a project use that data to market your impact
  • Get a real understanding of cost – you need to understand CapEx and OpEx cost of projects and actions so you can project them to the business
  • Start to track SLA’s and report on them
  • Track other critical metrics and report on them
  • Marketing of your service is critical – no one cares about your shiny car.. they want to know capabilities aligned with business

Taking these action will gradually put you in a place were the business includes you in the discussion.   Your role once aligned with the business is to say yes to revenue projects and guide them into cost effective IT solutions to the need.

Never use brakes

When I was first a driver at 16 I was on the freeway.   I would accelerate quickly and brake a lot.  My mother suggested that I was driving incorrectly.  Her father has a simple driving goal “You should never have to use brakes.  You should anticipate slow downs in advance and only use brakes when you come to a stop.”  My grandfather has never driven in Paris, Italy or New York!   His advice aligns with the goal of understanding the business, once aligned we should not have to come to a complete stop instead we gradually adjust to meet needs.

Hard work

Being aligned with the business is a lot of work.   It requires that you create IT as a service instead of reactive IT.  Proactive IT:

  • Has a road map plan for the next 18 months (but allows for turns)
  • Has robust historical metrics around business critical metrics and cost to make business informed choices
  • Understands how every project aligns to revenue
  • Spends more time planning than implementing
  • Has robust standards for the service and aligns to them

Let me know what you think… am I up in the night?

Upgrading to vSphere 6.5 FAQ

I was recently involved in recording a series of Webinars to help customers understand how to upgrade to vSphere 6.5.   You can see the on demand recordings here:

https://vts.inxpo.com/scripts/Server.nxp?LASCmd=AI:4;F:APIUTILS!51004&PageID=747A2F8A-E3DD-451B-8172-0F8F16EB464B

A number of live questions were asked and I figured I would highlight some frequently asked questions from the series:

Architecture

Q. Is having three platform service controllers and three vCenters each vCenter pointing to their own PSC supported.

A. Yes 100% supported up to 10 PSC’s and 10 VC’s total pointing any combination you want. If you want enhanced linked mode the PSC’s will have to be external.

Q. Is there a manual step to make the load balancer switch to the secondary PSC?

A. Both PSC’s are active but only one PSC at a time can service requests. So assume we have two PSCs: PSC1 and PSC2 the load balancer points to PSC1 and it fails then the load balancer points all traffic to PSC2 and resumes traffic.

Q. What is the link for the decision tree to choose platform services controller topologies?

A. https://blogs.vmware.com/vsphere/2016/04/platform-services-controller-topology-decision-tree.html

Q. Do you need external PSC if using products such as site recovery manager?

A. The *only* reason you need an external PSC in v6.5 is if you want to use Enhanced Linked Mode (ELM).

Q. Why should we use the vCenter appliance on 6.5 instead of windows?

A. There are a number of features only available to the appliance including: native vCenter HA, native backup and restore, single click upgrade and simplified support models.

 

Predictive DRS

Q. What are the added requirements on the vCenter server for predictive DRS?

A. You will need to install vROps – at least the standard addition

 

Recovery

Q. What happens if the PSC is ‘down’? What functionality do you lose?

A. If a PSC is not functioning new authentication attempts to vCenter will not work. Already authenticated sessions will remain connected.

Q. When using VCHA how many vCenter licenses are required for the three machines?

A. A single vCenter license for a VCHA setup of three machines.

Q. Can the vCenter appliance backups be scheduled to run on a regular basis?

A. Yes, You can set the tool up to do a one time or a schedule.

 

Security

Q. Is there a hardening guide for vSphere 6.5?

A. Absolutely we just released the hardening guide for vSphere 6.5 at http://www.vmware.com/security/hardening-guides.html.

Q. Can you still encrypt VMs with 3rd party vendors?

A. Of course – those APIs are still available to those vendors.

Q. Will the vmotion encryption slow down the vmotion?

A. Less than 5% but yes. You’ll have to account for time to encrypt / decrypt.

Q. What KMS servers are supported?

A. We support any KMIP 1.1 compliant key management server.

Q. Where are the keys stored for VM encryption?

A. Encryption keys are stored in whatever KMIP 1.1 compliant KMS you decided to deploy. The keys never persist in vCenter and simply pass-through to the cluster hosting the workload. The actual key encrypting the VM is stored encrypted using the KMIP key inside the vmx file.   Should you lose your vCenter you would simply re-connect with your KMS infrastructure.

Upgrade

Q. Is there any way to change SSO domain in 6.5 after initial installation?

A. Unfortunately No. If you need to change your SSO domain you must do it in v5.5 before you upgrade (also not possible in v6.0).

Q. If you are upgrading from 6.0 to 6.5 with multiple PSC & VCSA on same SSO domain across 2 sites can you upgrade PSC’s over multiple days/weeks & then VCSA’s over days/weeks. Or does it all need to be done in one window?

A. Our official answer is: Mixed-version environments are not supported for production. Use these environments only during the period when an environment is in transition between vCenter Server versions.

Q. Does the upgrade from 6.0 to 6.5 keep your root certificate store?

A. Yes it does – the upgrade does not affect your certificate Store

Q. Do we have vCenter 6.0 Windows with MSSQL to Appliance 6.5 Converter?

A. The migration tool from 6.0 Windows vCenter to 6.5 vCenter Server Appliance is included as part of the vCenter 6.5 Appliance ISO.

Q. If we want to move vCenter from embedded to external SSO what is the best path?

A. I’d recommend you perform your upgrade to the vCenter appliance using the migration wizard and then post migration deploy a new PSC appliance joined to the embedded and repoint your vCenter to this new PSC.

 

Let me know if you have additional questions.

How to migrate current workloads into NSX

I get this question all the time.

How do I migrate my existing VLAN backed workloads into NSX?

The answer is pretty simple but it has some design concerns.   In order to explain the process let’s make some assumptions:

  • You have two virtual machines (VM1, VM2)
  • They are all on the same subnet that is backed by a VLAN 310
  • The subnet assigned to the VLAN is 10.40.10.0/24
  • Subnet 10.40.10.0/24 is routed by physical_router1

 

The environment is shown below:

Let’s assume that our NSX network is also built out at this time as follows:

  • Edge Services gateway ESG_1 provides routing between physical and virtual using OSPF area 10 to peer with physical_router1
  • ESG_1 connects to a distributed logical router (DLR_1)
  • Virtual networks backed by VXLAN operate behind DLR_1
  • The ESG is advertising for 10.60.10.0/24 running on VNI5000

The setup is visualized below:

Ok so how to do we get virtual machines on VLAN310 behind the DLR_1 so we can take advantage of all of the NSX routing advantages?

#1 Create a new destination network

Create a new logical switch which will be VNI5001 (assigned number by NSX) at this point don’t assign it a gateway on the DLR_1

#2 Deploy a new Edge gateway

Deploy a new ESG we will call ESG_2 with just management interfaces

#3 Create a bridge

Use ESG_2 to bridge VLAN310 and VNI5001.  There a number of constraints with bridges which I will mention after the process steps.

#4 Test VNI5001

Put a test virtual machine into 10.40.10.0/24 on VNI5001 and test connectivity to VM1 and VM2.

 

#5 Move virtual machines into VNI5001

Switch the network interface for VM1 and VM2 into VNI5001.   They will take a single ping interruption and should continue to work.

#6 Change routing

Here is the interruptive part.  Currently routing to VM1 is going from physical_router1 to switch on VLAN310 through ESG_2 into VNI5001 not an ideal path.    We need to switch 10.40.10.0/24 to be advertised by ESG_1.   We can do this by removing ESG_2 (interrupts network to VM1 + VM2) and adding a gateway for 10.40.10.0/24 on the DLR_1 for VNI5001.   ESG_1 will then advertise the new subnet to the physical_router1 assuming it’s accepted because the old route has been removed traffic will resume.

Bridge mode allows you to migrate into virtual networking with IP address changes.  It does cause an interruption.   One might wonder if you could not just run bridge mode forever.   There are performance and latency concerns to consider with this plan.

Design considerations to bridge mode:

  • An ESG used to provide a L2 bridge maps to a single VLAN so for each bridge you require a new ESG
  • If the ESG fails anything on the virtual networking side will fail because it’s the single point to bridge
  • Performance can be impacted all traffic crossing the bridge has to route into the ESG bridge then to the destination VM
  • If redundancy beyond VMware HA is a concern active / passive ESG’s are supported
  • L2 VLAN must be present on all ESXi hosts that may run the ESG with the bridge

 

So with some design considerations in the book this did not address VLAN’s with physical and virtual machines.   A bridge can provide the functionality of communication between physical and virtual.   This may seem like a good solution but it requires careful design and performance considerations.   Single points of failure or configuration challenges on the physical network can cause the whole solution to fail.

You can read more about bridges on VMware’s documentation here.

Cross site vMotion requires VMware switches technology

Cross site vMotion is a feature that really shows the power of the VMware platform.   When combined with NSX you can move live running virtual machines across long distances.    It’s a huge advantage for customers looking to balance workloads or avoid potential disasters.   I learned today that this feature does require VMware’s virtual standard switch or distributed switch it will not work on any third party switches today.   In addition there are only certain supported migration paths:

VSS = Virtual Standard Switch

VDS = Virtual distributed switch

There are only certain supported migration paths:

VSS -> VSS

VSS -> VDS

VDS ->VDS

Notice that VDS -> VSS is not supported.

Double your storage capacity without buying a new storage shelf

I spent a good portion of my career moving storage from one array to another.   The driver is normally something like this:

  • Cost of older array (life cycle time)
  • New capacity, speed or feature

So off we went on another interruption migration of lun’s and data..  At one point I was sold on physical storage virtualization appliances.   They stood in front of the array and allowed me to move data between arrays without interruption to the WWID or application.   I loved them what a great solution.   Then storage vMotion became available and 95% of the workloads were running in VMware.   I no longer needed the storage virtualization appliance and my life became very VMware focused. I rather reed some access self-storage feedback instead.

 

New Storage paradigm

With the advent of all flash arrays and HCI (all flash or mixed) performance(speed) has almost gone away as a reason for moving data off arrays.  Most arrays offer the same features; replication capability aside.   So now we are migrating to new arrays / storage shelf’s because of capacity or life cycle issues.   Storage arrays and their storage shelves have a real challenge with linear growth.   They expect you to make a bet on the next three years capacity.   HCI allows a much better linear growth model for storage.

My HCI Gripe

My greatest grip with HCI solutions is that everyone needs more storage that does not always mean you need more compute.   Vendors that provide hardware locked (engineered) platforms suffer from this challenge.   The small box provides 10TB, Medium 20TB and large 40TB.   Which do I buy if I need 30TB?   I am once again stuck in the making a bet problem from arrays (at least it’s a smaller bet).   The software based platforms including VSAN (full disclosure – At time of writing I work for VMware and have run VSAN in my home for three years) have the advantage of offering better mixed sizing and linear growth.

What about massive growth?

What happens when you need to double your storage with HCI and your don’t have spare drive bays available?   Do you buy a new set of compute and migrate to it?  That’s just a replacement of the storage array model…  Recently at some meetings a friend from the Storage and availability group let me know the VSAN solution to this problem.   Quite simply replace the drives in your compute with larger drives in a rolling fashion.   You should create uniform clusters but it’s totally possible to replace all current drives with new double capacity drives.   Double the size of your storage for only the cost of the drives.   (doubling the size of cache is a more complex operation)  Once the new capacity is available and out of maintenance mode data is migrated by VSAN on to the new disks.

What is the process?

It’s documented in chapter 11 of the VSAN administration guide : https://pubs.vmware.com/vsphere-60/topic/com.vmware.ICbase/PDF/virtual-san-600-administration-guide.pdf

A high level overview of the steps (please use the official documentation)

  1. Maintenance mode the host
  2. Remove the disk from the disk group
  3. Replace the disk you removed with the new capacity drive
  4. Rescan for drives
  5. Add disk back into the disk group

 

Migrating off a distributed virtual switch to standard switch Article 2

Normally people want to migrate from virtual standard switches to distributed switches.   I am a huge fan of the distributed switch and feel it should be used everywhere.   The distributed switch becomes a challenge when you want to migrate hosts to a new vCenter.   I have seen a lot of migrations to new vCenters via detaching the ESXi hosts and connecting to the new vCenter.   This process works great assuming you are not using the distributed switch.   Removing or working with VM’s on a ghosted VDS is a real challenge.   So remove it before you migrate to a new vCenter.

In this multi-article solution I’ll provide some steps to migrate off a VDS to a VSS.

Article 2:  Migrating the host off the VDS.  In the last article we moved all the virtual machines off the VDS to a VSS.   We now need to migrate the vMotion and management off the VDS to a VSS.   This step will cause interruption to the management of a ESXi host.   Virtual machines will not be interrupted but the management / will be.   You must have console access to the ESXi host for this to work.  Steps at a glance:

  1. Confirm that a switch port exists for management and vMotion
  2. Remove vMotion, etc.. from VDS and add to VSS
  3. Remove management from VDS and add to VSS
  4. Confirm settings

Confirm that a switch port exists for management and vMotion

Before you begin examine the VSS to confirm that management and vMotion port groups were created correctly by Article 1's script.   Once your sure the VLAN settings for the port group are correct then you can move to the next step. ​​ You may want to confirm your host isolation settings it’s possible these steps will cause a HA failure if you take too long to switch over and don’t have independent datastore networking. ​​ Best practice would be to disable HA or switch to leave powered on isolation response.​​ 

Remove vMotion, etc.. from VDS and add to VSS

Login to the ESXi host via console and ssh.  (Comments are preceded with #) 

#use the following command to identify virtual adapters on your dvs

esxcfg-vswitch -l

# sample output from my home lab

DVS Name  ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ Num Ports  ​​​​ Used Ports ​​ Configured Ports ​​ MTU  ​​ ​​ ​​​​ Uplinks

dvSwitch  ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 1792  ​​ ​​ ​​ ​​ ​​ ​​​​ 7  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 512  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 1600  ​​ ​​​​ vmnic1

 

 ​​​​ DVPort ID  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ In Use  ​​ ​​ ​​ ​​​​ Client

 ​​​​ 675  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 0

 ​​​​ 676  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 1  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ vmnic1

 ​​​​ 677  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 0

 ​​​​ 678  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 0

 ​​​​ 679  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 1  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ vmk0

 ​​​​ 268  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 1  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ vmk1

 ​​​​ 139  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 1  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ vmk2

 

# We can see we have three virtual adapters on our host use the following command to identify their use and IP addresses

esxcfg-vmknic -l

#Sample output from my home lab cut out some details to make it more readable

Interface ​​ Port Group/DVPort  ​​​​ IP Family IP Address  ​​ ​​ ​​​​ 

vmk0  ​​ ​​ ​​ ​​ ​​​​ 679  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ IPv4  ​​ ​​ ​​ ​​​​ 192.168.10.16  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 

vmk1  ​​ ​​ ​​ ​​ ​​​​ 268  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ IPv4  ​​ ​​ ​​ ​​​​ 192.168.10.26  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 

vmk2  ​​ ​​ ​​ ​​ ​​​​ 139  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ IPv4  ​​ ​​ ​​ ​​​​ 192.168.10.22  ​​ ​​ ​​​​ 

 

Align you vmk# with vCenter to identify which adapter provides the function (vmk0 management, vmk1 vMotion, vmk2 FT)

 

# We can now move all adapter other than management which in my case is vmk0​​ #we will start with vmk1 on dvSwitch on port 268

esxcfg-vmknic -d -v 268 -s "dvSwitch"

 

# Then add to vSwitch0 vmk1

esxcfg-vmknic -a -i 192.168.10.26 -n 255.255.255.0 -p PG-vMotion

 

Remove FT

esxcfg-vmknic -d -v 139​​ -s "dvSwitch"

 

esxcfg-vmknic -a -i 192.168.10.22 -n 255.255.255.0 -p PG-FT

 

Remove management from VDS and add to VSS

Remove management (this stage will interrupt management access to ESXi host – make sure you have console access) You might want to pretype the add command in the console before you execute the remove. ​​ If you are having trouble getting the shell on a ESXi host do the following:

  • You will need to login to the console go to troubleshooting options -> Enable ESXi Shell

  • Press Alt-Cntr-F1 to enter shell and login

 

Remove management:

esxcfg-vmknic -d -v 679​​ -s "dvSwitch"

 

Add management to VSS:

esxcfg-vmknic -a -i 192.168.10.16 -n 255.255.255.0 -p PG-Mgmt

 

Confirm settings

Ping the host to ensure networking has returned to management.  ​​​​ Ensure the host returns to vCenter by waiting 2 minutes.  ​​ ​​​​ After you move the host to a new vCenter you can remove via:

  • Go to the host in vCenter and select dvs it should provide a remove button.

 

 

 

Migrating off a distributed virtual switch to standard switch Article 1

Normally people want to migrate from virtual standard switches to distributed switches.   I am a huge fan of the distributed switch and feel it should be used everywhere.   The distributed switch becomes a challenge when you want to migrate hosts to a new vCenter.   I have seen a lot of migrations to new vCenters via detaching the ESXi hosts and connecting to the new vCenter.   This process works great assuming you are not using the distributed switch.   Removing or working with VM’s on a ghosted VDS is a real challenge.   So remove it before you migrate to a new vCenter.

In this multi-article solution I’ll provide some steps to migrate off a VDS to a VSS.

It’s important to understand that assuming that networking is correct this process should not interrupt customer virtual machines.   The movement from a distributed switch to a standard switch at most will lose a ping.   When you assign a new network adapter a gratuitous arp is sent out the new adapter.   If you only have two network adapters this process does remove network adapter redundancy while moving.

Step 1:​​ Create a VSS with the same port groups

You need to create a standard switch with port groups on the correct VLAN ID’s.  ​​​​ You can do this manually but one of the challenges of the standard switch is the name must be exactly the same including case sensitivity to avoid vMotion errors. ​​ (One great reason for the VDS) ​​ So we need to use a script to create the standard switch and port groups.  ​​​​ Using PowerCLI (sorry orchestrator friends I didn’t do it in Orchestrator this time)

Code:

#Import modules for PowerCLI

 ​​ ​​ ​​​​ Import-Module​​ -Name​​ VMware.VimAutomation.Core

 ​​ ​​ ​​​​ Import-Module​​ -Name​​ VMware.VimAutomation.Vds

 

 ​​​​ #Variables to change

 ​​ ​​ ​​​​ $standardSwitchName​​ =​​ "StandardSwitch"

 ​​ ​​ ​​​​ $dvSwitchName​​ =​​ "dvSwitch"

 ​​ ​​ ​​​​ $cluster​​ =​​ "Basement"

 ​​ ​​ ​​​​ $vCenter​​ =​​ "192.168.10.14"

 

 ​​ ​​ ​​​​ #Connect to vCenter

 ​​ ​​ ​​​​ connect-viserver​​ -server​​ $vCenter

 

 

 

 ​​​​ $dvsPGs​​ =​​ Get-VirtualSwitch​​ -Name​​ $dvSwitchName​​ |​​ Get-VirtualPortGroup​​ |​​ Select​​ Name,​​ @{N="VLANId";E={$_.Extensiondata.Config.DefaultPortCOnfig.Vlan.VlanId}},​​ NumPorts

 

 ​​​​ #Get all ESXi hosts in a cluster

 ​​​​ $vmhosts​​ =​​ get-cluster​​ -Name​​ $cluster​​ |​​ get-vmhost

 

 ​​ ​​ ​​​​ #Loop ESXi hosts

 ​​ ​​ ​​​​ foreach​​ ($vmhost​​ in​​ $vmhosts)

 ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Create new VSS

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ $vswitch​​ =​​ New-VirtualSwitch​​ -VMHost​​ $vmhost​​ -Name​​ $standardSwitchName​​ -Confirm:$false

 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Look port groups and create

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ foreach​​ ($dvsPG​​ in​​ $dvsPGs)

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Validate the port group is a number the DVUplink returns an array

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ if​​ ($dvsPg.VLANId​​ -is​​ [int]​​ )

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ New-VirtualPortGroup​​ -Name​​ $dvsPG.Name​​ -VirtualSwitch​​ $vswitch​​ -VlanId​​ $dvsPG.VLANId​​ -Confirm:$false

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 

 ​​ ​​ ​​​​ }​​ 

 

Explained: ​​ 

  • Provide variables

  • Connect to vCenter

  • Get all port groups into $dvsPGs

  • Get all ESXi hosts

  • Loop though ESXi hosts one at a time

  • Create the new standard switch

  • Loop through port groups and create them with same name as DVS and VLAN ID

 

This will create a virtual standard switch with the same VLAN and port group configuration as your DVS.  ​​ ​​​​ 

 

I like to be able to validate that the source and destination are configured the same so this powercli script provides the checking:

Code:

#Validation check DVS vs VSS for differences

 

 ​​ ​​ ​​​​ $dvsPGs​​ =​​ Get-VirtualSwitch​​ -Name​​ $dvSwitchName​​ |​​ Get-VirtualPortGroup​​ |​​ Select​​ Name,​​ @{N="VLANId";E={$_.Extensiondata.Config.DefaultPortCOnfig.Vlan.VlanId}},​​ NumPorts

 ​​ ​​ ​​​​ #Get all ESXi hosts in a cluster

 ​​ ​​ ​​​​ $vmhosts​​ =​​ get-cluster​​ -Name​​ $cluster​​ |​​ get-vmhost

 

 ​​ ​​ ​​​​ #Loop ESXi hosts

 ​​ ​​ ​​​​ foreach​​ ($vmhost​​ in​​ $vmhosts)

 ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Write-Host "Host: "$vmhost.Name "VSS: "$standardSwitchName

 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Get VSSPortgroups for this host

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ $VSSPortGroups​​ =​​ $vmhost​​ |​​ Get-VirtualSwitch​​ -Name​​ $standardSwitchName​​ |​​ Get-VirtualPortGroup

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Sort based upon name of VSS

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ foreach​​ ($dvsPG​​ in​​ $dvsPGs)

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ if​​ ($dvsPg.VLANId​​ -is​​ [int]​​ )

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Write "VSSPortGroup: " $VSSPortGroup.Name

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Loop on DVS

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ $match​​ =​​ $FALSE

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ foreach​​ ($VSSPortGroup​​ in​​ $VSSPortGroups)

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ if​​ ($dvsPG.Name​​ -eq​​ $VSSPortGroup.Name)

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Write-Host "Found a Match vss: "$VSSPortGroup.Name" to DVS: "$dvsPG.Name" Host: " $vmhost.name

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ $match​​ =​​ $TRUE

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ $missing​​ =​​ $dvsPG.Name

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ if​​ ($match​​ -eq​​ $FALSE)

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ Write-Host​​ "Did not find a match for DVS: "$missing​​ " on "$vmhost.name

 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 

 ​​ ​​ ​​​​ }​​ 

 

Explained:

  • Get the VDS

  • Get all ESXi hosts

  • Loop through VM hosts

  • Get port groups on standard switch

  • Loop though the standard switch port groups and look for matches on DVS

  • If missing then output missing element

 

 

Now we need to give the standard switch an uplink (this is critical otherwise VM’s will fail when moved)

 

Once it has an uplink you can use the following script to move all virtual machines:

 

Code:

#Move Virtual machines to new Adapters

 

 ​​ ​​ ​​​​ $vms​​ =​​ get-vm​​ 

 

 ​​ ​​ ​​​​ foreach​​ ($vm​​ in​​ $vms)

 ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #grab the virtual switch for the hosts​​ 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ $vss​​ =​​ Get-VirtualSwitch​​ -Name​​ $standardswitchname​​ -VMHost​​ $vm.VMHost

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #check that the virtual switch has at least one physical adapter

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ if​​ ($vss.ExtensionData.Pnic.Count​​ -gt​​ 0)

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #VMHost

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ $adapters​​ =​​ $vm​​ |​​ Get-NetworkAdapter​​ 

 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Loop through adapters

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ foreach​​ ($adapter​​ in​​ $adapters)

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Get VSS port group of same name returns port group on all hosts

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ $VSSPortGroups​​ =​​ Get-VirtualPortGroup​​ -Name​​ $adapter.NetworkName​​ -VirtualSwitch​​ $standardSwitchName

 ​​ ​​​​ 

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Loop the hosts

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ foreach​​ ($VSSPortGroup​​ in​​ $VSSPortGroups)

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Search for the PortGroup on our host

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ if​​ ([string]$VSSPortGroup.VMHostId​​ -eq​​ [string]$vm.VMHost.Id)

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ {

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ #Change network Adapter to standard switch

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ Set-NetworkAdapter​​ -NetworkAdapter​​ $adapter​​ -Portgroup​​ $VSSPortGroup​​ -Confirm:$false

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 ​​ ​​ ​​ ​​ ​​ ​​ ​​​​ }

 ​​ ​​ ​​​​ }​​ 

 

Explained: ​​ 

  • Used same variables from previous script

  • Get all virtual machines (you could use get-vm “name-of-vm” to test a single vm

  • Loop through all virtual machines one at a time

  • Get the VSS for the VM (host specific)

  • Check for at least one physical uplink to switch (gut / sanity check)

  • Loop though the adapters on a virtual machine​​ 

  • For each adapter get VDS port group name and switch the adapter