Powershell Functions for tags

Some quick powershell functions for tags in ESXi enjoy:

#add tag to vm
function add_tag_to_vm($VM, $TAG)
{
 Get-VM –Name $VM | New-TagAssignment –Tag $TAG

}
#Add a tag to all virtual machines in folder
function add_tag_to_all_vm_in_folder($FOLDER,$TAG){

get-folder $FOLDER | get-vm | New-TagAssignment –Tag “$TAG”

}
#List all vm's with specific tag
function get_vms_with_tag($TAG){

$tags = Get-VM –Tag “$TAG”
 return $tags

}
#Check for the presence of a tag on VM
function check_vm_for_tag($VM, $TAG)
{
 $checkfortag = get-vm $VM -tag $TAG
 $havetag = get-vm | where {$checkfortag.name -contains $_.name} 
 return $havetag

}



#Remove tag from VM
function remove_tag_from_vm($VM, $TAG){
 
 $myVM = Get-VM $VM
 $myTagAssignment = Get-TagAssignment -TagAssignment $TAG $myVM
 Remove-TagAssignment $myTagAssignment -Confirm:$False

}

#Remove a tag from all vm's in a folder
function remove_tag_from_all_vm_in_folder($FOLDER, $TAG){

$myVM = get-folder $FOLDER | Get-VM 
 $myTagAssignment = Get-TagAssignment -Category "$TAG" $myVM
 Remove-TagAssignment $myTagAssignment -Confirm:$False

}

Operational aspects of HCI video

While attending VMworld 2017 I presented on some of the operational aspects of hyper-converged infrastructure.   I believe the key takeaways were:

  • Hyper-converged is more than just storage to gain the real benefits
  • Hyper-converged has a difference scalability model  (more linear)
  • Hyper-converged requires a difference organizational structure to be successful
  • Hyper-converged performance and availability are policy based instead of location based.

You can watch the video here:

 

One question I was asked after the presentation was about scalability.   It was a really good question so I wanted to answer here.   Let’s assume that you start with a 3 node cluster.  After three years you add 12 nodes to a total of 15 nodes.   At some point newer hardware types are available does that mean that you now need to buy 15 brand new nodes running into a major investment instead of incremental growth?

  • The answer is yes and no.   At some point you do have to make the three node investment but you should do it long before the end of life for your current cluster so you can organically grow new hardware before end of life.   This should be taken into account on your growth models.

Thanks to all who attended.

How to operate IT with full Velocity

I was honored to be able to present this last week at VMworld 2017.   I have always been a huge supporter of vBrownBag and was really happy to present for them at VMworld again this year.   One of my presentations was on how to operate IT will full velocity as a follow-up to my post on how to make IT Agile.   The session was recorded and posted on YouTube.   You can view the brief (12 minute) talk here:

Please let me know if you have any feedback or thoughts

Upgrading to vSphere 6.5 FAQ

I was recently involved in recording a series of Webinars to help customers understand how to upgrade to vSphere 6.5.   You can see the on demand recordings here:

https://vts.inxpo.com/scripts/Server.nxp?LASCmd=AI:4;F:APIUTILS!51004&PageID=747A2F8A-E3DD-451B-8172-0F8F16EB464B

A number of live questions were asked and I figured I would highlight some frequently asked questions from the series:

Architecture

Q. Is having three platform service controllers and three vCenters each vCenter pointing to their own PSC supported.

A. Yes 100% supported up to 10 PSC’s and 10 VC’s total pointing any combination you want. If you want enhanced linked mode the PSC’s will have to be external.

Q. Is there a manual step to make the load balancer switch to the secondary PSC?

A. Both PSC’s are active but only one PSC at a time can service requests. So assume we have two PSCs: PSC1 and PSC2 the load balancer points to PSC1 and it fails then the load balancer points all traffic to PSC2 and resumes traffic.

Q. What is the link for the decision tree to choose platform services controller topologies?

A. https://blogs.vmware.com/vsphere/2016/04/platform-services-controller-topology-decision-tree.html

Q. Do you need external PSC if using products such as site recovery manager?

A. The *only* reason you need an external PSC in v6.5 is if you want to use Enhanced Linked Mode (ELM).

Q. Why should we use the vCenter appliance on 6.5 instead of windows?

A. There are a number of features only available to the appliance including: native vCenter HA, native backup and restore, single click upgrade and simplified support models.

 

Predictive DRS

Q. What are the added requirements on the vCenter server for predictive DRS?

A. You will need to install vROps – at least the standard addition

 

Recovery

Q. What happens if the PSC is ‘down’? What functionality do you lose?

A. If a PSC is not functioning new authentication attempts to vCenter will not work. Already authenticated sessions will remain connected.

Q. When using VCHA how many vCenter licenses are required for the three machines?

A. A single vCenter license for a VCHA setup of three machines.

Q. Can the vCenter appliance backups be scheduled to run on a regular basis?

A. Yes, You can set the tool up to do a one time or a schedule.

 

Security

Q. Is there a hardening guide for vSphere 6.5?

A. Absolutely we just released the hardening guide for vSphere 6.5 at http://www.vmware.com/security/hardening-guides.html.

Q. Can you still encrypt VMs with 3rd party vendors?

A. Of course – those APIs are still available to those vendors.

Q. Will the vmotion encryption slow down the vmotion?

A. Less than 5% but yes. You’ll have to account for time to encrypt / decrypt.

Q. What KMS servers are supported?

A. We support any KMIP 1.1 compliant key management server.

Q. Where are the keys stored for VM encryption?

A. Encryption keys are stored in whatever KMIP 1.1 compliant KMS you decided to deploy. The keys never persist in vCenter and simply pass-through to the cluster hosting the workload. The actual key encrypting the VM is stored encrypted using the KMIP key inside the vmx file.   Should you lose your vCenter you would simply re-connect with your KMS infrastructure.

Upgrade

Q. Is there any way to change SSO domain in 6.5 after initial installation?

A. Unfortunately No. If you need to change your SSO domain you must do it in v5.5 before you upgrade (also not possible in v6.0).

Q. If you are upgrading from 6.0 to 6.5 with multiple PSC & VCSA on same SSO domain across 2 sites can you upgrade PSC’s over multiple days/weeks & then VCSA’s over days/weeks. Or does it all need to be done in one window?

A. Our official answer is: Mixed-version environments are not supported for production. Use these environments only during the period when an environment is in transition between vCenter Server versions.

Q. Does the upgrade from 6.0 to 6.5 keep your root certificate store?

A. Yes it does – the upgrade does not affect your certificate Store

Q. Do we have vCenter 6.0 Windows with MSSQL to Appliance 6.5 Converter?

A. The migration tool from 6.0 Windows vCenter to 6.5 vCenter Server Appliance is included as part of the vCenter 6.5 Appliance ISO.

Q. If we want to move vCenter from embedded to external SSO what is the best path?

A. I’d recommend you perform your upgrade to the vCenter appliance using the migration wizard and then post migration deploy a new PSC appliance joined to the embedded and repoint your vCenter to this new PSC.

 

Let me know if you have additional questions.

How to migrate current workloads into NSX

I get this question all the time.

How do I migrate my existing VLAN backed workloads into NSX?

The answer is pretty simple but it has some design concerns.   In order to explain the process let’s make some assumptions:

  • You have two virtual machines (VM1, VM2)
  • They are all on the same subnet that is backed by a VLAN 310
  • The subnet assigned to the VLAN is 10.40.10.0/24
  • Subnet 10.40.10.0/24 is routed by physical_router1

 

The environment is shown below:

Let’s assume that our NSX network is also built out at this time as follows:

  • Edge Services gateway ESG_1 provides routing between physical and virtual using OSPF area 10 to peer with physical_router1
  • ESG_1 connects to a distributed logical router (DLR_1)
  • Virtual networks backed by VXLAN operate behind DLR_1
  • The ESG is advertising for 10.60.10.0/24 running on VNI5000

The setup is visualized below:

Ok so how to do we get virtual machines on VLAN310 behind the DLR_1 so we can take advantage of all of the NSX routing advantages?

#1 Create a new destination network

Create a new logical switch which will be VNI5001 (assigned number by NSX) at this point don’t assign it a gateway on the DLR_1

#2 Deploy a new Edge gateway

Deploy a new ESG we will call ESG_2 with just management interfaces

#3 Create a bridge

Use ESG_2 to bridge VLAN310 and VNI5001.  There a number of constraints with bridges which I will mention after the process steps.

#4 Test VNI5001

Put a test virtual machine into 10.40.10.0/24 on VNI5001 and test connectivity to VM1 and VM2.

 

#5 Move virtual machines into VNI5001

Switch the network interface for VM1 and VM2 into VNI5001.   They will take a single ping interruption and should continue to work.

#6 Change routing

Here is the interruptive part.  Currently routing to VM1 is going from physical_router1 to switch on VLAN310 through ESG_2 into VNI5001 not an ideal path.    We need to switch 10.40.10.0/24 to be advertised by ESG_1.   We can do this by removing ESG_2 (interrupts network to VM1 + VM2) and adding a gateway for 10.40.10.0/24 on the DLR_1 for VNI5001.   ESG_1 will then advertise the new subnet to the physical_router1 assuming it’s accepted because the old route has been removed traffic will resume.

Bridge mode allows you to migrate into virtual networking with IP address changes.  It does cause an interruption.   One might wonder if you could not just run bridge mode forever.   There are performance and latency concerns to consider with this plan.

Design considerations to bridge mode:

  • An ESG used to provide a L2 bridge maps to a single VLAN so for each bridge you require a new ESG
  • If the ESG fails anything on the virtual networking side will fail because it’s the single point to bridge
  • Performance can be impacted all traffic crossing the bridge has to route into the ESG bridge then to the destination VM
  • If redundancy beyond VMware HA is a concern active / passive ESG’s are supported
  • L2 VLAN must be present on all ESXi hosts that may run the ESG with the bridge

 

So with some design considerations in the book this did not address VLAN’s with physical and virtual machines.   A bridge can provide the functionality of communication between physical and virtual.   This may seem like a good solution but it requires careful design and performance considerations.   Single points of failure or configuration challenges on the physical network can cause the whole solution to fail.

You can read more about bridges on VMware’s documentation here.

Cross site vMotion requires VMware switches technology

Cross site vMotion is a feature that really shows the power of the VMware platform.   When combined with NSX you can move live running virtual machines across long distances.    It’s a huge advantage for customers looking to balance workloads or avoid potential disasters.   I learned today that this feature does require VMware’s virtual standard switch or distributed switch it will not work on any third party switches today.   In addition there are only certain supported migration paths:

VSS = Virtual Standard Switch

VDS = Virtual distributed switch

There are only certain supported migration paths:

VSS -> VSS

VSS -> VDS

VDS ->VDS

Notice that VDS -> VSS is not supported.

Repoint 6.x vCenter to a new PSC

Since a vCenter has a connection to a single PSC it’s important to understand how to move between PSC’s and deploy new ones when old ones have failed.  ​​​​ This article details this​​ mobility and process.  ​​​​ 

 

Once installed check for working vCenter

Then login via ssh and check which PSC is being used

 

Let’s repoint it to psc2.griffiths.local

cmsso-util repoint --repoint-psc psc2.griffiths.local

 

Now we are pointing to psc2 at site1. ​​ In 6.0 you were able to repoint a vCenter to different site PSC’s this is no longer available in 6.5 (Yep no longer possible remember this trying to repoint can cause some really bad stuff in 6.5).  ​​ ​​​​ 

As you can see we have repointed the psc from 1 to 2 at the same site:

So what do you do when all your PSC’s at a site have failed? ​​ (Don’t have a single PSC at a site first off..)  ​​​​ Or this:

Install a new PSC pointing to a remaining site psc we will use psc3 at site2 to create a new PSC5 at site1. ​​ In order to test this I shutdown psc1 and psc2 to simulate failures. ​​ 


So we are creating:​​ 

 

After the PSC is installed it will replicate with psc3.griffiths.local only.  ​​ ​​​​ We then can repoint vc1 to psc5 and rebuild missing psc’s at site1.  ​​ ​​​​ We​​ have to make sure PSC5 was deployed correctly first via visiting it’s webpage:

Now we can repoint the vc to psc5 at site1. ​​ 

 

Login to the vCenter web client to test working authentication via psc5

And it’s working

 

William Lam posted a script to​​ automatically change vCenter to new PSC when a failure is detected and it’s here:

http://www.virtuallyghetto.com/2015/12/how-to-automatically-repoint-failover-vcsa-to-another-replicated-platform-services-controller-psc.html

 

For those who don’t want to read the script it’s very simple it runs on the vCenter appliance and checks the PSC web page for a​​ return code of 200 if it fails 3 times it switches to another PSC.  ​​​​ It runs as a automated task every x minutes.​​ 

 

Remove an old PSC

Login to any PSC and type the following command:

 

cmsso-util unregister –node-pnid OLD_PSC_Name –username administrator@sso_domainname

 

So to remove psc1.griffiths.local I would type:

 

cmsso-util unregister –node-pnid psc1.griffiths.local –username administrator@vsphere.local

 

 

 

Setting up replication on the platform services controller

In the previous set of articles I discussed the following:

In this article I will discuss how to create the reference design from the following:

 

The reference replicate design should be:

So we need to add two new replication agreements.

PSC1 <->PC3

First login to psc1.griffiths.local and enable shell.  Then change the directory into /usr/lib/vmware-vmdir/bin

Look at the current agreements

Add an agreement with psc3.griffiths.local using the following command:

./vdcrepadmin -f createagreement -2 -h psc3.griffiths.local -u Administrator -H psc1.griffiths.local

You can see that we now have two agreements.   Rinse and repeat on PSC2 <=> PSC4 and we have the VVD topology.  Notice in this example that I did everything from PSC1 while referencing different PSC’s:

Now you have your replication agreements.   Looking at the vdcrepadmin command you have the following options

vdcrepadmin -f showpartners

 vdcrepadmin -f showpartnerstatus

 vdcrepadmin -f showservers

 vdcrepadmin -f createagreement [-2]

 vdcrepadmin -f removeagreement [-2]

 vdcrepadmin -f isfirstcycledone

This essentially allows you to:

  • Show replication partners – showpartners
  • Show replication status (including latest replication number) – showpartnerstatus
  • Show all PSC’s in domain – showservers
  • Create replication agreement – createagreement
  • Remove replication agreement (don’t remove all) – removeagreement
  • Check that the PSC has done initial replication – isfirstcycledone

So there you have it health is checked via showpartnerstatus.

 

Installing Platform services controller

In the previous post I discussed the reference architecture and design tips for the PSC.  Here are all the posts in the series:

In this article I will setup four platform services controllers at two sites.   In my home lab I didn’t want to create complexity by using two different routed networks so I left them on the same subnet but created two sites inside the domain.   By the end of this article I will have created:

Four PSC’s at two sites.   So lets get into the install:

I will use the 6.0 appliance for this article.  ​​​​ Before starting I have setup DNS for all the names both forward and reverse. ​​ 

Connect to the ESXi host:

Name here is the name that​​ will show up in vCenter

Choose a PSC

Creating a new SSO domain and calling it vsphere.local with the site name of site1

I changed the IP after this screenshot so it should read 192.168.10.160 and because it’s a demo I am syncing with my ESXi host.. in a real situation you want to sync with NTP. ​​ 

After installer is complete try to visit the web page for the newly installed PSC. ​​ If you don’t get this page remove the PSC and try again. ​​ Error handling is not the best.

 

 

I enabled ssh (if you​​ forgot login to the console and enable there) so I ssh’ed into the new PSC to perform some checks. ​​ Remember to use shell.set ---enabled True to enable bash shell then type shell to enter it.

Change the directory to /usr/lib/vmware-vmdir/bin/ ​​ and​​ execute the command vdcrepadmin as shown to identify all PSC’s in the domain

The command showpartners shows the replication partners (which don’t exist yet)

Psc2.griffiths.local

We are going to join current domain I will not repeat screen shots that​​ are redundant.  ​​​​ The key thing to notice here is that I am creating a replication agreement between psc1 and psc2 that is bi-directional (does the PSC support uni-directional replication: NO – can you do it yes..) ​​ 

Here we are joining the PSC domain and entering the name of it’s initial replication partner:

Notice how it pulls the site names out of psc1

 

 

Save and go to the web site to check after deployment

Login via SSH notice here that I have used the -h (host) to point to psc1.griffiths.local ​​​​ any PSC can be queried from any other.  ​​​​ Showservers now shows both PSC’s

 

 

Showpartners or replication agreements shows that psc1.griffiths.local is in a replication agreement with psc2. ​​ If you ran the command on host psc2 it would show psc1.griffiths.local as the replication member.

Now we want to add a second site and another PSC and replicate it with psc2

Key item: I have entered psc2 as the joining PSC not psc1.  ​​​​ If I had entered PSC 1 here it would be replicating with 1 and ignore 2. ​​ 

Once deployed go to web page to confirm working deployment

SSH to test connections notice the replicate agreement for psc2 now shows two partners. ​​ The showservers command also shows the site name for each PSC.  ​​​​ 

Here you can see the full replicate​​ agreements

PSC4

Skipping known steps and joining to psc3 site2

 

Once deployed check for working PSC via web portal

 

Login via ssh and check replication

 

Replication with only psc3.griffiths.local

We now have:

As expected.  ​​​​ This is not quite​​ the ring but closer. ​​​​ 

 

Long Distance Cross vCenter vMotion requirements

The ability to move virtual machines long distances between two datacenters while running seems like the key example of the power of abstraction.   VMware has enabled this feature but it has a number of requirements that make the cost of ownership a little high.    All of these requirements are listed in VMware KB articles but you have to mine them for the details to ensure you are compatible.   Having recently been stung by these requirements I thought I would collect them into a single location.

Assumptions:

The following assumptions are made:

  • You are running two vCenters one at each site
  • You are running virtual distributed switches at each site

KB Articles mined for the data

Requirements

  • The source and destination vCenter server instances and ESXi hosts must be running version 6.0 or later.
  • Requires Enterprise Plus licensing
  • When initiating the moves in the web client both source and destination vCenter instances must be in Enhanced Linked mode and in the same vCenter Single Sign-On domain (When using API this is not a requirement)
  • Both vCenter Servers must be time synced for SSO to work
  • For migration of compute resources only, both ESXi hosts must be connected to the shared virtual machine storage.
  • When using the vSphere APIs/SDK, both vCenter Server instances may exist in separate vSphere Single Sign-On domains. Additional parameters are required when performing a non-federated cross vCenter Server vMotion.
  • MAC address must no conflict (different vCenter ID’s will ensure this)
  • vMotion cannot take place from distributed switch to standard switch
  • vMotion cannot take place between distributed switches of different versions (source and destination vDS must be the same version)
  • RTT (round-trip time) latency of 150 milliseconds or less, between hosts
  • You must create a routeable network for the Traffic for Cold migrations (Provisioning network from VMkernel types)

 

These requirements can really bite you if you are not careful.   Notice there are no constraints on vMotioning from a standard switch to a distributed switch which helps you get around version differences.   The truth is that vMotion is a miracle of engineering and then cross vCenter vMotion is an even better miracle but it comes at a cost.   Essentially best case scenario you have to have two vCenters in enhanced linked mode on the same version of ESXi, with the same hardware type or in EVC with the same version of distributed switches.   It’s a lot of asks to enable the features and something to consider if your planning on using long distance cross vCenter vMotion.