Two simple things to improve your life

Warning this is an end of year personal post and has very little to do with technology so if you are looking for a technology post feel free to skip it.   Two things have been rattling around in my head of late and I wanted to share them as an end of year post.   These two things have proven to improve my life many times over.

Perception determines reality

I do not wish to diminish the real life challenges you each face.  I have lived long enough to understand that each of us faces monstrous mountains throughout our lives.   Some of us face challenges with family, friends, relationships, actions of others, health and many other things that are real. 

When used professionally, this has the potential to reduce stress and enhance wellbeing. By applying it directly to your skin, you can check here for more benefits; for example, relieving your headache or aiding in your circulation.

When we faces these mountains our outlook truly can change the outcomes.   I learned this in a simple way many years ago.   I was a young missionary serving a dedicated two years spreading the gospel.  I was 22 years old and in Michigan 2,000 miles away from any family.   I had been a missionary for over 14 months well experienced in constant negative response we received from our work.   I was assigned to train a new missionary and it was Christmas time.   It’s a particularly hard time to be away from the family and virtually cut off (we were only allowed to communicate via mail once a week).  We had a particularity hard area full of rich people (they are generally not receptive to our message).  There were many days it poured freezing rain or snow while we traveled by bike.  It was cold and dark.  A week before Christmas we discussed getting a Christmas tree and determine that our monthly food budget of $115 dollars could not afford a tree.   So we pressed on with the long days work (9 am – 8 pm knocking on doors every day six days a week).   My new missionary friend always had great attitude nothing phased him.   We joked and talked every day while we seemed to be doing nothing.   One night I was preparing for bed and came out into the front room to discover my companion coloring boxes green.   When I asked him what he was doing he simply stated making a Christmas tree.   Later that night he stacked the boxes of various sized on top of each other to roughly resemble a tree.   He then took a red marker and drew balls on the tree.  Satisfied with the results he went to bed.  He and I spent three months together knocking on doors for 11 hours a day and didn’t get the opportunity to teach a single lesson.   By all results we had an epic fail.   Looking back 16 years later I can clearly say I learned one of lives most important lessons: it is not the results that count it is how you face them.  I have had many failures in my life since then but I have been able to remember the lessons he taught me:  make the best of what you have and don’t let any external event get you down.   You might have to make a Christmas tree out of boxes or lower your expectations considering getting out of bed the crowning achievement of the day but your perception determines your reality no external event.

 

By small and simple things great things are brought to pass

When I was young I was convinced that I needed to locate some great event to prove nobility.   It was in a single moment these great things happen.   It is simply not true.  While people are noble and at times do great things I suggest this is just an extension of the many small things they have done for many years.  A friend once called it healthy life habits practiced regularly.  I have learned that I do my best work in small burst practiced regularly.  Simple things like choosing to go to bed on time makes me a better father the next day.. when practiced for a lifetime consistently makes a better father for life.   Choosing to allocate time to service makes me less selfish for a day practiced each day makes me a less selfish person.   Other examples may include: reading a book to improve myself, spending one on one time with my children, driving a little slower and letting people merge, choosing to do the dishes, reading religious words, turning off my phone etc..  I am convinced that by doing all these small simple things consistently I will find I have become a great and noble man at the end of my life.

 

 

Operational aspects of HCI

Hyperconverged infrastructure (HCI) is natively software defined proving a shift of operations away from the traditional storage management paradigm.  ​​​​ Many of​​ my​​ customers have struggled with the paradigm shift​​ when​​ adopting​​ storage​​ HCI.  ​​​​ HCI has been very successful in addressing​​ specific use cases.​​ ​​ Many of these use cases have been successful because they represent workloads that have not been traditionally managed by storage teams for example VDI. ​​ Adoption of HCI beyond these use cases requires large organizations to implement people and process transformation to be successful. ​​ ​​ Discussions with customers has shown that fear​​ about the operational transition has created a​​ lack of adoption.  ​​​​ The net gain of HCI in the datacenter is a significant reduction in the total cost of ownership for storage.

 

What is your storage strategy?

When looking at your storage strategy you are likely to see a mixture of solutions to meet your needs.  ​​​​ I​​ have found that the following questions help​​ people​​ identify their requirements which ultimately lead to strategy:

  • What storage requirements do your applications have and how are they measured?

  • How is storage involved in your business continuity, disaster recovery, backup and availability strategy?

  • What data security requirements does your organization have?

  • What is your storage strategy for the cloud?

 

Once you identify your storage requirements the strategy can be aligned with functional needs. ​​ Functions that may be important to your organization around storage may include:

  • Capacity

  • Performance

  • Redundancy

  • Data security

  • Ease of management

  • Cost

  • Replication​​ capabilities

Assigning measurements to these functions allow you to identify the correct storage “profiles” to be used in your organization. ​​ These profiles can then be aligned with​​ your storage strategy. ​​​​ 

 

Differences in HCI

HCI does present some differences from many​​ traditional storage arrays. ​​ Four​​ common​​ elements of​​ difference are capacity management,​​ scalability,​​ policy based management and roles.

 

Capacity Management

Capacity management in most traditional systems​​ require a measurement based on​​ historical usage metrics. ​​ Historical​​ data is taken into account then a​​ “bet”​​ is made about required capacity for the next​​ X​​ years.  ​​​​ The​​ large “bet”​​ on storage array capacity and performance does not allow IT to be agile to business changes.  ​​​​ Growth beyond the initial implementation is possible by adding additional storage shelves or buying new arrays. ​​​​ HCI by contrast takes a​​ linear model. ​​​​ You can scale up​​ and out incrementally. ​​ You​​ add capacity by adding additional drives to your current servers or add additional servers to increase available controllers​​ and drive bays.​​ ​​ I​​ find that customer who adopt HCI​​ are:

  • Able to procure storage in incremental blocks instead of via large capital expense​​ “bets”

  • Able to have a predictable outcome on capacity management

  • Able to adopt new technology faster

  • Able to​​ utilize storage resources without depreciation​​ “bets”

 

Once a storage becomes aligned with HCI based capacity management they find that storage capacity growth is no longer a “flak jacket” exercise. ​​ The business can accept that their new project requires some incremental increase in cost instead​​ of requiring a large CapEx spend. ​​ The integrated nature of HCI means that compute capacity sizing is integrated in part with storage capacity.  ​​​​ This simplified​​ capacity management​​ allows the IT budget to stretch farther. ​​ ​​ Best practices for HCI include:

  • Design for scale, but build incrementally

  • Overall capacity management process is the same as traditional arrays but lead times are shorter and potentially more frequent

  • Choose servers with maximum available drive bays

 

Traditional storage capacity management requires procurement at roughly 60% usage to allow for growth.  ​​​​ In large environment this means that large amounts of capacity will never be used increasing to total cost per GB of storage usage.  ​​ ​​​​ HCI’s lower capacity expansion cost should allow large organizations to utilize 80% or more of capacity before buying expansions.

 

Some capacity metrics that you should monitor include:

  • Total available space

  • Used space

  • Used capacity breakdown including (VM’s, Swap,​​ Snapshots etc.)

  • Dedupe and compression savings

 

Scalability

A common concern with HCI​​ is scalability.​​ ​​ Independent scalability​​ is touted as one of the primary benefits of traditional three tier infrastructure: compute, storage, and networking.  ​​​​ When considering the scalability of traditional storage systems the follow are considered:

  • Capacity in TB’s

  • Required IOPs

  • Throughput of storage systems (link speed)

  • Throughput of controllers

 

The adoption of flash drives has changed the scalability painpoint, IOPs are no longer a concern for most enterprises. Flash drives have increased the pressure on link speed and controller throughput forcing architecture changes in traditional arrays.  ​​​​ When adopting HCI controllers and link speed becomes distributed removing both bottlenecks leaving only capacity to be considered assuming all flash arrays. ​​ HCI addresses​​ capacity scalability​​ in two ways: adding additional drives and increasing the capacity of existing drives. ​​ It is considered a best practice when implementing HCI​​ to get servers with as many drive bays as possible.  ​​​​ This allows you to increase capacity across the cluster by adding drives.  ​​​​ The explosive adoption of HCI​​ and flash​​ has driven manufactures to provide increasing larger capacity drives.  ​​​​ With VMware VSAN you can replace existing drives with larger​​ drives without interrupting​​ operations​​  ​​ ​​​​ Customers​​ can​​ double​​ storage capacity​​ without adding additional compute nodes. ​​ HCI scales​​ in​​ a distributed fashion for linear growth. ​​ Some best practices to consider around scalability are:

  • Consider using traditional servers instead of blades to increase the available drive bays

  • Consider using all flash drives to​​ remove all potential performance concerns

  • HCI does implement a flash cache which greatly improves performance without having to implement all flash

 

Policy Based Management

Many traditional arrays​​ availability and performance is tied to logical unit number (LUN).  ​​​​ These capabilities are set in stone at time of creation.  ​​​​ In order to change these capabilities moving the data is required.  ​​​​ This type of allocation creates challenges for capacity management and increases the number of day two operations required in order to meet business needs.  ​​​​ HCI takes a policy based approach and removes the constraints of LUNs.  ​​​​ There is a single​​ datastore​​ provided by HCI radically simplifying traditional storage​​ management.  ​​​​ Policies define availability and performance​​ requirements and the HCI system enforces the policies.  ​​​​ To increase the performance of a specific workload a new policy is defined and assigned to the workload. ​​ The HCI system works to ensure policy compliance without interruption to the workload.  ​​​​ Policy based management provides large operational efficiencies. ​​ An IDC study has shown that​​ HCI can lower the OpEx​​ cost​​ of storage by 50% or more. ​​ In VSAN there are two key elements in a policy: stripe count and failures to tolerate (FTT).  ​​​​ Stripe count denotes the number of drives an object needs to be striped across thus improving performance.  ​​​​ Each object will have its data spread across X number of disks on the same compute note. ​​ Failure to​​ tolerate denotes the number of compute nodes that can fail before data access is affected.  ​​​​ A FTT setting of 1 is essentially a mirror each object must have one duplicate copy on another node.  ​​​​ FTT of 2 provides two copies of the data across three total nodes.  ​​​​ FTT has a direct effect on the amount of storage used in the HCI implementation.  ​​​​ Policies should be designed to meet the business needs of the application.  ​​​​ A few best practices to consider:

  • Do not use FTT of 0 unless you truly don’t care about loss of the data (stateless services)

  • Depending on the type of disks backing the HCI solution additional stripes may not provide performance boosts

Some general​​ VSAN​​ performance​​ guidance is provided below:

Some general​​ VSAN​​ availability guidance provided below:

The policies should align with organizational application requirements. ​​ Management by policy provide the greatest flexibility and reduces the management cost. ​​ 

 

Roles

 

Many organizations have struggled to adopt HCI because of the change in​​ skills and process​​ required to be successful.  ​​​​ The best case scenario for HCI bridges the world of compute,​​ storage, networking and security together into a single platform.  ​​​​ This single platform provides operational synergy and encourages standards.  ​​​​ Organizations that have been successful in adoption of HCI have learned that it requires a cross functional skills set. ​​ The current reality of siloed teams struggle to adopt HCI. ​​​​ Creation of cross functional teams with blended skills allows accelerated adoption of HCI. ​​ 

 

Some best practices for successful HCI adoption include:

  • Cross functional training

  • Blended teams

  • Rotating subject matter experts who are expected to own a product but train others

  • Outcome-oriented teams​​ and compensation​​ instead of activity-oriented

 

Many of​​ my customers​​ have adopted a plan, build run methodology in these organizations it is recommended that teams at each tier be blended.  ​​​​ I recommend that members of each silo of plan, build and run rotate though plan, build and run to better understand each role.

 

Benefits of HCI

HCI can provide many benefits required by modern datacenters.  ​​​​ I​​ have​​ observed customers successfully adopting HCI have achieved the following outcomes:

  • Hyper Scalability

  • Operation agility

  • Operation efficiency

  • Simplified operations and support

  • Improved availability and performance

 

I truly believe it’s time to adopt HCI in your datacenter​​ and realize the operational and cost benefits.

 

 

Basic NSX Setup using RESTAPI

In a previous article I used the GUI to deploy a basic NSX setup I wanted to do the same thing using the RESTAPI.   Remember the manual process took me about 20 minutes to complete the RESTAPI calls took one minute.   I am defining the network setup via code.  Please review this article for specifics on design here.

 

To publish new NSX configurations on Edges you need to do POSTS against : https://nsx-manager-address/api/4.0/edges with the body type of XML/Application

Inside the body you need to modify a few things (if you do a dump of a current edge)

  • Add a cli password section
  • Modify the name of the Edge if it’s a duplicate name (or remove the old)

 

Add cli password section as shown below with your password (you can change via GUI once deployed)

Then use the following code to deploy the ESG-3 (resource pool etc is unique to my environment so you have to change too)

<?xml version=”1.0″ encoding=”UTF-8″?>
<edge>
<id>edge-73</id>
<version>5</version>
<status>deployed</status>
<datacenterMoid>datacenter-21</datacenterMoid>
<datacenterName>Home</datacenterName>
<tenant>default</tenant>
<name>ESG-3</name>
<fqdn>esg3</fqdn>
<enableAesni>true</enableAesni>
<enableFips>false</enableFips>
<vseLogLevel>emergency</vseLogLevel>
<vnics>
<vnic>
<label>vNic_0</label>
<name>Uplink</name>
<addressGroups>
<addressGroup>
<primaryAddress>192.168.10.223</primaryAddress>
<subnetMask>255.255.255.0</subnetMask>
<subnetPrefixLength>24</subnetPrefixLength>
</addressGroup>
</addressGroups>
<mtu>1500</mtu>
<type>uplink</type>
<isConnected>true</isConnected>
<index>0</index>
<portgroupId>dvportgroup-106</portgroupId>
<portgroupName>DV-VM</portgroupName>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>false</enableSendRedirects>
</vnic>
<vnic>
<label>vNic_1</label>
<name>vnic1</name>
<addressGroups>
<addressGroup>
<primaryAddress>10.0.0.1</primaryAddress>
<subnetMask>255.255.255.0</subnetMask>
<subnetPrefixLength>24</subnetPrefixLength>
</addressGroup>
</addressGroups>
<mtu>1500</mtu>
<type>internal</type>
<isConnected>true</isConnected>
<index>1</index>
<portgroupId>virtualwire-45</portgroupId>
<portgroupName>Transport-10.0.0.0</portgroupName>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>true</enableSendRedirects>
</vnic>
<vnic>
<label>vNic_2</label>
<name>vnic2</name>
<addressGroups />
<mtu>1500</mtu>
<type>internal</type>
<isConnected>false</isConnected>
<index>2</index>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>true</enableSendRedirects>
</vnic>
<vnic>
<label>vNic_3</label>
<name>vnic3</name>
<addressGroups />
<mtu>1500</mtu>
<type>internal</type>
<isConnected>false</isConnected>
<index>3</index>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>true</enableSendRedirects>
</vnic>
<vnic>
<label>vNic_4</label>
<name>vnic4</name>
<addressGroups />
<mtu>1500</mtu>
<type>internal</type>
<isConnected>false</isConnected>
<index>4</index>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>true</enableSendRedirects>
</vnic>
<vnic>
<label>vNic_5</label>
<name>vnic5</name>
<addressGroups />
<mtu>1500</mtu>
<type>internal</type>
<isConnected>false</isConnected>
<index>5</index>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>true</enableSendRedirects>
</vnic>
<vnic>
<label>vNic_6</label>
<name>vnic6</name>
<addressGroups />
<mtu>1500</mtu>
<type>internal</type>
<isConnected>false</isConnected>
<index>6</index>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>true</enableSendRedirects>
</vnic>
<vnic>
<label>vNic_7</label>
<name>vnic7</name>
<addressGroups />
<mtu>1500</mtu>
<type>internal</type>
<isConnected>false</isConnected>
<index>7</index>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>true</enableSendRedirects>
</vnic>
<vnic>
<label>vNic_8</label>
<name>vnic8</name>
<addressGroups />
<mtu>1500</mtu>
<type>internal</type>
<isConnected>false</isConnected>
<index>8</index>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>true</enableSendRedirects>
</vnic>
<vnic>
<label>vNic_9</label>
<name>vnic9</name>
<addressGroups />
<mtu>1500</mtu>
<type>internal</type>
<isConnected>false</isConnected>
<index>9</index>
<enableProxyArp>false</enableProxyArp>
<enableSendRedirects>true</enableSendRedirects>
</vnic>
</vnics>
<appliances>
<applianceSize>compact</applianceSize>
<appliance>
<highAvailabilityIndex>0</highAvailabilityIndex>
<vcUuid>500cf09e-2945-2df1-ca4f-1accbf151185</vcUuid>
<vmId>vm-1033</vmId>
<resourcePoolId>domain-c861</resourcePoolId>
<resourcePoolName>Office</resourcePoolName>
<datastoreId>datastore-998</datastoreId>
<datastoreName>SYN9-NFS-GEN-VOL1</datastoreName>
<hostId>host-863</hostId>
<hostName>esx1.griffiths.local</hostName>
<vmFolderId>group-v22</vmFolderId>
<vmFolderName>vm</vmFolderName>
<vmHostname>esg3-0</vmHostname>
<vmName>ESG-3-0</vmName>
<deployed>true</deployed>
<cpuReservation>
<limit>-1</limit>
<reservation>0</reservation>
</cpuReservation>
<memoryReservation>
<limit>-1</limit>
<reservation>0</reservation>
</memoryReservation>
<edgeId>edge-73</edgeId>
<configuredResourcePool>
<id>domain-c861</id>
<name>Office</name>
<isValid>true</isValid>
</configuredResourcePool>
<configuredDataStore>
<id>datastore-998</id>
<name>SYN9-NFS-GEN-VOL1</name>
<isValid>true</isValid>
</configuredDataStore>
<configuredHost>
<id>host-882</id>
<name>esx3.griffiths.local</name>
<isValid>true</isValid>
</configuredHost>
</appliance>
<deployAppliances>true</deployAppliances>
</appliances>
<cliSettings>
<remoteAccess>true</remoteAccess>
<userName>admin</userName>
<password>yourpassword</password>
<sshLoginBannerText>*************************************************************************** NOTICE TO USERS This computer system is the private property of its owner, whether individual, corporate or government. It is for authorized use only. Users (authorized or unauthorized) have no explicit or implicit expectation of privacy. Any or all uses of this system and all files on this system may be intercepted, monitored, recorded, copied, audited, inspected, and disclosed to your employer, to authorized site, government, and law enforcement personnel, as well as authorized officials of government agencies, both domestic and foreign. By using this system, the user consents to such interception, monitoring, recording, copying, auditing, inspection, and disclosure at the discretion of such personnel or officials. Unauthorized or improper use of this system may result in civil and criminal penalties and administrative or disciplinary action, as appropriate. By continuing to use this system you indicate your awareness of and consent to these terms and conditions of use. LOG OFF IMMEDIATELY if you do not agree to the conditions stated in this warning. ****************************************************************************</sshLoginBannerText>
<passwordExpiry>99999</passwordExpiry>
</cliSettings>
<features>
<l2Vpn>
<version>2</version>
<enabled>false</enabled>
<logging>
<enable>true</enable>
<logLevel>notice</logLevel>
</logging>
</l2Vpn>
<featureConfig />
<firewall>
<version>3</version>
<enabled>false</enabled>
<globalConfig>
<tcpPickOngoingConnections>false</tcpPickOngoingConnections>
<tcpAllowOutOfWindowPackets>false</tcpAllowOutOfWindowPackets>
<tcpSendResetForClosedVsePorts>true</tcpSendResetForClosedVsePorts>
<dropInvalidTraffic>true</dropInvalidTraffic>
<logInvalidTraffic>false</logInvalidTraffic>
<tcpTimeoutOpen>30</tcpTimeoutOpen>
<tcpTimeoutEstablished>21600</tcpTimeoutEstablished>
<tcpTimeoutClose>30</tcpTimeoutClose>
<udpTimeout>60</udpTimeout>
<icmpTimeout>10</icmpTimeout>
<icmp6Timeout>10</icmp6Timeout>
<ipGenericTimeout>120</ipGenericTimeout>
<enableSynFloodProtection>false</enableSynFloodProtection>
<logIcmpErrors>false</logIcmpErrors>
<dropIcmpReplays>false</dropIcmpReplays>
</globalConfig>
<defaultPolicy>
<action>deny</action>
<loggingEnabled>false</loggingEnabled>
</defaultPolicy>
<firewallRules>
<firewallRule>
<id>131075</id>
<ruleTag>131075</ruleTag>
<name>routing</name>
<ruleType>internal_high</ruleType>
<enabled>true</enabled>
<loggingEnabled>false</loggingEnabled>
<description>routing</description>
<action>accept</action>
<application>
<service>
<protocol>ospf</protocol>
<port>any</port>
<sourcePort>any</sourcePort>
</service>
</application>
</firewallRule>
<firewallRule>
<id>131073</id>
<ruleTag>131073</ruleTag>
<name>default rule for ingress traffic</name>
<ruleType>default_policy</ruleType>
<enabled>true</enabled>
<loggingEnabled>false</loggingEnabled>
<description>default rule for ingress traffic</description>
<action>deny</action>
</firewallRule>
</firewallRules>
</firewall>
<dns>
<version>2</version>
<enabled>false</enabled>
<cacheSize>16</cacheSize>
<listeners>
<vnic>any</vnic>
</listeners>
<dnsViews>
<dnsView>
<viewId>view-0</viewId>
<name>vsm-default-view</name>
<enabled>true</enabled>
<viewMatch>
<ipAddress>any</ipAddress>
<vnic>any</vnic>
</viewMatch>
<recursion>false</recursion>
</dnsView>
</dnsViews>
<logging>
<enable>false</enable>
<logLevel>info</logLevel>
</logging>
</dns>
<sslvpnConfig>
<version>2</version>
<enabled>false</enabled>
<logging>
<enable>true</enable>
<logLevel>notice</logLevel>
</logging>
<advancedConfig>
<enableCompression>false</enableCompression>
<forceVirtualKeyboard>false</forceVirtualKeyboard>
<randomizeVirtualkeys>false</randomizeVirtualkeys>
<preventMultipleLogon>false</preventMultipleLogon>
<clientNotification />
<enablePublicUrlAccess>false</enablePublicUrlAccess>
<timeout>
<forcedTimeout>0</forcedTimeout>
<sessionIdleTimeout>10</sessionIdleTimeout>
</timeout>
</advancedConfig>
<clientConfiguration>
<autoReconnect>true</autoReconnect>
<upgradeNotification>false</upgradeNotification>
</clientConfiguration>
<layoutConfiguration>
<portalTitle>VMware</portalTitle>
<companyName>VMware</companyName>
<logoExtention>jpg</logoExtention>
<logoUri>/api/4.0/edges/edge-73/sslvpn/config/layout/images/portallogo</logoUri>
<logoBackgroundColor>56A2D4</logoBackgroundColor>
<titleColor>996600</titleColor>
<topFrameColor>000000</topFrameColor>
<menuBarColor>999999</menuBarColor>
<rowAlternativeColor>FFFFFF</rowAlternativeColor>
<bodyColor>FFFFFF</bodyColor>
<rowColor>F5F5F5</rowColor>
</layoutConfiguration>
<authenticationConfiguration>
<passwordAuthentication>
<authenticationTimeout>1</authenticationTimeout>
<primaryAuthServers />
<secondaryAuthServer />
</passwordAuthentication>
</authenticationConfiguration>
</sslvpnConfig>
<routing>
<version>4</version>
<enabled>true</enabled>
<routingGlobalConfig>
<routerId>192.168.10.223</routerId>
<ecmp>false</ecmp>
<logging>
<enable>false</enable>
<logLevel>info</logLevel>
</logging>
</routingGlobalConfig>
<staticRouting>
<defaultRoute>
<vnic>0</vnic>
<mtu>1500</mtu>
<gatewayAddress>192.168.10.1</gatewayAddress>
<adminDistance>1</adminDistance>
</defaultRoute>
<staticRoutes />
</staticRouting>
<ospf>
<enabled>true</enabled>
<ospfAreas>
<ospfArea>
<areaId>2</areaId>
<type>normal</type>
<authentication>
<type>none</type>
</authentication>
</ospfArea>
</ospfAreas>
<ospfInterfaces>
<ospfInterface>
<vnic>0</vnic>
<areaId>2</areaId>
<helloInterval>10</helloInterval>
<deadInterval>40</deadInterval>
<priority>128</priority>
<cost>1</cost>
<mtuIgnore>false</mtuIgnore>
</ospfInterface>
</ospfInterfaces>
<redistribution>
<enabled>false</enabled>
<rules />
</redistribution>
<gracefulRestart>true</gracefulRestart>
<defaultOriginate>false</defaultOriginate>
</ospf>
</routing>
<highAvailability>
<version>2</version>
<enabled>false</enabled>
<declareDeadTime>15</declareDeadTime>
<logging>
<enable>false</enable>
<logLevel>info</logLevel>
</logging>
<security>
<enabled>false</enabled>
</security>
</highAvailability>
<syslog>
<version>1</version>
<enabled>false</enabled>
</syslog>
<featureConfig />
<loadBalancer>
<version>1</version>
<enabled>false</enabled>
<enableServiceInsertion>false</enableServiceInsertion>
<accelerationEnabled>false</accelerationEnabled>
<monitor>
<monitorId>monitor-1</monitorId>
<type>tcp</type>
<interval>5</interval>
<timeout>15</timeout>
<maxRetries>3</maxRetries>
<name>default_tcp_monitor</name>
</monitor>
<monitor>
<monitorId>monitor-2</monitorId>
<type>http</type>
<interval>5</interval>
<timeout>15</timeout>
<maxRetries>3</maxRetries>
<method>GET</method>
<url>/</url>
<name>default_http_monitor</name>
</monitor>
<monitor>
<monitorId>monitor-3</monitorId>
<type>https</type>
<interval>5</interval>
<timeout>15</timeout>
<maxRetries>3</maxRetries>
<method>GET</method>
<url>/</url>
<name>default_https_monitor</name>
</monitor>
<logging>
<enable>false</enable>
<logLevel>info</logLevel>
</logging>
</loadBalancer>
<gslb>
<version>1</version>
<enabled>false</enabled>
<logging>
<enable>false</enable>
<logLevel>info</logLevel>
</logging>
</gslb>
<ipsec>
<version>1</version>
<enabled>false</enabled>
<logging>
<enable>true</enable>
<logLevel>warning</logLevel>
</logging>
<sites />
<global>
<psk>******</psk>
<caCertificates />
<crlCertificates />
</global>
</ipsec>
<dhcp>
<version>2</version>
<enabled>false</enabled>
<staticBindings />
<ipPools />
<logging>
<enable>false</enable>
<logLevel>info</logLevel>
</logging>
</dhcp>
<nat>
<version>2</version>
<enabled>true</enabled>
<natRules />
</nat>
<bridges>
<version>2</version>
<enabled>false</enabled>
</bridges>
<featureConfig />
</features>
<autoConfiguration>
<enabled>true</enabled>
<rulePriority>high</rulePriority>
</autoConfiguration>
<type>gatewayServices</type>
<isUniversal>false</isUniversal>
<hypervisorAssist>false</hypervisorAssist>
<queryDaemon>
<enabled>false</enabled>
<port>5666</port>
</queryDaemon>
</edge>

 

Same thing for the LDR

<?xml version=”1.0″ encoding=”UTF-8″?>

<edge>

<id>edge-72</id>

<version>6</version>

<status>deployed</status>

<datacenterMoid>datacenter-21</datacenterMoid>

<datacenterName>Home</datacenterName>

<tenant>default</tenant>

<name>LDR-3</name>

<fqdn>ldr3</fqdn>

<enableAesni>false</enableAesni>

<enableFips>false</enableFips>

<vseLogLevel>emergency</vseLogLevel>

<appliances>

<applianceSize>compact</applianceSize>

<appliance>

<highAvailabilityIndex>0</highAvailabilityIndex>

<vcUuid>500c4666-b908-cf53-a9f5-322d2fac48d3</vcUuid>

<vmId>vm-1032</vmId>

<resourcePoolId>domain-c861</resourcePoolId>

<resourcePoolName>Office</resourcePoolName>

<datastoreId>datastore-998</datastoreId>

<datastoreName>SYN9-NFS-GEN-VOL1</datastoreName>

<hostId>host-882</hostId>

<hostName>esx3.griffiths.local</hostName>

<vmFolderId>group-v22</vmFolderId>

<vmFolderName>vm</vmFolderName>

<vmHostname>ldr3-0</vmHostname>

<vmName>LDR-3-0</vmName>

<deployed>true</deployed>

<cpuReservation>

<limit>-1</limit>

<reservation>1000</reservation>

</cpuReservation>

<memoryReservation>

<limit>-1</limit>

<reservation>512</reservation>

</memoryReservation>

<edgeId>edge-72</edgeId>

<configuredResourcePool>

<id>domain-c861</id>

<name>Office</name>

<isValid>true</isValid>

</configuredResourcePool>

<configuredDataStore>

<id>datastore-998</id>

<name>SYN9-NFS-GEN-VOL1</name>

<isValid>true</isValid>

</configuredDataStore>

<configuredHost>

<id>host-882</id>

<name>esx3.griffiths.local</name>

<isValid>true</isValid>

</configuredHost>

</appliance>

<deployAppliances>true</deployAppliances>

</appliances>

<cliSettings>

<remoteAccess>true</remoteAccess>

<userName>admin</userName>

<sshLoginBannerText>

***************************************************************************

NOTICE TO USERS

 

 

 

This computer system is the private property of its owner, whether

individual, corporate or government. It is for authorized use only.

Users (authorized or unauthorized) have no explicit or implicit

expectation of privacy.

 

Any or all uses of this system and all files on this system may be

intercepted, monitored, recorded, copied, audited, inspected, and

disclosed to your employer, to authorized site, government, and law

enforcement personnel, as well as authorized officials of government

agencies, both domestic and foreign.

 

By using this system, the user consents to such interception, monitoring,

recording, copying, auditing, inspection, and disclosure at the

discretion of such personnel or officials. Unauthorized or improper use

of this system may result in civil and criminal penalties and

administrative or disciplinary action, as appropriate. By continuing to

use this system you indicate your awareness of and consent to these terms

and conditions of use. LOG OFF IMMEDIATELY if you do not agree to the

conditions stated in this warning.

 

****************************************************************************</sshLoginBannerText>

<passwordExpiry>99999</passwordExpiry>

</cliSettings>

<features>

<syslog>

<version>1</version>

<enabled>false</enabled>

</syslog>

<featureConfig/>

<firewall>

<version>4</version>

<enabled>false</enabled>

<globalConfig>

<tcpPickOngoingConnections>false</tcpPickOngoingConnections>

<tcpAllowOutOfWindowPackets>false</tcpAllowOutOfWindowPackets>

<tcpSendResetForClosedVsePorts>true</tcpSendResetForClosedVsePorts>

<dropInvalidTraffic>true</dropInvalidTraffic>

<logInvalidTraffic>false</logInvalidTraffic>

<tcpTimeoutOpen>30</tcpTimeoutOpen>

<tcpTimeoutEstablished>21600</tcpTimeoutEstablished>

<tcpTimeoutClose>30</tcpTimeoutClose>

<udpTimeout>60</udpTimeout>

<icmpTimeout>10</icmpTimeout>

<icmp6Timeout>10</icmp6Timeout>

<ipGenericTimeout>120</ipGenericTimeout>

<enableSynFloodProtection>false</enableSynFloodProtection>

<logIcmpErrors>false</logIcmpErrors>

<dropIcmpReplays>false</dropIcmpReplays>

</globalConfig>

<defaultPolicy>

<action>deny</action>

<loggingEnabled>false</loggingEnabled>

</defaultPolicy>

<firewallRules>

<firewallRule>

<id>131075</id>

<ruleTag>131075</ruleTag>

<name>routing</name>

<ruleType>internal_high</ruleType>

<enabled>true</enabled>

<loggingEnabled>false</loggingEnabled>

<description>routing</description>

<action>accept</action>

<application>

<service>

<protocol>ospf</protocol>

<port>any</port>

<sourcePort>any</sourcePort>

</service>

</application>

</firewallRule>

<firewallRule>

<id>131073</id>

<ruleTag>131073</ruleTag>

<name>default rule for ingress traffic</name>

<ruleType>default_policy</ruleType>

<enabled>true</enabled>

<loggingEnabled>false</loggingEnabled>

<description>default rule for ingress traffic</description>

<action>deny</action>

</firewallRule>

</firewallRules>

</firewall>

<routing>

<version>4</version>

<enabled>true</enabled>

<routingGlobalConfig>

<routerId>10.0.0.2</routerId>

<ecmp>false</ecmp>

<logging>

<enable>false</enable>

<logLevel>info</logLevel>

</logging>

</routingGlobalConfig>

<staticRouting>

<defaultRoute>

<vnic>2</vnic>

<mtu>1500</mtu>

<description></description>

<gatewayAddress>10.0.0.1</gatewayAddress>

<adminDistance>1</adminDistance>

</defaultRoute>

<staticRoutes/>

</staticRouting>

<ospf>

<enabled>true</enabled>

<protocolAddress>10.0.0.3</protocolAddress>

<forwardingAddress>10.0.0.2</forwardingAddress>

<ospfAreas>

<ospfArea>

<areaId>2</areaId>

<type>normal</type>

<authentication>

<type>none</type>

</authentication>

</ospfArea>

</ospfAreas>

<ospfInterfaces>

<ospfInterface>

<vnic>2</vnic>

<areaId>2</areaId>

<helloInterval>10</helloInterval>

<deadInterval>40</deadInterval>

<priority>128</priority>

<cost>1</cost>

<mtuIgnore>false</mtuIgnore>

</ospfInterface>

</ospfInterfaces>

<redistribution>

<enabled>true</enabled>

<rules>

<rule>

<id>0</id>

<from>

<ospf>false</ospf>

<bgp>false</bgp>

<static>false</static>

<connected>true</connected>

</from>

<action>permit</action>

</rule>

</rules>

</redistribution>

<gracefulRestart>true</gracefulRestart>

</ospf>

</routing>

<dhcp>

<version>2</version>

<enabled>false</enabled>

<staticBindings/>

<ipPools/>

<logging>

<enable>false</enable>

<logLevel>info</logLevel>

</logging>

</dhcp>

<bridges>

<version>2</version>

<enabled>false</enabled>

</bridges>

<highAvailability>

<version>2</version>

<enabled>false</enabled>

<declareDeadTime>15</declareDeadTime>

<logging>

<enable>false</enable>

<logLevel>info</logLevel>

</logging>

<security>

<enabled>false</enabled>

</security>

</highAvailability>

</features>

<autoConfiguration>

<enabled>true</enabled>

<rulePriority>high</rulePriority>

</autoConfiguration>

<type>distributedRouter</type>

<isUniversal>false</isUniversal>

<mgmtInterface>

<label>vNic_0</label>

<name>mgmtInterface</name>

<addressGroups>

<addressGroup>

<primaryAddress>192.168.10.224</primaryAddress>

<subnetMask>255.255.255.0</subnetMask>

<subnetPrefixLength>24</subnetPrefixLength>

</addressGroup>

</addressGroups>

<mtu>1500</mtu>

<index>0</index>

<connectedToId>dvportgroup-106</connectedToId>

<connectedToName>DV-VM</connectedToName>

</mgmtInterface>

<interfaces>

<interface>

<label>138900000002/vNic_2</label>

<name>UpLink</name>

<addressGroups>

<addressGroup>

<primaryAddress>10.0.0.2</primaryAddress>

<subnetMask>255.255.255.0</subnetMask>

<subnetPrefixLength>24</subnetPrefixLength>

</addressGroup>

</addressGroups>

<mtu>1500</mtu>

<type>uplink</type>

<isConnected>true</isConnected>

<isSharedNetwork>false</isSharedNetwork>

<connectedToId>virtualwire-45</connectedToId>

<connectedToName>Transport-10.0.0.0</connectedToName>

</interface>

<interface>

<label>13890000000a</label>

<name>GW-10.0.1</name>

<addressGroups>

<addressGroup>

<primaryAddress>10.0.1.1</primaryAddress>

<subnetMask>255.255.255.0</subnetMask>

<subnetPrefixLength>24</subnetPrefixLength>

</addressGroup>

</addressGroups>

<mtu>1500</mtu>

<type>internal</type>

<isConnected>true</isConnected>

<isSharedNetwork>false</isSharedNetwork> <connectedToId>virtualwire-46</connectedToId>

<connectedToName>LS-10.0.1</connectedToName>

</interface>

</interfaces>

<edgeAssistId>5001</edgeAssistId>

<lrouterUuid>3914608b-a1a9-41e2-8251-7da1557c38e1</lrouterUuid>

<queryDaemon>

<enabled>false</enabled>

<port>5666</port>

</queryDaemon>

</edge>

 

Does understanding the cost of IT really matter?

Welcome to my new form of click bait titles.   I have been thinking about this for a while.   I get to see a lot of different enterprise environments as a solution architect for VMware.   It’s been great to experience all these customers challenges and help them on their journey.   Years ago I was very focused on helping organizations understand the cost of IT.   It was very important to me to identify the total cost of resources.   IT has long been able to quantify capital expenses due to hardware costs.   All you have to do is take your bill of materials and divided it by the logical resource element.  (GB’s of storage, RAM etc..)  Operating expenses have long plagued IT organizations because staff normally multi-tasks and dislikes tracking individual work.   This total action was supposed to help you have a seat at the business table by talking then language of business (money).  I have written this article to argue why I believe cost justification means you are loosing the battle with your business.

Why does cost justification mean you are loosing the battle

When I first started in IT I was in charge of personal computers for Columbus GA Auto Accident Law firm.   Every three years we had a bake-off between major PC vendors where they would parade their latest prize pig in front of us.   In the end the winner was determined by who was willing to go the lowest on price.   There was no loyalty there was only price.   There was no value add.   Then a strange thing happened.  My customer became interested in PC features which were reflected in aesthetic preference.   They didn’t care about the hardware specifications.

They wanted Macbook’s and clean desks that all matched: 

They wanted something that looked nice and was a status symbol.   They didn’t care about price.  Suddenly management no longer cared about price they wanted the Macbook.   I did a justification related to the cost of IT repairs on Mac’s in a last-ditch effort to head off the Apple invasion.   I didn’t win.

Quite simply put there is always going to be another prize pig that will beat you on price by either reducing features or buying at an economy of scale that you cannot.   So when IT is 100% based on cost justification you are always going to lose.   Please don’t assume that I am saying cost does not matter it does.. but it cannot be the only factor in a choice because the farm is world-wide now.   When your business is left with a justification of cost between private and public cloud you might loose that’s why you need to upgrade and try to view prospect.io vs leadfuze. The best way to boost your company is through funding, there are companies that can help you with that, for more info visit capstone business funding.

Perception is reality

Macbook’s are better right?  Well it depends on who you ask.     Perception determines reality.   My customers perceived that Macbook’s are better because all the cool kids were using them…  The truth is the hardware is pretty good but expensive compared to other x86 machines.   Most IT professionals love technology.   They cannot wait to be on the bleeding edge of technology.   Infrastructure has become on the whole very risk adverse for a number of reasons:

  • When something fails they are the first to be called
  • Mountains of technical debt
  • Weak development unit testing
  • Sheer sprawl of divergence in the environment

This risk aversion has made infrastructure people the old man yelling get off my lawn of technology while all the cool kids have Macbooks and text get off my lawn.   It’s time for infrastructure to return to their roots and embrace change once again or be left behind.   It’s a software defined world… learn how to use an API today.  Learn about the cloud providers because they are part of your future. Also you may visit https://www.paydayloansnow.co.uk/payday/no-credit-check/direct-lender/ if ever you want to apply for a payday loan.

Digital Transformation

Digital is thrown around too much… it’s the cloud of the 2000’s.   (Not to be confused with the cloud of the 2010’s)  I believe digital transformation is defined as IT aligning with the business.   If you start your alignment by assigning a bill or trying to use cost as leverage you cannot assume it will help the relationship.   The key element of any relationship is not power manipulation (cost) it’s achieving mutual shared goals.   It’s working together to solve problems as a single entity.  At VMworld 2018 I spoke along with Craig Fletcher about this relationship and we compared it to a marriage.   I suggest that a lot of infrastructure teams are heading for a divorce.  You IT have to take the first step to fix the relationship… because if you don’t the business will go find another suitor (can you spell AWS).    Allow me to suggest four things to consider in your digital journey:

  • Spend more time understanding how your efforts impact revenue or not
  • Create a state of IT anonymous survey
  • Create cross functional teams to address business challenges
  • Embrace change and cross cloud capabilities

 

Basic NSX Network virtualization setup

This post will go over the basic setup for network virtualization in NSX.  This is nothing new or exciting but I figured I would share as more users are deploying NSX in their home labs these days.   I will assume that you already have the environment prepared by deploying the manager and controllers and all your ESXi hosts are prepared.

We are going to set up the subnet of 10.0.0.0/17 to be virtually routed as shown below:

This requires the following:

  • Static route on Linksys EA6200 router to point 10.0.0.0/17 to 192.168.10.223 (because my Linksys does not support any dynamic routing protocols)
  •  A logical switch called Transport-10.0.0.0 between the border ESG and the Logical distributed router
  • OSPF configured between ESG-3 and LDR-3

 

Creation of the LDR-3 (pictures to follow steps)

  1. First we need to create a logical switch by choosing Logical Switches, select green + button, Input Name (Transport-10.0.0.0) and description and click ok
  2. Select NSX Edges in Navigator pane, select green + button
  3. In Name and description pane: Install Type: Logical (distributed) route, Name: LDR-3, Hostname ldr3, leave deploy NSX Edge selected, Next
  4. In settings, type your password, I like to enable ssh, click next
  5. In configure deployment: Press the green + to deploy a NSX Edge Appliance, Select correct resource pool, datastore, host, and folder, click ok
  6. Click Next
  7. In Configure interfaces
  8. Select connected to for HA interface: Port group DV-VM, press + below HA and add 192.168.10.224
  9. press the green + button under interfaces
  10. In Add NSX Edge Interface: Name Uplink, Connected to: Transport-10.0.0.0, Press green + to add IP: 10.0.0.2 subnet 24, Click ok
  11. Click Next
  12. In Default gateway settings:  Set the gateway IP as 10.0.0.1 and click next
  13. Ignore the Firewall and HA settings click next
  14. Click finish to deploy LDR

 

 

Creation of the ESG-3 (pictures to follow steps)

  1. Back at the NSX Edge section in Navigator
  2. Press the Green + sign
  3. In Name and description: Choose Edge Services gateway, Name: ESG-3, Hostname esg3 and select Next (in Production you might want high availability or ECMP)
  4. In Settings:  Type Admin password and enable ssh, Next
  5. In Configure deployment: Press Green + sign, Select resource pool, datastore and host then ok and Next
  6. In Configure Interfaces: press the green + sign
  7. Name: Uplink, Connected To: DV-VM, Press green + to add interface: 192.168.10.223 subnet 24, click ok
  8. Click Next
  9. In Default gateway settings insert default gateway of 192.168.10.1 then next
  10. Ignore firewall and HA settings and next
  11. Click Finish to deploy appliance

Configure Physical router

This is unique per router in mine I added a static route for the subnet:

 

Configure LDR

We need to add at least one inside network and configure OSPF.

  1. Logical Switch section we are going to add a switch for 10.0.1.0/24 called LS-10.0.1
  2. In Logical Switch Section: Green + button, Name LS-10.0.1 then OK
  3. Go to NSX Edges in Navigator
  4. Double click on LDR-3
  5. We need to add a interface for the new network Select Manage, Settings, Interfaces
  6. Select Green +
  7. Name: GW-10.0.1: Connected To LS-10.0.1, Green + button to add interface 10.0.1.1 subnet 24,
  8. Select Routing tab, global Configuration
  9. Go to Dynamic Routing configuration and click edit
  10. Make sure the uplink interface is chosen then click ok
  11. Press Publish Changes button
  12. Click on OSPF button
  13. Remove all current area definitions (51 ) with red X then publish changes
  14. Click green + on area definitions and add area 2 (just type 2 in area button leave rest default)
  15. Press green + in area to interface mapping button
  16. Make sure Uplink is selected and area 2 and press OK
  17. Press Edit button next to OSPF configuration and enable OSPF, For protocol address choose a free IP 10.0.0.3, forwarding is 10.0.0.2
  18. Publish Changes
  19. Go to firewall section
  20. Disable firewall
  21. Publish changes

 

Configure ESG-3

  1. Return to Networking & Security main section
  2. Select NSX Edges and double click on ESG-3
  3. Select Manage, Settings, Interfaces
  4. We need to add a interface for the transport between LDR and ESG
  5. Select vnic1 and press Edit button
  6. Connected to: Transport-10.0.0.0, IP: 10.0.0.1 subnet 24
  7. Select Routing
  8. In global configuration: Select edit next to dynamic routing configuration, ensure uplink is selected and press ok
  9. Publish changes
  10. Click on OSPF
  11. Remove current area definitions with red X and publish changes
  12. Add a new area for area 2 leaving everything else default
  13. In the area to interface mapping make sure you chose vnic1 (internal link) and area 2
  14. Select OSPF Configuration and Enable OSPF
  15. Publish Changes
  16. Select Firewall section and disable firewall and publish changes

 

Validate Configuration

Let’s validate configuration three ways: Confirming OSPF settings on ESG-3, Adding a new subnet, ping test

Confirming on ESG-3

  1. Login to ESG-3 via SSH (username admin password set during deployment)
  2. Type the following to see current routes (show ip route)  ensure that the E2 learned route is showing:

Adding a new subnet

  1. Stay logged into the ESG-3
  2. Switch to the Networking and security console, navigate to Logical switches
  3. Press green + to add a switch for LS-10.0.2
  4. select NSX Edges, Double click on LDR-3
  5. Go to Manage and settings
  6. Select Interfaces and press green +
  7. Name: GW-10.0.2, Internal, Connected to:  LS-10.0.2, IP 10.0.2.1 subnet 24
  8. Return to the ESG-3 ssh session and run the command show ip route to see 10.0.2.0/24

Test Via ping

  1. Attempt to ping either gateway on the LDR (10.0.1.1 or 10.0.2.1)

 

Additional commands on ESG-3

Here are some commands that will help you in troubleshooting OSPF

show ip ospf neightbors – show other members of the areas

show ip ospf database – understand current ospf database

 

PowerShell get the latest VI events

From time to time it’s nice to search the VIEvent log for something specific.   I have found using PowerShell allows you to do this very quickly.   If for example I was looking for all HA events I might use the following code:

 

Import-Module -Name VMware.VimAutomation.Core

Connect-VIServer 192.168.10.14



$ViEvents = Get-ViEvent | where {$_.FullFormattedMessage -like "*HA*"} | select *

$ViEvents.count
$ViEvents | Out-GridView

Does Cloud + REST API spell the end of GUI

Fun question:  Does API spell the end of the GUI?

I started my career as a Solaris and Linux administrators mostly because I felt that working in Windows Server took away most of my control.  I loved configuring a web server in text and having full control.   I love having to understand what each variable did so I could tune my web server to meet my needs.    It was a great job which led into configuration management with puppet.   Full control and text once again…

This evening I was working with the REST API for NSX working on a side project and to confirm the results of my query I just used REST… I got my answer is a millisecond… I could not have refreshed the GUI that quickly.   It was so easy and it reminded me of the good old Linux days long forgotten as a architect.

Make no mistake it’s a coders world out there infrastructure folks need to get comfortable with API’s and code.   The future is a process of automating different units together using API’s.   Working with Rest has taught me so much about the platform.   You start to understand how the solution was built.   It exposes workflows that helps you build efficiency…

I suggest that if you really want to understand your product you need to learn it’s API.  If it does not have an API consider a different product.   I know GUI’s will be around but I do believe they will continue to have less value in enterprise deployments.  Strap on your code and join the power users.

Advice to VCDX candidates from a Double VCDX

“Sometimes it’s the journey that teaches you a lot about your destination.”  – Drake

Update: I have updated the wording on the constraints section to reflect a Twitter comment from  thanks for the fix to wording and reading.

The VMware Certified Design Expert certification (VCDX) represents the highest tier of VMware’s certifications.   I recently contributed to a panel of VCDX’s at VMworld.  Candidates considering the VCDX certification had the opportunity to ask the panel questions.   The questions illustrated that candidates were concerned about the Herculean effort required to achieve the certification.   I wanted to take this opportunity to provide some guidance I have learned as a mentor.   I believe anyone can become a VCDX.   It does require some hard work but it is very achievable.

 

Requirements, Constraints, Assumptions and Risks

Becoming a VMware certified design expert does not mean you have to be the most technical person in the room.   It does mean you have to know how to align technology to business needs.    My experience has taught me that I can tell if a proposal for VCDX will be successful right away based upon requirements, constraints, assumptions and risks.   The ability to gather business and technical requirements is a key skill for any design expert.   Your technical requirements should be aligned to the business requirements. It’s important to understand the difference between business and technical requirements:

  • Business Requirements – Defines how the delivered product provides value. Other words often used are outcomes, or expected benefits.  For example, the solution must meet regulatory compliance.
  • Technical Requirements – Defines the technical “must haves” to achieve the outcome. For example the solution must be able to fail over and fail back from a disaster and support a RTO of four hours.

Many VCDX documents are solely focused on technical requirements and miss the “why” that drives the design.   Understanding the difference between requirements and constraints is another challenge for many candidates:

  • Requirements – Things the design must meet, such as: establish a RTO of four hours or provide capacity for twenty percent growth for the next three years.
  • Constraints – Things that form limits or boundaries that apply to the design.  For example a specific vendor relationship or reuse of current hardware.  Constraints should be met by the design unless they are resolved via conflict.

Once you have established your requirements and constraints you are left with assumptions and risks:

  • Assumptions – things you believe to be true but cannot verify. For example, storage usage will grow at the same rate as compute usage or the sample data provided represents reality.
  • Risks – are simply risks to the project meeting business requirements. If you identify risks they should be provided in this section.   Every project has risks.   For example, staff skills or timelines.

 

Correctly creating requirements and constraints that align with the elements of design are critical to a successful submission.    Identification of assumptions and risks provide important protections to the architect.  The goal of a VCDX design is to align technology to meet the requirements and constraints not provide the best technology mix.

 

Elements of Design

When working with infrastructure, VMware has designated five elements that should be considered in each design choice.  Each design choice should be evaluated against the elements of design for impact.  I personally like to use the acronym RAMPS to help me remember these elements:

  • Recoverability – Choices effect on disaster recovery
  • Availability – Choices effect on SLA
  • Manageability – Choices effect of management cost
  • Performance – Choices effect on performance
  • Security – Choices effect on security

It is not uncommon for availability, recoverability, security or performance to have a negative impact on manageability.   Not all choices can have a net benefit to all elements of design.   The tie breaker with these conflicts should be the requirements.   Conflicts between design elements may exist even after evaluating the requirements.   This allows for a conflict resolution section.   Conflict resolution is where the customer of the solution acknowledges the conflict and mitigates the conflict in some form.   Make sure your design has conflicts.   Each requirement and constrain should be aligned to an element of design.  When gathering business requirement, consider the RAMPS impact of each requirement to help gather a full list of requirements and constraints.    Each technical requirement or constraint should be aligned to a single element of RAMPS.

 

Fun with Formats

Every single candidate struggles with document format.    The VCDX requires far more detail than most designs in enterprise.    Format paralysis has slowed if not stopped many candidates.   My suggestion is identify an outline that aligns with the blueprint.

  • Overview
  • Requirements, constraints, assumptions and risks
  • Conceptual architecture
  • Logical architecture
  • Physical architect
  • Security
  • Appendix

 

Each of the different layers of architecture should address the sub elements: compute, storage, networking, applications, recovery, virtual machine, management, etc…   You cannot provide lip service to conceptual and logical architecture.   They must be developed just like physical architecture.    Design choices should be justified against RAMPS, with conflicts identified.   The secret is to determine a format and start writing, don’t get stuck on format.   In the end, the format is not as important as the content assuming the reviewer can locate the items required in the blueprint.

 

Time Management

Every candidate struggles with time.  We have family, friends, hobbies, faith and work conflicting with the VCDX goal.    My advice is to set a goal with a timeline.   Agree upon a set time each day.  Exercise discipline to work on the VCDX during that time and you will achieve your goal.   For me I used 8:00 – 9:00 PM each night.  It was after my kids’ bed time and before spending time with my wife.   I had to sacrifice computer game time, my guaranteed wins from p4rgaming.com, social media time and blogging time, but after six months I was done.   This model has worked for me to achieve two VCDX certifications and put me on the path to my third.   I’d like to end where I began.   I believe everyone can achieve this certification with hard work.   To start get a mentor by visiting vcdx.vmware.com and searching for a mentor including me.

Redeploy NSX Edges to a different cluster / datacenter

First Issue my bad

I ran into an interesting issue in my home lab.  I recently replaced all my older HP servers with Intel NUC’s.  I could not be happier with the results.   Once I replaced all the ESXi hosts I mounted the storage and started up my virtual machines including vCenter.   Once vCenter and NSX Manager were available I moved all the ESXi hosts to the distributed switch.   This normal process was complicated by NSX.    I should have added the ESXi hosts to the transport zone allowing NSX to join the distributed switch.   Failure to do this made the NSX VXLAN process fail.   I could not prepare the hosts… ultimately I removed the VXLAN entries from the distributed switch and then re-prepared which re-created the VXLAN entries on the switch.   (This is not a good idea if you use it in production so follow the correct path.

Second Issue nice to know

This process generated a second issue the original cluster and datacenter on which my NSX edges used to live was gone.   I assumed that I could just re-deploy NSX edges from the manager.   While this is true the configuration assumes that it will be deploying the Edges to the same datacenter, resource pool and potentially the same host as when it was created.   So if I have a failure and expect to just bring up NSX manager and redeploy to a new cluster it will not work.   You have to adjust the parameters for the edges you can do this via the API or GUI.   I wanted to demonstrate the API method:

I needed to change the resource pool, datastore, and host for my Edge.   I identified my Edge via the identifier name in the GUI.  (edge-8 for me)  Grabbed my favorite REST tool (postman) and formed a query on the current state:

Get https://{nsx-manager-ip}/api/4.0/edges/edge-8/appliances

This returned the configuration for this edge device.  If you need to identify all edges just do

Get https://{nsx-manager-ip}/api/4.0/edges

Then I needed the VMware identifier for resource pool, datastore and host – this can all be gathered via the REST API but I went for Powershell because it was faster for me.  I used the following commands in PowerCLI:

 

get-vmhost | fl - returned host-881

get-resourcepool | fl - returned domain-c861

get-datastore | fl - returned datastore-865

 

Once identified I was ready to form my adjusted query:

 

<appliances>
<applianceSize>compact</applianceSize>
<appliance>
<highAvailabilityIndex>0</highAvailabilityIndex>
<vcUuid>500cfc30-5b2a-6bae-32a3-360e0315ccd3</vcUuid>
<vmId>vm-924</vmId>
<resourcePoolId>domain-c861</resourcePoolId>
<resourcePoolName>domain-c861</resourcePoolName>
<datastoreId>datastore-865</datastoreId>
<datastoreName>datastore-865</datastoreName>
<hostId>host-881</hostId>
<vmFolderId>group-v122</vmFolderId>
<vmFolderName>NSX</vmFolderName>
<vmHostname>esg1-0</vmHostname>
<vmName>ESG-1-0</vmName>
<deployed>true</deployed>
<cpuReservation>
<limit>-1</limit>
<reservation>1000</reservation>
</cpuReservation>
<memoryReservation>
<limit>-1</limit>
<reservation>512</reservation>
</memoryReservation>
<edgeId>edge-9</edgeId>
<configuredResourcePool>
<id>domain-c26</id>
<name>domain-c26</name>
<isValid>false</isValid>
</configuredResourcePool>
<configuredDataStore>
<id>datastore-31</id>
<isValid>false</isValid>
</configuredDataStore>
<configuredHost>
<id>host-29</id>
<isValid>false</isValid>
</configuredHost>
<configuredVmFolder>
<id>group-v122</id>
<name>NSX</name>
<isValid>true</isValid>
</configuredVmFolder>
</appliance>
<deployAppliances>true</deployAppliances>
</appliances>

I used a PUT against https://{nsx-manager-ip}/api/4.0/edges/{edgeId}/appliances  with the above body in xml/application.   Then I was able to redeploy my edge devices without any challenge.

Powershell Functions for tags

Some quick powershell functions for tags in ESXi enjoy:

#add tag to vm
function add_tag_to_vm($VM, $TAG)
{
 Get-VM –Name $VM | New-TagAssignment –Tag $TAG

}
#Add a tag to all virtual machines in folder
function add_tag_to_all_vm_in_folder($FOLDER,$TAG){

get-folder $FOLDER | get-vm | New-TagAssignment –Tag “$TAG”

}
#List all vm's with specific tag
function get_vms_with_tag($TAG){

$tags = Get-VM –Tag “$TAG”
 return $tags

}
#Check for the presence of a tag on VM
function check_vm_for_tag($VM, $TAG)
{
 $checkfortag = get-vm $VM -tag $TAG
 $havetag = get-vm | where {$checkfortag.name -contains $_.name} 
 return $havetag

}



#Remove tag from VM
function remove_tag_from_vm($VM, $TAG){
 
 $myVM = Get-VM $VM
 $myTagAssignment = Get-TagAssignment -TagAssignment $TAG $myVM
 Remove-TagAssignment $myTagAssignment -Confirm:$False

}

#Remove a tag from all vm's in a folder
function remove_tag_from_all_vm_in_folder($FOLDER, $TAG){

$myVM = get-folder $FOLDER | Get-VM 
 $myTagAssignment = Get-TagAssignment -Category "$TAG" $myVM
 Remove-TagAssignment $myTagAssignment -Confirm:$False

}