Etherchannel, LACP and VMware

Recently I have had some discussions about using LACP and static etherchannel with VMware.  The conversations have mainly revolved around how to get it setup, and what are the different use cases for it. The biggest question was about what exactly is the difference between the two.  Are they the same thing with different names or are they actually different things?

nirclecom_user_file_VR9EFwHQTILTpFyG4tnAFjWnKn6tVUGoSyBc

Etherchannel and LACP are used to accomplish the same thing, but they both do it in a slightly different way.  They are used to form a link-aggregation-groups (LAG) made of multiple physical links to connect networking devices together.  This is needed to avoid getting a loop in the network, that is normally handled by using the Spanning Tree Protocol.   So what is the real difference between the two?  LACP has two modes.  Active and passive, if one or both sides are set for active then they form a channel.   With Etherchannel one side must be set for active and the other set for passive.  Otherwise no channel will form.  Seems fairly simple but…

The reason all of this matters is that the virtual switches with VMware cannot form a loop.  So by setting up LACP or etherchannel you are just increasing your operational cost, and the complexity of the network.  It requires greater coordination with the networking team to ensure that LACP or etherchannel are setup with the same exact settings.   LACP and etherchannel offer different forms of load balancing.  This is accomplished by using hashes based on things such as source IP, source MAC. There are quite a few options to choose from.  Once the hash is created the packet is sent down a certain link determined by the hash that was generated..  This creates a constraint because now every packet is sent down that same link, and will keep using it until a link fails and it is forced to use another link.  So it is possible that if 2 VM’s are communicating over a LAG all traffic could be going across just one link, and leaving the other links underutilized.  The distributed switch and physical switch must be setup to use the same settings or a link will not be established. LACP is only available by using the Distributed switch which is only available with Enterprise Plus Licensing.

If you are able to use the Distributed switch it also supports Load Base Teaming.  LBT is the only true load balancing method.  It will send traffic across all links based on the actual utilization of the link.  This is a far superior load balancing feature and if you are already paying for it you should be using it.  There is also the myth that bonding two 10gb links will give you 20gb of throughput.  As I discussed earlier the limitation is that vNIC can only utilize one link at a time.  It cannot break up streams across two links for increased throughput.  You can only really gain the throughput advantage with multiple VM’s utilizing them.

download (2)

As a best practice you should always use trunk ports down to your hypervisor hosts, this allows the host to utilize multiple VLAN’s as opposed to placing the switch ports into access mode and allowing only one VLAN, customers who do this often end up re-configuring their network later on and its always a pain. I generaly recommend setting up each port on the physical switch in a standard trunk mode with all the VLAN’s that you need.  Then on the virtual switch build out all of your portgroups and have the traffic tagged there with the VLAN needed for that portgroup.  By doing this and using LBT you have a simple yet efficient design.

Now there is one caveat to all of this  vSAN does not support LBT, but it does support LACP, and if you have vSAN you are licensed for the distributed switch.  LACP has one advantage over LBT and that is the fail over time.  This is the time it takes for a dead link to be detected and traffic sent to another link. LACP failover is faster than that of LBT, and this failover time could mean the difference between a failed write with vSAN.  Which can limit any downtime, but with a production hopefully there will not be many links going offline.

VMworld 2018!!!

It is finally that time of year.  The greatest time of year. It is time for VMworld!!!  August 26-30 is the the time where everyone packs up and spends a week in Las Vegas with some of the greatest minds in Virtualization.

download

VMworld is a great opportunity to learn about some of the latest technology in the industry.  The show floor will be backed with tons of vendors.  Some you have heard of and some that you haven’t.  You may find that vendor that has just the solution that you have been looking for.  All the vendors will have lots of information about the various products and solutions that they offer.  It is a great idea to talk to as many as you can.  Always a great opportunity to learn something new, and they usually have some great prize and swag!

The sessions will be excellent as always presented by some of the smartest people you have ever met.  You can take a look of all the sessions here.  If you can’t make it to VMworld they will post most of the sessions on Youtube shortly after.

They will also be offering training sessions on the various VMware products, and if you ready for it you can take one of the certification tests.  Maybe finally get that VCP or VCAP that you have been working on.

The best part of all of this is the networking, and the lifelong friends you will make.  Through VMworld and various other social events I met many great people and friends.  It is a great community to be a part of, and I hope this year I will be able to meet up with as many people I can at the various events.

download (1)

Stretched vSAN Cluster on Ravello

Stretched clustering has been something that I have wanted to set up for my home lab for a while, but it would not be feasible with my current hardware.  Recently I was selected to be a part of the vExpert program for the third year.  One of the perks of this is the use of Ravello cloud.  They have recently made a lot of advancements that has greatly increased the performance.  Now they have also added a bare metal option which which makes the performance even greater.  I am skipping most of the steps to setup vSAN, and trying to only include what is different for a stretched cluster.

The high level architecture of a stretched vSAN cluster is simple.

21640548292_faf47a713e_o

  • Two physically separated clusters.  This is accomplished using Ravello Availability grouping.
  • A vCenter to manage it all.
  • External witness.  This is needed for the quorum.  Which allows for an entire site to fail with it and the vm’s to fail over.
  • Less than 5ms latency between the two site.  This is needed because all writes need to be acknowledged at the second site.
  • 200ms RTT max latency between clusters and witness.

If this was a production setup there would be a few things to keep in mind.

  • All writes will need to be acknowledged at second site.  So that could be an added 5ms of latency for all writes.
  • You can use layer 2 and 3 networks between the clusters.  You would want at least 10gb for the connection between sites.
  • You can use layer 2 and 3 networks with at least 100mbs for the witness.

Deploying on Ravello

blueprint

For the architecture of this deployment we will need 3 sections

  • Management
  • Cluster Group 1 (Availability groups simulate separate data center)
  • Cluster Group 2 (Availability groups simulate separate data center)
  • vSAN network and Management/Data Network

Management

There needs to be a DNS server and a vCenter.  I used Server 2016 to setup both the DNS server and Domain Controller.  I used the vCenter appliance 6.5 which I then deployed to an separate mangement ESXi hosts.

Cluster Groups

These consist of 2 ESXi 6.5 hosts each.  They use Availability Groups to keep them physically separated to simulate the stretched cluster.  Group 1 used AG1 and Group 2 used AG2

AG

Network

 

I manually setup the DNS entries on the Server 2016 DNS, and the two networks consists of the following.

  • 10.0.0.0/16 Data/Management
  • 10.10.0.0/16 vSAN

Witness

The witness is an easy to deploy OVF.  It creates a nested ESXi host that runs on top of a physical host.  The networking consists of the following

  • vmk0 Management Traffic
  • vmk1 vSAN Traffic

Once the OVF is deployed add the new witness host into vCenter.  You will see it in vCenter as a blue ESXi host.

4

Creating the Cluster

Now that every is setup and online it is time to create the cluster.  All four hosts need to be in one cluster in vCenter.  Go to the cluster settings and start the setup of vSAN.  Choose configure stretched cluster.

stretched cluster

Now break out the two fault domains to correspond to the availability groups setup on Ravello

1

After the disk are claimed you now have a stretched vSAN cluster that provides high availability across two data centers.  One cluster or one node can go down, and your VM’s can fail over and keep on running.

 

ESXi 6.0 to 6.5 Upgrade Failed

The Problem

I am currently running vCenter 6.5 with a mix of 6.0 and 6.5 clusters.  I uploaded the latest Dell customized ESXi 6.5 image to update manager, and had no issues updating my first cluster from 6.0 to 6.5.  In the past I have had some weird issues with update manager, but since 6.5 was integrated into vCenter it has been a lot more stable.  I then proceeded to migrate the next cluster to 6.5 and received this weird error.

2

I then tried to mount the ISO to the host and install it that way, but now I get a much more detailed error.

3

The Solution

  1.  SSH into the host and run the following command to see list of installed VIB’s

esxcli software vib list

2. Remove the conflicting VIB.

esxcli software vib remove –vibname=scsi-mpt3sas

3. Reboot!

Now that the conflicting VIB has been removed you can proceed with installing the updates.

 

 

 

 

 

 

vSAN Storage Policies

I get a lot of questions about vSAN and its storage policies.  “What exactly does FTT mean?”, “What should I set the stripe to?”.  The default storage policy with vSAN is FTT=1 and Stripe=1.  FTT means Failures To Tolerate.  Stripe is how many drives an object is written across.

FTT=1 in a 2 node configuration results in mirror of all data. You can lose one drive or one node which results in 200% storage usage.  In a 4 node or larger configuration it gives you RAID 5 which is data being distributed across nodes with a parity of 1.

FTT=2 requires 6 nodes and you can lose 2 drives or 2 nodes.  This is accomplished through using RAID 6 which is parity of 2, and results in 150% storage usage.

If you want to check the status go to Cluster > Monitor > vSAN > Virtual Objects.  From here you can see the FTT and what disks it involves.  From the picture you can see with the 2 node vSAN cluster the objects are on both nodes resulting in RAID 1 or mirroring.

2017-08-30 12_36_08-vSphere Web Client

Now lets break  down which each setting is.

2017-08-28 10_01_51-vSphere Web Client

Striping breaks apart an object to be written across multiple disks.  In a all  flash environment there is still one cache drive per disk group, but it is used just to cache writes.  The rest of the drives are use for reads.   In a hybrid configuration reads are cached on the SSD, but if that data is not on the disk it will then be retrieved from the slower disks.  This will result in slower performance, but by having the object broken apart, and written across multiple disks it can result in increased read performance.  I would recommend leaving the stripe at 1 unless you encounter any performance issues.  The largest size an object can be is 255GB.  If it grows beyond that size it will be broken up into multiple objects across multiple disks.

Force provisioning allows an object to be provisioned on a datastore even if it is not capable of meeting the storage policy.  Such as you have it set for FTT=2, but the cluster is only 4 nodes so its only capable of FTT=1.

Object Space Reservation controls how much of an object is thick provisioned. By default all storage is thin provisioned with vSAN.  You can change this by increasing the percentage.  If you set it to 100% then the object will be thick provisoined.  You can set it anywhere between 0%-100%.  The only caveats are with deduplication and compression its either 0% or 100%.  By default the page file is 100%, but there is a command line setting you can change if you need to save this space.

Flash Read Cache reserves the amount of cache you want reserved for objects.  The max amount of storage the cache drive can use is 800GB.  If you have have 80 VM’s each with 100GB in storage then the entire cache drive storage is used.  When you power on the 81st VM the cache drive will not be able to give that VM any read cache.  That is why its best to not change the default unless you have a technical reason to.

 

How To Install vSAN Updates

VMware is making a lot of great progress with vSAN.  One of the biggest pain points with the technology is the daunting HCL.  VMware spends a lot of time working with hardware vendors to validate the various hardware and firmware versions with vSAN.  In the past this meant manually checking to verify you were running on the right firmware version.  Now with vSAN 6.6 it will automatically check if your running the correct firmware version, and if not you can download and install the firmware automatically.  I found one simple issue with this.  The buttons are not very clear about what they do.  As you can see from the below image it looks like those buttons would refresh the page.  The arrow points to the button that “updates all”.  By selecting that it will apply the update to all your host.  You can do this to all at once or through a rolling update.

2017-08-28 09_46_26-Pasted image at 2017_08_18 08_02 AM.png ‎- Photos

Strange Issues With Microsoft Clustering and ESXi

I have some legacy applications that require Microsoft Clustering which are running on ESXi 6.0.  Using Microsoft Clustering on top of VMware does not give you many benefits.  Things like HA and moving workloads across nodes is already available using virtualization.  What clustering does do is create more places for things to break and give you downtime.  Really the only benefit I see with clustering in a virtualized environment is the ability to restart a server for system updates.

RDM’s are required for using Microsoft Clustering.  RDM “Raw Device Map” gives the VM control of the LUN such as it was directly connected to it. To set this up you need to add a second SCSI controller and set it to physical mode.  Each disk must then share the same SCSI controller settings for every VM in the cluster. The negative side to doing this is that you lose such features as snapshot and vmotion.  When using RDM’s with physical mode you should treat those VM’s as if they were physical hosts.

12

The problem occurred when one of the clustered nodes was rebooted.  The node never came back online, and when checking the console it looked like the Windows OS was gone.  Powered off the VM and removed the mapped RDM’s.  When powering on the VM Windows booted up fine.  I Found that very strange so powered it off again and added the drives back.  That is when  I got the error invalid device backing.  VMware KB references the issue, and it basically says there is an issue with inconsistent LUN’s The only problem was I did have have consistent LUN’s.  I put in a ticket with GSS, and the first level support was not able to help.  They had to get a storage expert to help out. He quickly found this issue which was the LUN ID had changed. I am not sure how that occurred, but it was not anything I could change  When adding the drives in the VM’s the config it makes a mapping from the VM to the LUN.  When the LUN ID changed the mapping did not.  The only fix was to remove the RDM’s from all VM’s in that cluster and then add them back.

vCenter Fails after Time Zone Change

We recently changed our NTP server, and I needed to update all or hosts and vCenters.  I have a handy powershell script to update the ESXi hosts, but that script does not work on the vCenter servers.  I log into the server using port 5480 to gain access to the vCenter Management. I login as root and notice that the time zone is UTC.  I am in the Central time zone so I wanted to change it from UTC.  Turns out if you do that it break everything.  I had to learn this the hard way, and once I changed the time zone I was not able to log into vCenter.  I had to then go back and change the time zone back to UTC to regain access. Capture.

Blog at WordPress.com.

Up ↑