Etherchannel, LACP and VMware

Recently I have had some discussions about using LACP and static etherchannel with VMware.  The conversations have mainly revolved around how to get it setup, and what are the different use cases for it. The biggest question was about what exactly is the difference between the two.  Are they the same thing with different names or are they actually different things?

nirclecom_user_file_VR9EFwHQTILTpFyG4tnAFjWnKn6tVUGoSyBc

Etherchannel and LACP are used to accomplish the same thing, but they both do it in a slightly different way.  They are used to form a link-aggregation-groups (LAG) made of multiple physical links to connect networking devices together.  This is needed to avoid getting a loop in the network, that is normally handled by using the Spanning Tree Protocol.   So what is the real difference between the two?  LACP has two modes.  Active and passive, if one or both sides are set for active then they form a channel.   With Etherchannel one side must be set for active and the other set for passive.  Otherwise no channel will form.  Seems fairly simple but…

The reason all of this matters is that the virtual switches with VMware cannot form a loop.  So by setting up LACP or etherchannel you are just increasing your operational cost, and the complexity of the network.  It requires greater coordination with the networking team to ensure that LACP or etherchannel are setup with the same exact settings.   LACP and etherchannel offer different forms of load balancing.  This is accomplished by using hashes based on things such as source IP, source MAC. There are quite a few options to choose from.  Once the hash is created the packet is sent down a certain link determined by the hash that was generated..  This creates a constraint because now every packet is sent down that same link, and will keep using it until a link fails and it is forced to use another link.  So it is possible that if 2 VM’s are communicating over a LAG all traffic could be going across just one link, and leaving the other links underutilized.  The distributed switch and physical switch must be setup to use the same settings or a link will not be established. LACP is only available by using the Distributed switch which is only available with Enterprise Plus Licensing.

If you are able to use the Distributed switch it also supports Load Base Teaming.  LBT is the only true load balancing method.  It will send traffic across all links based on the actual utilization of the link.  This is a far superior load balancing feature and if you are already paying for it you should be using it.  There is also the myth that bonding two 10gb links will give you 20gb of throughput.  As I discussed earlier the limitation is that vNIC can only utilize one link at a time.  It cannot break up streams across two links for increased throughput.  You can only really gain the throughput advantage with multiple VM’s utilizing them.

download (2)

As a best practice you should always use trunk ports down to your hypervisor hosts, this allows the host to utilize multiple VLAN’s as opposed to placing the switch ports into access mode and allowing only one VLAN, customers who do this often end up re-configuring their network later on and its always a pain. I generaly recommend setting up each port on the physical switch in a standard trunk mode with all the VLAN’s that you need.  Then on the virtual switch build out all of your portgroups and have the traffic tagged there with the VLAN needed for that portgroup.  By doing this and using LBT you have a simple yet efficient design.

Now there is one caveat to all of this  vSAN does not support LBT, but it does support LACP, and if you have vSAN you are licensed for the distributed switch.  LACP has one advantage over LBT and that is the fail over time.  This is the time it takes for a dead link to be detected and traffic sent to another link. LACP failover is faster than that of LBT, and this failover time could mean the difference between a failed write with vSAN.  Which can limit any downtime, but with a production hopefully there will not be many links going offline.

Nutanix Node Running Low On Storage

I manage a few Nutanix clusters and they are all flash, because of this the normal tiering of data does not apply. In a hybrid mode, which has both spinning and solid state drives, the SSD will be used for read and write cache. Only moving “cold” data down to the slower spinning drives as needed.   The other day one of the nodes local drives were running out of free space.  It made me wonder what happens if they do fill up?

With Nutanix it tries keeps everything local to the node.  This provides low latency reads since there is no network for data to cross, but the writes still have to go across the network.  The reason for this is that you want at least two copies of data.  One local and one remote.  So when writes happen, it writes synchronously to the local and a remote node.  Writes are written across all nodes in the cluster, and in the event of a lost node it can use all nodes to rebuild that data.

When the drives do fill up nothing really happens.  Everything keeps working and their is no down time.  The local drives become read only.  Writes will then be written to at least two different nodes ensuring data redundancy.

To check the current utilization of your drives it is under Hardware > Table > Disk

Capture

So it is best practice to try to “right size” your workloads.  Try to  make sure that the VM’s will have their storage needs met by the local drives.  HCI is a great technology it just has a few different caveats to consider when designing for your workloads.

If you want a deeper dive about it check out Josh Odgers post about it.

2 Node vSAN Design for a Remote Site

I was recently asked to design a solution for a remote site.  The requirements were it had to be cheap, run a few virtual machines, fail over capability and have shared storage The workloads are going to be very light so there is no need for powerful servers.  I had a few options with this.  Technically one server could run the entire workload, but that does not allow for any failure so I needed at least two servers.  This would provide a fail over capacity of only 1.  Bare minimum but acceptable for this use case.  These two servers would need some kind of shared storage. One option would be using a small storage array such as the DELEMC VNXe.  I have used these previously, and they were a great solution for the time, but the times are changing and I think hyperconvergence is the future.  With vSAN 6.5 there were a lot of new features that it would make it a perfect solution.

Previously with any Hyperconvereged solution you needed 3 nodes.  3 nodes are used to check for everything being online.  If 1 out of the 3 nodes goes down the other two nodes can check with each other to verify that the node actually went down.  To get away with using 2 nodes you use an external witness.  This external witness could run on a separate server on the site or at the main data center.

With vSAN you have one SSD per Disk Group (DG) to be used for cache.  Since this had to be a cheap solution my area on constraint was cost, and everything had to be a minimal design to get the job done.  Each server would have 1 DG with an 800GB SSD and 4 4TB 7.2k HHD.  This allowed for FTT=1 or only 1 host could be lost.  There is some risk with this design.  There is always a chance that in a maintenance situation one of the host would be in maintenance mode, and  this would leave a single point of failure.  Because there would only be 1 DG available on the one online host, but this is an acceptable risk for the constraint of cost.

One of my favorite new features with 6.5 is direct connect.  With this you can now directly connect two hosts to each other instead of running through a switch.  Each of these server have 2 1GB ports and 2 10GB ports. The remote site switch infrastructure is only 1GB.  Now 1GB can be a serious limitation for storage, and I wanted to avoid that.  With direct connect you can directly connect the two host to each other, and all storage traffic would then go across that link.  Leaving the 1GB ports to be used by the VM traffic.

As you can tell this is an bare minimum design for vSAN and hyperconvergence.  It does meet all the requirements such as Cost, Availability, Share Storage.  In the event of a host going down HA can restart all the VM’s on the second node providing minimal downtime.  This provides the optimal solution for the requirements of the design.

 

Blog at WordPress.com.

Up ↑