vSphere 6.5 Update 1 is the Update You’ve Been…

vSphere 6.5 Update 1 is the Update You’ve Been Looking For!

vSphere 6.5 Update 1 is the Update You’ve Been…

With this update release, VMware builds upon the already robust industry-leading virtualization platform and further improves the IT experience for its customers. vSphere 6.5 has now been running in production environments for over 8 months and many of the discovered issues have been fixed in patches and subsequently rolled into this release.


VMware Social Media Advocacy

ESXi 6.0 to 6.5 Upgrade Failed

The Problem

I am currently running vCenter 6.5 with a mix of 6.0 and 6.5 clusters.  I uploaded the latest Dell customized ESXi 6.5 image to update manager, and had no issues updating my first cluster from 6.0 to 6.5.  In the past I have had some weird issues with update manager, but since 6.5 was integrated into vCenter it has been a lot more stable.  I then proceeded to migrate the next cluster to 6.5 and received this weird error.

2

I then tried to mount the ISO to the host and install it that way, but now I get a much more detailed error.

3

The Solution

  1.  SSH into the host and run the following command to see list of installed VIB’s

esxcli software vib list

2. Remove the conflicting VIB.

esxcli software vib remove –vibname=scsi-mpt3sas

3. Reboot!

Now that the conflicting VIB has been removed you can proceed with installing the updates.

 

 

 

 

 

 

How To Setup A Nutanix Storage Container

Nutanix storage uses Storage Pool and Storage Container.  The Storage Pool is the aggregated disks of all or some of the nodes..  You can create multiple Storage Pools depending on the business needs, but Nutanix recommends 1 Storage Pool.  Within the Storage Pool are Storage Containers.  With these containers there are different data reduction settings that can setup to get the optimal data reduction and performance that is needed.

Creating The Container

1

Once the cluster is setup with a Storage Pool created we are ready to create a Storage Container.

  1. Name the Container
  2. Select Storage Pool
  3. Choose which hosts to add.

That is all looks really simple until the advanced button is clicked.  This is where the Geek Knobs be tweaked.

2.png

Advanced Settings

There are quite a few options to choose from, and each setting depends on the different use cases.

  1. Replication Factor –  2 copies of  data in the cluster or 3.  Depending on the use case.
  2. Reserved Capacity – How much guaranteed storage that is needed to be reserved for this container.  All the Containers share storage with the Storage Pool so this is used to guarantee the capacity is always available.
  3. Advertised Capacity – How much storage the connected hosts will see.  This can be use this to control actual usage on the Container side.  To allow
  4. Compression – A setting of 0 will result in inline compression.  This can be set to a higher number for desired performance.
  5. Deduplication – Cache deduplication can be used to optimize performance and use less storage.  Capitcity deduplication will deduplicate all data globally across the cluster.  Deduplication is only post-process, and if enabled after a Container is created then only new writes will be deduplicated.
  6. Erasure Coding – Requires at least 4 nodes.  It is a more efficient than the simple replication factor.  Instead of copies of data it uses parity to be able to rebuild anything.  Enabling this setting will result in some performance impact.

Summary

As you can see there can be a lot of impact in performance depending on the settings that you choose.  As always Architecture matters, and you will have to evaluate the needs that your workload has, and  better understanding on how everything works results in a better performing system.

 

vSAN Storage Policies

I get a lot of questions about vSAN and its storage policies.  “What exactly does FTT mean?”, “What should I set the stripe to?”.  The default storage policy with vSAN is FTT=1 and Stripe=1.  FTT means Failures To Tolerate.  Stripe is how many drives an object is written across.

FTT=1 in a 2 node configuration results in mirror of all data. You can lose one drive or one node which results in 200% storage usage.  In a 4 node or larger configuration it gives you RAID 5 which is data being distributed across nodes with a parity of 1.

FTT=2 requires 6 nodes and you can lose 2 drives or 2 nodes.  This is accomplished through using RAID 6 which is parity of 2, and results in 150% storage usage.

If you want to check the status go to Cluster > Monitor > vSAN > Virtual Objects.  From here you can see the FTT and what disks it involves.  From the picture you can see with the 2 node vSAN cluster the objects are on both nodes resulting in RAID 1 or mirroring.

2017-08-30 12_36_08-vSphere Web Client

Now lets break  down which each setting is.

2017-08-28 10_01_51-vSphere Web Client

Striping breaks apart an object to be written across multiple disks.  In a all  flash environment there is still one cache drive per disk group, but it is used just to cache writes.  The rest of the drives are use for reads.   In a hybrid configuration reads are cached on the SSD, but if that data is not on the disk it will then be retrieved from the slower disks.  This will result in slower performance, but by having the object broken apart, and written across multiple disks it can result in increased read performance.  I would recommend leaving the stripe at 1 unless you encounter any performance issues.  The largest size an object can be is 255GB.  If it grows beyond that size it will be broken up into multiple objects across multiple disks.

Force provisioning allows an object to be provisioned on a datastore even if it is not capable of meeting the storage policy.  Such as you have it set for FTT=2, but the cluster is only 4 nodes so its only capable of FTT=1.

Object Space Reservation controls how much of an object is thick provisioned. By default all storage is thin provisioned with vSAN.  You can change this by increasing the percentage.  If you set it to 100% then the object will be thick provisoined.  You can set it anywhere between 0%-100%.  The only caveats are with deduplication and compression its either 0% or 100%.  By default the page file is 100%, but there is a command line setting you can change if you need to save this space.

Flash Read Cache reserves the amount of cache you want reserved for objects.  The max amount of storage the cache drive can use is 800GB.  If you have have 80 VM’s each with 100GB in storage then the entire cache drive storage is used.  When you power on the 81st VM the cache drive will not be able to give that VM any read cache.  That is why its best to not change the default unless you have a technical reason to.

 

How To Install vSAN Updates

VMware is making a lot of great progress with vSAN.  One of the biggest pain points with the technology is the daunting HCL.  VMware spends a lot of time working with hardware vendors to validate the various hardware and firmware versions with vSAN.  In the past this meant manually checking to verify you were running on the right firmware version.  Now with vSAN 6.6 it will automatically check if your running the correct firmware version, and if not you can download and install the firmware automatically.  I found one simple issue with this.  The buttons are not very clear about what they do.  As you can see from the below image it looks like those buttons would refresh the page.  The arrow points to the button that “updates all”.  By selecting that it will apply the update to all your host.  You can do this to all at once or through a rolling update.

2017-08-28 09_46_26-Pasted image at 2017_08_18 08_02 AM.png ‎- Photos

Storage Resiliency in Nutanix. (Think about the Architecture!)

Hyperconverged is a great technology, but it does have its caveats.  You have to understand the architecture and design your environment appropriately.   Recently I had a Nutanix cluster that had lost Storage Resiliency.  Storage Resiliency is when there is not enough storage available in the event of the loss of a node.  When storage is written it is done locally and on a remote node.  This provides Data Resiliency, but at the cost of increased storage usage.  This is essentially the same thing as RAID with traditional storage.

I had 3 nodes that were getting close to 80% usage on the storage container.  80% is fairly full and if one node went down the VM’s running on that host would not be able to failover.  They cannot failover because the loss of one node would not provide enough storage for the VM’s on that node to HA to.  Essentially whatever running on that host would be lost including the what is on the drives.  I really wish they would add a feature to not let you use more storage than what is required for resiliency.

I had two options to remedy this.  I could either add more storage which would also require the purchase of another node, or I could turn off replication.  Each cluster was replicating to each other resulting in double the storage usage.  With replication the RPO was 1 hour, but there were also backups which gave an RPO of 24 hours.  An RPO of 24 hours was deemed acceptable so replication was disabled.  The space freed up was not available instantly.  Curator still needed to run background jobs to make the new storage available.

Screen Shot 2016-02-16 at 2.42.49 PM

A lot of time users will just look at the CPU commitment ratio or the memory usage and forget about the storage.  They are still thinking in the traditional 3 tier world.  Like any technology you need to understand how everything works underneath.  At the end of the day Architecture is what matters.

X-IO Technologies Axellio At SFD13

This is part of a series of post from my time at Storage Field Day 13.  You can find all the related content about it here.

Background

X-IO has been around for a while.  It has recently been going through some troubling times along with the storage industry as a whole.  They had gone dark and I have not seen much about them since then.  Like the Phoenix rising they are now they are ready to show off their new product at Storage Field Day 13.

They were founded in 2002 as the Seagate Advanced Storage Group with the goal to build a scalable storage array with zero trade-offs.  The team included engineers from Digital Equipment Corporation, Hewlett-Packard and Storage Works.  This eventually led to in 2006 the X-IO Intelligent Storage Element (ISE). Then in 2007 they were purchased by Xiotech based out of Minneapolis.  In 2008 the ISE-1 product was introduced.  Then in 2012 they moved to Colorado Springs which is where they had the SFD presentation.  Some of the current products include iglu and ISE series.

X-IO-Axellio-Logo-BBD-Large-X-Green-IO-300x105

Axellio is not for your general Data Center workloads.  It is being built to solve specific problems.  Problems that I did not know even existed.   One of the use case examples was for the oil industry.  Currently a survey ship will travel across the ocean surveying the ocean floor.  This creates petabytes of data that gets stored on the ship.  Not all of the data is able to be processed locally.  This data will then be migrated to a data center somewhere else to be processed.  This is just one use case that Axellio can help solve.

The platform itself I would call a form a converged system.  Normally a converged system includes the storage, compute, and software.  Axellio includes the compute and storage, but not the software.  It would be up to the customer or partner to implement their own software stack to run on the hardware.  Maybe sometime in the future they will also include the software.

Hardware

The Axellio hardware is a 2U appliance which incorporates 2 servers or nodes.  Each node has 2 Intel Xeon e5-26xx CPU’s and 1 TB RAM or NVRAM.  This gives us in one appliance 4 CPU’s and 2TB RAM or NVRAM.  With the current CPU’s that gives us up to 88 cores and 176 threads.  Each appliance can hold up to 72 drives of Dual Port NVMe 2.5″ SSD drives which gives us up to 460TB of storage.  This is achieved using 6 trays with 12 drives each.  Offload modules can also be added.  Such as Intel Phi for CPU extension for parallel compute or Nvidia K2 GPU for VDI.  The appliances can all be attached to Ethernet switches ranging from 4x 10GB, 4x 40GB and up to 4x 100GB.

Capture2.PNG

Architecture (What Really Matters)

FabricExpress is a PCIe interconnect that allows the two nodes to connect directly two the 72 NVMe drives.  By using NVMe drives they are able to connect to the CPU directly over the PCIe lanes.  This creates super fast local storage for the nodes to connect to.  Normally in a converged system there would be an external switch that storage traffic would have to go across which always adds latency.

Axellio Performance

When it comes to performance Axellio does not disappoint.  12 millions IOPS  with 35 microsecond latency at 60 GB’s sustained bandwidth at 4KB writes.  That’s a lot of power in a little box.  Below is an image of some more test that was only utilizing 48 drives.

IMG_8408cropped

Brandon’s Take

One of the most exciting parts of the even was that I was able to go in the back and see where they were building this product.  I could see all kinds of machinery that I had no idea how to use or what it was used for.  You can open up a product once its shipped to you, but its an entirely different thing to see its different stages of being built.  Really makes you appreciate what all goes into creating these products.

Axellio is being built to solve a problem that I did not know much about before the event.  The problems it can solve can be very lucrative for X-IO and for the business that use it.  They are one of the few companies that are doing something different when all the storage products are starting to look the same.

Fellow Delegate Posts

SFD13 PRIMER – X-IO AXELLIO EDGE COMPUTING PLATFORM  by Max Mortillaro

Axellio, next gen, IO intensive server for RT analytics by X-IO Technologies by Ray Lucchesi

Full Disclosure

X-IO provided us the ability to attend the session.  This included a shirt, USB drive and some Thai food.

 

 

 

Storage Field Day Is Almost Here

SFD-Logo-150x150

I am getting really excited about Storage Field Day coming up.  This is going to be my first Field Day experience, and it is going to bring many brand new experiences for me.  I am ready to see what the vendors have to present, and ready to learn many new things.

Netapp_logo.svg_-52x60

I have been on vacation for the last week, so when I got back I was really surprised about some recent changes.  The vendor Seagate is no longer going to be presenting, and in their place will be Netapp.  Netapp is a company that I really did not know much about until it its acquisition of Solidfire.  I had been following Soldfire for some time before Netapp acquired them, and they have recently announced a Hyper-Converged product.  The HCI market is becoming much more competitive with an increasing number of vendors in it.  All the major storage vendors have an HCI offering such as DellEMC with their VxRail, HPE with Simplivity, and Cisco with HyperFlex.  It only makes sense that Netapp would get into the market.  I am curious to see how their product will differentiate from all of its competitors.

Check back next week for an update on Netapp, and all the other vendors presenting at Storage Field Day 13.  I plan to have many more post to cover everything that will be presented at Storage Field Day.

Blog at WordPress.com.

Up ↑