Storage Field Day Is Almost Here

SFD-Logo-150x150

I am getting really excited about Storage Field Day coming up.  This is going to be my first Field Day experience, and it is going to bring many brand new experiences for me.  I am ready to see what the vendors have to present, and ready to learn many new things.

Netapp_logo.svg_-52x60

I have been on vacation for the last week, so when I got back I was really surprised about some recent changes.  The vendor Seagate is no longer going to be presenting, and in their place will be Netapp.  Netapp is a company that I really did not know much about until it its acquisition of Solidfire.  I had been following Soldfire for some time before Netapp acquired them, and they have recently announced a Hyper-Converged product.  The HCI market is becoming much more competitive with an increasing number of vendors in it.  All the major storage vendors have an HCI offering such as DellEMC with their VxRail, HPE with Simplivity, and Cisco with HyperFlex.  It only makes sense that Netapp would get into the market.  I am curious to see how their product will differentiate from all of its competitors.

Check back next week for an update on Netapp, and all the other vendors presenting at Storage Field Day 13.  I plan to have many more post to cover everything that will be presented at Storage Field Day.

Nutanix Node Running Low On Storage

I manage a few Nutanix clusters and they are all flash, because of this the normal tiering of data does not apply. In a hybrid mode, which has both spinning and solid state drives, the SSD will be used for read and write cache. Only moving “cold” data down to the slower spinning drives as needed.   The other day one of the nodes local drives were running out of free space.  It made me wonder what happens if they do fill up?

With Nutanix it tries keeps everything local to the node.  This provides low latency reads since there is no network for data to cross, but the writes still have to go across the network.  The reason for this is that you want at least two copies of data.  One local and one remote.  So when writes happen, it writes synchronously to the local and a remote node.  Writes are written across all nodes in the cluster, and in the event of a lost node it can use all nodes to rebuild that data.

When the drives do fill up nothing really happens.  Everything keeps working and their is no down time.  The local drives become read only.  Writes will then be written to at least two different nodes ensuring data redundancy.

To check the current utilization of your drives it is under Hardware > Table > Disk

Capture

So it is best practice to try to “right size” your workloads.  Try to  make sure that the VM’s will have their storage needs met by the local drives.  HCI is a great technology it just has a few different caveats to consider when designing for your workloads.

If you want a deeper dive about it check out Josh Odgers post about it.

Delegate For Storage Field Day 13

SFD-Logo-150x150

I am very excited to say that I have been chosen as a delegate for Storage Field Day 13 in Denver, Colorado June 14-16. Storage Field Day is just one of the many events of the Tech Field Day program. Tech Field Day brings in various vendors and leaders in tech to present on various topics. The delegates then can interact and ask all kinds of questions. I have been following the Tech Field Day program for a few years now. Ever since my good friend Michael Wood introduced it to me.

I first became interested in becoming a delegate after my friend Thom Greene was invited to join Tech Field Day.  Up to this point I had never met anyone who had been a part of it.  After talked to him it seemed like something I really would like to be a part of.  I know it is going to be an amazing  and humbling experience.  I am going to be with some of the smartest people in the industry.  Just take a look at the event page, and look at everyone that is going to be there.
As of now there is six presenters:

I already know a little about all of these companies, but I am eager to learn more. I am looking forward to the deep dives and white board sessions detailing all of the technology that they offer.

I will be following twitter so if you have anything you want me to find out, tweet at me @brandongraves08 I will try to ask as many questions as I can.  You can also follow #SFD13 to ask questions and keep up with event.

Strange Issues With Microsoft Clustering and ESXi

I have some legacy applications that require Microsoft Clustering which are running on ESXi 6.0.  Using Microsoft Clustering on top of VMware does not give you many benefits.  Things like HA and moving workloads across nodes is already available using virtualization.  What clustering does do is create more places for things to break and give you downtime.  Really the only benefit I see with clustering in a virtualized environment is the ability to restart a server for system updates.

RDM’s are required for using Microsoft Clustering.  RDM “Raw Device Map” gives the VM control of the LUN such as it was directly connected to it. To set this up you need to add a second SCSI controller and set it to physical mode.  Each disk must then share the same SCSI controller settings for every VM in the cluster. The negative side to doing this is that you lose such features as snapshot and vmotion.  When using RDM’s with physical mode you should treat those VM’s as if they were physical hosts.

12

The problem occurred when one of the clustered nodes was rebooted.  The node never came back online, and when checking the console it looked like the Windows OS was gone.  Powered off the VM and removed the mapped RDM’s.  When powering on the VM Windows booted up fine.  I Found that very strange so powered it off again and added the drives back.  That is when  I got the error invalid device backing.  VMware KB references the issue, and it basically says there is an issue with inconsistent LUN’s The only problem was I did have have consistent LUN’s.  I put in a ticket with GSS, and the first level support was not able to help.  They had to get a storage expert to help out. He quickly found this issue which was the LUN ID had changed. I am not sure how that occurred, but it was not anything I could change  When adding the drives in the VM’s the config it makes a mapping from the VM to the LUN.  When the LUN ID changed the mapping did not.  The only fix was to remove the RDM’s from all VM’s in that cluster and then add them back.

Weathervane, a benchmarking tool for…

Weathervane, a benchmarking tool for virtualized infrastructure and clouds – now open source!

Weathervane, a benchmarking tool for…

Weathervane is a performance benchmarking tool developed at VMware. It lets you assess the performance of your virtualized or cloud environment by driving a load against a realistic application and capturing relevant performance metrics. You might use it to compare the performance characteristics of two different environments, or to understand the performance impact of some change in an existing environment.


VMware Social Media Advocacy

ECC Errors On Nutanix

When logging into a Nutanix cluster I see that I have 2 critical alerts.

1

With a quick search I found KB KB 3357 I SSH into one of the CVM’s running on my cluster, and ran the following command as one line.

ncc health_checks hardware_checks ipmi_checks ipmi_sel_correctable_ecc_errors_check

Looking over the output I quickly found this line.

3

I forwared all the information to support, and will replace the faulty memory module when it arrives.  Luckly so far I have not seen and issues from this memory issue, and I really liked how quick and easy it was to resolve this issue using Nutanix.

vCenter Fails after Time Zone Change

We recently changed our NTP server, and I needed to update all or hosts and vCenters.  I have a handy powershell script to update the ESXi hosts, but that script does not work on the vCenter servers.  I log into the server using port 5480 to gain access to the vCenter Management. I login as root and notice that the time zone is UTC.  I am in the Central time zone so I wanted to change it from UTC.  Turns out if you do that it break everything.  I had to learn this the hard way, and once I changed the time zone I was not able to log into vCenter.  I had to then go back and change the time zone back to UTC to regain access. Capture.

Blog at WordPress.com.

Up ↑