Strange Issues With Microsoft Clustering and ESXi

I have some legacy applications that require Microsoft Clustering which are running on ESXi 6.0.  Using Microsoft Clustering on top of VMware does not give you many benefits.  Things like HA and moving workloads across nodes is already available using virtualization.  What clustering does do is create more places for things to break and give you downtime.  Really the only benefit I see with clustering in a virtualized environment is the ability to restart a server for system updates.

RDM’s are required for using Microsoft Clustering.  RDM “Raw Device Map” gives the VM control of the LUN such as it was directly connected to it. To set this up you need to add a second SCSI controller and set it to physical mode.  Each disk must then share the same SCSI controller settings for every VM in the cluster. The negative side to doing this is that you lose such features as snapshot and vmotion.  When using RDM’s with physical mode you should treat those VM’s as if they were physical hosts.

12

The problem occurred when one of the clustered nodes was rebooted.  The node never came back online, and when checking the console it looked like the Windows OS was gone.  Powered off the VM and removed the mapped RDM’s.  When powering on the VM Windows booted up fine.  I Found that very strange so powered it off again and added the drives back.  That is when  I got the error invalid device backing.  VMware KB references the issue, and it basically says there is an issue with inconsistent LUN’s The only problem was I did have have consistent LUN’s.  I put in a ticket with GSS, and the first level support was not able to help.  They had to get a storage expert to help out. He quickly found this issue which was the LUN ID had changed. I am not sure how that occurred, but it was not anything I could change  When adding the drives in the VM’s the config it makes a mapping from the VM to the LUN.  When the LUN ID changed the mapping did not.  The only fix was to remove the RDM’s from all VM’s in that cluster and then add them back.

vCenter Fails after Time Zone Change

We recently changed our NTP server, and I needed to update all or hosts and vCenters.  I have a handy powershell script to update the ESXi hosts, but that script does not work on the vCenter servers.  I log into the server using port 5480 to gain access to the vCenter Management. I login as root and notice that the time zone is UTC.  I am in the Central time zone so I wanted to change it from UTC.  Turns out if you do that it break everything.  I had to learn this the hard way, and once I changed the time zone I was not able to log into vCenter.  I had to then go back and change the time zone back to UTC to regain access. Capture.

Only Default Printer Mapping Over With View 6.2

I recently had an issue with only the default printer being mapped over from the local Windows 7 PC to the View Client session.  It did not make any sense to me.  I had 10 printers mapped yet only 1 was showing up.  It turns out its a limitation of Windows 7.  If all the printers are using the same Driver and Port then you will only see the default printer under the Devices and Printers page.  If you right click that printer it will list all the printers you have mapped.  When you try to print something it will also list all your printers.

New Year New Goals for 2017

For the year 2017 I have 3 certification I would like to achieve.  Not only to advance my career, but to also further my knowledge and my passion for technology. Some people may feel that certification are not really necessary or serve no real purpose.  That you shouldn’t need a certification to just prove you know something,  certifications are a great way to containerize what you should learn about a subject.  I feel by not pursuing a certification I would not get as deep into learning about a technologies as I should.

AWS Certification is my first goal of the year. I think it is going to be an very important skill going forward in IT.  The cloud is everywhere and is constantly growing, and AWS is currently the market leader. I think of cloud as a automated way to run a data center.  When you need to accomplish something such as deploying a VM or provisioning networks, you do it by utilizing and automated tasks.  There are some real private clouds, but in general is seems most private data centers are still doing this the old fashion way.  Manually deploying new VM’s or configuring the network by hand.  That is why AWS is so important because they have already designed an automated way for you to deploy your workloads.  Leaving you to architect and design how to run your workload on top of it.  The real skill is knowing how to use AWS, and understanding the entire compute stack.  With understanding the entire stack  you can really go anywhere in IT.

Next on my list is the VCAP-6 DCV Implementation.  VCAP stands for VMware Certified Advance Profession.  Before you can pursue these certifications you must have first earned you VCP or VMware Certified Profession.  I have a real passion for virtualization, and I love everything that goes along with it.  My long term goal is  to accomplish the VCDX, but I know that is still very far off.   There are many steps to this goal, and VCAP is just one of many along that journey.  With the implementation certification I will show I understand how to fully deploy vSphere into the datacenter.  On the surface the  deployment does not really seem all that difficult until you realize how many settings or “nerd knobs” there are with it.  To accomplish this I hope to get some real off site training.  If this happens it will be the first time in my career that I will have had actual training on something.  I always find it ironic that companies are willing to spend millions of dollars on equipment, but not 5,000 dollars on actual training.  I will also use Pluralsight which I have a free subscription with for being in the vExpert program.  Thank you Pluralsight for giving that to us.  Finally I will read lots of blogs and white papers.  The VCAP tests cover so much that you really have to learn all you can before taking the test.  It shows that you have real knowledge and are a subject matter expert on it.

Finally my final goal for the year is VCAP6- DCV Design.  Design is probably one of the hardest parts of IT.  When  you ask what is the best way to do something the infamous answer in IT is “it depends”.  Because its not always a one size fits all.  Best practice does have its place, but the real knowledge is knowing what the best way to do something, and not just the best practice.  I think that this test will be the hardest for me.  My career has always been focused on the doing and the the designing.  It will be a learning curve, but will be good challenge and really further my skills.

Accomplishing these 2 VCAP test will give me the VCIX or VMware Certified Implementation Expert.  Proving that I now have the knowledge to deploy vSphere in the Data Center.  Hopefully I will be able to accomplish all 3 of these goals within the year of 2017.  If I do then maybe I will move on to the VCIX-DTM or some other challenge.  If you have any career advice please leave a comment below.  Thanks for reading this post and have a good 2017.

vCenter Server Resource Missing or Invalid

Recently I was trying to deploy an OVA file.  After selecting the storage I recieved the following error. capture

I tried downloading the OVA again, but still had the same issue.  Luckily it was an easy fix.  I was trying to deploy the OVA at the Cluster level.  What I needed to do was deploy the OVA at the ESXi level in vCenter.  Hopefully in a future vSphere update this will no longer be an issue.

VMware Quick Tip: Installing a VIB

Usually you use VMware Update Manager to install VIB’s such as Dell’s Open Manage.  You can also SSH into a host and use ESXCLI.

First upload the Offline .Zip file to your datastore.  Then you will need to find the mount point.

esxcli storage filesysem list

Now that you install the VIB.

esxcli software vib install -d “/vmfs/volumes/Datastore/DirectoryName/PatchName.zip

Now verify

esxcli software vib list

 

2 Node vSAN Design for a Remote Site

I was recently asked to design a solution for a remote site.  The requirements were it had to be cheap, run a few virtual machines, fail over capability and have shared storage The workloads are going to be very light so there is no need for powerful servers.  I had a few options with this.  Technically one server could run the entire workload, but that does not allow for any failure so I needed at least two servers.  This would provide a fail over capacity of only 1.  Bare minimum but acceptable for this use case.  These two servers would need some kind of shared storage. One option would be using a small storage array such as the DELEMC VNXe.  I have used these previously, and they were a great solution for the time, but the times are changing and I think hyperconvergence is the future.  With vSAN 6.5 there were a lot of new features that it would make it a perfect solution.

Previously with any Hyperconvereged solution you needed 3 nodes.  3 nodes are used to check for everything being online.  If 1 out of the 3 nodes goes down the other two nodes can check with each other to verify that the node actually went down.  To get away with using 2 nodes you use an external witness.  This external witness could run on a separate server on the site or at the main data center.

With vSAN you have one SSD per Disk Group (DG) to be used for cache.  Since this had to be a cheap solution my area on constraint was cost, and everything had to be a minimal design to get the job done.  Each server would have 1 DG with an 800GB SSD and 4 4TB 7.2k HHD.  This allowed for FTT=1 or only 1 host could be lost.  There is some risk with this design.  There is always a chance that in a maintenance situation one of the host would be in maintenance mode, and  this would leave a single point of failure.  Because there would only be 1 DG available on the one online host, but this is an acceptable risk for the constraint of cost.

One of my favorite new features with 6.5 is direct connect.  With this you can now directly connect two hosts to each other instead of running through a switch.  Each of these server have 2 1GB ports and 2 10GB ports. The remote site switch infrastructure is only 1GB.  Now 1GB can be a serious limitation for storage, and I wanted to avoid that.  With direct connect you can directly connect the two host to each other, and all storage traffic would then go across that link.  Leaving the 1GB ports to be used by the VM traffic.

As you can tell this is an bare minimum design for vSAN and hyperconvergence.  It does meet all the requirements such as Cost, Availability, Share Storage.  In the event of a host going down HA can restart all the VM’s on the second node providing minimal downtime.  This provides the optimal solution for the requirements of the design.

 

Blog at WordPress.com.

Up ↑