X-IO Technologies Axellio At SFD13

This is part of a series of post from my time at Storage Field Day 13.  You can find all the related content about it here.

Background

X-IO has been around for a while.  It has recently been going through some troubling times along with the storage industry as a whole.  They had gone dark and I have not seen much about them since then.  Like the Phoenix rising they are now they are ready to show off their new product at Storage Field Day 13.

They were founded in 2002 as the Seagate Advanced Storage Group with the goal to build a scalable storage array with zero trade-offs.  The team included engineers from Digital Equipment Corporation, Hewlett-Packard and Storage Works.  This eventually led to in 2006 the X-IO Intelligent Storage Element (ISE). Then in 2007 they were purchased by Xiotech based out of Minneapolis.  In 2008 the ISE-1 product was introduced.  Then in 2012 they moved to Colorado Springs which is where they had the SFD presentation.  Some of the current products include iglu and ISE series.

X-IO-Axellio-Logo-BBD-Large-X-Green-IO-300x105

Axellio is not for your general Data Center workloads.  It is being built to solve specific problems.  Problems that I did not know even existed.   One of the use case examples was for the oil industry.  Currently a survey ship will travel across the ocean surveying the ocean floor.  This creates petabytes of data that gets stored on the ship.  Not all of the data is able to be processed locally.  This data will then be migrated to a data center somewhere else to be processed.  This is just one use case that Axellio can help solve.

The platform itself I would call a form a converged system.  Normally a converged system includes the storage, compute, and software.  Axellio includes the compute and storage, but not the software.  It would be up to the customer or partner to implement their own software stack to run on the hardware.  Maybe sometime in the future they will also include the software.

Hardware

The Axellio hardware is a 2U appliance which incorporates 2 servers or nodes.  Each node has 2 Intel Xeon e5-26xx CPU’s and 1 TB RAM or NVRAM.  This gives us in one appliance 4 CPU’s and 2TB RAM or NVRAM.  With the current CPU’s that gives us up to 88 cores and 176 threads.  Each appliance can hold up to 72 drives of Dual Port NVMe 2.5″ SSD drives which gives us up to 460TB of storage.  This is achieved using 6 trays with 12 drives each.  Offload modules can also be added.  Such as Intel Phi for CPU extension for parallel compute or Nvidia K2 GPU for VDI.  The appliances can all be attached to Ethernet switches ranging from 4x 10GB, 4x 40GB and up to 4x 100GB.

Capture2.PNG

Architecture (What Really Matters)

FabricExpress is a PCIe interconnect that allows the two nodes to connect directly two the 72 NVMe drives.  By using NVMe drives they are able to connect to the CPU directly over the PCIe lanes.  This creates super fast local storage for the nodes to connect to.  Normally in a converged system there would be an external switch that storage traffic would have to go across which always adds latency.

Axellio Performance

When it comes to performance Axellio does not disappoint.  12 millions IOPS  with 35 microsecond latency at 60 GB’s sustained bandwidth at 4KB writes.  That’s a lot of power in a little box.  Below is an image of some more test that was only utilizing 48 drives.

IMG_8408cropped

Brandon’s Take

One of the most exciting parts of the even was that I was able to go in the back and see where they were building this product.  I could see all kinds of machinery that I had no idea how to use or what it was used for.  You can open up a product once its shipped to you, but its an entirely different thing to see its different stages of being built.  Really makes you appreciate what all goes into creating these products.

Axellio is being built to solve a problem that I did not know much about before the event.  The problems it can solve can be very lucrative for X-IO and for the business that use it.  They are one of the few companies that are doing something different when all the storage products are starting to look the same.

Fellow Delegate Posts

SFD13 PRIMER – X-IO AXELLIO EDGE COMPUTING PLATFORM  by Max Mortillaro

Axellio, next gen, IO intensive server for RT analytics by X-IO Technologies by Ray Lucchesi

Full Disclosure

X-IO provided us the ability to attend the session.  This included a shirt, USB drive and some Thai food.

 

 

 

Storage Field Day Is Almost Here

SFD-Logo-150x150

I am getting really excited about Storage Field Day coming up.  This is going to be my first Field Day experience, and it is going to bring many brand new experiences for me.  I am ready to see what the vendors have to present, and ready to learn many new things.

Netapp_logo.svg_-52x60

I have been on vacation for the last week, so when I got back I was really surprised about some recent changes.  The vendor Seagate is no longer going to be presenting, and in their place will be Netapp.  Netapp is a company that I really did not know much about until it its acquisition of Solidfire.  I had been following Soldfire for some time before Netapp acquired them, and they have recently announced a Hyper-Converged product.  The HCI market is becoming much more competitive with an increasing number of vendors in it.  All the major storage vendors have an HCI offering such as DellEMC with their VxRail, HPE with Simplivity, and Cisco with HyperFlex.  It only makes sense that Netapp would get into the market.  I am curious to see how their product will differentiate from all of its competitors.

Check back next week for an update on Netapp, and all the other vendors presenting at Storage Field Day 13.  I plan to have many more post to cover everything that will be presented at Storage Field Day.

Nutanix Node Running Low On Storage

I manage a few Nutanix clusters and they are all flash, because of this the normal tiering of data does not apply. In a hybrid mode, which has both spinning and solid state drives, the SSD will be used for read and write cache. Only moving “cold” data down to the slower spinning drives as needed.   The other day one of the nodes local drives were running out of free space.  It made me wonder what happens if they do fill up?

With Nutanix it tries keeps everything local to the node.  This provides low latency reads since there is no network for data to cross, but the writes still have to go across the network.  The reason for this is that you want at least two copies of data.  One local and one remote.  So when writes happen, it writes synchronously to the local and a remote node.  Writes are written across all nodes in the cluster, and in the event of a lost node it can use all nodes to rebuild that data.

When the drives do fill up nothing really happens.  Everything keeps working and their is no down time.  The local drives become read only.  Writes will then be written to at least two different nodes ensuring data redundancy.

To check the current utilization of your drives it is under Hardware > Table > Disk

Capture

So it is best practice to try to “right size” your workloads.  Try to  make sure that the VM’s will have their storage needs met by the local drives.  HCI is a great technology it just has a few different caveats to consider when designing for your workloads.

If you want a deeper dive about it check out Josh Odgers post about it.

Delegate For Storage Field Day 13

SFD-Logo-150x150

I am very excited to say that I have been chosen as a delegate for Storage Field Day 13 in Denver, Colorado June 14-16. Storage Field Day is just one of the many events of the Tech Field Day program. Tech Field Day brings in various vendors and leaders in tech to present on various topics. The delegates then can interact and ask all kinds of questions. I have been following the Tech Field Day program for a few years now. Ever since my good friend Michael Wood introduced it to me.

I first became interested in becoming a delegate after my friend Thom Greene was invited to join Tech Field Day.  Up to this point I had never met anyone who had been a part of it.  After talked to him it seemed like something I really would like to be a part of.  I know it is going to be an amazing  and humbling experience.  I am going to be with some of the smartest people in the industry.  Just take a look at the event page, and look at everyone that is going to be there.
As of now there is six presenters:

I already know a little about all of these companies, but I am eager to learn more. I am looking forward to the deep dives and white board sessions detailing all of the technology that they offer.

I will be following twitter so if you have anything you want me to find out, tweet at me @brandongraves08 I will try to ask as many questions as I can.  You can also follow #SFD13 to ask questions and keep up with event.

Strange Issues With Microsoft Clustering and ESXi

I have some legacy applications that require Microsoft Clustering which are running on ESXi 6.0.  Using Microsoft Clustering on top of VMware does not give you many benefits.  Things like HA and moving workloads across nodes is already available using virtualization.  What clustering does do is create more places for things to break and give you downtime.  Really the only benefit I see with clustering in a virtualized environment is the ability to restart a server for system updates.

RDM’s are required for using Microsoft Clustering.  RDM “Raw Device Map” gives the VM control of the LUN such as it was directly connected to it. To set this up you need to add a second SCSI controller and set it to physical mode.  Each disk must then share the same SCSI controller settings for every VM in the cluster. The negative side to doing this is that you lose such features as snapshot and vmotion.  When using RDM’s with physical mode you should treat those VM’s as if they were physical hosts.

12

The problem occurred when one of the clustered nodes was rebooted.  The node never came back online, and when checking the console it looked like the Windows OS was gone.  Powered off the VM and removed the mapped RDM’s.  When powering on the VM Windows booted up fine.  I Found that very strange so powered it off again and added the drives back.  That is when  I got the error invalid device backing.  VMware KB references the issue, and it basically says there is an issue with inconsistent LUN’s The only problem was I did have have consistent LUN’s.  I put in a ticket with GSS, and the first level support was not able to help.  They had to get a storage expert to help out. He quickly found this issue which was the LUN ID had changed. I am not sure how that occurred, but it was not anything I could change  When adding the drives in the VM’s the config it makes a mapping from the VM to the LUN.  When the LUN ID changed the mapping did not.  The only fix was to remove the RDM’s from all VM’s in that cluster and then add them back.

Weathervane, a benchmarking tool for…

Weathervane, a benchmarking tool for virtualized infrastructure and clouds – now open source!

Weathervane, a benchmarking tool for…

Weathervane is a performance benchmarking tool developed at VMware. It lets you assess the performance of your virtualized or cloud environment by driving a load against a realistic application and capturing relevant performance metrics. You might use it to compare the performance characteristics of two different environments, or to understand the performance impact of some change in an existing environment.


VMware Social Media Advocacy

ECC Errors On Nutanix

When logging into a Nutanix cluster I see that I have 2 critical alerts.

1

With a quick search I found KB KB 3357 I SSH into one of the CVM’s running on my cluster, and ran the following command as one line.

ncc health_checks hardware_checks ipmi_checks ipmi_sel_correctable_ecc_errors_check

Looking over the output I quickly found this line.

3

I forwared all the information to support, and will replace the faulty memory module when it arrives.  Luckly so far I have not seen and issues from this memory issue, and I really liked how quick and easy it was to resolve this issue using Nutanix.

vCenter Fails after Time Zone Change

We recently changed our NTP server, and I needed to update all or hosts and vCenters.  I have a handy powershell script to update the ESXi hosts, but that script does not work on the vCenter servers.  I log into the server using port 5480 to gain access to the vCenter Management. I login as root and notice that the time zone is UTC.  I am in the Central time zone so I wanted to change it from UTC.  Turns out if you do that it break everything.  I had to learn this the hard way, and once I changed the time zone I was not able to log into vCenter.  I had to then go back and change the time zone back to UTC to regain access. Capture.

Blog at WordPress.com.

Up ↑