We had the chance to sit down and have a chat with the Hedvig EMEA guys last week. They gave us a very good presentation on what Hedvig can bring and what they are working on. As we recently got to know Hedvig and their software defined storage solution, we were pretty amazed with their view on SDS and their long list of supported platforms and enterprise storage features and services. Although it is pretty hard to explain all the goods Hedvig brings in one post, we will give it a try! 🙂
Not too long ago, Hedvig Inc came out of stealth after a period of developing since June of 2012. They are opting for a slightly different approach with the general availability (GA) status compared to other SDS start-ups. When their software will be GA with version 1.0, it will be a fully developed, full feature solution which is already running production at several enterprise early adopter customers! It is likely version 1.0 is released next week (week 23)!!
Okay, so let us focus on what makes Hedvig unique. They introduce themselves using the quote below.
Put simply: Hedvig gets better and smarter as it scales. Hedvig defies conventional wisdom, transforming commodity hardware into the most advanced storage solution available today. Hedvig accelerates data to value by collapsing disparate storage systems into a single platform, creating a virtualized storage pool that provisions storage with a few clicks, scales to petabytes, and runs seamlessly in both private and public clouds.
When looking into their view on SDS, you notice that Avinash Lakshman has a lot of experience building large scale, distributed, database models. He is the inventor of Cassandra and co-inventor of Dynamo, which is now morphed into NoSQL. The Hedvig solution is leveraging on that experience (of course), using it to facilitate a large scale, distributed, elastic storage system using commodity server hardware. No strict hardware compatibility list (HCL) is enforced, although there are recommendations available for best performance and capacity sizing.
If you won’t be able to say goodbye to your traditional storage arrays just yet, Hedvig is able to connect to LUN’s or mountpoints provided by your traditional array(s) for ‘easier’ transition from traditional storage to fully distributed storage. 🙂
Hedvig does not only run on hypervisors (VMware, Hyper-V, KVM and Xen) , but will also run on bare metal. Next to that, it supports file- and block based storage protocols (SMB3 support is being developed), even object based storage (Openstack Swift, Amazon S3)! As for Openstack support; Next to the Swift support, Cinder (plugin for block based storage) is also fully supported. That makes it possible for the vDisk policies (explenation follows) to be automatically exported to Openstack!
The two deployment models are hyper-scale (scale compute and storage independently) and hyper-converged (scale compute with storage). Within these deployment models (which can run together in the same cluster!), the Hedvig storage proxies and the Hedvig storage nodes are deployed.
The storage proxy is used to present storage to the bare metal server or hypervisor using file (NFS v2,v3,v4) or block based (iSCSI) storage protocols, except for Swift or S3 which is spoken ‘natively’. The proxy is also used for the client side caching mechanism and dedupe caching.
The storage node is the actual software piece which forms the distributed elastic storage cluster.
Both communicate with each other using custom RPC calls. The storage proxy will translate RPC to the configured storage protocol.
Hedvig uses Virtual Disk (vdisk) Policies to define a logical storage space. Using vdisks, multi tenancy can be introduced within you storage environment by using a vdisk per specific customer or department. The virtual disk management GUI (or CLI) will let you create (batches of) virtual disks using specific options as shown in the screenshot below. Did I mention REST API is also fully integrated?
Note that the replication policy is rack-aware an dc-aware! All the enterprise storage features you expect are configurable; Inline compression, inline global deduplication and the replication factor (up to 6 replicas!) using the configured replication policy and datacenters. Storage features like read cache, dedup and compression are done in RAM.
The tricky part in my opinion is the block size. I’m not a big fan of getting to choose your block size yourselves. I get it, but I’m no fan. Especially when deduplication will only work when you configure 512B or 4K (default) block sizes!
The clustered file system option is used for i.e. file systems accessed by multiple hosts.
It is possible to create a vdisk with HDD spindles while using client-side caching (write through) or auto-tiering for higher performance, or… create an all flash virtual disk.
Write performance is optimized by using an road robin algorithm which will use the 2 best responding hosts within the cluster for the write IO. After the initial write and ack, the write is are also distributed evenly across the cluster. Further optimization for writes is done by stream detection for random versus sequential IO’s. If sequential data streams are detected, then the cluster node passes that data directly to the disks to save IO on SDD or Flash.
Read performance is obtained by using all storage nodes within the cluster, leveraging the fastest responding cluster nodes for low latency reads. A real-time updated table containing the latency from all hosts in the cluster is leveraged for this. Also, the client side caching will lower latency dramatically.
Other IO optimization features used within Hedvig are auto-tiering and auto-balancing.
When it comes to cluster resiliency, Hedvig has a few tricks up their sleeves. First of all, writes are distributed to all the cluster nodes taking the configured replication options into account. The ack will be send to the application after a majority quorum of checksummed acks (2 ack’s in case of 3 replicas) for data protection.
Using the replication options, you get a lot of replication models to choose from. In example; The most ideal, resilient configuration for replication factor and policy is said to be a deployment using 3 replica’s in 3 datacenters. That will enable the customer to use the most resilient setup in which non disruptive updates can be used. A logical overview of such a configuration will look like this:
Next to the replication possibilities, you get to create zero-impact, virtually unlimited snapshots and zero-impact thin clones. Those clones can have different characteristics from it’s parent vDisk!
The snapshot mechanism capture space-efficient point-in-time state of a Virtual Disk using metadata only!
Hedvig is not bound to traditional RAID challenges like the need for spare disks and long rebuild times when using high capacity disks. When a disk in the cluster fails, the rebuild is initiated automatically across the entire cluster. So a full wide stripe rebuild is initiated on all nodes and disks in the cluster. An average 4TB disk rebuild should be done in about 20 minutes. We also overheard that the road map contains the implementation of erasure code in the near future which will further enhance disk/cluster resiliency. Cool! 😀
So… Hedvig truly is a very complete SDS product and is not bound to specific hardware or hypervisor. Hedvig will deliver! Whether you need all flash performance, a hyper converged solution or even a big data environment! The versatility of their product on every level is a true differentiator!
The stuff we saw on the road map looks very promising including performance QoS and historical data analytics, so there is even more to come!
Because I am a VMware minded guy, I was also glad to hear that support for storage API’s for vSphere (and Hyper-V) is being developed. Only missing feature in my opinion is support for VMware’s VVOL.
As with most SDS solutions, a fair amount of compute overhead is needed for the SDS solution to run properly. Be sure to keep that in mind when designing infrastructures using these technologies.
It is difficult to explain all the nifty things Hedvig is capable of in one blog post. So be sure to check out the videos below and check out the online brochure. I strongly recommend, when you see Hedvig on exhibitions or conferences, to check them out!!
Latest posts by Niels Hagoort (see all)
- Datanauts: Diving Deep Into vSphere Host Resources - May 12, 2017
- VMUGs and VMworlds - March 22, 2017
- AWS direct connect – Connectivity matters! - October 12, 2016