Have you ever heard of the Lindy effect? It basically postulates than the future life expectancy for technologies and ideas is proportional to their current age, and based on that, it is safe to say that it is not too late for anyone to learn vi, which is notorious for being a text editor with vertical learning curve 🙂 Below is a link to my introductory blogpost published on StarWind blog and intended for those who just starting with vi. I recently have an opportunity to start using it and I can confirm it is just about overcoming initial frustration and practice a little bit, and then you can even like it 🙂 and even if you don’t you’ll definitely enjoy the power of being able to make config file changes on any Linux box right away instead of helplessly struggling without missing friendlier text editor(s).
This article provides a basic overview of StarWind Virtual SAN (VSAN), a software-defined storage (SDS) solution from StarWind.
First things first: it is important to understand what SDS is
SDS is an umbrella term for software that enables policy-based provisioning of data storage independently of the underlying hardware. You can consider SDS a form of storage virtualization allowing to separate storage hardware from the software for its management. On top of that, SDS virtualization may also provide a rich policy-managed feature set including such things as data deduplication, replication, thin provisioning, etc.
SDS allows you to design architectures where software (instead of hardware) determines storage performance, availability, and resiliency. Usually, SDS systems are designed to perform on commodity hardware so that the software never gets dependent on proprietary hardware. However, the software you use may lock-in you to the particular vendor.
There are different implementations of SDS from different vendors. They can be divided on solutions offered by OS vendors (or public cloud providers) and ones developed by vendors focused purely on SDS.
For example, Microsoft introduced its SDS solution, Storage Spaces Direct, as a Windows Server 2016 feature (this version was RTMed in September 2016). However, you can find flaws even in the latest versions of their Storage Spaces Direct technology (for instance, deduplication did not work on ReFS until Windows Server 2019 release). Such issues may be a good reason why users opt for an alternative. Another thing about SDS is that you can access Storage Spaces Direct functionality only in Datacenter edition of Windows Server (high licensing costs).
On the other hand, StarWind VSAN is an example of SDS software developed by an SDS-oriented company. First released in 2005, it was one of the first practical implementations of SDS built with simplicity in mind. Any experienced administrator of Microsoft Hyper-V, VMware vSphere, or Citrix XenServer can configure StarWind VSAN easily. StarWind VSAN also allows you to start leveraging its full feature set starting with just two commodity servers as a foundation of highly-available (HA) SDS. Although this software uses the services provided by of Windows Server, you have a better version and edition choices, i.e., you can run it on any edition of Windows Server 2012 or 2016 (there is still even partial support for 2008 R2 which will probably end soon). As you can see, this specialized software found a way to the market earlier than Storage Spaces Direct, thereby developers had more time for adding improvements and refinements based on real-world usage and client base’s feedback.
You can have a full-fledged feature set including asynchronous replication, in-line, and offline deduplication, log structuring, and multi-tiered caching even in a minimal configuration of a two-node VSAN cluster. These features are present in other software solutions, but they often do not allow the two-node implementation scenario.
StarWind Virtual SAN features
- Asynchronous replication replicates mission-critical data to remote disaster recovery (DR) site with minimal requirements for network bandwidth and hardware equipment, enabling you to perform replication over long-distance high-latency routes. Replication is performed asynchronously in the background using snapshots as a source. Features such as deduplication, snapshots, change block tracking in combination minimize the amount of data transferred to reduce WAN link usage. Snapshots secure data integrity.
- In-line deduplication. Deduplication increases storage efficiency by saving space through elimination of repeating data. StarWind in-line duplication uses industry standard 4k blocks. Being combined with compression, it reduces the number of write operations, allowing to extend flash life span.
- Log structuring write-back cache (LSWBC) optimizes highly randomized data flow generated by VMs. Disk storage handles highly randomized writes poorly; it uses RAM cache, leading to the risk of data loss. The use of SSD as the only approach is not always viable from the financial standpoint (overprovisioning/overuse of financial budget). LSWBS uses RAM and flash caching in conjunction with log structuring. LSWBC writes data to the circular buffer in RAM, organizes its flow sequentially, and gradually flushes it to the log disk (device from a tiny fraction of your storage). Log disk, in turn, sends data to the underlying storage where it eventually resides. Hence, it is possible to get high performance even with highly randomized workloads.
server-side cache is a technology turning an SSD into level 2 cache. With
the use of server RAM as level 1 cache, it absorbs excessive writes and reduces
the number of write cycles impacting life span of SSD drives. Inexpensive commodity
hardware is available, which means that you can use MLC flash instead of
expensive SLC flash that gives more memory to meet workload requirements.
- Multiprotocol: VSAN supports industry-standard uplink protocols. The following protocols are available: iSER, NVMe-oF, iSCSI, SMB3 (including RDMA-supporting SMB Direct and MPIO-utilizing SMB Multichannel), and NFS. Virtually unlimited use cases are possible: bare-metal, converged (“compute and storage separated”), hyperconverged, Clustered Shared Volumes for SOFS, VVols on top of iSCSI, SMB3 file servers and many others.
In terms of supported fabrics, you can use 1, 10, and up to 200 GbE or Infiniband.
Such feature set makes StarWind VSAN proposition quite compelling and competitive. It is an interesting option, especially in terms of design flexibility it provides and a variety of potential use cases.
I hope that this overview was useful and interesting. In case you want to know more about StarWind VSAN, you can get more information on StarWind Virtual SAN product page.
Normally, I write the “technical how to” type of articles, but this one will be more of a product review/introduction (though, I think even with this format we can go into technical details 😊). Relatively recently, StarWind released a free tool which allows you to measure latency and bandwidth of RDMA connections (pay attention to conjunction “and” here) and to do this in heterogeneous environments (meaning that you can measure Windows – Linux RDMA connection bandwidth and latency). This utility is called rPerf and can be downloaded for free from StarWind website. To download it, you will need to fill in a little form with some of your data, but that’s not much to pay for a good tool, right?
I would allow myself to write a little bit on what RDMA is so that we are clear on what we are going to measure here with this utility😊 (though, this technology is a huge topic in its own which calls for a lot of reading to fully understand it). Next, we will touch a little bit on what rPerf can do for you and even more briefly how to use it (just because it is straightforward and easy).
What is RDMA? RDMA or Remote Direct Memory Access is a technology which enables direct access from memory of one computer to another bypassing OS data buffers of both computers (meaning it is all happens on hardware level through device drivers). That type of access allows you to have high-throughput and low-latency networking which is something you really need for massively parallel computing clusters. RDMA-enabled data transfers do not add extra load on CPUs, caches or context switches, allowing your data transfers to continue in parallel with others system tasks. As an example of practical use case may be Hyper-V live migration, there is a YouTube video from Mellanox demonstrating a comparison of live migration performance with RDMA vs. TCP (and it shows impressive 29 seconds VS 2 hours result).
RDMA read and write requests are delivered directly to the network allowing for fast message transfer and reduced latency, but also introduces certain problems of single-side communications, where target node is not notified about the completion of the request (you may want to read up more on this to really understand this technology).
How can you get it? RDMA implementations require you to have both hardware (NIC) and software support (API and drivers support) and currently different varieties of RDMA implementations exist: Virtual Interface Architecture, RoCE (RDMA over Converged Ethernet), InfiniBand, Omni-Path, iWARP.
All in all, you most likely will find RDMA capability in high-end servers (you need to make sure that you have NIC supporting RDMA, something from Broadcom, Cavium or Mellanox Technologies) and HPC type of Microsoft Azure VMs (H16r, H16mr, A8 and A9, and some of N-series sizes with “r” in their name too).
What can you do with rPerf? You can measure RDMA link performance between RDMA-enabled hosts. The rPerf tool is a CLI utility which has to be run on both machines: one of them running as a server and another as a client. On the machine which you run as a client you specify the number of read iterations, buffer size and queue depth to start testing and once test completes you are going to get throughput in MiB/s and kIOPS along with latency information units/microseconds (minimum/maximum/average).
I’ve already mentioned that one of the strong points of this tool is its ability to work cross-platform. OS wise it supports Windows 7/Server 2012 or newer, CentOS 7, Ubuntu. Windows based OS must have Network Direct Provider v1 and lossless RDMA configured. Keep in mind that the latest drivers from the NIC manufacturer are recommended as standard Windows drivers don’t have ND API support. In case of Linux-based OS, you will need the latest network drivers with RDMA and RoCE support.
All the command switches you need to use are well documented in the technical paper dedicated to this tool on StarWind site so I won’t be dwelling on that, and I would say that best thing is to try to use this tool in your RDMA enabled environments.
Having real numbers comes in really handy in scenarios when you set up your cluster and need to make sure which mix of technologies gives you the best latency, or when you need to make sure whether your setup meets the requirements of your workload or application demand outlined by an application vendor, or (and this is the most frequently forgotten thing) when you need to set up the baseline performance numbers of your environment to be able to compare against it once your setup receives higher load or when service consumers report degraded performance. With rPerf, you can solve at least one part of writing your performance baseline documentation. Having some firm numbers for RDMA connection performance also serves well for verifying/auditing RDMA connection performance in any other scenario and with rPerf you can do it with one simple cross-platform tool.