
Normally, I write the “technical how to” type of articles, but this one will be more of a product review/introduction (though, I think even with this format we can go into technical details 😊). Relatively recently, StarWind released a free tool which allows you to measure latency and bandwidth of RDMA connections (pay attention to conjunction “and” here) and to do this in heterogeneous environments (meaning that you can measure Windows – Linux RDMA connection bandwidth and latency). This utility is called rPerf and can be downloaded for free from StarWind website. To download it, you will need to fill in a little form with some of your data, but that’s not much to pay for a good tool, right?
I would allow myself to write a little bit on what RDMA is so that we are clear on what we are going to measure here with this utility😊 (though, this technology is a huge topic in its own which calls for a lot of reading to fully understand it). Next, we will touch a little bit on what rPerf can do for you and even more briefly how to use it (just because it is straightforward and easy).
What is RDMA? RDMA or Remote Direct Memory Access is a technology which enables direct access from memory of one computer to another bypassing OS data buffers of both computers (meaning it is all happens on hardware level through device drivers). That type of access allows you to have high-throughput and low-latency networking which is something you really need for massively parallel computing clusters. RDMA-enabled data transfers do not add extra load on CPUs, caches or context switches, allowing your data transfers to continue in parallel with others system tasks. As an example of practical use case may be Hyper-V live migration, there is a YouTube video from Mellanox demonstrating a comparison of live migration performance with RDMA vs. TCP (and it shows impressive 29 seconds VS 2 hours result).
RDMA read and write requests are delivered directly to the network allowing for fast message transfer and reduced latency, but also introduces certain problems of single-side communications, where target node is not notified about the completion of the request (you may want to read up more on this to really understand this technology).
How can you get it? RDMA implementations require you to have both hardware (NIC) and software support (API and drivers support) and currently different varieties of RDMA implementations exist: Virtual Interface Architecture, RoCE (RDMA over Converged Ethernet), InfiniBand, Omni-Path, iWARP.
All in all, you most likely will find RDMA capability in high-end servers (you need to make sure that you have NIC supporting RDMA, something from Broadcom, Cavium or Mellanox Technologies) and HPC type of Microsoft Azure VMs (H16r, H16mr, A8 and A9, and some of N-series sizes with “r” in their name too).
What can you do with rPerf? You can measure RDMA link performance between RDMA-enabled hosts. The rPerf tool is a CLI utility which has to be run on both machines: one of them running as a server and another as a client. On the machine which you run as a client you specify the number of read iterations, buffer size and queue depth to start testing and once test completes you are going to get throughput in MiB/s and kIOPS along with latency information units/microseconds (minimum/maximum/average).
I’ve already mentioned that one of the strong points of this tool is its ability to work cross-platform. OS wise it supports Windows 7/Server 2012 or newer, CentOS 7, Ubuntu. Windows based OS must have Network Direct Provider v1 and lossless RDMA configured. Keep in mind that the latest drivers from the NIC manufacturer are recommended as standard Windows drivers don’t have ND API support. In case of Linux-based OS, you will need the latest network drivers with RDMA and RoCE support.
All the command switches you need to use are well documented in the technical paper dedicated to this tool on StarWind site so I won’t be dwelling on that, and I would say that best thing is to try to use this tool in your RDMA enabled environments.
Having real numbers comes in really handy in scenarios when you set up your cluster and need to make sure which mix of technologies gives you the best latency, or when you need to make sure whether your setup meets the requirements of your workload or application demand outlined by an application vendor, or (and this is the most frequently forgotten thing) when you need to set up the baseline performance numbers of your environment to be able to compare against it once your setup receives higher load or when service consumers report degraded performance. With rPerf, you can solve at least one part of writing your performance baseline documentation. Having some firm numbers for RDMA connection performance also serves well for verifying/auditing RDMA connection performance in any other scenario and with rPerf you can do it with one simple cross-platform tool.