InfiniBand networking is quite awesome. It's mainly used for two reasons:

low latency high bandwidth

As a home user, I'm mainly interested in setting up a high bandwidth link between two servers.

I was using quad-port network cards with Linux Bonding, but this solution has some downsides:

you can only go to 4 Gbit with Linux bonding (or you need more ports) you need a lot of cabling it is similar in price as InfiniBand

So I've decided to take a gamble on some InfiniBand gear. You only need InfiniBand PCIe network cards and a cable.

1 x SFF-8470 CX4 cable $16 2 x MELLANOX DUAL-PORT INFINIBAND HOST CHANNEL ADAPTER MHGA28-XTC $25 Total: $66

I find $66 quite cheap for 20 Gbit networking. Regular 10Gbit Ethernet networking is often still more expensive that using older InfiniBand cards.

InfiniBand is similar to Ethernet, you can run your own protocol over it (for lower latency) but you can use IP over InfiniBand. The InfiniBand card will just show up as a regular network device (one per port).

ib0 Link encap:UNSPEC HWaddr 80-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:10.0.2.3 Bcast:10.0.2.255 Mask:255.255.255.0 inet6 addr: fe80::202:c902:29:8e01/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 RX packets:7988691 errors:0 dropped:0 overruns:0 frame:0 TX packets:17853128 errors:0 dropped:10 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:590717840 (563.3 MiB) TX bytes:1074521257501 (1000.7 GiB)

Configuration

I've followed these instructions to get IP over InfiniBand working.

Modules

First, you need to assure the following modules are loaded at a minimum:

ib_mthca ib_ipoib

I only had to add the ib_ipoib module to /etc/modules. As soon as this module is loaded, you will notice you have some ibX interfaces available which can be configured like regular ethernet cards

Subnet manager

In addition to loading the modules, you also need a subnet manager. You just need to install it like this:

apt-get install opensm

This service needs to run on just one of the endpoints.

Link status

if you want you can check the link status of your InfiniBand connection like this:

# ibstat CA 'mthca0' CA type: MT25208 Number of ports: 2 Firmware version: 5.3.0 Hardware version: 20 Node GUID: 0x0002c90200298e00 System image GUID: 0x0002c90200298e03 Port 1: State: Active Physical state: LinkUp Rate: 20 Base lid: 1 LMC: 0 SM lid: 2 Capability mask: 0x02510a68 Port GUID: 0x0002c90200298e01 Link layer: InfiniBand Port 2: State: Down Physical state: Polling Rate: 10 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510a68 Port GUID: 0x0002c90200298e02 Link layer: InfiniBand

Set mode and MTU

Since my systems run Debian Linux, I've configured /etc/network/interfaces like this:

auto ib0 iface ib0 inet static address 10.0.2.2 netmask 255.255.255.0 mtu 65520 pre-up echo connected > /sys/class/net/ib0/mode

Please take note of the 'mode' setting. The 'datagram' mode gave abysmal network performance (< Gigabit). The 'connected' mode made everything perform acceptable.

The MTU setting of 65520 improved performance by another 30 percent.

Performance

I've tested the card on two systems based on the Supermicro X9SCM-F motherboard. Using these systems, I was able to achieve file transfer speeds up to 750 MB (Megabytes) per second or about 6.5 Gbit as measured with iperf.

~# iperf -c 10.0.2.2 ------------------------------------------------------------ Client connecting to 10.0.2.2, TCP port 5001 TCP window size: 2.50 MByte (default) ------------------------------------------------------------ [ 3] local 10.0.2.3 port 40098 connected with 10.0.2.2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 7.49 GBytes 6.43 Gbits/sec

Similar test with netcat and dd:

~# dd if=/dev/zero bs=1M count=100000 | nc 10.0.2.2 1234 100000+0 records in 100000+0 records out 104857600000 bytes (105 GB) copied, 128.882 s, 814 MB/s

Testing was done on Debian Jessie.

During earlier testing, I've also used these cards in HP Micro Proliant G8 servers. On those servers, I was running Ubuntu 16.04 LTS.

As tested on Ubuntu with the HP Microserver:

------------------------------------------------------------ Client connecting to 10.0.4.3, TCP port 5001 TCP window size: 4.00 MByte (default) ------------------------------------------------------------ [ 5] local 10.0.4.1 port 52572 connected with 10.0.4.3 port 5001 [ 4] local 10.0.4.1 port 5001 connected with 10.0.4.3 port 44124 [ ID] Interval Transfer Bandwidth [ 5] 0.0-60.0 sec 71.9 GBytes 10.3 Gbits/sec [ 4] 0.0-60.0 sec 72.2 GBytes 10.3 Gbits/sec

Using these systems, I was able eventually able to achieve 15 Gbit as measured with iperf, although I have no 'console screenshot' from it.

Closing words

IP over InfiniBand seems to be a nice way to get high-performance networking on the cheap. The main downside is that when using IP over IB, CPU usage will be high.

Another thing I have not researched, but could be of interest is running NFS or other protocols directly over InfiniBand using RDMA, so you would bypass the overhead of IP.