There’s a bunch of howtos out there on how to set up VFIO and pass GPU through to the VM. I found this tutorial (and other articles in the series) especially helpful. I didn’t follow it thoroughly, but it served as good checklist on what needs to be set up.

In my case I decided to keep things simple and I ended up with these kernel parameters in /etc/default/grub:

These were added over couple reboots and the details will be specific to your hardware, but let me explain:

Note the PCIe IDs near the end of the lines. We will use those later.

And it really was. This script will show you group mapping on your server:

Enable IOMMU . This is the mechanism that will later allow us passing PCIe device (GPU) to a VM. You need to do this first to see how will the IOMMU groups work with your specific hardware. This is Xeon based server grade hardware, so the grouping should be actually quite good.

I had to use this, because my HW wouldn’t support vfio-based device assignment without it. It is sort of unsafe and allows certain types of attacks using MSI-based interrupt, so do not use this unless the pass-through fails and the error message explicitly tells you to enable this feature. I’m only running VMs that I personally can trust, so this is not a huge concern, but if you plan to use the server to host VMs where individual owners do not trust each other, definitely avoid this.

Normally nouveau (opensource Nvidia GPU driver) will grab the GPU during boot to let host system use it. In my case I planned to use on-board Matrox card as primary VGA card for host, so to avoid nouveau taking over the device, I just blacklisted it. (this would prevent any other nvidia GPU from working, but I only have the GTX 1050 in the server, so that’s fine)

The IDs are from the lspci listing above and might differ on different hardware.

Configure vfio module to take over GPU and audio card. (both devices are physically on the GTX 1050 card) This will later let you hand over the device to VM.

hugepages=12400

This is totally optional, but I recommend it to squeeze out a bit of performance out of your hardware. With this I dedicate portion of RAM for hugepages allocation. KVM can then use it to allocate memory to VM in more effective manner. Note that this memory will no longer be usable by the rest of the system. R710 RAM is generally quite cheap these days, so for me it just makes sense to dedicate couple gigabytes to a VM.

Note that I’m using 2MB page size and I’m telling kernel to allocate 12400 blocks which amounts to about 24GB of RAM. I only need 12GB for my VM, so why allocate more?

R710 has two CPUs (you can run it with one, but why would you buy server with two CPU sockets?) and each CPU has it’s own "allocated" memory. Both CPUs can still access entire RAM, but accessing RAM slots that are not local to the CPU (that are not part of the same NUMA node) comes with some performance hit. To avoid this hit, I’m doing two "tricks":

I’m pinning the virtual CPU cores to physical cores on only one of the CPUs (this way entire VM runs in the same NUMA node)

I’m allocating hugepages in the same NUMA node to make sure the VM memory access won’t suffer the cross-node performance degradation.

Now the problem with hugepages allocation via kernel parameters is that it (as far as I could find) won’t let you specify in which NUMA node should the memory be allocated. I could allocate the memory after the boot just with echo 6200 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages, but because hugepages require continuous unallocated memory chunks, chances are that even very early into the boot, there might not be enough free memory to allocate.

Luckily, kernel will allocate the memory evenly across all NUMA nodes and we have enough spare memory during the boot. (my system has 48GB RAM) So what I ended up doing is to allocate double the RAM and then drop hugepages from the NUMA node I won’t use right after the boot:

/etc/systemd/system/drop-hugepages.service [Unit] Description = Drop numa node1 hugepages After = multi - user . target [Service] Type = oneshot ExecStart =/ bin / bash - c 'echo 0 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages' [Install] WantedBy = multi - user . target

This works rather well and I end up with 12GB usable hugepages in NUMA node0: