Choosing the components

A sensible budget for me would be about 2 years worth of my current compute spending. At $70/month for AWS, this put it at around $1700 for the whole thing.

You can check out all the components used. The PC Part Picker site is also really helpful in detecting if some of the components don’t play well together.

GPU

The GPU is the most crucial component in the box. It will train these deep networks fast, shortening the feedback cycle.

The GPU is important is because: a) most calculations in DL are matrix operations, like matrix multiplication. They can be slow if done on the CPU. b) As we are doing thousands of these operations in a typical neural network, the slowness really adds up (as we will see in the benchmarks later). On the other hand, GPUs, rather conveniently, are able to run all these operations in parallel. They have a large number of cores, which can run an even larger number of threads. GPUs also have much higher memory bandwidth which enables them to run these parallel operations on a bunch of data at once.

Disclosure: The following are affiliate links, to help me pay for, well, more GPUs.

The choice is between a few of Nvidia’s cards: GTX 1070, GTX 1070 Ti, GTX 1080, GTX 1080 Ti and finally the Titan X. The prices might fluctuate, especially because some GPUs are great for cryptocurrency mining (wink, 1070, wink).

On performance side: GTX 1080 Ti and Titan X are similar. Roughly speaking the GTX 1080 is about 25% faster than GTX 1070. And GTX 1080 Ti is about 30% faster than GTX 1080. The new GTX 1070 Ti is very close in performance to GTX 1080.

Tim Dettmers has a great article on picking a GPU for Deep Learning, which he regularly updates as new cards come on the market.

Here are the things to consider when picking a GPU:

Maker: No contest on this one — get Nvidia. They have been focusing on Machine Learning for a number of years now, and it’s paying off. Their CUDA toolkit is entrenched so deeply that it is literally the only choice for the DL practitioner. Note: I wish AMD would pick up their game on this. AMD cards are cheaper and do half-precision compute at full speed. Budget: The Titan X got a really bad mark here as it is offering the same performance as the 1080 Ti for about $500-$700 more. It used to be case that you could do same speed half-precision (fp16) on the old, Maxwell-based Titan X, effectively doubling your GPU memory, but not on the new one. One or multiple: I considered picking a couple of 1070s (or currently 1070 Ti) instead of 1080 or 1080 Ti. This would have allowed me to either train a model on two cards or train two models at once. Currently training a model on multiple cards is a bit of a hassle, though things are changing with PyTorch and Caffe 2 offering almost linear scaling with the number of GPUs. The other option — training two models at once seemed to have more value, but I decided to get a single more powerful card now and add a second one later. Memory: More memory is better. With more memory, we could deploy bigger models and use sufficiently large batch size during training (which helps the gradient flow). Memory bandwidth: This enables the GPU to operate on large amounts of memory. Tim Dettmers points out that this is the most important characteristic of a GPU.

Considering all of this, I picked the GTX 1080 Ti, mainly for the training speed boost. I plan to add a second 1080 Ti soonish.

CPU

Even though the GPU is the MVP in deep learning, the CPU still matters. For example, data preparation is usually done on the CPU. The number of cores and threads per core is important if we want to parallelize all that data prep.

To stay on budget, I picked a mid-range CPU, the Intel i5 7500. It’s relatively cheap but good enough to not slow things down.

Edit: As a few people have pointed out: “probably the biggest gotcha that is unique to DL/multi-GPU is to pay attention to the PCIe lanes supported by the CPU/motherboard” (by Andrej Karpathy). We want to have each GPU have 16 PCIe lanes so it eats data as fast as possible (16 GB/s for PCIe 3.0). This means that for two cards we need 32 PCIe lanes. However, the CPU I have picked has only 16 lanes. So 2 GPUs would run in 2x8 mode (instead of 2x16). This might be a bottleneck, leading to less than ideal utilization of the graphics cards. Thus a CPU with 40 lines is recommended.

Edit 2: However, Tim Dettmers points out that having 8 lanes per card should only decrease performance by “0–10%” for two GPUs. So currently, my recommendation is: Go with 16 PCIe lanes per video card unless it gets too expensive for you. Otherwise, 8 lanes should do as well.

A good solution with to have for a double GPU machine would be an Intel Xeon processor like the E5–1620 v4 (40 PCIe lanes). Or if you want to splurge go for a higher end processor like the desktop i7–6850K.

Memory (RAM)

It’s nice to have a lot of memory if we are to be working with rather big datasets. I got 2 sticks of 16 GB, for a total of 32 GB of RAM, and plan to buy another 32 GB later.

Disk

Following Jeremy Howard’s advice, I got a fast SSD disk to keep my OS and current data on, and then a slow spinning HDD for those huge datasets (like ImageNet).

SSD: I remember when I got my first Macbook Air years ago, how blown away was I by the SSD speed. To my delight, a new generation of SSD called NVMe has made its way to market in the meantime. A 480 GB MyDigitalSSD NVMe drive was a great deal. This baby copies files at gigabytes per second.

HDD: 2 TB Seagate. While SSDs have been getting fast, HDD have been getting cheap. To somebody who has used Macbooks with 128 GB disk for the last 7 years, having this much space feels almost obscene.

Motherboard

The one thing that I kept in mind when picking a motherboard was the ability to support two GTX 1080 Ti, both in the number of PCI Express Lanes (the minimum is 2x8) and the physical size of 2 cards. Also, make sure it’s compatible with the chosen CPU. An Asus TUF Z270 did it for me.

MSI — X99A SLI PLUS should work great if you got an Intel Xeon CPU.

Power Supply

Rule of thumb: Power supply should provide enough juice for the CPU and the GPUs, plus 100 watts extra.

The Intel i5 7500 processor uses 65W, and the GPUs (1080 Ti) need 250W each, so I got a Deepcool 750W Gold PSU (currently unavailable, EVGA 750 GQ is similar). The “Gold” here refers to the power efficiency, i.e how much of the power consumed is wasted as heat.

Case

The case should be the same form factor as the motherboard. Also having enough LEDs to embarrass a Burner is a bonus.

A friend recommended the Thermaltake N23 case, which I promptly got. No LEDs sadly.

Budgeting it all in

Here is how much I spent on all the components (your costs may vary):

$700 GTX 1080 Ti

+ $190 CPU

+ $230 RAM

+ $230 SSD

+ $66 HDD

+ $130 Motherboard

+ $75 PSU

+ $50 Case

============

$1671 Total

Adding tax and fees, this nicely matches my preset budget of $1700.