My current research area is in robotics, image processing, and embedded systems. I am addicted to small, portable, and mobile platforms for my work. The project I’m currently working on involves migrating robotic calculations to a GPU using CUDA, NVIDIA’s GPU compute language. To make my development and debug easier, I want to have a high CUDA core count GPU, but also stay mobile, since my work often requires me to move around. The solution? An external GPU.

Use case for external GPUs

One of the downsides of CUDA is debugging—unless you are running Linux, since dbg is such a great tool—but if you are working on Windows you will need to use NVIDIA Parallel Nsight for debugging, and the requirements, as you can see in the image on the right, dictate that we need either two GPUs, or two networked PCs. Dual GPU laptops are 17″, which defeats the mobility requirement and makes it difficult to use for demonstration purposes.

So, my current configuration for mobile development is as follows:

A heavily modified Dell XPS M1530:

Intel Core 2 Duo T9300

NVIDIA Geforce 8600GT DDR3

8GB DDR2 RAM

Touch screen

A homemade tablet based on a Dell Inspirion 1520:

Intel Core 2 Duo T9300

NVIDIA Geforce 8600GT DDR2 with a custom-built PCB supporting 8gb DDR2

Touch screen

Four-axis accelerometer

Intel 160GB G1 SSD

As well as a variety of Core 2 Duo and Core 2 Quad desktops, and a variety of older NVIDIA GPUs—not the greatest cards but they do provide some minimal CUDA testing as well as a basic development environment.

So technically, I have two mobile PCs on some sort of communication network, which meets one of the Nsight requirements for remote debugging.

Complications with remote debugging

Remote debugging using Nsight requires Visual Studio and NSight to be installed on both computers, which I don’t like. I don’t mind Nsight on both system, but having to maintain two full installations of Visual Studio 2010 is messy, and significantly impacts performance and longevity on my first-gen Intel SSD—since the first-gen Intel SSDs do not support TRIM.

There weren’t any external GPUs that met every single one of my needs, so I opted to build my own.

Here are some of the parts I used:

Building a custom external GPU

The first part that is needed is an external PCI-Express 16X adapter. This basically splits the single 16x lane into four PCI-E 2X channels, and it can be shared across four different devices—although this configuration is unsupported without modification. I also added a Samsung S3C6410 SoC as a micro-controller to handle the sharing, quick removal, logging, debugging and ejecting through the USB interface

The Samsung S3C6410 provides one USB host controller and one USB client—that is why you also see another USB port on the board. There are a lot of things you can add to that USB port, such as a sound card, Wi-fi adapter, modem, and so on.

Any PCI-E slot/independent lane also provides one USB signal—that is how I was able to piggyback the S3C6410 onto the PCI-E 16X slot. There are two ways to implement additional USB devices off of a PCI-E lane:

PCI-E -> S3C6410 -> free USB PCI-E -> root hub -> nth many USB devices

I used option one because I wanted to have direct control over the USB through the SoC instead of going to the CPU interrupt to control the USB. If you use the CPU interrupt, it also drains more resources, so the first choice is to just drain the SoC’s resources.

It uses the laptop’s express bus for communication (2x) or desktop PCI-e redirector (it comes with PCI-E 1X, PCI-E 4x, or PCI-E 8x cards, but I purchased only the 1x card, as data communication is not that important). I also decided to use the NVIDIA Geforce GTX 560 Ti. Why not a ‘better’ card? Because the 560 Ti has the highest CUDA core count and highest CUDA version (2.1) of any of NVIDIA’s consumer-level cards. The only other card that has more cores and CUDA 2.1 support is NVIDIA Tesla. The Geforce GTX 570 and GTX 580 only support CUDA 2.0.

I also went to Ebay to find a used Xbox 360 first-gen PSU. Why? Because the first generation PSUs can provide up to 203W, compared to all the newer ones which only provide 170W or less. The Geforce GTX 560 Ti only uses 175W max on load. Therefore, 203W is more than enough, and the 203W version has nice active cooling built-in (a fan in the PSU).

Component List

Geforce GTX 560Ti

External PCI-Express 16X adapter

Express Card bus extender

Two desktop PCI-Express redirectors

XBox 360 first-gen 203W PSU

Part two of this article is now up: How I built an enclosure for my external GPU.

