Building Your First Cluster

Page 1 of 3

Introduction

Getting started with what appears to be a very powerful, very complex idea (in computing, at least) is often a daunting proposition, and cluster computing is no exception. There is so much to learn! So many things can go wrong! It might require new, specialized, expensive hardware and software! Looking over some of the articles on this website can easily reinforce this view, as some of them describe very sophsticated tools and concepts.

Anybody who has ever tackled a major new computing project is aware of how costs can spiral, energy and time fritter away, and benefits seem unreachably distant in the middle of the mess. This classic Fear, Uncertainty and Doubt (FUD) has propelled many a corporate or government project manager to spend huge amounts of money to get shrink-wrapped parallel supercomputers and full service contracts because at least that way they can put a firm upper bound on costs and have a fairly predictable benefit, even when their task is actually trivially parallel and could be run at a fraction of the cost on commodity hardware.

In Universities and many government labs, though, there exist large numbers of researchers who simply don't have the deep pockets that this kind of throw-money-at-it response requires but who do have an unbounded appetite for compute cycles. For well over a decade, this group of individuals (I'm one of them) have been working on ways of getting cycles in a predictable way, at modest cost (using commodity components instead of expensive "big iron" supercomputers), without having to be an absolute computing genius to make it all work. After all, some of us want to foist all the real work of cluster programming and management off on graduate students eventually, right?

This column will present a distillation of all of this experience and effort by all of these bright people that hopefully will convince you that hey -- cluster computing with commodity systems isn't really all that difficult or expensive, and that you can see a clear path to significant benefits after reading as little as a single article (for example, this one)!

In this column today we will review the simplest of designs for a cluster you may already have -- two or more (linux/unix) workstations on a completely generic local area network. This sort of "cluster" is easy to build and actually free in many environments (or rather, for the economically picky, is available at zero marginal cost, significantly improving the productivity of your existing resources).

To demonstrate its potential utility, we will learn to actually run an absolutely trivial "demo" parallel task -- one that does no real work but contains a very clear "placeholder" where real work can be inserted. This parallel task will be used in various incarnations in future columns to demonstrate things like parallel scaling laws and parallel libraries.

A final note before proceeding. This column is for beginners at cluster computing, but is not really intended for beginners at computing altogether. It does presume a passing acquaintance with Unix (preferrably Linux), TCP/IP networking, a programming language or three at the single computer level.

This column will primarily use C, perl, and /bin/sh (as well as numerous other tools in a standard workstation installation of Red Hat Linux) in the discussion and examples, although for the most part they are really Unix-agnostic. If you are a real beginner at computers altogether, you will likely need to do a bit of work and learn to install and manage and program at the single computer level. Fortunately, bookstores, the internet, and other columns and articles in this and other linux magazines (like Linux Magazine) make this quite easy these days. The software you need is at the end of the article (cut and paste) or you can download a tar file.

Sidebar One: NOW Cluster Recipe The following are the (very generic) ingredients for a small "getting started" NOW-style compute cluster: 2-8 PC's (Intel or AMD will be easiest). Their (linux-supported) hardware configuration can be quite flexible, but for "workstation" nodes should include adequate memory ( minimum 128 MB), room on their HD for a comfy linux workstation installation (4+ GB), and a 100BT ethernet interface, preferrable a "good" one.

128 MB), room on their HD for a comfy linux workstation installation (4+ GB), and a 100BT ethernet interface, preferrable a "good" one. A 100 BT switch with at least as many ports as you have computers, and enough cabling and wiring to connect all the workstations to the switch.

Linux (or perhaps FreeBSD), installed on each node. I personally am currently using Red Hat Linux version 9 for my own NOW as it permits PXE/Kickstart/Yum installation and maintenance (to be described in future columns) but just about any reasonably modern Linux will suffice. All the workstation nodes should have e.g. Openssh (server and client), perl version>5.8.0 (with thread support), gcc, make in order to run the examples, although a good programmer ought to be able to rewrite the perl "taskmaster" in C using posix threads quite easily. I include the perl example because (frankly) it is really cool that perl now has the ability to manage threads and it is a bit easier to hack around with a script than with C sources. To "configure" the cluster for the example, one system should probably be set up as an NFS server for /home and all the systems should share account space. ssh should be configured (according to its man pages) so that one can transparently execute remote commands without entering a password, e.g.: rgb@lilith|T:101>ssh lucifer date Sun Sep 7 08:27:08 EDT 2003

Is There a Supercomputer in the House?

A parallel supercomputer requires only two basic hardware components: a bunch of processors and a way for those processors to talk to one another and other private and shared resources. In the case of a NOW those components are realized as workstations and servers (each typically with one or at most two processors) and an ordinary TCP/IP network.

In almost any modern Unix/Linux based business, such a network can be found -- workstations and servers serving staff, programmers, and nowadays, even executives as Open Office, KDE, Gnome and their associated office tools mature. Universities almost invariably have workstation clusters of this sort to serve students and faculty alike. There exists a network of Linux workstations in my house, serving my wife and kids (as well as my own professional needs) and in fact it was used to test and debug things like the parallel programs that accompany this article (as I write it sitting at my laptop in my downstair den, connected via wireless to my house Local Area Network (LAN)).

If you have such a NOW, great! Check the sidebar entitled "NOW Cluster Recipe" and make sure that all the workstations have the setup described there. If not, don't despair! The same sidebar gives you a basic set of ingredients and an outline of installation instructions. These installation instructions are necessarily terse and are more of a configuration specification, as it simply isn't possible to tell you how to install linux or freebsd or a commercial Unix itself for every possible hardware or distribution permutation. This is well-documented elsewhere, however, and shouldn't be a terrible obstacle.

So go ahead, prepare your LAN, get to where you can enter "ssh target date" and see the date setting on the "target" workstation (without a password), and then return and start the next section.

Putting the NOW to Work: A Simple Parallel Task

At this point you should be sitting at a Linux/Unix workstation on a LAN set up to permit remote logins and command execution on at least one other system. These systems (as far as this article is concerned) can be Internet accessible or even on a Wide Area Network (WAN) as we're going to use ssh only as a remote connection protocol, although network performance is likely to be more consistent on a LAN.

Perform the following tasks:

Get software at the end of the article (cut and paste) or download a tar file.

Put the code for task.c and its Makefile in a subdirectory on your system(s) and build it (just entering "make" should do it, if gcc and make are installed and correctly set up).

Put the resulting "task" binary on a consistent path on all the systems. This might be someplace like $HOME/bin/task or /tmp/task. I'm assuming the latter in the taskmaster script below. You can execute task without arguments to see its argument list, and can execute it by hand once or twice with arguments to see what it returns.

Put the taskmaster script in a convenient place -- probably your working subdirectory where you have the task.c source. If you execute the taskmaster script without arguments (be sure to set its executable bits) it will also tell you its argument list. If it complains about not knowing about threads, you will need to update your version of perl to one that includes thread support.

Create a file named "hostfile" containing the names or IP numbers of the host(s) you wish to use in your parallel computation. I'd recommend having at least four participating hosts (one of which can be the "master" host that you are working from).

Don't worry about load balancing or the network or whether the computers in question are all the same speed -- task only does simulated work with a fixed delay, so any computers used will result in a fairly predictable task scaling (more on that later). Also, task uses almost no memory or CPU, so you also don't have to worry about annoying the console users of any workstations you are borrowing -- they won't even know the task is there.

Here's what task and taskmaster do. task seeds a system random number generator with the seed you give it. It then loops nwork times, sleeping for delay seconds and then printing a uniform deviate in the range [0.0,1.0) on stdout. If delay is "big" it simulates a lot of work for a little communication; if delay is "small" it simulates a relatively little work for a lot of communication.

taskmaster is responsible for distributing an instance of task to some or all of the hosts in a hostfile passed to it as an argument. You can control how many with another argument (we don't really check if this is more than the available host pool, so be careful). Finally, you tell taskmaster how many uniform deviates it should try to make in parallel, with what delay for each. taskmaster then partitions the task, creates separate threads (one per host), executes task on the remote host via ssh in each thread. It then collects the returns from those jobs (reading their stdouts and putting the uniform deviates in an array) and prints them all to the screen, more so you can see that actual work was done on the remote hosts than for any practical purpose.

Finally, taskmaster returns the timing, which is very important and will be used in future columns to explore the mysteries of parallel scaling laws. By varying nhosts, nrands, and delay we can learn lots of things about parallel scaling even in this simple environment, and even experienced cluster programmers may find the threaded taskmaster script a useful base from which to write a more careful task distribution mechanism. One with actual error checking, for example...

You are now ready to go! Here are some example of taskmaster usage on my own NOW cluster (named "eden", if anybody cares), configured with seven available hosts.