Contributed by jj on 2012-03-24 from the sudo-make-me-coffee-in-parallel dept.

Kristaps Dzonsons wrote in with an article about how OpenBSD helps him produce better research. Kristaps writes,

It's no secret that OpenBSD is an excellent research platform. From packages(7) for specialised software to out-of-the-box httpd(8), sshd(8), and so on, it's a no-brainer to pop OpenBSD onto a workstation and just get to work.





generate a set of parameters, like moments of a distribution; generate data files from these parameters; generate plotter input by applying parameters to a template; plot using the plotter input and data files.

make(1)

Makefile

configure

parm1=0.01 p1.png: p1.plot p1.dat gnuplot p1.plot p1.plot: template.plot sed "s!@OUT@!p1.png!;s!@IN@!p1.dat;s!@PARM@!$(parm1)!" $< >$@ p1.dat: longrunning >$@

Makefile

make(1)

p1.dat: ssh node1 longrunning ">$@" scp node1:$@ .

">$@"

$@

make(1)

make(1)

grep exec

(.DIST)

make(1) with a single patch to /usr/src. No need for NFS. No need for RPC---way too complicated for me. Just the clean make(1) code, a single patch on the local host, and password-less keys for sshd(8) on my build hosts. Throw in ControlMaster connections and my build times dropped arithmetically with the number of remote hosts.



In the end, my research capacity jumped from a small set of simulations to a set proportionate to my build cluster. Adding a new machine? Slap OpenBSD on the disk and add ssh(1) keys. Upgrading the dispatch machine? Re-apply the patch and get back to work.



The moral of the story is that the clean code and low barrier to entry, cd /usr/src/usr.bin/make ---not to mention the excellent default system installation---are invaluable tools. Scientific computing puts a lot of stress on computation, but there's more to it than algorithms and tuning: it's a process. And part of that process is starting dhcpd(8) on a secondary Ethernet device, attaching some boxes, and running them into the ground.



Is this feature useful for the general make(1) ? No. Is it useful for me? Absolutely. Can it be improved? Sure. All for another day, and another adventure in /usr/src!



References:



Unofficial Distributed Build Extension of OpenBSD's bsdmake:

http://kristaps.bsd.lv/bsddistmake/

In this article, I explore how OpenBSD's clean code and sane defaults recently saved the day. For great science!By way of background, I often [ab]use make(1) for generating visualisations from hundreds of datasets. It usually goes something like this:ties it all together. Sometimes I manually maintain a; sometimes it's a tangle ofloops and substitutions. For larger projects I generate the Makefile from ascript. But in the end, it usually looks like this:This trivial example uses gnuplot(1) to generate an imagefrom a data file,, and plot input. Assume that the longrunning utility consists of heavy number-crunching. Now imagine hundreds of data files and images... you get the point.This is fabulously easy on OpenBSD. On a new box I pkg_add(1) the required tools, pull my templates andfrom an off-site cvs(1) ) repository, then invoke. Time from CD boot? A handful of minutes.Meanwhile, a recent project was pushing my patience: the parameters needed lots of tweaking, with partial builds taking over ten minutes . I can only drink so many coffees per day (note: conjecture).Each plain-text data set took from a few seconds up to minutes to generate. It then occurred to me that I could use a nearby backup machine to build dataset targets with ssh(1) , since the output format of longrunning was machine-independent.(Note: I shell-escapefor generality. In this snippet, sending directly tois valid.)This worked, although builds completed unevenly due to longrunning's random execution time. This didn't bug me: speed-up with rough edges is still speed-up, right? But when I tripled my simulations and was stuck guessing remote host load and baby-sitting builds, the hack had outlived its usefulness.Unlike the systems of choice for some researchers, where solutions to such problems involve despair and liquor, OpenBSD has a third option: /usr/src. Why not hackto distribute target builds on-the-fly? No need to hardcode remote hosts and endure load inequality. To the source!had always seemed one of those utilities you just run and try not to think about. But it took only a few minutes to walk through usr.bin/make , starting with, and discover exactly where to play with build targets. Tricky bits, functions, even files themselves were well-documented in the source.In no time at all, I added a new special source to whitelist distributable targetsand some goop to send and receive dependencies and targets. The dispatcher was ready to go!The bad news: I lost my coffee time.The good news: distributed