[This post is partly recycled from my recent not-really-on-topic entry on the Index Data blog. Apologies to anyone who’s seen the earlier version. If that’s you, skip down to ScottKit — my first non-trivial Ruby program, just after the sushi: it’s all new material from there on.]

Out with the old, in with the new

A month or so ago, it suddenly struck me that I am writing most of my programs in the same language (Perl) that I was using a decade ago. Sure, I’ve learned and used some other languages since then, notably Java and JavaScript, but neither of those has wired itself into my hindbrain the way C did in 1986 and Perl in 2000, so that they spring unbidden to my fingertips when I open a fresh emacs buffer to start a new program. Ten years is a long time to ride the same horse, especially when programming environments are moving so fast, so I decided to make a conscious effort to learn something new. Partly just to keep myself sharp; partly because learning a new language is learning a new way of thinking, which is useful when programming in an old language, too; and partly in the hope that I might find something that is just plain Better than Perl.

Don’t get me wrong — I like Perl. Obviously I do, or I wouldn’t have been using it as my Language Of Choice for a decade. I like it, but I can’t love it: it’s a language that was not so much designed as congealed, and consequently it encompasses a huge variety of ideas from a huge variety of sources, going back as far as COBOL. The various ideas don’t always sit comfortably on the same conceptual sofa, and there is a certain amount of pushing and shoving. All in all, it’s not a language that you would want to introduce to your mother. Still, these are easy flaws to forgive, because Perl is just so darned useful. I’ve hardly written C at all in those ten years, because Perl is capable enough (and fast enough) to let me do almost everything I used C for in the 1990s. (The last exception was a year or two back when I needed a program that would call setrlimit() to establish a desired environment, then execute a named program. I couldn’t do that in Perl because it seems to have no API to the setrlimit() system call, so I had a nice excuse to write some C.)

Choosing a new language

So I was looking for a new language to learn. Because I wanted something that was fun to work with, bondage-and-discipline languages with masses of compile-time checking were off the agenda; and because I wanted something with a sufficiently well established ecosystem that I could use it for Real Work right off the bat, novelties like Google’s Go were also not the way to go. For well established, widely used dynamic languages, the choice basically came down to Python or Ruby, which now have selections of packaged libraries on a par with Perl’s CPAN.

Both of those were appealing, and I nearly went with Python; but in the end it felt just a bit too familiar — a bit too much like cleaned-up Perl — whereas Ruby felt more like an adventure. So that’s what I went with, and a few weeks on it’s a decision I am really happy about.

How to learn a language? The obvious way these days, I guess, is from online tutorials and videos. But I am a bit old-school in these matters, and I’ve always found that the most effective way for me to learn a totally new technology is from a book — something I can take away from the screen, snuggle under a duvet near an open fire, with a glass of cheap port and a slice of mature cheddar at my side, and read without being continually tempted to Just Check My Mail. Luckily I have an old university friend (Steve Sykes) who is a big Ruby fan, so from him I was able to get a recommendation for what book to buy. Turns out that, just as The Book for C is Kernighan and Ritchie, so there is a The Book for Ruby: Programming Ruby: The Pragmatic Programmer’s Guide (amazon.com, amazon.co.uk), by Dave Thomas. (I went with the version that describes the new 1.9 release of Ruby, which handles character sets correctly, rather than the older and more widespread 1.8).

While Ruby is an imperative object-oriented language, it also uses a lot of functional ideas — in fact Paul Graham has described it as “a dialect of Lisp with syntax”. One of the really neat things about it, which turns out to be trivial but has huge ramifications, is its syntax for closures (or blocks, as they are known in Ruby), which has finally won me over to how great closures are.

ScottKit — my first non-trivial Ruby program

By the time I’d read through Programming Ruby, and done a few simple exercises, I had a decent feeling for the language; but of course you never really learn a language until you write a non-trivial program in it. So I set out to build something tractable but fun, and I came up with ScottKit. It’s a toolkit for messing with Scott Adams format Adventure games — simple two-word-parser interactive fiction of the kind that was popular in the 1980s and which, for me at least, has never lost its charm. (Part of the reason for that is that I made and sold such games myself as a young teenager — it was a major rite of passage for me.)

ScottKit was a neat exercise for several reasons:

It deals with an existing format, defined and documented, for representing games.

It could be built in stages: first, a decompiler for existing games; then a system for playing such games; then a compiler for building games from a source format of my own devising. Each stage was worthwhile in its own right, so I didn’t have to slog through lots of work before I saw any results.

There are quite a few existing games out there, free to download, which it can play and decompile: decompiling is great when you get stuck, because you can see what the rooms and items are and what the actions do. You can also decompile a game, tweak the source, and recompile the modified version.

Games are fun!

So how did the experiment go?

Well, the actual programming was a joy. Ruby is — I’ve thought about it, and this is the best word — fun. Of all the languages I’ve ever used, it’s the one that gets in your way least, that allows you to spend the greatest proportion of your time actually solving your problem rather trying to remember whether you need an InstanceFactoryBuilder or an EntityBuilderFactory, or how many $s and @s to use. (“Never write $a[$i] when you mean ${$a[$i]} or @$a[$i] when you mean @{$a[$i]}. Those won’t work at all” — the Perl Data Structures Cookbook.) I don’t see myself writing a lot of Perl in the future, except where mandated by external forces.

Learning the culture

But of course the actual language is only the tip of the iceberg (and finally we come to the actual point of this blog post): where you really face a steep learning curve is, well, everywhere else. Learning a language is a great start, but to be productive in any meaningful sense you also have to learn the libraries, the testing frameworks, the packaging systems, the build tools, the inline documentation systems, the code-hosting services, the documentation-hosting services, and no doubt a bunch of other stuff that I’ve not got around to yet. Let’s look at those in turn, and see how they are panning out for Ruby.

Libraries

One of the nice things about Ruby is that the supplied standard libraries are pretty extensive and cover a lot of the kinds of things you want to do, so you don’t for example have to go and evaluate half a dozen candidate XML libraries before you can start messing with angle-brackets: the no-brainer answer is to use REXML, which comes with the interpreter. This contrasts nicely with Perl, where the 19,742 modules downloadable from CPAN include FIVE HUNDRED with XML somewhere in their title. It’s nice not to have to make those choices.

That said, those supplied libraries and built-in classes give you plenty to learn. I can do basic string processing in Ruby, but the String class has 145 methods — 145! The Array class has 121. The Class class has 77. I doubt anyone ever fully learns these classes, but good Ruby programmers will have the main methods at their fingertips, and it’s a little intimidating to think how it might be before I qualify as a “good Ruby programmer” by that metric.

Ruby comes with a nice packaging system, rubygems, that makes installation of libraries rather easier than in Perl. It seems to combine features of CPAN, Debian’s dpkg system and apt-get. So far I am pretty happy with it. (More on rubygems below.)

Testing frameworks

In the old days, the way to do unit-testing in Ruby was with Test::Unit, which succeeded Lapidary. In recent years, though, a bunch of alternatives have sprung up, including but not limited to Minitest::Unit, Rspec and Shoulda. Test::Unit itself seems to be phasing out, but Minitest::Unit includes an emulation for it, and that’s what I went with just because it seems to be the baseline. It seems to be possible to mix Rspec and Shoulda tests in with (real or emulated) Test::Unit later, if I should need to.

Packaging systems

This, happily, is a decision made long ago: if you want to distribute Ruby code, you do it as a gem. You just do. No-one distributes tarballs of Ruby code. If you want to be part of the community, you have to distribute your code as a gem, too, otherwise people will just shrug and assume you’re some kind of crazy sicko.

As it turns out, that’s a good thing. The gem packaging system is pretty lightweight, it imposes only a simple canonical directory structure on your code — one that make perfect sense — and leaves you free to add whatever else you might want. And it’s really nice that anyone, anywhere in the world, who has a working Ruby installation can get ScottKit just by running “gem install scottkit”.

In fact, the gem directory-layout conventions are not quite dictatorial enough for my tastes, which is not something you’ll hear me say often. I want to add a change-log for my project, and such files are given many different names in different cultures — Changes (Perl projects), NEWS (stuff built using GNU configure), changelog (Debian packaging), Changelog-with-a-capital-C, History, etc. I was hoping that the standard gem layout would include such a file and mandate a well-defined format, but I wasn’t able to find any such convention. In the absence of anything better, I picked the name Changes, which is what I use in Perl, and adopted the Perl change-log format.

Build tools

Here’s where things started to get hazy for me. Being a traditionalist, I’ve always been quite happy with Stu Feldman’s baby, make, and not seen the need for ant and suchlike. (It’s like a Makefile — but it’s in XML! So it’s Better!). I was a bit surprised to find that in the Ruby world it’s much more common to use rake (It’s like make — but it’s written in Ruby! So it’s Better!) I strenuously dislike this sort of cultural imperialism in a language — it’s dangerously close to the classic Java-world attitude of flat-out rejecting anything that’s not written in Java.

But I’ve warmed to rake. Unlike ant, whose whole selling point seems to be that it’s written in Java (a horrible language) and configured in XML (a horrible metalanguage), the point of rake is that the Rakefile is itself a Ruby program — Ruby’s economical and flexible syntax makes this much less intrusive than you might expect. And this of course means that you have the full power of Ruby to hand in expressing your rules: no need to call out to the shell, quote your quotes and backslash-end your continued lines as we’ve all been doing with make for years. So far I’ve not really exploited rake beyond the simplest cookie-cutter uses, but it’ll be interesting to see how it bears up as I start to do so.

In Ruby-world, the build tool interacts with how you do packaging, because rake is so darned configurable that people can’t resist extending it with special kinds of tasks that know how to build gems. And that means you have more choices to make.

The basic approach to building your gem is to write a projectName.gemspec file, which specifies stuff like the name and version of your package, your own name and email address, which files are to be included, and so on. (The gemspec is itself, naturally, a little Ruby program.) Then you can run “gem build projectName.gemspec” and if all is well, then projectName–version.gem will be generated and dumped in the working directory.

But various people have thought that maintaining the gemspec file by hand is too hard, so they’ve made rake extensions that do it for you. One such is jeweler, which pretty much leaves you to put the gemspec in your Rakefile, embedded in a Jeweler::Tasks block, and fills in some of the fields for you, like the date of release. To be honest, I am not wholly sure that this gets you a great deal, but that’s what I’m using right now basically because it’s what a friend uses. Jeweler does other objectionable things, too, like assuming your project is at the top level of its own git module and blowing up if that’s not true (unless you take special precautions).

So I might switch to hoe, yet another rake-extension-that-makes-gemfiles: I’ve only just glanced at it, but it makes a much shorter Rakefile with a much simpler rule, which I think harvests the relevant metadata from the README.txt, Manifest.txt, etc., to make the gemfile. Looks neater than Jeweler, and also, I notice, expects to see a file History.txt, which looks like it’s my change-log at long last.

Jeweler and hoe are not the only games in town, by the way. There are plenty of other Grand Unified schemes that various people have come up with, and choosing between them seem pretty much roulette.

Inline documentation systems

Ruby classes and functions can be annotated by special comments which are extracted by rdoc to make automatic documentation, just like javadoc does for Java. Rdoc seems to have superseded an earlier attempts called rd, and — I thought — had complete control of the Ruby inline documentation ecosystem. Nuh-uh. Seems that the world is now moving towards using YARD instead, which is kind of similar in spirit, but more complex (and harder to find documentation for). Ugh.

Code-hosting services

If you’re getting tired of my listing bunches of alternatives under every heading, you’re not alone. I will try to be quick, so we can get to the punchline.

When you build a gem, the gem software complains if you’ve not defined a rubyforge_project metadata element — in other words, it just assumes you’re hosting your project on one particular site, which seems extremely presumptuous (I am trying to be polite here). When you use jeweler to set up a project, it requires that you tell it your username on github. Which is also, let us say, not the kind of behaviour we like to see. The upshot is that, because I had my own hosting arrangements already in place, I got whined at by two separate pieces of software that I wasn’t doing it the way they wanted. Rude, rude, rude.

As it happens, I have — for now at least — moved my git repository onto github. I’m not sure how that’s working out, for reasons that will become clear in the next (and last!) subsection; I might try moving to rubyforge after all. There’s also Sourceforge, of course, and Google Code; but they don’t seem to have been taken to hearts of Rubyites in the same way as github has.

Documentation-hosting services

If you look at the home page for, say, Ruby-ZOOM, you’ll see that it has all its rdoc documentation right there on the rugyforge-hosted site. Very useful. I’d assumed, or at least hoped, that github would do the same, but it seems that it doesn’t. (This is the main reason I am thinking of moving to rubyforge.) I guess this is because github supports projects in any language whereas rubyforge is Ruby-specific. [Addendum: turns out I was mistaken — rubyforge doesn’t do this for you. The Ruby-ZOOM authors made their own arrangements to upload the formatted documentation. Rats.]

There’s a solution, sort of, at http://rdoc.info/ — a site that knows how to pull projects from github, assemble their inline documentation and publish it. Better still, github can be set up, pretty easily, to tell rdoc.info automatically whenever you push a new commit of your project, and it arranges to have the documentation updated. It happens quickly, too. Great, eh? Well, nearly. Sadly, rdoc.info has a couple of issues:

There doesn’t seem to be a permalink to the pages documenting any of the classes: frames are used, and the URLs of the frames when you manage to get hold of them turns out to have the git commit’s SHA1 hash in them which means they are (A) ugly and (B) not up to date.

Worse, despite the domain name, rdoc.info does not use rdoc! It uses YARD, and that is causing me problems, not just because I’d already set my source up to use rdoc comments, but also because YARD seems to have religious convictions that I am not happy about, such as a commitment that you have to document all your private methods.

I’ve made some steps towards YARDifying the ScottKit source, but I’m currently thinking that’s the wrong thing — I need to find a rdoc.info-like site that really does use rdoc. Failing that, I’ll have to make my own documentation-build-and-install scripts that push it to a ScottKit area on http://miketaylor.org.uk/, but I really don’t want to go that route.

What it all means

None of this the fault of Ruby; all the same issues exist for other languages. I’m gradually coming to the conclusion that they are sort of irreducible: they come with the territory. For Real Work (as opposed to solving interesting puzzles with Prolog or APL), you need a language that has developed a rich culture over time. That’s what enables the language to make all the connections it needs to make in order to do the kinds of things we need to do these days. What it needs, in short, is an ecosystem. And ecosystems are complicated things. They are hard to learn.

Could the Ruby culture be simplified? Well, maybe by fiat. Maybe Matz could decree that all projects must be generated by jeweler, hosted on github, documented in YARD, and must have unit-test using Shoulda. It would, in a way, be nice to have those decisions made for me. But even if the community accepted these diktats, which they wouldn’t, it’s not really what we want. Languages that grow and develop and succeed are those with rich, competitive ecosystems; constrain it too much and it becomes sterile. I’m guessing that in three years, some of the issues I’ve had to make decisions about will be much easier: Darwin will take care of the weaker approaches, and the stronger will survive. That’s how we got the point of Ruby being as good as it is, after all — and by “good” I don’t just mean “elegant” or “fun to use”, but “capable of doing large-scale stuff using many different libraries available from and documented on well-known community sites”.

In short: when you learn a programming language — a real, grown-up one rather than a novelty — you have to learn the culture that goes with it. You just do. And it’s time-consuming and frustrating and feels like a waste of time, and yields up very few of those satisfying Aha! moments. It’s only gradually, having waded through these fertile but sometime smelly swamps for a few weeks’ evenings, that its all starting to swim into focus and make some kind of sense. I think I am getting there with Ruby culture; hopefully ScottKit will be properly published and available soon, and I’ll have found a combination of test frameworks, inline documentation conventions and so on that works for me.

Ask me again in a few weeks, and I’ll tell you how it went.

And ask me in twenty years and I’ll let you know whether I’ve made an progress in penetrating Java culture.

—

Addendum

By the way, ScottKit is not really ready to use yet, even though I released it as a gem already. (That was mostly so that I could easily install it on the boys’ computer upstairs and they could get going building their own games.) The big gap is the documentation for the source language — it doesn’t really exist. I’ll try to fix that in the next day or so; I’ll post about ScottKit when it’s ready for the attention.