This time I didn’t have to define a single function! I re-used two completely decoupled methods, Array#select and Integer#even? —methods that didn’t even know about each other—to accomplish this task.

The takeaway: if you let your language deal with repetitive details like looping and even-ness, you get to focus on the unique bits of code that differentiate your application from everyone else’s. And having so many reusable methods available to us makes it not only useful, but also easy to do the right thing by default.

From Java to Ruby

Higher-order functions have transformed my code so much that I barely recognize what I was writing without them. In fact, in 2005, I stopped coding in Java and learned Ruby. Because I wasn’t working in an environment where stability or fine-grained data structure manipulation was necessary, I shifted to favor of ease-of-use instead of perfect control. Yes, Java ran faster than Ruby. Yes, more people were ‘doing’ Java. No, Ruby would not get you a job—this was before Rails 2. But I didn’t care about any of that. I wanted an easy, elegant way to make my computer work for me, to write scripts, create libraries, analyze data, and build Web applications. I wanted to move quickly, get things done. I was tired of spending too much time on boilerplate code and not enough time solving real problems.

(Update: As of 2016, my needs have shifted in the other direction, and I’m much more in favor of statically typed systems like Haskell. However, I think it was important for the pendulum to swing in the Ruby direction during the early stages of my career. I also think that Haskell’s type system is far better than Java’s, which is pretty strongly tied to mutable objects.)



I think the industry still hasn’t grasped the elegance of the style of code I’m looking at here. I get an expressive, flexible syntax, almost like Lisp. But in a friendly language that tries, above all else, not to surprise me. I get the ability to interoperate with Unix with just a backtick, but the language also runs on Windows and other platforms. And over the last six years, this power and expressiveness has been invaluable to me. I’ve learned to build websites using Rack and Sinatra and Rails—in fact, the site you’re looking at is powered by Ruby that I wrote—and I feel like an expert at the language.

I’m in the process of building out a couple of open source libraries. And generally, I’m satisfied with the way things are in the Ruby community. Exciting things are happening.



What I’m missing (and it’s not the semi-colon)

Okay, I’ve spent lots of time praising Ruby for being beautiful, expressive and pragmatic. I do love the language and think it makes programming painless in lots of ways. But the Ruby community is not giving me one very important thing, something that’s vital for me at work: solid tools for science and statistics. I’ve already leveraged Ruby at work to dramatically speed up some of our high-throughput experimoents—for processing and summarizing data—but I’ve always had to pipe my numbers into R for statistics and graphing. Because unfortunately, despite all the hubbub around Ruby, no one seems to be crunching numbers with it. I’m not completely comfortable with R, though, and I want a one-stop solution for my numbers needs. So I’m going to do that thing that no Ruby programmer wants to do: I’m going to learn Python.

Now, learning Python is something that is usually frowned upon in the Ruby world, probably just because it’s the ‘rival language’ and is pretty similar. The traditional argument is: “Don’t learn Python! It’s a dynamically typed, multi-paradigm, interpreted scripting language, just like Ruby! And it’s ugly.” And all of that is true. But I’ve found (over four years, and especially this year) that really Ruby is focused on one thing, and that’s web development.

I love Web dev, and I’ve done my fair share of it. But I also have a day job that depends a lot on statistics.

In theory, I could port functions from SciPy and NumPy to Ruby. It’s been tried before without success (see the failed SciRuby project) and I’m pretty confident I don’t want to go that route. It takes a community, and not just one person, to foster something like that. And I have other things to focus on now.

Instead, I’m going to leverage the huge data ecosystem that’s grown around Python and add the language to my résumé. SciPy is unrivaled in the modern programming world, and I plan to embrace it for projects at work. What’s more, Python supports higher-order functions like Ruby, so I’m not missing out on all the functional goodness I described earlier.

However, this big change is not without its problems.

Python package management

This past week, I started my foray into the Python world. This involved installing Python 2.7 and 3.1, bookmarking Dive into Python, and figuring out package management.

Er, I guess, by “figuring out”, I mean “being completely baffled by”.

At the moment, Python package management seems to be fragmented and complicated. I am used to typing something like gem install symbolic when I want to install something on my machine. It’s standard, simple, and rarely causes problems.

In Python, though, there seem to be competing managers ( easy_install and pip ) and separate ways to package libraries for uploading. I’m also hearing names like setuptools , distutils , distribute and virtualenv , and I have to say that the whole ecosystem isn’t too clear to me yet. And the documentation tends to assume I know what all the above mean already!

After asking around, I take it I should use pip for package management—apparently Pip is the future. In fact, it looks like it’s meant to mimic RubyGems. So installing SciPy should be a simple pip install numpy scipy .

Awesome. Easy as Ruby.

But wait. What’s this? I see a lot of text moving down my screen. God. My computer is starting to heat up. Now I’m seeing errors all over the place. “You need a FORTRAN compiler. Found gfortran. Installation failed.”

Wha?! I mean, I expected C dependencies, sure. I would hate to do math without them. But FORTRAN? Is FORTRAN something we’re still installing in 2010?

I’m hoping to get this all sorted out soon and start doing some heavy stats with the new language. I’m excited to be joining a group of people who are focused on data and experiments, and not only HTTP requests, MVC, APIs, jQuery and event processing. That’s not a jab at the Rubyists I know. I’m just excited to put a full language (and not just a web framework) to good use.

What to like about Ruby

Now, I want to be clear: I like Ruby better than Python. Its “developer UX” make more sense to me than Python’s, and the language itself seems more expressive. I love that objects know how to `map` or `filter` themselves, which leads to elegant chaining. I love that blocks unify Ruby’s closures, anonymous functions, and iteration. That Ruby has such versatile syntax that it can masquerade as C or Perl or Scheme (more on that some other time). That when I code in Ruby, it does everything the way I would expect to—without reading documentation.

But Python and Ruby aren’t so different, and I want a strong data community to work with. I don’t want to duplicate effort to bring solid libraries to Ruby, and Python seems like the way to go. I might even use this new language to dabble in machine learning using PyBrain and the Natural Language Toolkit, which both, well, rock my socks off. The potential for number crunching in Python seems endless.

Anyway, statistics knowledge is in demand and probably will be in the future, so I’m happy to become competent with these tools. Maybe someday, that will be viable on the Ruby platform—and I look forward to that day. For now, Python is going into my toolbelt. And I welcome the challenge that implies.