May. 15th, 2008 07:55 pm ActiveRecord Primitive Iterator: So You Don't Have to Select All Those Rows at Once Recently I needed to do a BFM (Big Freakin' Migration) that would normalize 1.5 million rows of data into about 5 millions rows in 3 tables. Reading and Writing ActiveRecord objects for them all was out of the question; I needed to read bare SQL for memory and emit optimized MySQL inserts for speed.



I worked out a clean aggregation/mapping method to generate large optimized insert statements, but the select was killing me. ActiveRecord wants to load all your objects in one go. Not good. Even the bare sql methods, select_all and select_rows, create an array, load the entire query, and then return the array. Seriously not good.



Well, to make a long story short, I monkeypatched ActiveRecord's MysqlAdapter. I was rather pleased with myself¹, and I promptly forgot about it because it was a grody hack that required great care from the user to keep it from leaking memory. In short, it was the poster child for Why Not To Monkeypatch: it was equal parts awesome and donteverdothatagain. I had abused Ruby's flexibility in order to take the safety guards off of Rails, make a few key cuts, and then put the safety guards back on again. But it wasn't code you'd ever want to see in production. I patted myself on the back, scolded myself for being a naughty boy, and then forgot all about it.



But then it started bugging me. There was a really good idea in that hack. They say information wants to be free. Well, this code wanted to be cleaned up and allowed out of the basement. And then Flying Machine Studios blogged about a similar problem. And in the comments to that, I saw that Jamis Buck had blogged about it a year ago. So clearly there was a need for this kind of solution. Both of their solutions involved chunking things up using SQL LIMIT statements, which can become slow with large queries--and large queries are the whole reason we went down this rabbit hole in the first place.



My hack is pretty simple: I just made select_all and select_rows take a block argument. If a block is given, ActiveRecord runs the query and yields each row to the block instead of accumulating an array of results to return. An ensure block guarantees that result.free gets called (fixing the potential memory leak caused if you break out of the block or raise an exception).



I have cleaned up the code a bit. I have fixed the memory leak. I have made an honest patch out of the monkeypatch using alias_method_chain. There's a lot left to do still, but this is worky enough for my needs and I figured I'd put it out there in case someone else wanted to take a stab at it.



Heres' the essential bit of the code (select_all is nearly identical)²:





Usage is simple and clean:





So, what's next? Well, there's still a lot to be desired here. It needs to be cleaned up, possibly turned into a plugin. It's currently MySQL only even though there's really nothing preventing it from being generalized. It's not really idiomatic; what I'd really like to see is select_all().each and select_all().map work with or without an internal fetch. If that can be made to work cleanly, I'd love to see this generalized into the find methods themselves in Rails.



And suddenly I have all these ambitions for this code, and I realize I'm getting tired just thinking about it. I wrote a hack, and it works for me. I think there's a good idea in it, and people might be able to run with it.



Want to? Take it. Run with it. Tell me what you did with it. (Even if it's print it out so you can enjoy burning such a heathen monkeypatch.)



mysql_select_with_block.rb



¹ I should clarify that I think most monkeypatches are poo. That's right, my tweet basically says I am a god of poo.



² Apologies for the screenshots instead of text pastes. I just really like syntax coloring. Anybody know of a good tool to colorize syntax for web pasting? It would rock if pastie.org would emit a css-free copy of a paste.... Current Mood: thoughtful

Current Music: Rock You Like A Hurricane - Scorpions

9 comments - Leave a comment