One Small, Arbitrary Change and It's a Whole New World

I want to take one item from Things to Optimize Besides Speed and Memory and run with it: optimizing the number of disk sector writes.

This isn't based on performance issues or the limited number of write cycles for solid-state drives. It's an arbitrary thought experiment. Imagine we've got a system where disk writes (but not reads!) are on par with early 1980s floppy disks, one where writing four kilobytes of data takes a full second. How does looking through the artificial lens of disk writes being horribly slow change the design perceptions of a modern computer (an OS X-based laptop in this case, because that's what I'm using)?

Poking around a bit, there's a lot more behind-the-scenes writing of preferences and so on than expected. Even old-school tools like bash and vim save histories by default. Perhaps surprisingly, the innocuous less text-file viewer writes out a history file every time it's run.

There are system-wide log files for recording errors and exceptional occurrences, but they're used for more than that. The Safari web browser logs a message when the address/search bar is used and every time a web page is loaded. Some parts of the OS are downright chatty, recording copyright messages and every step of the initialization process for posterity. There's an entire megabyte of daemon shutdown details logged every time the system is turned off. Given the "4K = 1 second" rule, that's over four minutes right there.

The basic philosophy of writing data files needs a rethink. If a file is identical to what's already on disk, don't write it. Yes, that implies doing a read and compare, but those aren't on our performance radar. Here's an interesting case: what if you change the last line of a 100K text file? Most of the file is the same, so we can get by with writing a single 4K sector instead of the 25 second penalty for blindly re-saving the whole thing.

All of this is minor compared to what goes on in a typical development environment. Compiling a single file results in a potentially bulky object file. Some compilers write out listings that get fed to an assembler. The final executable gets written to disk as well. Can we avoid all intermediate files completely? Can the executable go straight to memory instead of being saved to disk, run once for testing, then deleted in the next build?

Wait, hold on. We started with the simple idea of avoiding disk writes and now we're rearchitecting development environments?

Even though it was an off-the-cuff restriction, it helped uncover some inefficiencies and unnecessary complexity. Exactly why does less need to maintain a history file? It's not a performance issue, but it took code to implement, words to document, and it raises a privacy concern as well. Not writing over a file with identical data is a good thing. It makes it easy to sort by date and see what files are actually different.

Even the brief wondering about development systems brings up some good questions about whether the design status quo for compilers forty years ago is the best match for the unimaginable capabilities of the average portable computer in 2012.

All because we made one small, arbitrary change to our thinking.

(If you'd like to subscribe, here's the news feed.)

permalink August 18, 2012

previously