Why Structure and Interpretation of Computer Programs matters

Brian Harvey

University of California, Berkeley

In 2011, to celebrate the 150th anniversary of MIT, the Boston Globe made a list of the most important innovations developed there. They asked me to explain the importance of SICP, and this is what I sent them:

SICP was revolutionary in many different ways. Most importantly, it dramatically raised the bar for the intellectual content of introductory computer science. Before SICP, the first CS course was almost always entirely filled with learning the details of some programming language. SICP is about standing back from the details to learn big-picture ways to think about the programming process. It focused attention on the central idea of abstraction -- finding general patterns from specific problems and building software tools that embody each pattern. It made heavy use of the idea of functions as data, an idea that's hard to learn initially, but immensely powerful once learned. (This is the same idea, in a different form, that makes freshman calculus so notoriously hard even for students who've done well in earlier math classes.) It fit into the first CS course three different programming paradigms (functional, object oriented, and declarative), when most other courses didn't even really discuss even one paradigm.

Another revolution was the choice of Scheme as the programming language. To this day, most introductions to computer science use whatever is the "hot" language of the moment: from Pascal to C to C++ to Java to Python. Scheme has never been widely used in industry, but it's the perfect language for an introduction to CS. For one thing, it has a very simple, uniform notation for everything. Other languages have one notation for variable assignment, another notation for conditional execution, two or three more for looping, and yet another for function calls. Courses that teach those languages spend at least half their time just on learning the notation. In my SICP-based course at Berkeley, we spend the first hour on notation and that's all we need; for the rest of the semester we're learning ideas, not syntax. Also, despite (or because of) its simplicity, Scheme is a very versatile language, making it possible for us to examine those three programming paradigms and, in particular, letting us see how object oriented programming is implemented, so OOP languages don't seem like magic to our students. Scheme is a dialect of Lisp, so it's great at handling functions as data, but it's a stripped-down version compared to the ones more commonly used for professional programming, with a minimum of bells and whistles. It was very brave of Abelson and Sussman to teach their introductory course in the best possible language for teaching, paying no attention to complaints that all the jobs were in some other language. Once you learned the big ideas, they thought, and this is my experience also, learning another programming language isn't a big deal; it's a chore for a weekend. I tell my students, "the language in which you'll spend most of your working life hasn't been invented yet, so we can't teach it to you. Instead we have to give you the skills you need to learn new languages as they appear."

Finally, SICP was firmly optimistic about what a college freshman can be expected to accomplish. SICP students write interpreters for programming languages, ordinarily considered more appropriate for juniors or seniors. The text itself isn't easy reading; it has none of the sidebars and colored boxes and interesting pictures that typify the modern textbook aimed at students with low attention spans. There are no redundant exercises; each exercise teaches an important new idea. It uses big words. But it repays a close reading; every sentence matters.

Statistically, SICP-based courses have been a small minority. But the book has had an influence beyond that minority. It inspired a number of later textbooks whose authors consciously tried to live up to SICP's standard. The use of Scheme as a language for learners has been extended by others over a range from middle school to graduate school. Even the more mainstream courses have become sensitive to the idea of programming paradigms, although most of them concentrate on object oriented programming. The idea that computer science should be about ideas, not entirely about programming practice, has since widened to include non-technical ideas about the context and social implications of computing.

SICP itself has had a longevity that's very unusual for introductory CS textbooks. Usually, a book lasts only as long as the language fad to which it is attached. SICP has been going strong for over 25 years and shows no sign of going out of print. Computing has changed enormously over that time, from giant mainframe computers to personal computers to the Internet on cell phones. And yet the big ideas behind these changes remain the same, and they are well captured by SICP.

I've been teaching a SICP-based course since 1987. The course has changed incrementally over that time; we've added sections on parallelism, concurrency control, user interface design, and the client/server paradigm. But it's still essentially the same course. Every five years or so, someone on the faculty suggests that our first course should use language X instead; each time, I say "when someone writes the best computer science book in the world using language X, that'll be fine" and so far the faculty have always voted to stay with the SICP course. We'll find out pretty soon whether the course can survive my own retirement.

(Footnote: Nope. Berkeley's new first course for majors uses Python, with lecture notes that try to keep the ideas (and some of the text) of SICP.)

The discussion has been sharper recently because MIT underwent a major redesign of their lower division EECS curriculum. People outside MIT tend to summarize that redesign as "MIT decided to switch to Python," but that's not a perceptive description. What MIT decided was to move from a curriculum organized around topics (programming paradigms, then circuits, then signal processing, then architecture) to a curriculum organized around applications (let's build and program a robot; let's build and program a cell phone). Everything about their courses had to be reorganized; the choice of programming language was the least of those decisions. Their new approach is harder to teach; for one thing, each course requires a partnership of Electrical Engineering faculty and Computer Science faculty. Perhaps in time the applications-first approach will spark a revolution as profound as the one that followed SICP, but it hasn't happened yet.

In my experience, relatively few students appreciate how much they're learning in my course while they're in it. But in surveys of all our CS students, it turns out to be among the most popular courses in retrospect, and I regularly get visits and emails from long-gone students to tell me about how they're using in their work ideas that they thought were impractical ivory-tower notions as students. The invention of the MapReduce software for data parallelism at Google, based on functional programming ideas, has helped eliminate that ivory-tower reputation.