From 100 Students to 100,000

Typically I teach around 100 students per year in my introductory database course. This past fall my enrollment was a whopping 60,000. Admittedly, only 25,000 of them chose to submit assignments, and a mere 6500 achieved a strong final score. But even with 6500 students, I more than quadrupled the total number of students I’ve taught in my entire 18-year academic career.

The story begins a couple of years earlier, when Stanford computer science faculty started thinking about shaking up the way we teach. We were tired of delivering the same lectures year after year, often to a half-empty classroom because our classes were being videotaped. (The primary purpose of the videotaping is for Stanford’s Center for Professional Development, but the biggest effect is that many Stanford students skip lectures and watch them later online.) Why not “purpose-build” better videos: shorter, topic-specific segments, punctuated with in-video quizzes to let watchers check their understanding? Then class time could be made more enticing for students and instructor alike, with interactive activities, advanced or exotic topics, and guest speakers. This “flipped classroom” idea was evangelized in the Stanford C.S. department by Daphne Koller; I was one of the early adopters, creating my videos during the first few months of 2011. Recording was a low-tech affair, involving a computer, Cintiq tablet, cheap webcam and microphone, Camtasia software, and a teaching assistant to help with editing.

I put my videos online for the public, and soon realized that with a little extra work, I could make available what amounted to an entire course. With further help from the teaching assistant, I added slides (annotated as lecture notes, and unannotated for teaching use by others), demo scripts, pointers to textbook readings and other course materials, a comprehensive suite of written and programming exercises, and quick-guides for relevant software. The site got a reasonable amount of traffic, but the turning point came when my colleague Sebastian Thrun decided to open up his fall 2011 introductory artificial intelligence course to the world. After one email announcement promising a free online version of the Stanford AI course, including automatically-graded weekly assignments and a “statement of accomplishment” upon completion, Sebastian’s public course garnered tens of thousands of sign-ups within a week.

Having already prepared lots of materials, I jumped on the free-to-the-world bandwagon, as did my colleague Andrew Ng with his machine learning course. What transpired over the next ten weeks was one of the most rewarding things I’ve done in my life. The sign-ups poured in, and soon the “Q&A Forum” was buzzing with activity. The fact that I had a lot of materials ready before the course started turned out to be a bit deceptive—for ten weeks I worked nearly full-time on the course (never mind my other job as department chair, much less my research program), in part because there was a lot to do, but mostly because there was a lot I could do to make it even better, and I was having a grand time.

In addition to the video lectures, in-video quizzes, course materials, and self-guided exercises, I added two very popular components: quizzes that generate different combinations of correct and incorrect answers each time they’re launched (using technology pioneered a decade ago by my colleague Jeff Ullman in his Gradiance system), and interactive workbenches for topics ranging from XML DTD validation to view-update triggers. I offered midterm and final exams—multiple-choice, and crafted carefully so the problems weren’t solvable by running queries or checking Wikipedia. (Creating these exams, at just the right level, turned out to be one of the most challenging tasks of the entire endeavor.) To add a personal touch, and to amplify the strong sense of community that quickly welled up through the Q&A Forum, each week I posted a “screenside chat” video—modeled after Franklin D. Roosevelt’s fireside chats—covering topics ranging from logistical issues, to technical clarifications, to full-on cheerleading for those who were struggling.

Meanwhile back on the campus front, the Stanford students worked through exactly the same materials as the public students (except for the multiple-choice exams), but they did get something more for their money: hand-graded written problems with more depth than the automated exercises, a significant programming project, traditional written exams, and classroom activities ranging from interactive problem-solving to presentations by data architects at Facebook and Twitter. There’s no question that the Stanford students were satisfied: I’ve taught the course enough times to know that the uptick in my teaching ratings was statistically significant.

One interesting and surprisingly large effect of having 60,000 students is the need for absolute perfection: not one tiny flaw or ambiguity goes unnoticed. And when there’s a downright mistake, especially in, say, an exam question … well, I shudder to remember. The task of correcting small (and larger) errors and ambiguities in videos, quizzes, exercises, and other materials, was a continuing chore, but certainly instructive.

What kept me most engaged throughout the course was the attitude of the public students, conveyed primarily through emails and posts on the Q&A Forum. They were unabashedly, genuinely, deeply appreciative. Many said the course was a gift they could scarcely believe had come their way. As the course came to a close, several students admitted to shedding tears. One posted a heartfelt poem. A particularly noteworthy student named Amy became an absolute folk hero: Over the duration of the course Amy answered almost 900 posted questions. Regardless of whether the questions were silly or naive, complex or deep, her answers were patient, correct, of just the right length, included examples as appropriate, and were crafted in perfect English. Amy never revealed anything about herself (although she agreed to visit me after the course was over), despite hundreds of adoring public thank-you’s from her classmates, and one marriage proposal!

So who were these thousands and thousands of students? I ran a survey that revealed some interesting statistics. For example, although ages and occupations spanned the gamut, the largest contingent of students were software professionals wanting to sharpen their job skills. Many students commented that they’d been programming with databases for years without really knowing what they were doing. Males outnumbered females four to one, which is actually a little better than the ratio among U.S. college computer science majors. Students hailed from 130 countries; the U.S. had the highest number by a wide margin, followed by India and Russia. (China unfortunately blocked some of the content, although a few enterprising students helped each other out with workarounds.) On db-class.org you can find the full survey results via the FAQ page, as well as some participation and performance statistics.

Were there any negatives to the experience? Naturally there were a few complainers. For example, in my screenside chats I often referred to the “eager beavers” who were working well ahead of the schedule, and the “procrastinators” who were barely meeting deadlines. Most students enjoyed self-identifying into the categories (some eager-beavers even planned to make T-shirts), but a few procrastinators objected to the term, pointing out that they were squeezing the course between a full-time job or two and significant family obligations. A number of students were disappointed by the low-tech, non-Stanford-endorsed “statement of accomplishment” they received at the end; despite ample warnings from the start, apparently some students were still expecting official certification. I can’t help but wonder if some of those students were the same ones who cheated; I did appear to have quite a number of secondary accounts created expressly for achieving a perfect score. I made it clear from the start that I was assuming students were in it to learn, and cheating was not something I planned to prevent or even think about. Of course in the long run of online education, the interrelated topics of certification and cheating will need to be addressed.

So what happens next? Stanford is launching quite a few more courses in the same style, and I’ll offer mine again next fall. MIT has jumped on the bandwagon; other universities can’t be far behind. Independent enterprises such as the pioneering Khan Academy, and the recently-announced Udacity, are sure to play into the scene. There’s no doubt we’re at a major inflection point in higher education, both on campus and through internet distribution to the world. I’m thrilled to have been an early part of it.

Meanwhile here are a few more numbers: A few months after the initial launch we now have over 100,000 accounts, and we’ve accumulated millions of video views. Even with the course in a self-serve dormant state, each day there are a couple of thousand video views and around 100 assignments submitted for automated grading. All to learn about databases! Wow. Check it out at db-class.org.

Blogger’s Profile:

Jennifer Widom is the Fletcher Jones Professor and (currently) Chair of the Computer Science Department at Stanford University. She was a Research Staff Member at the IBM Almaden Research Center before joining the Stanford faculty in 1993. Her research interests span many aspects of nontraditional data management. She is an ACM Fellow and a member of the National Academy of Engineering and the American Academy of Arts & Sciences; she received the ACM SIGMOD Edgar F. Codd Innovations Award in 2007.