6.851: Advanced Data Structures (Fall'17)

Prof. Erik Demaine TAs: Adam Hesterberg, Jayson Lynch LAs: Andrew He, Stef Ren

Data structures play a central role in modern computer science. You interact with data structures even more often than with algorithms (think Google, your mail server, and even your network routers). In addition, data structures are essential building blocks in obtaining efficient algorithms. This course covers major results and current research directions in data structures:

TIME TRAVEL We can remember the past efficiently (a technique called persistence), but in general it's difficult to change the past and see the outcomes on the present (retroactivity). So alas, Back To The Future isn't really possible. GEOMETRY When data has more than one dimension (e.g. maps, database tables). DYNAMIC OPTIMALITY Is there one binary search tree that's as good as all others? We still don't know, but we're close. MEMORY HIERARCHY Real computers have multiple levels of caches. We can optimize the number of cache misses, often without even knowing the size of the cache. HASHING Hashing is the most used data structure in computer science. And it's still an active area of research. INTEGERS Logarithmic time is too easy. By careful analysis of the information you're dealing with, you can often reduce the operation times substantially, sometimes even to constant. We will also cover lower bounds that illustrate when this is not possible. DYNAMIC GRAPHS A network link went down, or you just added or deleted a friend in a social network. We can still maintain essential information about the connectivity as it changes. STRINGS Searching for phrases in giant text (think Google or DNA). SUCCINCT Most “linear size” data structures you know are much larger than they need to be, often by an order of magnitude. Some data structures require almost no space beyond the raw data but are still fast (think heaps, but much cooler).

Inverted Lectures

This year, we're experimenting with inverted lectures: most material is covered in video lectures recorded in 2012 (already watched by over 100,000 people), which you can conveniently play at faster speed than real time. In-class time will be divided between answers to questions, new material presented by the professor and/or guest lecturers, and in-class problem solving, with a focus on problem solving. Particularly unusual is that the problems we'll solve in groups will include problem-set style problems with known solutions, coding problems, and open research problems that no one knows the answer to, with the goal of publishing papers about whatever we discover. (Past offerings of 6.851 have led to over a dozen published papers.) You can work on whatever type of problem most interests you. To facilitate collaboration, we'll be using a new open-source software platform called Coauthor, along with Github for (optional) coding.

Specifics

Lecture time: Wednesdays 7:00–9:30pm

Wednesdays 7:00–9:30pm First lecture: Wednesday, September 6, 2017

Lecture room: 32-082 — except Sept. 27 in 32-155

32-082 — except Sept. 27 in 32-155 Units: 3-0-9, G-level & TCS AAGS credit

3-0-9, G-level & TCS AAGS credit Contact: Email 6851-staff#at#csail.mit.edu

Email Prerequisites: 6.046 (Design and Analysis of Algorithms), or an equivalently thorough undergraduate algorithms class from another school (e.g., covering much of CLRS).

Recommended Reading

Data Structures and Network Algorithms by Robert E. Tarjan (covers BSTs, splay trees, link-cut trees)

Open Data Structures by Pat Morin (covers BSTs, B-trees, hashing, and some integer data structures)

Participating

If you are interested in attending the class, for credit or as a listener, please do the following:

Join the 6851-students mailing list. Sign up for an account on Coauthor Fill out this signup form

Grading

Watching lecture videos. Be sure to fill out the linked feedback forms.

Attending in-person classes (except when you have a valid excuse, which you need to tell the course staff). In particular, you must participate weekly by posting or being @mentioned in a post on Coauthor.

Lightweight (weekly, one-page) problem sets. Problem sets will be posted weekly, and follow a “one-page in, one-page out” rule.

Revise existing scribe notes for one lecture (or maybe two), according to your own inspiration for improvement and feedback from fellow students. The entire calendar for the course has been posted, so you can select a lecture that interests you. We will circulate a sign-up sheet during the second week. Scribe notes are generally due noon on Tuesday after the corresponding class (but extensions are possible).

Research-oriented final project (paper and presentation). We allow theoretical, experimental, survey, and Wikipedia final projects.

Past and Future

The class is offered once every two years or so. It was given in Spring 2003 and Spring 2005 as 6.897, and in Spring 2007, Spring 2010, Spring 2012, Spring 2014, as 6.851.