Abstract:

The amount of programming tutorial videos on the web increases on a daily basis. Video hosting sites such as YouTube host millions of video lectures, with many programming tutorials for various languages and platforms. Automatically understanding the content of such videos is desirable for many purposes, including search, targeting of ads, and referrals to semantically related content. We present a novel approach for extracting code from videos. Our technique extracts and recognizes code directly from the video, and is based on the following ideas: (i) consolidating code across frames to improve precision of extraction, (ii) a combination of statistical language models for applying corrections at different levels, allowing us to perform corrections by choosing the most likely token, combination of tokens that forms a likely line structure, and combination of lines that lead to a likely code fragment in the language. We have implemented our approach in a tool called ACE, and used it to extract code from 40 Android video tutorials on YouTube. Our evaluation shows that ACE extracts code with high precision, enabling deep indexing of video tutorials.