Google Books Data Mining Reveals Mad Men's Big Historical Flaw: Business Lingo

from the keep-a-low-profile dept

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community. Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis. While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

The TV showhas quite a reputation for going to great lengths to be as authentic as possible. The clothes, the props, the scenarios are all supposedly thought out in great detail. While some who were actually in the business at the time quibble with certain aspects of the show, it cannot be denied that the show's producers certainly go way beyond other period pieces to try to make keep everything accurate for the time period. However, it turns out that there's one area where it appears the writers have completely flopped: period-specificran an absolutely fascinating clip about a researcher who has shown how frequently Mad Men uses words or phrases that were not in popular usage at the time, but only came into the lexicon at a later date:This is actually a cross-broadcast of another podcast, Lexicon Valley , and it's covering the work of Ben Schmidt, who has produced a software algorithm that compares thescripts... to a searchable database of language from Google's book scanning project. Schmidt's algorithm compares the language from the show with scanned books from the same period. Schmidt has a website, Prochronism , which covers his findings. I can't quite explain why, but it's really quite fascinating.Schmidt has found that the show is pretty good about getting language aboutright (with one exception). It knows that there aren't fax machines and computers and stuff. The one area where it gets things wrong, is with. For example, using the phrase "on hold." He notes that phones had hold buttons, but there wasn't yet a concept of the state of being "on hold." That showed up in the 70s.What Schmidt has also found is that the show is absolutelyabout getting "business" terms correct in a period specific way. That same post about "on hold" also chides the show for using "defining moment," another phrase that showed up in the 70s, but was basically stuck in academia until the late 80s or early 90s when it became a popular phrase.Honestly, Ben's site is really fascinating. I could spend hours on it (and actually had to stop going through it post by post to finish this post). There are also discussions on phrases like "focus groups" and "leverage." But one more awesome chart from Ben, discussing the use of both "moral high ground" and "consumerism," both of which were barely in use until much later:On the podcast, they discuss how part of the reason that the show gets the language about technology right, but not business, is because wethat technology rapidly evolves and we're more attuned to it. But people don't pay nearly as much attention to how business changes and especially how the language of business changes over time. I guess that's true, though it doesn't surprise me that "consumerism" and "moral high ground" are both more recent phenomena. "Defining moment" and "on hold" are a bit more surprising to me.Either way, I also wanted to highlight something else about all of this that I find fascinating. For all the talk by some about just how evil Google's book scanning project is, this kind of effort and researchwithout large scale scanning of books. While this particular example may appear (on its face) to be a frivolous (even if it's fascinating) area of research, it does highlight just how collection of certain data can open up vast arrays of data that can be mined in useful ways. When people freak out about new technologies and services, they almost always focus on how it impacts the old offerings. So most of the talk was about book scanning and its impact on book sales. But what almost no one talks about is how it enables new things that simply weren't possible before -- such as being able to build an algorithm like the one Ben built. Those kinds of innovations -- the unexpected "externalities" of projects like the Google book scanning project -- shouldn't be ignored, because there's tremendous value that can come out of them.

Filed Under: ben schmidt, data mining, google books, history, language, mad men, television