This year, for the first time in over two decades, a slew of work entered the public domain: everything first published in the United States in 1923, to be precise. (And yes, next year we’ll get the goods from 1924.) “But there’s another source of public domain works,” Cory Doctorow writes at Boing Boing.

Until the 1976 Copyright Act, US works were not copyrighted unless they were registered, and then they quickly became public domain unless that registration was renewed. The problem has been to figure out which of these works were in the public domain, because the US Copyright Office’s records were not organized in a way that made it possible to easily cross-check a work with its registration and renewal.

“This is how Project Gutenberg is able to publish all these science fiction stories from the 50s and 60s,” writes Leonard Richardson, who cooked up the bot Secretly Public Domain in response to this news. “Those stories were published in issues of magazines that didn’t send in the renewal form. But up [un]til now this hasn’t been a big factor, because 1) the big publishers generally made sure to send in their renewals, and 2) it’s been impossible to check renewal status in bulk.”

Impossible, that is, until the NYPL started a project to encode all of the registration records in XML, which makes them readable by machines. And now that they’ve done so, what have the machines told us? It turns out that 80% of the books published in the United States before 1964 are actually (secretly!) in the public domain, because no one filed the correct form. And for the first time, you can actually figure out which ones. Ah, bureaucracy: every reader’s best friend.

[via Boing Boing]