Who complains loudest about Google linking to infringing content in its search results? The movie and music industries, of course, who absolutely delight in taking whacks at the search engine. But thanks to a huge new trove of data released today by Google, we know that the worldwide top takedown requestor—by far—is actually Microsoft.

Anatomizing takedowns

If content owners claim that a Google search result links to infringing material, Google will remove the link. But just how many times does this happen—and who is making all the requests? Google today rolled out an upgrade to its "Transparency Report" that shows private copyright takedown information in addition to the usual government requests for user information or for censorship.

In the last month, Google's search engine has received requests to remove links to 1.13 million URLs hosted on 23,000 distinct domains. (Takedown requests to YouTube and other Google properties are not covered under the current data release.) I spoke to Fred von Lohmann, senior copyright counsel at Google, who said that the company does in fact remove 97 percent of the requested links after running them through both algorithmic and human review to catch mistakes or bad faith notices. The average turnaround time for a takedown is 11 hours, which von Lohmann called "the best in the industry."

The data, which is updated constantly, provides an intriguing look behind the scenes at the takedown process. Though Google has long sent takedown requests to the Chilling Effects website for storage, they were difficult to mine for aggregate data. The new interface makes wider analysis possible.

The data shows that, since July 2011, 2.5 million takedown requests have been filed on behalf of Microsoft. NBC Universal, the next highest, made only 985,000. Third are the RIAA member labels with 417,000, and the numbers trail off quickly from there.

Google's data also shows that many of these takedown requests don't come directly from rightsholders, either; copyright enforcement companies with names like DtecNet and Takedown Piracy LLC, which make a business from providing such services, account for most of the actual requests.

As for why Google released the new takedown data, von Lohmann told me that the company wants to be more transparent, but it also wants to "make sure that policymakers have the data they need to make good decisions." Reading between the lines, we might suspect this to be a play for lawmakers, who keep hearing about Google like it's a rogue entity on issues like copyrights and patents. (Google hasn't helped itself by having some real problems, like the notorious "pharmacy ads" investigation for which the company coughed up $500 million to the US government.)

Von Lohmann also confirmed to me that Google doesn't accept takedowns uncritically. While most are granted, Google rejects some for not including the proper information and a small number on the grounds that the content is not actually infringing. This can be a gutsy call to make, since under US law, Google loses its "safe harbor" for those links and could be sued directly by the rightsholder for not removing them.

In a blog post announcing the new data, von Lohmann noted, "We recently rejected two requests from an organization representing a major entertainment company, asking us to remove a search result that linked to a major newspaper’s review of a TV show. The requests mistakenly claimed copyright violations of the show, even though there was no infringing content."

So good for Google both in standing up (occasionally) to abusive takedown requests and for releasing the new data. Thanks to the new release we know that, while The Pirate Bay is widely demonized, the site receiving the greatest number of requests to be pulled from Google's index is actually filestube.com. And we know that requests pour in from all over the globe, including from unexpected sources like the Yale International Relations Association.