Something else came out of that consortium. I talked to Illien two days after the massacre in Paris at the offices of Charlie Hebdo. “We are overwhelmed, and scared, and even taking the subway is terrifying, and we are scared for our children,” Illien said. “The library is a target.” When we spoke, the suspects were still at large; hostages had been taken. Illien and his colleagues had started a Web archive about the massacre and the world’s response. “Right now the media is full of it, but we know that most of that won’t last,” he said. “We wrote to our colleagues around the world and asked them to send us feeds to these URLs, to Web sites that were happening, right now, in Paris, so that we could collect them and historians will one day be able to see.” He was very quiet. He said, “When something like that happens, you wonder what you can do from where you sit. Our job is memory.”

“Do you have any hand sanitizer?” Facebook

Twitter

Email

Shopping

The plan to found a global Internet archive proved unworkable, partly because national laws relating to legal deposit, copyright, and privacy are impossible to reconcile, but also because Europeans tend to be suspicious of American organizations based in Silicon Valley ingesting their cultural inheritance. Illien told me that, when faced with Kahle’s proposal, “national libraries decided they could not rely on a third party,” even a nonprofit, “for such a fundamental heritage and preservation mission.” In this same spirit, and in response to Google Books, European libraries and museums collaborated to launch Europeana, a digital library, in 2008. The Googleplex, Google’s headquarters, is thirty-eight miles away from the Internet Archive, but the two could hardly be more different. In 2009, after the Authors Guild and the Association of American Publishers sued Google Books for copyright infringement, Kahle opposed the proposed settlement, charging Google with effectively attempting to privatize the public-library system. In 2010, he was on the founding steering committee of the Digital Public Library of America, which is something of an American version of Europeana; its mission is to make what’s in libraries, archives, and museums “freely available to the world . . . in the face of increasingly restrictive digital options.”

Kahle is a digital utopian attempting to stave off a digital dystopia. He views the Web as a giant library, and doesn’t think it ought to belong to a corporation, or that anyone should have to go through a portal owned by a corporation in order to read it. “We are building a library that is us,” he says, “and it is ours.”

When the Internet Archive bought the church, Kahle recalls, “we had the idea that we’d convert it into a library, but what does a library look like anymore? So we’ve been settling in, and figuring that out.”

From the lobby, we headed up a flight of yellow-carpeted stairs to the chapel, an enormous dome-ceilinged room filled with rows of oak pews. There are arched stained-glass windows, and the dome is a stained-glass window, too, open to the sky, like an eye of God. The chapel seats seven hundred people. The floor is sloped. “At first, we thought we’d flatten the floor and pull up the pews,” Kahle said, as he gestured around the room. “But we couldn’t. They’re just too beautiful.”

On the wall on either side of the altar, wooden slates display what, when this was a church, had been the listing of the day’s hymn numbers. The archivists of the Internet have changed those numbers. One hymn number was 314. “Do you know what that is?” Kahle asked. It was a test, and something of a trick question, like when someone asks you what’s your favorite B track on the White Album. “Pi,” I said, dutifully, or its first three digits, anyway. Another number was 42. Kahle gave me an inquiring look. I rolled my eyes. Seriously? But it is serious, in a way. It’s hard not to worry that the Wayback Machine will end up like the computer in Douglas Adams’s “Hitchhiker’s Guide to the Galaxy,” which is asked what is the meaning of “life, the universe, and everything,” and, after thinking for millions of years, says, “Forty-two.” If the Internet can be archived, will it ever have anything to tell us? Honestly, isn’t most of the Web trash? And, if everything’s saved, won’t there be too much of it for anyone to make sense of any of it? Won’t it be useless?

The Wayback Machine is humongous, and getting humongouser. You can’t search it the way you can search the Web, because it’s too big and what’s in there isn’t sorted, or indexed, or catalogued in any of the many ways in which a paper archive is organized; it’s not ordered in any way at all, except by URL and by date. To use it, all you can do is type in a URL, and choose the date for it that you’d like to look at. It’s more like a phone book than like an archive. Also, it’s riddled with errors. One kind is created when the dead Web grabs content from the live Web, sometimes because Web archives often crawl different parts of the same page at different times: text in one year, photographs in another. In October, 2012, if you asked the Wayback Machine to show you what cnn.com looked like on September 3, 2008, it would have shown you a page featuring stories about the 2008 McCain-Obama Presidential race, but the advertisement alongside it would have been for the 2012 Romney-Obama debate. Another problem is that there is no equivalent to what, in a physical archive, is a perfect provenance. Last July, when the computer scientist Michael Nelson tweeted the archived screenshots of Strelkov’s page, a man in St. Petersburg tweeted back, “Yep. Perfect tool to produce ‘evidence’ of any kind.” Kahle is careful on this point. When asked to authenticate a screenshot, he says, “We can say, ‘This is what we know. This is what our records say. This is how we received this information, from which apparent Web site, at this IP address.’ But to actually say that this happened in the past is something that we can’t say, in an ontological way.” Nevertheless, screenshots from Web archives have held up in court, repeatedly. And, as Kahle points out, “They turn out to be much more trustworthy than most of what people try to base court decisions on.”

You can do something more like keyword searching in smaller subject collections, but nothing like Google searching (there is no relevance ranking, for instance), because the tools for doing anything meaningful with Web archives are years behind the tools for creating those archives. Doing research in a paper archive is to doing research in a Web archive as going to a fish market is to being thrown in the middle of an ocean; the only thing they have in common is that both involve fish.

The Web archivists at the British Library had the brilliant idea of bringing in a team of historians to see what they could do with the U.K. Web Archive; it wasn’t all that much, but it was helpful to see what they tried to do, and why it didn’t work. Gareth Millward, a young scholar interested in the history of disability, wanted to trace the history of the Royal National Institute for the Blind. It turned out that the institute had endorsed a talking watch, and its name appeared in every advertisement for the watch. “This one advert appears thousands of times in the database,” Millward told me. It cluttered and bogged down nearly everything he attempted. Last year, the Internet Archive made an archive of its .gov domain, tidied up and compressed the data, and made it available to a group of scholars, who tried very hard to make something of the material. It was so difficult to recruit scholars to use the data that the project was mostly a wash. Kahle says, “I give it a B.” Stanford’s Web archivist, Nicholas Taylor, thinks it’s a chicken-and-egg problem. “We don’t know what tools to build, because no research has been done, but the research hasn’t been done because we haven’t built any tools.”