What information is being stored?

Early last month, even while he was finalizing his discussions with Edward Snowden, The Guardian's Glenn Greenwald reported on a conversation between Tim Clemente, a former FBI agent, and CNN host Carol Costello. In the interview about the Boston Marathon investigation, as seen at right, Clemente makes the claim that "all digital communications are — there's a way to look at digital communications in the past." Costello refers to a previous appearance in which Clemente claimed the government could access phone calls, even "exactly what was said in that conversation."

This is an important claim for two reasons. The first is that Clemente, who also served on the FBI's Joint Terrorism Task Force, suggests a massive breadth of information collection. The second is that he doesn't say who is actually collecting the data, which we'll come back to.

Clemente indicates that entire phone calls are being recorded and stored, which is a far stronger claim than that Verizon is sharing metadata with the government. Both from a privacy standpoint and for our calculations. Metadata on a call — the number from which it originated, who it was placed to, duration, location information — is tiny. Perhaps a few hundred bytes could contain all of it. But a call is much larger — and as the call goes on, the amount of storage space it takes up increases dramatically, running into multiple megabytes. Same thing with email: a text email message is small; embed a photo, and it gets much bigger; embed a video, and it gets much, much, much bigger.

So if Clemente is right, and the government has access to "all digital communications" — videos, calls, audio recordings, emails, photos — that's taking up a lot of physical space somewhere. Which brings us to the second reason Clemente's claim is important, and to our second question.

How much of that information is the NSA housing?

According to Cisco, North Americans moved 13.1 exabytes around the Internet each month. You're familiar with kilobytes, megabytes, gigabytes. You're maybe familiar with terabytes, the next largest unit of electronic storage, each about 1,000 times larger than the last. Next comes petabytes, 1,000 times a terabyte. Then we get to exabytes. Put another way, 13.1 exabytes is the equivalent of 4,367,000,000,000 song files. That's what we move monthly. It's not the same figure as what is stored, of course, but it gives some sense of scale. A healthy percentage of that 13.1 exabytes must necessarily exist on servers around the world.

At the time of Greenwald's report on Clemente, it seemed incredible — impossible, even — that the government would be storing anything close to a repository of that size. Since The Guardian's Snowden reports came out over the last week, the world realized that it might be entirely possible. The government's PRISM system, we have learned, apparently gives the NSA a way to see data on private company servers. In other words, the NSA may not need to store the petabytes of content that people create. It just needs, in Clemente's words, "a way to look" at it. Which makes sense. For the NSA to keep copies of everything Google and Facebook and Apple and Microsoft store for their customers in the cloud (on their servers), NSA would need to at least match that storage capacity. For every gigabyte Facebook stores, the NSA would need to store it, too. Having Facebook store it for them makes much more sense — if Facebook keeps things around long enough.