Everything you need to know about data gathering from internet companies by the US National Security Agency

What is the scandal?

The US's National Security Agency (NSA), its wiretapping agency, has been monitoring communications between the US and foreign nationals over the internet for a number of years, under a project called Prism. Some of the biggest internet companies, from Apple to Google to Yahoo, are involved. The US government confirmed the existence of the scheme and its application on Thursday night.

Which companies are in the scheme?

Microsoft was the first to be included, in September 2007. Yahoo followed in March 2008, Google in January 2009, Facebook in June 2009, Paltalk, a Windows- and mobile-based chat program, in December 2009, YouTube in September 2010, Skype in February 2011 (before its acquisition by Microsoft), AOL in March 2011 and finally Apple in October 2012.

How long has it been going on?

The NSA has allegedly had means of monitoring internet communications as far back as Microsoft's Windows 95, the first version of Windows with built-in internet connectivity, in 1995. This specific project appears to have begun with monitoring in September 2007 of user data going to and from Microsoft.

What data is being monitored?

Potentially, everything. The PowerPoint slide about Prism says it can collect "email, chat (video, voice), videos, photos, stored data, VoIP [internet phone calls], file transfers, video conferencing, notifications of target activity – logins etc, online social networking details" and another category called "special requests".

How much does it cost to monitor so much traffic?

The budget given in the presentation is comparatively tiny – just $20m per year. That has puzzled experts because it's so low.

How effective has it been?

Nobody knows. The US government has said that the monitoring schemes it runs are necessary to defend against terrorist threats. But it hasn't cited any threats that were thwarted – unsurprising, given that the scheme has only just become public.

Isn't it illegal?

The NSA – and so the US government – has been careful to avoid any suggestion that the monitoring is being carried out indiscriminately on US citizens, because that would potentially breach the fourth amendment of the constitution against "unreasonable search".

But people overseas get no such protections. The question then is whether UK and EU governments knew of the scheme and were compliant – and whether they could stop it even if they wanted to.

What about "safe harbour" rules for EU data?

US companies that want to process private data from EU citizens have to promise a "safe harbour" – but crucially the documents do not mention tapping by US law enforcement. And if disputes arise, the rules say: "Claims brought by EU citizens against US organizations will be heard, subject to limited exceptions, in the US." That would probably mean the NSA's licence to spy would trump EU complaints.

How does it work?

The NSA isn't saying. Sources in the data-processing business point to a couple of methods. First, lots of data bound for those companies passes over what are called "content delivery networks" (CDNs), which are in effect the backbone of the internet. Companies such as Cisco provide "routers" which direct that traffic. And those can be tapped directly, explains Paolo Vecchi of Omnis Systems, based in Falmer, near Brighton.

"The Communications Assistance for Law Enforcement Act (Calea) passed in 1994 forces all US manufacturers to produce equipment compliant with that law," says Vecchi. "And guess what: Cisco is one of the companies that developed and maintains that architecture." Cisco's own documents explain its Calea compliance.

Second, it would be possible to tap into the routers at US national boundaries (to capture inbound international traffic) and just search for desired traffic there.

"The Prism budget – $20m – is too small for total surveillance," one data industry source told the Guardian. Twitter, which is not mentioned in the Prism slides, generates 5 terabytes of data per day, and is far smaller than any of the other services except Apple. That would mean skyrocketing costs if all the data were stored. "Topsy, which indexes the whole of Twitter, has burned through about $20m in three years, or about $6m a year," the source pointed out. "With Facebook much bigger than Twitter, and the need to run analysts etc, you probably couldn't do the whole lot on $20m."

Instead, the source suggests, "they might have search interfaces (at an administrator level) into things like Facebook, and then when they find something of interest can request a data dump. These localised data dumps are much smaller."

So the NSA would only need to tap the routers?

Not quite. Much of the traffic going to the target companies would be encrypted, so even when captured it would look like a stream of digital gibberish. Decrypting it would require the "master keys" held by the companies.

Did the companies know?

They say not. Those which have been contacted have all denied knowledge of it: Google, for example, said: "Google does not have a 'back door' for the government to access private user data." An Apple spokesman said: "We have never heard of Prism. We do not provide any government agency with direct access to our servers and any agency requesting customer data must get a court order."

The Washington Post retracted part of its story about Prism in which it said that the companies "knowingly" participated. Instead, it quotes a report which says that "collection managers [could send] content tasking instructions directly to equipment installed at company-controlled locations".

It is ambiguous whether "company" refers to the NSA or the internet companies. But the implication seems to be that the NSA has been running a system that can tap into the internet when it wants.

How could the companies not know if they had provided master decryption keys?

They might be required to provide them under US law, but would not be allowed to disclose the fact. That would give the NSA all it needed to monitor communications.

Is there anything I can do to stop it?

Lots of internet traffic from the west passes through the US because the destination servers are there, or connect there. Encrypting email using PGP is one possibility, though it is not easy to set up. Systems such as Tor, together with a virtual private network (VPN) connection, can cloak your location, though your identity might still be inferred from the sites you connect to.