Britain's surveillance agency GCHQ, with aid from the US National Security Agency, intercepted and stored the webcam images of millions of internet users not suspected of wrongdoing, secret documents reveal.

GCHQ files dating between 2008 and 2010 explicitly state that a surveillance program codenamed Optic Nerve collected still images of Yahoo webcam chats in bulk and saved them to agency databases, regardless of whether individual users were an intelligence target or not.

In one six-month period in 2008 alone, the agency collected webcam imagery – including substantial quantities of sexually explicit communications – from more than 1.8 million Yahoo user accounts globally.

Yahoo reacted furiously to the webcam interception when approached by the Guardian. The company denied any prior knowledge of the program, accusing the agencies of "a whole new level of violation of our users' privacy".

GCHQ does not have the technical means to make sure no images of UK or US citizens are collected and stored by the system, and there are no restrictions under UK law to prevent Americans' images being accessed by British analysts without an individual warrant.

The documents also chronicle GCHQ's sustained struggle to keep the large store of sexually explicit imagery collected by Optic Nerve away from the eyes of its staff, though there is little discussion about the privacy implications of storing this material in the first place.

NSA ragout 4 Photograph: Guardian

Optic Nerve, the documents provided by NSA whistleblower Edward Snowden show, began as a prototype in 2008 and was still active in 2012, according to an internal GCHQ wiki page accessed that year.

The system, eerily reminiscent of the telescreens evoked in George Orwell's 1984, was used for experiments in automated facial recognition, to monitor GCHQ's existing targets, and to discover new targets of interest. Such searches could be used to try to find terror suspects or criminals making use of multiple, anonymous user IDs.

Rather than collecting webcam chats in their entirety, the program saved one image every five minutes from the users' feeds, partly to comply with human rights legislation, and also to avoid overloading GCHQ's servers. The documents describe these users as "unselected" – intelligence agency parlance for bulk rather than targeted collection.

One document even likened the program's "bulk access to Yahoo webcam images/events" to a massive digital police mugbook of previously arrested individuals.

"Face detection has the potential to aid selection of useful images for 'mugshots' or even for face recognition by assessing the angle of the face," it reads. "The best images are ones where the person is facing the camera with their face upright."

The agency did make efforts to limit analysts' ability to see webcam images, restricting bulk searches to metadata only.

However, analysts were shown the faces of people with similar usernames to surveillance targets, potentially dragging in large numbers of innocent people. One document tells agency staff they were allowed to display "webcam images associated with similar Yahoo identifiers to your known target".

Optic Nerve was based on collecting information from GCHQ's huge network of internet cable taps, which was then processed and fed into systems provided by the NSA. Webcam information was fed into NSA's XKeyscore search tool, and NSA research was used to build the tool which identified Yahoo's webcam traffic.

Bulk surveillance on Yahoo users was begun, the documents said, because "Yahoo webcam is known to be used by GCHQ targets".

NSA ragout 3 Photograph: Guardian

Programs like Optic Nerve, which collect information in bulk from largely anonymous user IDs, are unable to filter out information from UK or US citizens. Unlike the NSA, GCHQ is not required by UK law to "minimize", or remove, domestic citizens' information from its databases. However, additional legal authorisations are required before analysts can search for the data of individuals likely to be in the British Isles at the time of the search.

There are no such legal safeguards for searches on people believed to be in the US or the other allied "Five Eyes" nations – Australia, New Zealand and Canada.

GCHQ insists all of its activities are necessary, proportionate, and in accordance with UK law.

The documents also show that GCHQ trialled automatic searches based on facial recognition technology, for people resembling existing GCHQ targets: "[I]f you search for similar IDs to your target, you will be able to request automatic comparison of the face in the similar IDs to those in your target's ID".

The undated document, from GCHQ's internal wiki information site, noted this capability was "now closed … but shortly to return!"

The privacy risks of mass collection from video sources have long been known to the NSA and GCHQ, as a research document from the mid-2000s noted: "One of the greatest hindrances to exploiting video data is the fact that the vast majority of videos received have no intelligence value whatsoever, such as pornography, commercials, movie clips and family home movies."

Sexually explicit webcam material proved to be a particular problem for GCHQ, as one document delicately put it: "Unfortunately … it would appear that a surprising number of people use webcam conversations to show intimate parts of their body to the other person. Also, the fact that the Yahoo software allows more than one person to view a webcam stream without necessarily sending a reciprocal stream means that it appears sometimes to be used for broadcasting pornography."

The document estimates that between 3% and 11% of the Yahoo webcam imagery harvested by GCHQ contains "undesirable nudity". Discussing efforts to make the interface "safer to use", it noted that current "naïve" pornography detectors assessed the amount of flesh in any given shot, and so attracted lots of false positives by incorrectly tagging shots of people's faces as pornography.

NSA ragout 1 Photograph: Guardian

GCHQ did not make any specific attempts to prevent the collection or storage of explicit images, the documents suggest, but did eventually compromise by excluding images in which software had not detected any faces from search results – a bid to prevent many of the lewd shots being seen by analysts.

The system was not perfect at stopping those images reaching the eyes of GCHQ staff, though. An internal guide cautioned prospective Optic Nerve users that "there is no perfect ability to censor material which may be offensive. Users who may feel uncomfortable about such material are advised not to open them".

It further notes that "under GCHQ's offensive material policy, the dissemination of offensive material is a disciplinary offence".

NSA ragout 2 Photograph: Guardian

Once collected, the metadata associated with the videos can be as valuable to the intelligence agencies as the images themselves.

It is not fully clear from the documents how much access the NSA has to the Yahoo webcam trove itself, though all of the policy documents were available to NSA analysts through their routine information-sharing. A previously revealed NSA metadata repository, codenamed Marina, has what the documents describe as a protocol class for webcam information.

In its statement to the Guardian, Yahoo strongly condemned the Optic Nerve program, and said it had no awareness of or involvement with the GCHQ collection.

"We were not aware of, nor would we condone, this reported activity," said a spokeswoman. "This report, if true, represents a whole new level of violation of our users' privacy that is completely unacceptable, and we strongly call on the world's governments to reform surveillance law consistent with the principles we outlined in December.

"We are committed to preserving our users' trust and security and continue our efforts to expand encryption across all of our services."

Yahoo has been one of the most outspoken technology companies objecting to the NSA's bulk surveillance. It filed a transparency lawsuit with the secret US surveillance court to disclose a 2007 case in which it was compelled to provide customer data to the surveillance agency, and it railed against the NSA's reported interception of information in transit between its data centers.

The documents do not refer to any specific court orders permitting collection of Yahoo's webcam imagery, but GCHQ mass collection is governed by the UK's Regulation of Investigatory Powers Act, and requires certification by the foreign secretary, currently William Hague.

The Optic Nerve documentation shows legalities were being considered as new capabilities were being developed. Discussing adding automated facial matching, for example, analysts agreed to test a system before firming up its legal status for everyday use.

"It was agreed that the legalities of such a capability would be considered once it had been developed, but that the general principle applied would be that if the accuracy of the algorithm was such that it was useful to the analyst (ie, the number of spurious results was low, then it was likely to be proportionate)," the 2008 document reads.

The document continues: "This is allowed for research purposes but at the point where the results are shown to analysts for operational use, the proportionality and legality questions must be more carefully considered."

Optic Nerve was just one of a series of GCHQ efforts at biometric detection, whether for target recognition or general security.

While the documents do not detail efforts as widescale as those against Yahoo users, one presentation discusses with interest the potential and capabilities of the Xbox 360's Kinect camera, saying it generated "fairly normal webcam traffic" and was being evaluated as part of a wider program.

Documents previously revealed in the Guardian showed the NSA were exploring the video capabilities of game consoles for surveillance purposes.

Microsoft, the maker of Xbox, faced a privacy backlash last year when details emerged that the camera bundled with its new console, the Xbox One, would be always-on by default.

Beyond webcams and consoles, GCHQ and the NSA looked at building more detailed and accurate facial recognition tools, such as iris recognition cameras – "think Tom Cruise in Minority Report", one presentation noted.

The same presentation talks about the strange means the agencies used to try and test such systems, including whether they could be tricked. One way of testing this was to use contact lenses on detailed mannequins.

To this end, GCHQ has a dummy nicknamed "the Head", one document noted.

In a statement, a GCHQ spokesman said: "It is a longstanding policy that we do not comment on intelligence matters.

"Furthermore, all of GCHQ's work is carried out in accordance with a strict legal and policy framework which ensures that our activities are authorised, necessary and proportionate, and that there is rigorous oversight, including from the secretary of state, the interception and intelligence services commissioners and the Parliamentary Intelligence and Security Committee.

"All our operational processes rigorously support this position."

The NSA declined to respond to specific queries about its access to the Optic Nerve system, the presence of US citizens' data in such systems, or whether the NSA has similar bulk-collection programs.

However, NSA spokeswoman Vanee Vines said the agency did not ask foreign partners such as GCHQ to collect intelligence the agency could not legally collect itself.

"As we've said before, the National Security Agency does not ask its foreign partners to undertake any intelligence activity that the US government would be legally prohibited from undertaking itself," she said.

"The NSA works with a number of partners in meeting its foreign intelligence mission goals, and those operations comply with US law and with the applicable laws under which those partners operate.

"A key part of the protections that apply to both US persons and citizens of other countries is the mandate that information be in support of a valid foreign intelligence requirement, and comply with US Attorney General-approved procedures to protect privacy rights. Those procedures govern the acquisition, use, and retention of information about US persons."