In November 2015, home secretary Theresa May revealed that the UK’s security and intelligence agencies had been gathering data on millions of people’s telephone calls for years. She admitted the use of this “bulk” power – one that gathers material in an untargeted fashion on a large number of people – when announcing the revised Investigatory Powers (IP) Bill, which she said would “put that power on a more explicit footing”.

But communications records are far from being the only large databases retained by the agencies. In October 2015, MI5 director general Andrew Parker said in a speech that alongside communications data the use of “travel data, passport information or other datasets” was “fundamental to our work”.

Under pressure from Labour, May agreed to a review of the IP Bill’s plans for bulk powers led by the independent reviewer of terrorism legislation David Anderson QC, who wrote a report on the draft bill. His new report is due to appear in time for the bill’s forthcoming committee stage in the House of Lords.

But documents released in April 2016 by Privacy International have already provided significant insights into the agencies’ use of population-wide databases – as well as clues to their identities. The civil liberties group published documents from its case against the agencies at the Investigatory Powers Tribunal, due to be heard at the end of July.

Policy for bulk data collection A partly-redacted MI5 document from October 2010 – Policy for bulk data acquisition, sharing, retention and deletion – defines bulk data as “a dataset or database containing data about a large range of individuals, including individuals who are of no intelligence interest to the service, and which is too large to be susceptible to manual processing”, and which is not covered by existing legislation and is not commercially available. An older MI5 form for acquiring such data quotes Cabinet Office definitions from June 2008, one of which is “any source of information relating to 1,000 or more individuals that is not in the public domain”. Were it to become widely known that the service held [financial] data the media response would most likely be unfavourable and probably innacurate MI5 document “There may be a level of public assumption that the service will not hold such data in bulk,” the 2010 document adds. This particularly applies to financial data: “The fact that the service holds bulk financial, albeit anonymised, data is assessed to be a high corporate risk since there is no public expectation that the service will hold or have access to this data in bulk.” A further section was “gisted” – rewritten for legal disclosure – as follows: “Were it to become widely known that the service held this data the media response would most likely be unfavourable and probably inaccurate.”

Financial data concerns “It is concerning that they hold financial data,” said Privacy International legal officer Camilla Graham Wood. “They state that it is anonymised, but the documents also indicate that it is possible to de-anonymise financial data.” The identities of the financial databases concerned are open to speculation, although records of transactions that run through bank clearing organisations or individual financial institutions are arguably anonymous if they exclude names. The agencies have provided much more information on their access to passport and travel data, as well as having explicitly admitted their use. However, the Home Office refused to answer questions on this and other areas, citing national security. The 2010 MI5 document refers to use of passport data as a low corporate risk “as the public has a reasonable expectation MI5 holds travel-related data and may hold it in bulk. Moreover, passport forms state that details may be passed to other departments and agencies when it is in the ‘public interest’ to do so”. The document makes similar comments about Olympics accreditation data which was gathered for the London 2012 Games, giving it as an example of a low level of actual intrusion. While the exact nature of the passport data has been redacted, a passport application includes a photograph as well as extensive biographical information.

Travel data The document also discusses apparently separate “travel data” held by MI5, which the service sees as representing a medium level of intrusion. “Results of a query would identify the movements of the individuals subject to the query,” it said, adding that “due to limited intelligence” it is common that these queries “return data on people of no intelligence interest”. An obvious candidate is international passenger journey data gathered by the Home Office from travel providers through what used to be known as e-Borders and is now part of the Border Systems Programme. Access to international journey data by the agencies appears to be confirmed by a message to users of MI6’s database in June 2014, warning against inappropriate “self search” of secure databases by staff to update travel records. “This is not a proportionate use of the system, as you could find this information by another means (i.e. check the stamps in your passport or keep a running record of your travel) that would avoid collateral intrusion into other people's data,” the document said. Read more about the Investigatory Powers Bill

The Investigatory Powers Bill is the first attempt in 500 years to bring the surveillance activities of the state under rule of law instead of "the prerogative of the crown"

Survey shows many Britons are unsure and concerned by controversial IP bill as it moves closer to becoming law

The Home Office has tweaked the draft Investigatory Powers Bill, taking on committee recommendations – but questions remain “The documents certainly indicate they have ‘travel data’ but the government has not disclosed in any detail either in the BPD [bulk personal datasets] case or to parliament exactly what this means,” said Privacy International’s Graham Wood, adding that it could also include domestic journeys such as automatic number plate recognition (ANPR) data and public transport records such as Transport for London’s Oyster card. The security agencies could also combine it with financial data and openly sourced material such as social media. MI5 also has access to “UK population data”, according to a brief and heavily redacted reference in a document from October 2012, titled Bulk data retention and deletion policy. It obtains this from an unidentified government department, which “provides details of ages and addresses”. It is a special case under MI5’s usual deletion rules, in that each update removes information that is no longer applicable and adds new applicable data. Relatively few government databases fit this description. Health services are devolved to the UK’s four nations and MI5’s guidelines treat health data as sensitive. Electoral rolls do not include ages and are anyway publicly available.

HMRC databases Probably the most reliable UK-wide databases which include address and age are those managed by HM Revenue and Customs (HMRC), given a national insurance number is required for employment, benefit and pension payments. HMRC’s data policy says that if the law allows, it may pass on personal information to “other government departments and similar bodies [and] the police and law enforcement agencies”. An HMRC spokesperson said: “We don't comment on matters of national security.” MI6 also appears to have access to the same, or a similar, database. In a newsletter for its database from September 2011, the agency warns about “individual users crossing the line” through “looking up addresses in order to send birthday cards [and] checking passport details to organise personal travel”; “checking details of family members” for personal reasons, the details of which were removed in the gisting process; and “as [a] ‘convenient’ way to check the personal details of colleagues when filling out service forms on their behalf”. “We do not know for certain but it is likely that databases such as national insurance form a dataset they obtain from another government department as that could be used to link with other data,” said Wood. “This is a key issue – that the government has failed to provide any detail to the public or parliamentarians on what constitutes bulk personal datasets.”