Exclusive The UK Home Office is secretly creating a centralised database on the good folk of Britain without presenting the capability increases to the public or subjecting them to Parliamentary scrutiny.

The Register can reveal the project, which was described as simply a “replatforming” of the department's aging IT infrastructure, has already begun to roll out, with the “first wave” of changes being delivered in what it is calling the Technology Platforms for Tomorrow (TPT) programme.

TPT will lay the foundations for this mega database by ushering in “core infrastructure, compute platforms and Live Service capability” changes, primarily using Hadoop, the open source software framework for centralising databases and allowing batch queries and analyses to be run across them in bulk.

While this data on the population is currently stored in “siloed” and disparate databases, connecting it could make it possible to automatically follow individuals' records across all of the Home Office's many directorates, from the two years' worth of car journeys logged in the ANPR data centre, to the passports database, the police databases, and many others.

After laying off over a third of its old IT staff, the Home Office has recently been attempting to recruit Hadoop specialists to help it build and maintain this new “single platform”, with a presentation and talk seemingly doing the rounds around the user circuit until the Home Office got spooked by The Register.

According to one of these presentations, which your correspondent attended, the department will be using HDFS, the Hadoop File System, “for storing all the data” that its various directorates are imbibing, which “could be image, it could be video, it could be anything”.

Among the aims in using this data, according to consultant Stephen Deakin delivering a presentation at a Hadoop Use Group UK (HUGUK) meeting earlier this year, was “to create interactive applications for the border force at the border control points, also for police officers actually in their cars”.

The applications would “run on hand-held devices as well, as well as interactive applications potentially for other Home Office departments and also being able to produce transaction applications so we can run analytics we can run all sorts of various algorithms around there, including machine learning,” added Deakin.

Despite this increased capability to automate digital tracking of the population and the intention to run machine learning algorithms on the public's information, there has been no presentation of these details to Parliament and there will be no additional scrutiny or oversight mechanisms applied to it.

The plans were criticised by the leader of the Liberal Democrats, Tim Farron MP, who told The Register: “Trying to get away with a substantial change simply by labelling it as IT replatforming is simply unacceptable. With measures such as the request filter being pushed in the investigatory powers bill centralising databases will essentially allow Government to build up a full profile of every single person in the country.”

Farron added: “Trying to bypass Parliament is not an option and the Home Secretary must come clean about her real intentions.”

The number of databases that the Home Office directorates hold is unknown and has not been clarified by the department. The other speaker at the HUGUK meeting, the head of strategy and architecture, Simon Bond, recognised this and offered a slide suggesting the scale of those databases.

“Deliberately [the slide below is] a slide you can't read, and we won't be sending these out,” said Bond, “but what this [would] actually [be] saying if it was bigger, is how [do] we think about the Home Office in a way which isn't siloed, and start thinking about it in a way which we think about 'What are the capabilities that everyone needs?'”

The aforementioned slide from Bond's presentation. Photo Alexander J Martin for The Register





When asked about the intentions regarding new regulation of all of this data, a Home Office spokesperson told The Register that the storage and querying of data is currently “protected by the Data Protection Act, the Protection of Freedoms Act and the Official Secrets Act” and that the department's current approach to data would remain in place.

However, the spokesperson added: “As new modes of data integration and analytics emerge, we continue to review the adequacy of these policy and legislative frameworks and introduce additional controls as necessary.”

The TPT's “crucial work”, as a Home Office spokesperson described it, included “taking greater direct control over the design, delivery and operation of technology systems; standardising, integrating and reusing solutions across services and developing a broader supplier base, including niche expert suppliers.”

Such niche expert suppliers are likely to include San Jose-based Hortonworks. The Register understands that roughly two years ago the government department started looking at using Hadoop as a means of cutting the costs of its Oracle-dominated workloads.

At that time Hortonworks was the only accredited Hadoop support company listed on the government's procurement platform G-Cloud and so the contract was awarded to them without the contract tender for a proof-of-concept going public.

The contract, which lasted around three months, included 30-odd days of consulting work, but how this was carried out is unclear: The Register has learned that it was deemed to affect information of such sensitivity that the Home Office refused to allow Hortonworks' employees to co-locate with its own staff, and as such the department rented the business separate offices to work from.

It is unclear what current involvement Hortonworks has with the project, as the Home Office has provided no public estimates regarding its complete delivery.

Acknowledging that the department had “made some big mistakes with technology over the last few decades,” Bond said in his presentation that the moving away from those decades of outsourcing by building the open source “single platform” in the Home Office itself would help the department meet its 30 per cent budgeted funding cut until the next election.

El Reg has repeatedly contacted Hortonworks for comment but did not receive answers on this subject when we asked. We will update if we hear more. ®