Asher Wolf is a journalist, information-activist and commentator. You can follow her on Twitter at @Asher_Wolf

Updated 07/02/2014: The original version of this article contained some inaccuracies. Please see the notes at the bottom of this updated article for details.

***


Shutterstock

Can we have a little chat about your privacy? "Why?" you ask.

Read next Ebola is back: WHO confirms outbreak in DR Congo following three deaths Ebola is back: WHO confirms outbreak in DR Congo following three deaths

Well, because it appears the UK government is selling off the rights to your personal health data in the name of "transparency" and "open data".

What?


Okay, it goes like this:

You might have noticed the Guardian newspaper article by Randeep Ramesh on 19 January, about a new initiative from the Health and Social Care Information Centre (HSCIC.)

Ramesh described how data harvested "from GP and hospital records, medical data covering the entire population will be uploaded to the repository controlled by the arms-length NHS information centre, starting in March".

Read next How effective are mental health apps? How effective are mental health apps?

The HSCIC responded within 24 hours to the Guardian article, with a statement claiming "patients and their carers should know that no data will be made available for the purposes of selling or administering any kind of insurance and that the NHS and the HSCIC never profit from providing data to outside organisations."


This appears to be a partially disingenuous statement, because despite HSCIC's privacy impact assessment, a publicly available PDF produced by the HSCIC sets out service charges for data linkage and extraction: approved organisations and individuals will be eventually allowed to buy a standard extract "containing personal confidential data" for the low, low price of a £1,594 set-up fee and a £2,782 fee for the data set. Although the NHS argues that this is a cost for processing the data, rather than a fee for the data -- it's about cost recovery, not profit.

The Care.data database is a UK national database which, according to the HSCIC, will "build on existing data services and expand them to provide linked data, that will eventually cover all care settings, both in and outside of hospital".

The database's users includes pharmacies, mental health services, opticians, dentists, education and training establishments, as well as the National Adult Social Care Intelligence Service (NASCIS.)

The HSCIC gathers Care Eposide Statistics (CES) which, according to HSCIC, "will provide commissioners with data from across primary, secondary and tertiary care, as well as community health services and social care."

Read next Can the NHS modernise without going broke? Can the NHS modernise without going broke?

The CES information includes demographics (such as post codes, birthdates and gender); events (such as hospital admissions; GP referrals as well as prescription data and pathology results. The datasets also gather information about specific conditions or disease types.

Data is colour-coded according to how risky it is. Green data is the least contentious. It involves the publication of average values for large groups of patients or completely anonymous figures. Red data is personally identifiable data and is very tightly controlled. Amber data is a little bit more tricky. It's supposedly "pseudonymised", meaning that each patient's identifiers (date of birth, postcode etc) have been stripped out.

The aggregated "green" data will be published openly, but record-level CES data will only be made available pseudonymously to nominated users under the terms of a legally binding agreement.

In September 2013 the NHS proposed an addendum to the NHS England's existing customer requirement for the Care.data general practice extract.

The addendum "seeks to increase the range of eligible recipients who may apply to the HSCIC for access to CES linked data in the form of disclosures of 'potentially identifiable data' in pseudonymous form (i.e., data that could be considered identifiable if published but are considered non-identifying when released into a controlled environment)."

Read next What's inside vape juice? What's inside vape juice?

Under the addendum, vetted "researchers and other recipients" will be allowed to access "potentially identifiable data". But considering that the NHS can barely manage the unsolicited spam and malware on its own webpages, should the public really trust them to advise on who should be given access to the public's personal health data?

It should be pointed out that back in March 2013, a report commissioned by the NHS found: "By providing such users with extracts of the pseudonymous CES file, there is a risk of malicious re-identification of patients from inference (a so-called "jigsaw attack")."

Already pharmaceutical companies like AstraZeneca are rubbing their hands with glee at the implications of being approved access to the datasets.

As security researcher Ross Anderson points out, there are typically only a few dozen addresses in a post-code, so with access to a birth date (that may come from sources outside of HSCIC) it is fairly easy to make a correct personal identification for about 98 percent of people (the exceptions are twins, students, soldiers and prisoners). For this reason, "amber" data can only be released into controlled environments -- but the NHS wants to extend access rights to "a wider audience" including researchers and companies vetted by the Data Linkage and Extraction Service within the HSCIC.

HSCIC's own guide to confidentiality points out the potential for messy dilemmas. For instance, the guide mentions that "removing the individual's name, age, address and other personal identifiers may not be sufficient to effectively anonymise the information".

Read next Last month we gave Yameen Rasheed a prize for his social startup. On Sunday he was killed Last month we gave Yameen Rasheed a prize for his social startup. On Sunday he was killed

Therefore, some of the amber data could be cross-referenced with other datasets (not issued by HSCIC) in order to de-pseudonymise the data.

While HSCIC attempts to fix legal issues by stating the information should always be shared in accordance with the law and organisations must abide by legal provisions which may ban or limit attempts to re-identify confidential information, plans for HSCIC to publicly track client compliance are yet to be revealed.

Though privacy advocates have bemoaned the lack of public clarity and transparency over exactly which organisations will be able to pay to access (HSCIC says this is simply about covering costs, not profiteering) particularly sensitive datasets, two companies may already have a head-start to the aggregated (green) data treasure trove: MedRed and BT.

Perhaps it's time we took a closer look.

Late last year the White House hosted an interesting little shindig. Obama and friends from the White House Office of Science and Technology got together in December 2013 to celebrate the launch of a cross-Atlantic cloud partnership providing commercial access to aggregated population data "of more than 50 million lives".

Read next What is 'runner's high'? Study finds your body makes its own 'cannabis' during exercise What is 'runner's high'? Study finds your body makes its own 'cannabis' during exercise

The partnership known as "MBHC" linked MedRed, a Washington, DC-based healthcare software company, and BT, a UK-based telecommunications cloud partnership. Essentially, MedRed has imported all of the UK's aggregated healthcare data (not identifiable patient-level data) to a cloud service based in the US. Considering the HSCIC's own publications suggests access to personal information in other datasets has a relatively cheap price, this could be concerning.

MedRed's CEO Will Smith even said that the UK's approach to data release was "gutsy". "People are using foreign data because it's available," he said. "The UK made some gutsy decisions about data liberation. There's political risk associated and they have a more tolerant climate over there."

Combining public datasets with deep analytic tools and big data, MedRed and BT plan to charge for access to MBHC in the future.

Already a beta version of MBHC is in use by pharmaceutical companies and universities. And what is to stop those corporations and organisations from requesting access to "amber" data from HCSIS when they find something particularly interesting in their combination of the public datasets and big data analysis?

But wait, there's more.

Read next Ageing is a disease. Gene therapy could be the 'cure' Ageing is a disease. Gene therapy could be the 'cure'

Perhaps we should do a little digging into the background of MedRed as well?

MedRed was first contracted by the US Army to develop medical decision support for chemical, biological weapons and blast injury in 2007 and provides software used to track traumatic brain injuries and PTSD in active duty military personnel and veterans.

MedRed's technology has been used by the Kenyan government to track disease outbreaks, and provides product support to the US Department of Defense and Veterans Affairs medical treatment facilities. MedRed has also created a mobile form of its technology called EFR MedCom "to support first responders, state and local activities, and functions involving the participation of private citizens" in emergency services.

As a 2009 article from mHealth News notes: "EFR MedCom is also designed to facilitate communication of casualty data to a command centre, facilitate information exchange through the Nationwide Health Information network and develop information databases through the Google Health platform."

Oh, and MedRed's board of directors and management? Well, there's a bunch of guys with an interesting history.

Read next Would you trust your life to an 'autopilot' robo-doctor? Would you trust your life to an 'autopilot' robo-doctor?

The founder, William Kennedy Smith, is the nephew of former US President John F. Kennedy. Other board members include a former staffer from Booz Allen Hamilton (the firm for which whistleblower Edward Snowden worked); a senior US public health policy adviser, a former US Department of Health and Human Services Secretary; and a former President and CEO of the Russian-American Enterprise Fund.

The management team includes a military technologist who worked on US Army, Air Force and Navy projects developing "computer-assisted medical diagnosis and treatment systems, digital television broadcast and reception systems, network operations for space-based telephone communication systems and satellite communication systems"; an "Air Force headquarters' expert on space-based laser systems", who "led the development of acquisition processes for major DoD efforts such as the Medical Community of Interest project for the Integrated Electronic Health Records (iEHR) programme"; and a data-mining expert who "architected, designed and built some of the largest and most successful national security threat assessment systems within the Department of Homeland Security."

It's spooky stuff. But scratch half of the Fortune 1000 companies that engage in military contracting and you'll find a similar range of characters.

I have no doubt that governments could find all kinds of amazing results using the data contained in NHS records. MedRed's work with traumatic brain injury is undoubtably important, but by now you're probably asking, "Should I be worried about Care.data and MedRed?"

Hell yes, particularly considering the lack of transparency and the dubious history of governments and organisations when it comes to safeguarding medical data privacy. I don't know about you, but I wouldn't want to give a US military contractor access to my personal health data -- aggregated or not -- for medical research and software development purposes.

Read next Living with MS: the innovators working on ways to help millions battle multiple sclerosis Living with MS: the innovators working on ways to help millions battle multiple sclerosis

Whether it's through MedRed or Care.data, the crux of the issue is that ultimately we don't know who will get access to our health data and for what purposes the data will be used.

What happens when commercial corporations use jigsaw attacks to de-anonymise the public datasets for profit? And despite making their customers sign contracts, there is no realistic way HSCIC can effectively prevent such kinds of de-anonymisation. How will the Information Commissioner even know?

You can opt-out from your GP data being placed into the HSCIC's database.

In fact there's a website, called MedConfidential.org, that will help you take back control over your personal health data.

They've even drafted a form letter you can give to your GP, to ensure your NHS Care Records aren't given to BT and MedRed.

Privacy is a basic human right that should be afforded to people everywhere. Transparency is meant to keep the corporations and governments accountable. While there is much public good that can come from open source data sets, ultimately you should be in charge of how your personal information is shared and used.


Just don't let your government tell you anything different -- particularly not when they're attempting to sell off your personal data.

The author thanks the following people for their invaluable consultation and advice: Professor Ross Anderson, Cambridge University's Head of Cryptography and Professor in Security Engineering at the University of Cambridge Computer Laboratory; Eleanor Saitta, Principal Security Engineer at the Open Internet Tools Project (OpenITP); and Trevor Timm, Co-founder and Executive Director of the Freedom of Press Foundation.

Updated 07/02/2014: This article -- originally titled "Care.data and the murky US partnership that puts your health data at risk" -- contained a number of inaccuracies and has been updated to reflect this. It stated that Care.data was known as "the Spine" within the NHS. This is not the case. It also suggested that MedRed would have access to identifiable patient data, be it pseudonimised or otherwise; MedRed only has access to the aggregated "green" data which will be made public on data.gov.uk. It's unclear whether or not it will gain access to any of the "amber" data in the future. The article also originally stated that the GP data extract would include details about drug addition, sexual health and abortion procedures. These "sensitive" data will remain in the GP, although that may be reconsidered at a later date, according to HSCIC.