Given the heightened sense of concern about our personal data following the Edward Snowden revelations, it probably isn’t the best time for the NHS to ask us for more. So it simply hasn’t bothered asking.

The information contained in your health records has the potential to help researchers gain insight into disease and to allow authorities to make better decisions about healthcare provision so it’s easy to see why the NHS wants to make it more available to them.

Rather than being asked for explicit consent, we are merely being informed by an unaddressed leaflet through the front door. The leaflet tells us that if we do not explicitly opt out of a new scheme, our GPs will pass our data to the Health and Social Care Information Centre. This so-called “care.data” extraction programme has been made compulsory for GP practices as a result of the 2012 Health and Social Care Act.

The act ensures the forthcoming NHS data extraction is exempt from most of the 1998 Data Protection Act. For the last two years, the EU has been negotiating a revision of the Data Protection Directive that underlies this act. It has become a drawn-out and intensely fought battle that shows both how badly revision is needed, and the strength of the interests at stake.

From the start, the 1998 act gave special status to medical data, because of its highly personal and confidential nature. This is unlikely to change. The latest explanation from the Information Commisioner’s Office is that once the data has reached HSCIC, patients need to be told whenever their data is shared, with whom, and why. Asking instead of telling, though impractical, would clearly have been better.

HSCIC knows very well what it will need to do to protect the data. Its Privacy Impact Assessment and the supporting NHS anonymisation standard indicate as much. When data is to be shared with research institutes, insurance companies or think tanks, for example, certain rules apply. A risk assessment must be done first to decide on whether access should be granted and the appropriate level of anonymisation that will be needed to do it safely. A contract then stipulates how the data can be used and this can be audited by HSCIC. Geraint Lewis, the NHS Chief Data Officer, is putting up a valiant defence of the scheme on this basis.

HSCIC takes careful decisions on the level of anonymisation needed when data is handled by other parties. “Red” data contains highly personal information such as date of birth, postcode, NHS number and gender, although not your name, and is shared only in clearly delimited cases. “Green” data consists of summarised data about larger groups of patients and is published openly. But a third category, “orange” data, is the main area of concern, as it can be sold to any organisation which is found suitable.

Anonymity matters

The decision on how much personal information is left in orange data is based on what other information the organisation using the data has access to. HSCIC is acutely aware of the risk of de-anonymisation, and particularly the so-called “jigsaw” variant. This describes the ability of organisations to use other information to piece together who is who in an anonymised database.

An example of this problem came to the fore recently when it was revealed that the NSA collects metadata on phone calls. Even though it claimed it is “not recording names and locations”, it still has access to information that would allow it to join the dots – much of it is readily available in the phone book. An extreme form of anonymisation is pseudonymisation: medical records have only a meaningless tag left to separate the different patients. However, even in that case, combining GP appointment times with mobile location data would mostly de-anonymise the information.

Medical information may well be the last of our personal data that isn’t compromised, given reports that intelligence services have all our text messages, mobile location data and communications metadata. It has been claimed that these services keep all data for “insurance” purposes, and that they have already used porn browsing data for reputation attacks. Controlled release of some kinds of medical data would be just as effective.

From a computer security perspective, one needs to worry about even creating such a huge medical database in the first place. The NSA and GCHQ have been able to gather such a vast amount of data partially because technology companies such as Facebook and Google hold so much information on users. They have essentially become honeypots that are too much of a temptation for intelligence hungry agencies to resist. How can we believe a medical database to be safe, if even the tech giants (and in turn the intelligence services) have been shown to be so incapable of adequately protecting their own sensitive material?

In all this, it is clear that the Data Protection Act is outdated. No matter how well the use of our data is policed, the highest penalty available is half a million pounds, and no jail sentence applies. It’s “pocket money”, as EU commissioner Viviane Reding said in relation to a Google conviction in France.

Health data can be extremely valuable for research but it has also been suggested that insurance companies should be allowed access to HSCIC information in order to make more reliable actuarial estimates, presumably based on summarised medical information. The gains from that would indeed be many millions, but nothing compared to the gains these companies could make from using and even abusing full medical data to push up premiums for individual customers or reject customers based on their medical history.

HSCIC provides data at cost price, but at that price it could never provide enough policing and auditing to prevent abuse. Insurance companies and public service providers may well get in through the back door anyway, through increased and potentially irreversible NHS privatisation. Even if our data ends up inside NHS divisions now, can we even prevent those divisions turning into Bupa, G4S or Atos in the future?

It has been coming for some time and now we have reached the point at which big data has simply outgrown the Data Protection Act. In terms of the volume of data, in terms of the sheer possibilities, and most importantly in terms of the financial stakes. Hopefully the Data Protection Directive revision will improve the legal aspects but as we learn how to store and use big data, we also need to come up with secure methods to provide consent and access for big confidential data.