By Mackenzie Graham

Facebook CEO Mark Zuckerburg recently appeared before members of the United States Congress to address his company’s involvement in the harvesting and improper distribution of approximately 87 million Facebook profiles —about 1 million of them British— to data collecting app Cambridge Analytica. In brief, Cambridge Analytica is a British political consulting firm, which uses online user data (like Facebook profiles), to construct profiles of subjects, which can then be used for what it calls ‘behavioural micro-targeting’; advertisements tailored to the recipient based on their internet activity. In 2016, Cambridge Analytica was contracted by Donald Trump’s presidential campaign, as well as the ‘Leave EU’ campaign prior to Britain’s referendum to leave the European Union.

The controversy involving Facebook concerned how Cambridge Analytica was able to acquire 87 million profiles, seemingly without user consent. According to Facebook’s policy, user data may be provided to third-parties, provided it is for academic use. Cambridge Analytica was able to acquire this data through Alexander Kogan, a psychologist at Cambridge University, whose app ‘thisismydigitallife’ gave him permission to access the profiles of about 320,000 Facebook profiles, and all of their friends’ profiles. Because this original use of the data was allegedly for academic purposes, Facebook allowed it to occur. And when it became clear that Cambridge Analytica had obtained this data illicitly, Facebook requested that Cambridge Analytica simply delete the data.

After news of the scandal broke, Zuckerburg issued a statement, in part saying that the policies which allowed for the misuse of data were “a breach of trust between Facebook and the people who share their data with us and expect us to protect it.” Facebook COO Sheryl Sandberg also called it “a major violation of people’s trust”.

The issue of trust is pervasive in the ethics of ‘big data.’ Take for example the UK Biobank, a 30 year study in which 500,000 volunteers in the UK provided a litany of biological information to researchers, including material from which genetic information could later be derived. The purpose of UK Biobank is to provide a large database from which researchers can apply to be provided anonymized biological information for the conduct of health-related research in the public interest. However, because this database is intended inform future research projects —the majority of which have yet to even be conceived—it is impossible to say exactly how the personal data of participants will be used. In a sense, participants are simply being asked to ‘trust’ researchers that their data will be secured, and only used for appropriate purposes (i.e., health related research in the public interest). This makes some people nervous, and reluctant to allow their personal biological data to be used for research purposes.

I should be clear that I am not equating Facebook and UK Biobank. One is a privately owned company, with explicitly capitalist interests, in a domain in which there is minimal government regulation and oversight. The other is academically-driven, intended to facilitate research that will improve the treatment of various illnesses, and heavily regulated. However, the controversy over Facebook’s breach of trust highlights an important problem with the inadequacy of traditional notions of informed consent in the context of ‘big data.’

In bioethics, consent is considered ‘informed’ when an individual with full decision-making capacity, to whom full disclosures have been made, and who fully understands these disclosures, voluntarily consents to treatment. While there is significant debate in the bioethics literature about how to justify the importance of informed consent, it is nevertheless ubiquitous in bioethics guidelines, and is an important part of maintaining public trust in the research enterprise. The basic idea is that if an individual is agreeing to allow themselves to be subjected to a research intervention for the public good, they have a right to be informed about what will be done to them and how the results will be used, and use this information to make a decision about whether they want to participate or not.

The challenge presented by research driven by ‘big data’ is that informed consent, at least in the traditional bioethics sense, is no longer possible. On the one hand, researchers cannot state with certainty how a participant’s data will be used, although they may be able to provide some account of the conditions under which it will be used. It would be impossible to re-acquire informed consent from participants every time their data is to be used in another study. On the other hand, gaining informed consent is difficult even in the best of circumstances; even if researchers could describe all the ways that participant data might be used, it is unrealistic to expect participants to understand. Informed consent has always been overstated as a panacea for protecting patient interests in a research context, but it is clearly not sufficient in this emerging research context. Traditional conceptions of autonomy and informed consent, long-standing pillars of ethics in medical research and practice, must be re-examined in light of the new ways in which data can be collected, stored, and used.

Research informed by big data has the potential to provide a tremendous public good, but presents novel ethical challenges. Participants need to feel confident that their biological data is secure, and will not be used for illicit purposes. Part of building this trust will be developing regulations controlling the use of data obtained for research purposes. The scandal involving Facebook and Cambridge Analytica was precipitated by data being provided to a third party for ‘academic purposes.’ Distributing this data to Cambridge Analytica was clearly unethical (and allegedly a violation of Facebook’s policy). This had led to calls for greater regulation of the acquisition and transmission of online user data. While strong, well-reasoned regulations can help to prevent this kind of unscrupulous distribution of data, regulation alone likely won’t be sufficient, in a research context, or otherwise. Policies and regulations, even when adhered to, may not anticipate advances in research, and fail to fully protect participants. It may not be enough for researchers —or committees granting ethics approval for research— to simply ‘follow the ethics guidelines.’

Ethicists have an important role to play in engaging with researchers and their work, to balance the public good of research, and the interests of research participants. Many of the researchers I have been fortunate to work with are genuinely concerned with conducting ethical, responsible research; cultivating this attitude through ethics education, and ethicists and scientists working together, is critical to ensuring a research enterprise worthy of public trust. It turned out to be a mistake to simply trust that Facebook would act in the interest of its users. A similar violation of the public trust in research could have even greater costs.