UpGuard can now disclose that a database containing personal information for over thirty-seven thousand individuals has been secured, preventing any future abuse. The database belonged to Neoclinical, an Australia-based company that matches individuals with active clinical trials. In reviewing the data set, the vast majority of individuals affected were in Australia and New Zealand, where Neoclinical operates clinical sites. In addition to contact information, the database included users’ responses to questions qualifying them for clinical trials, which included questions about medical diagnoses, illicit drug use, and treatments.

The Discovery

On July 1, an UpGuard researcher detected a MongoDB database named “neoclinical.” The database included collections for different entity types involved in connecting users to clinical trials: the accounts for the medical organizations running the trials, qualifying questions to determine the fit of the users, the “users” themselves seeking entry to those trials, and more. That day the researcher sent an email notification to Neoclinical. The researcher called both phone numbers on Neoclinical’s website, one of which was disconnected and the other was configured to record a ten second message to be transcribed and sent as text. On July 25 the researcher escalated notification to AWS Security, which followed their standard procedure of responding that they would notify the owner of the database. On July 26, public access to the database was removed.

The Significance

Modern medicine is an advanced, specialized practice, and for that reason medical procedures are typically constrained to dedicated sites, like a hospital or doctor’s office. Efforts to protect medical data likewise focus on threats to those sites, like mitigating the ransomware attacks (like WannaCry) that have struck many hospitals. The data generated from medical examinations, however, can enter other circuits of the digital world, sidestepping the regulation of the hospital. Neoclinical is one example of a company filling a particular role in the larger economy of healthcare that extends far beyond the relationship between doctor and patient within the protections of the hospital.

Research and development of new therapies is one part of healthcare writ large, and clinical trials are a subset of that development process. To confirm the effectiveness of a course of treatment requires a scientific study of the therapy’s effect with a statistically significant group of patients. Vetting individuals for inclusion in those trials requires gathering information about their health. In the case of the Neoclinical dataset, that information includes individuals revealing personal information their conditions ranging from cancer to incontinence.

The Neoclinical website claims they have 37,170 users, and that is exactly the number of rows in the “users” collection of their database. Each of those users has a profile with a collection of information describing their fit for the various trials being coordinated with Neoclinical. Part of the profile is personal information like name, email address, physical address, geo coordinates for that address, and date of birth. Additionally, the user information includes their responses to the questions and any trials for which they qualified.

Example of a redacted profile with name, email, address, date of birth. The “answers” field contains structured responses to questions about personal health.

Database statistics for the “users” collection showing 37,170 records

Each of those users has entered the Neoclinical system for the purpose of participating in a clinical trial. Whether they qualify for a trial depends on their responses to questions about their medical history. Some questions are about the frequency or severity of symptoms, while others are about past treatments.

Example of a series of questions about incontinence.

Questions about the use of pharmaceutical therapies

Questions about cancer treatments





Questions about past diagnoses for heart conditions

Conclusion

Without exposing documents produced by a physician– what one often thinks of as the model of “medical data”– these profiles reveal information about participants' medical histories. That information includes information generated by their interaction with the healthcare industry, like diagnoses and past treatments, as well as reports of their personal experiences related to their health. For individuals, this case provides a reminder that whenever they pass information to a third party, they should consider the impact of that data being exposed. And for companies, it should highlight the importance of having an incident response capability so that when data leaks occur, they can be mitigated within hours rather than weeks.