This weekend, Google and Apple announced a massive coronavirus partnership. In the coming months, they’ll be rolling out updates to their operating systems to enable “contact tracing” – the process of identifying carriers of coronavirus so they can be isolated from the healthy population. The system will track with whom you come into contact by recording when your Bluetooth connects with other devices near you.

Contact tracing precedes Google and Apple, of course. During the 2014-16 Ebola crisis in West Africa, the World Health Organization carried out extensive on-the-ground interviews with people regarding where they went and with whom they came into contact. These people were then told to watch for symptoms and to quarantine themselves as needed.

Every country affected by the coronavirus is now adopting their own version of contact tracing and almost all are going digital, leveraging the smartphones in people’s pockets through Bluetooth or geolocation data. How they go about it reflects local laws and norms around the use of personal data and people’s rights to privacy. For example, contact tracing in the European Union must be compliant with the EU’s privacy law, the GDPR, which gives Europeans more control over their data than Americans currently enjoy.

The Pan-European Privacy-Preserving Proximity Tracing (PEPP-PT), a consortium of over 130 research organizations from eight countries, is putting together a variety of different proposals for contact tracing, including the Decentralized Privacy-Preserving Proximity Tracing (DP-3T) initiative backed by 25 academic researchers. PEPP-PT could offer a model for protecting privacy while carrying out necessary disease surveillance, they argue.

“This system is very good because it doesn’t leak,” says Claudia Diaz, Associate Professor and researcher at Katholieke Universiteit Leuven and Chief Scientist at Nym, an open-source, decentralized, permissionless protocol. “It’s very difficult to extract any meaningful information from what is visible, because there’s just some random keys and nobody can make sense of those keys unless you interact with that person.”

Europe’s debate over how to carry out contact tracing delineates the questions that the US will need to answer with its own contact tracing system. They include how to keep users’ Bluetooth IDs truly anonymous, how to secure the upload of Bluetooth data to servers, and what a decentralized, open system approach might look like.

The DP-3T proposal

DP-3T is like Singapore’s national TraceTogether app, which monitors the exchange of Bluetooth signals with other users of the app. If individuals are diagnosed with coronavirus, they can choose to allow the government to access their app, and see what other phones they were near, or had crossed paths with, and alert those individuals. The system creates a random ID for people’s phone numbers, which are exchanged between phones, rather someone’s actual phone number.

DP-3T processes the contact tracing data locally on the user’s device. Then, when a person is officially diagnosed with coronavirus, a health agency would authorize the upload of a record of Bluetooth contacts, each assigned a random ID that regularly changes. It then sends those Bluetooth IDs of an infected person to other devices, to see if there was a crossover within its own record of Bluetooth contacts, and then alert the device’s user if there was contact.

Through this design, the random IDs don’t need to be centralized in any way, which limits the privacy risks, as well as the potential re-appropriation of data for other purposes, like state surveillance, the researchers say. The design would encourage trust in the apps built on the protocol, making them more likely to be downloaded and therefore more effective, they argue.

A centralized approach raises the risk of abuse by a nefarious actor.

“With the approach that our team is exploring, you would not upload all your observed codes in a central database, but the key to generate the codes would be put in a database that will be sent to all the phones,” said Bart Preneel, a cryptography professor at Katholieke Universiteit Leuven, who is working on the DP-PPT project and is an advisor to Nym.

The random codes your phone collects don’t give location information, or any other information other than what other codes you were in close proximity to. “The keys of infected people would be sent to all phones, and with this key, every smartphone can have an algorithm to detect whether yes, a code they’ve come in contact with matches this key. And that, we believe, is maximally private,” Preneel says.

According to Diaz, centralized networks inherently include privacy vulnerabilities. For example, when someone uploads data, such as a Bluetooth ID, to a backend server, that could correspond to them alerting health authorities they got infected. Observing this metadata traffic on a network level means that a person could potentially be identified, though it wouldn’t be easy.

See also: Privacy Advocates Are Sounding Alarms Over Coronavirus Surveillance

“The backend that is receiving this information would be able to see the IP address,” says Diaz. “So, the IP address of my home is the IP address from which I send these messages and these messages correspond to somebody who has tested positive. So, they would be able to infer that I am positive or the people living in my house are positive.”

A centralized approach raises the risk of abuse by a nefarious actor or state level adversary. The privacy of such data is not trivial. There have been numerous racist attacks due to coronavirus and many people fear being evicted due to loss of income or even for being diagnosed with coronavirus.

Preneel says the DP-3T proposal, in part, figured out a work around on this. Even if you aren’t infected, your phone would send a dummy string and not the key to a server. That way, your phone is regularly sending messages to the server, which means someone wouldn’t be able to identify which communication might actually mean you’re infected. But that work is still in development.

Diaz said the dummy traffic obscures whether someone has tested positive, but the backend server can distinguish if the message being received is a positive report that should be published or just a dummy message to be discarded. So the backend server can associate the observed IP address with the positive report.

Harry Halpin, the CEO of Nym, a privacy startup, has an additional tool that can address this. He’s offering up Nym’s mixnet as one alternative to build contact tracing apps on.

A mix network (taking its name from the proxy servers it employs, called “mixes”) obscures the metadata left behind when data passes through a network. It does this by taking messages, or packets of data, from one place, holding them, and then waiting for a few more to come in. Then it shuffles them or mixes them, like you would a deck of cards. It then hands those to the next proxy server, which then waits for some more packets, shuffles them, and so on. If there aren’t enough packets they make up fake packets, which is dummy traffic. While it does make the network slower, it’s much more anonymous, and resists the observation of metadata.

“With Nym, you can communicate freely without your internet traffic revealing your metadata. It’s even more important now given increased surveillance due to the coronavirus. Nym’s mixnet resists a global passive adversary like the NSA that can record all the traffic coming in and out of a network,” said Halpin.

Nym has asked to join the PEPP-PT and is currently building a coalition with related initiatives, like those pursued by Henry de Valence from the Zcash Foundation and Carmela Troncoso, a professor at the Swiss Federal Institute of Technology, Lausanne, who is the lead contact for Europe’s private corona contact tracing.

After reviewing the U.S. proposal involving Google and Apple, Halpin says something like DP3T is not ideal, but could be the best of suboptimal options if speed is of the essence.