The specific type of data that the DARPA program seeks to protect is your transactional data or data that you knowingly stream to a site or party. But it comes with a crucial caveat—the Brandeis program addresses data “that is knowingly provided to a third party, as opposed to data collected as a byproduct of interacting with the network or a system,” according to the DARPA announcement.

That caveat is important for two reasons: First, the goal is not just the protection of privacy but also the protection of privacy while users are engaged in the act of sharing data. Second, by focusing on information that citizens knowingly give to third parties rather than inadvertently provide as a result of merely interacting with machines the DARPA program rules out building any future information system that might, somehow, get in the way of the military or law enforcement collecting signals intelligence as part of investigations (to the continued objection of privacy advocates).

Why is protecting data sharing important? On this count, the DARPA announcement is so strident on the social value of yottabytes of user-generated data that the casual reader might think the notice came from the Google press office.

Data sharing will create “personal medicine (leveraging cross-linked genotype/phenotype data), effective smart cities (where buildings, energy use, and traffic controls are all optimized minute by minute), detailed global data (where every car is gathering data on the environment, weather, emergency situations etc.), and fine grained internet awareness (where every company and device shares network and cyber-attack data).”

Without strong privacy controls, none of those futuristic visions can be realized. But protecting the privacy of people who are voluntarily sharing data is no straightforward undertaking. The same correlational analysis that can reveal a relationship between a certain protein construction and cancer can reveal the individual that volunteered that genomic information. A person’s electricity-usage patterns are distinct, based on an individual’s schedule, devices, habits etc. All of that speaks to identity. For that reason, the idea of rendering data fully and permanently anonymous is a dubious one among much of the privacy community. The DARPA researchers acknowledge that obstacle by pointing to the work of Carnegie Mellon University’s Latanya Sweeney, who has shown that gender and zip code are enough to identify 87 percent of individuals by name.

But just because data is revelatory doesn’t mean that all data shared with a third party is viewable everywhere or at low cost. Ideally, the user who is sharing the data should be able to control how it’s viewed, rather than that responsibility falling on the third party. The point of the project is ensuring that those telltale data bits don’t wind up in the wrong places.