I Got My File From Clearview AI, and It Freaked Me Out

Here’s how you might be able to get yours

Photo: Aitor Diago/Getty Images

Have you ever had a moment of paranoia just before posting a photo of yourself (or your kid) on social media?

Maybe you felt a vague sense of unease about making the photo public. Or maybe the nebulous thought occurred to you: “What if someone used this for something?” Perhaps you just had a nagging feeling that sharing an image of yourself made you vulnerable, and opened you up to some unknowable, future threat.

It turns out that your fears were likely justified. Someone really has been monitoring nearly everything you post to the public internet. And they genuinely are doing “something” with it.

The someone is Clearview AI. And the something is this: building a detailed profile about you from the photos you post online, making it searchable using only your face, and then selling it to government agencies and police departments who use it to help track you, identify your face in a crowd, and investigate you — even if you’ve been accused of no crime.

I realize that this sounds like a bunch of conspiracy theory baloney. But it’s not. Clearview AI’s tech is very real, and it’s already in use.

How do I know? Because Clearview has a profile on me. And today I got my hands on it.

Clearview AI was founded in 2017. It’s the brainchild of Australian entrepreneur Hoan Ton-That and former political aide Richard Schwartz. For several years, Clearview essentially operated in the shadows. That was until an early 2020 exposé by the New York Times laid bare its activities and business model.

The Times, not usually an institution prone to hyperbole, wrote that Clearview could “end privacy as we know it.” According to the exposé, the company scrapes public images from the internet. These can come from news articles, public Facebook posts, social media profiles, or multiple other sources. Clearview has apparently slurped up more than 3 billion of these images.

The company then runs its massive database of images through a facial recognition system, identifying all the people in each image based on their faces. The images are then clustered together which allows the company to form a detailed, face-linked profile of nearly anyone who has published a picture of themselves online (or has had their face featured in a news story, a company website, a mugshot, or the like).

Clearview packages this database into an easy-to-query service (originally called Smartcheckr) and sells it to government agencies, police departments, and a handful of private companies.

Clearview’s clients can upload a photo of an unknown person to the system. This can be from a surveillance camera, an anonymous video posted online, or any other source. In emails received by the Times, a detective even bragged about how the system worked on photos taken of unsuspecting subjects through a telephoto lens.

In a matter of seconds, Clearview locates the person in its database using only their face. It then provides their complete profile back to the client. As of early 2020, the company had more than 2,200 customers using its service.

What does a Clearview profile contain? Up until recently, it would have been almost impossible to find out. Companies like Clearview were not required to share their data, and could easily build massive databases of personal information in secret.

Thanks to two landmark pieces of legislation, though, that is changing. In 2018, the European Union began enforcing the General Data Protection Regulation (GDPR). And on January 1, 2020, an equivalent piece of legislation, the California Consumer Privacy Act (CCPA), went into effect in my home state.

Both GDPR and CCPA give consumers unprecedented access to the personal data that companies like Clearview gather about them. If a consumer submits a valid request, companies are required to provide their data to them. The penalties for noncompliance stretch into the tens of millions of dollars. Several other U.S. states are considering similar legislation, and a federal privacy law is expected in the next five years.

Within a week of the Times’ expose, I submitted my own CCPA request to Clearview. For about a month, I got no reply. The company then asked me to fill out a web form, which I did. Another several weeks passed. I finally received a message from Clearview asking for a copy of my driver’s license and a clear photo of myself.

I provided these. In minutes, they sent back my profile.

For reference, here is the photo that I provided for my search.

It’s a candid cellphone photo of me making latkes. I deliberately sent a photo with a lot going on visually, and one where my face is not professionally lit or framed. I wanted to see how Clearview would perform on the kind of everyday photo that anyone might post to social media.

Here is the profile that I got back. Redactions in red are mine, as described below.

Based on the timing of emails and data from the Times’ story, I estimate that Clearview retrieved my profile in under one minute. It could have been as fast as a few seconds.

The depth and variety of data that Clearview has gathered on me is staggering. My profile contains, for example, a story published about me in my alma mater’s alumni magazine from 2012, and a follow-up article published a year later.

It also includes a profile page from a Python coders’ meetup group that I had forgotten I belonged to, as well as a wide variety of posts from a personal blog my wife and I started just after getting married.

The profile contains the URL of my Facebook page, as well as the names of several people with connections to me, including my faculty advisor and a family member (I have redacted their information and images in red prior to publishing my profile here).

From this data, an investigator could determine quite a lot about me. First and most obviously, they would know my name. They would also know where I went to school, what line of work I’m in, and the region where I live.

From my Facebook page, they could see anything I post publicly (I was so shocked by the data available there that I made my profile private after receiving Clearview’s report). And they would have data on several of my known associates — more than enough to access their Clearview profiles, too.

If someone was trying to track me down — especially someone with police powers at their disposal — Clearview’s profile would give them more than enough data to do so.

Perhaps most worrying is the fact that some of Clearview’s data is wrong. The last hit on my profile is a link to a Facebook page for an entirely different person. If an investigator searched my face and followed that lead (perhaps suspecting that the person was actually my alias), it’s possible I could be accused of a crime that the unknown, unrelated person whose profile turned up in my report actually did commit.

Remember, all of this data was retrieved using a single image of my face. I’ve never been arrested or convicted of a crime. Clearview gathered my data without my knowledge, and without any justification or probable cause.

They’ve likely done the same for you. If they have — and you’re a resident of California or a citizen of the EU — the company is legally obligated to give you your profile, too.

To access it, scan or photograph your driver’s license, and choose a clear photo of yourself where your face is fully visible (not obscured by glasses, a hat, or other objects). Send these to privacy@clearview.ai via email. Clearly state that your message is a CCPA or GDPR request.

Follow any instructions you receive. Expect your request to take up to two months to process. Be persistent in following up. And remember that once you receive your data, you have the option to demand that Clearview delete it or amend it if you’d like them to do so.

I know. A lot is going on right now. But as the novel coronavirus spreads worldwide, many facial recognition companies are using the pandemic as a reason to expand their services, including surveilling the public. Sometimes this leads to helpful safety measures. But we also need to be aware of — and actively manage — the privacy implications of this expansion.

Beyond the creepiness factor, Clearview’s intelligence gathering raises an age-old question. If you’ve done nothing wrong, should you care that the company is gathering data about you? If you’re a law-abiding citizen, it shouldn’t matter, right?

The issue with this is that doing “something wrong” is a very slippery concept. Clearview could be used to investigate serious crimes. But it could also be used to identify every person who attended a political rally or protest, using only surveillance photos or images posted on social media.

As the Times points out, it could also be used to blackmail nearly anyone. An unscrupulous user could record people having an embarrassing conversation in public, determine their identity using their faces, and threaten to publish the conversation unless they paid up.

It could also be used to look people up indiscriminately, for no reason at all. As the Times discovered, Clearview has laid the background for accessing its system via AR glasses. This means it’s conceivable that a police officer could walk through a crowd wearing AR goggles, and see the name and background information of every person in their line of sight superimposed over the person’s head in real time.

We assume that we can enjoy a certain level of anonymity, even in public spaces. Clearview’s technology turns that assumption on its head.

There are other major issues with Clearview’s system. Its treatment of copyright, for example, should be enough to make any plaintiff’s lawyer’s mouth water.

Several of the photos that Clearview gathered of me — and integrated into its product — were taken by professional photographers. The picture on my Meetup profile page, for example, was taken by a photographer I hired. I own the copyright to it. And Clearview never obtained a license to use it.

On this front, the company attempts to hide behind the Digital Millennium Copyright Act. The DMCA provides a safe harbor for platforms like Facebook or Google if their users post copyrighted material. But Clearview is not a platform. And users aren’t posting photos to their system — the company is actively grabbing these on its own, without copyright owners’ consent. The DMCA is unlikely to apply to it.

The laws around fair use and A.I. products are complex and evolving. But if courts come down on the side of copyright owners, Clearview could potentially be sitting on 3 billion copyright infringements. Each could be worth up to $250,000 in statutory damages, provided copyright owners properly registered their rights. It’s a flaw in Clearview’s model that could easily bring down the company.

Even if Clearview disappears, though, another similar company would just start up in its place. Web scraping and facial recognition are essentially commodity products today. For a few million dollars, nearly anyone with a startup background could likely build their own version of Clearview in less than a year.

To truly ensure that companies like Clearview (or their clients) don’t abuse citizens’ privacy, society needs clear legislation dictating when facial recognition and other related technologies can and cannot be used. Already, some cities have begun to pass such legislation. But to be effective, this needs to happen at a much broader scale.

To the company’s credit, Clearview’s system is not just a privacy pariah. It’s also a breakthrough technology for investigating abhorrent crimes like child sexual abuse. As the Times reports, in one case Clearview helped to catch an alleged predator based on a reflected face in an unrelated photo posted at a gym. It’s also a powerful tool for solving long-abandoned murders, and all manner of other cold cases.

Any legislation governing technologies like Clearview’s should protect citizens from random searches. But at the same time, it should allow authorities to use services like Clearview when their use is justified.

As with any issue involving privacy, until strong legislation is in place, it’s up to us as citizens to stay informed, and to protect our own rights.

Luckily, in the United States, we have a document that deals with achieving the proper balance between protecting society and respecting the rights of individuals. It’s called the Constitution.

If searches on Clearview followed the same rules as other searches (like the requirement that police agencies obtain a warrant to perform them), this would be a huge step toward protecting the privacy of innocent citizens. Crucially, it would still allow investigators to use the system to solve serious crimes just as they currently use court-authorized searches to investigate suspects.

Until such legislation is passed, the field of facial recognition is essentially the Wild West. Companies can gather nearly any data they want about you, and use it for nearly any purpose.

As with any issue involving privacy, until strong legislation is in place, it’s up to us as citizens to stay informed, and to protect our own rights. For Californians — and citizens of the myriad states and countries developing their own privacy laws — we now have powerful tools on our arsenal to do just that.

Rather than waiting for legislation to arrive, leverage these tools today to find out with Clearview and other corporations know about you. Then decide what you want to remove, amend, or leave in place.

The power to control data has traditionally rested with big companies. But it’s increasingly shifting into our hands. Only through our own vigilance and action can we ensure that we understand and control the data gathered about us.