In the age of big data analytics, the proprietary algorithms web sites use to determine what data to display to visitors have the potential to illegally discriminate against users. This is particularly troublesome when it comes to employment and real estate sites, which could prevent users from having a fair crack at jobs and housing simply by failing to display certain listings to them based on their race or gender.

But four academic researchers who specialize in uncovering algorithmic discrimination say that a decades-old federal anti-hacking statute is preventing them from doing work to detect such discrimination. They say a provision of the Computer Fraud and Abuse Act could be used to criminally prosecute them for research that involves scraping publicly available data from these sites or creating anonymous user accounts on them, if the sites's terms of service prohibit this activity.

The researchers, along with First Look Media Works, which publishes The Intercept, filed a lawsuit today against the Justice Department, asserting that opening fake profiles to pose as job and housing seekers constitutes speech and expressive activity that is protected under the First Amendment. They further argue that because sites can change their terms of service at any time without informing visitors, this can suddenly turn any speech or activity on the site into a criminal act—a violation, they say, of the Fifth Amendment right to due process, which requires proper notice to the public of what constitutes criminal behavior.

They're asking the US District Court in the District of Columbia to enjoin the government from enforcing what they say is an unconstitutional provision that prevents them from doing meaningful research.

"Being able to run socially beneficial studies like ours is at the heart of academic freedom," Christian Sandvig, an associate professor of information and communication studies at the University of Michigan and one of the plaintiffs, said in a statement. "We shouldn’t have to fear prosecution just because we’re doing our jobs."

The case gets at the heart of what many consider to be a problematic provision in the anti-hacking law. Ordinarily, violations of a site's terms of service should only allow a site to bring civil action against users who breach those terms. But under the CFAA, federal prosecutors have interpreted terms-of-service violations as exceeding a site's authorized access, a criminal hacking violation that carries a maximum prison sentence of one year and a fine. Subsequent violations can result in a sentence up to ten years in prison and a fine.

The risk of prosecution for violating a site's terms of service isn't limited to academics, nor is it theoretical; the government has already done so at least twice. In 2008, federal prosecutors charged a Missouri woman named Lori Drew with three counts of violating the CFAA after she and two others created a fake Myspace profile to bully a classmate of Drew's daughter, who subsequently committed suicide. Myspace’s user agreement requires registrants to provide factual information about themselves; in creating a fake profile for a nonexistent teenage boy in violation of those terms, federal prosecutors asserted that Drew obtained "unauthorized access" to MySpace’s servers.

The next year, the government prosecuted the owners of the ticket-scalping service Wiseguy Tickets for using a script and botnet to bypass Captcha protections on several ticket-selling sites—in violation of the sites' terms of service—and purchase concert and sporting event tickets in bulk. The defendants pleaded guilty.

That these prior cases involve bullying and scalping, rather than important academic research, matters little next to the precedent they established for how the government can invoke the CFAA.

Algorithmic Hijinks

The complaint (.pdf) was filed by the American Civil Liberties Union on behalf of First Look, Sandvig, and three other academics: Karrie Karahalios, an associate professor of computer science at the University of Illinois; and Alan Mislove and Christo Wilson, associate and assistant professors of computer science at Northeastern University.

All four academics have a track record in researching algorithms for discrimination. Sandvig and Karahalios were part of a 2014 study looking at how to audit for algorithmic discrimination (.pdf). Mislove and Wilson are part of the Algorithmic Auditing Research Group at Northeastern University and have co-authored several papers about measuring discrimination online. First Look's interest in the lawsuit stems around the media outlet's interest in doing similar discrimination research for stories.

Web sites often use algorithms to analyze user profile information, web surfing habits—determined through tracking cookies that sites place on the computers of visitors—and other information collected by data brokers from public records, social media sites, and store loyalty programs. The algorithms, which are proprietary and therefore not transparent in how they work, can determine not only the ads a site serves to visitors but can also determine things like the job and housing listings a visitor sees on them. This can lead to discrimination that is illegal under the Fair Housing Act and Title VII of the Civil Rights Act.

"Big data enables behavioral targeting, meaning that websites can steer individuals toward different homes or credit offers or jobs—including based on their membership in a class protected by civil rights laws," the plaintiffs state in their complaint. Because of this, "[b]ehavioral targeting opens up vast potential for discrimination against marginalized communities, including people of color and other members of protected classes."

Sandvig and Karahalios are currently researching popular housing and real estate sites like Zillow.com, Trulia.com, Redfin.com, and Homes.com to determine if they offer different property listings to users based on race and other characteristics. Mislove and Wilson are conducting similar research of job sites like Monster.com and CareerBuilder.com to determine if their algorithms assign lower rankings to people based on gender or color. Job recruiting algorithms often rank job seekers for employers based on relevance, which can have an effect on who employers contact and who gets a job. If an algorithm consistently gives certain classes of people a low ranking, this could cause them to miss out on potential jobs.

Similar types of auditing in the offline world has long been considered a critical tool by courts and the government for uncovering racial discrimination in housing and employment practices. Past tests, for example, have consistently found that Caucasian job applicants receive about twice as many callbacks or job offers as African-American ones.

For the online equivalent, researchers must audit algorithms for evidence of discrimination using scripts to scrape publicly available data on the web sites, and create fake user profiles. Sandvig and Karahalios, for example, plan to generate multiple fake user accounts, known as "sock puppets," that exhibit behavioral characteristics associated with different racial groups to see if the housing sites discriminate against them.

But Zillow.com, Trulia.com, Realtor.com, Redfin.com, Homes.com, and Apartments.com all prohibit scraping in their terms of service, and many of these sites also prohibit users from providing false information. Job sites like LinkedIn, Monster.com, CareerBuilder.com, and TheLadders.com also prohibit this activity, raising the potential for the researchers to be criminally prosecuted.

Chilling Effects

The concern is that by threatening researchers who violate service terms with criminal prosecution, web sites could effectively chill research that helps determine if the web sites themselves are breaking laws. And because it's the web sites that draft the terms of service, "the recipe for avoiding Fair Housing Act and Title VII liability for algorithmic discrimination is straightforward," the plaintiffs write. "[M]erely employ terms of service that preclude subsequent speech about such discrimination, and it can continue unchecked."

Indeed, the plaintiffs say, some web site terms of service specifically require researchers to obtain advance permission to conduct research on their site, making it easy for gatekeepers to refuse access to researchers who might portray the site in a negative light. Other companies include blatant non-disparagement clauses in their terms that prohibit site visitors—including researchers—from speaking negatively about them.

"The work of our clients has a clear social benefit and is protected by the First Amendment," says Esha Bhandari, staff attorney with the ACLU's Speech, Privacy, and Technology Project. "This law perversely grants businesses that operate online the power to shut down investigations of their practices."

The plaintiffs say that by delegating power to companies to determine what constitutes criminal conduct, the government has essentially relinquished control of the lawmaking process to private companies, which they say is unconstitutional.

In 2008, that didn't matter to the jury in Lori Drew's case. Although they acquitted Drew of the three CFAA felonies with which the government charged her, they convicted her on lesser misdemeanor charges of unauthorized access, setting a dangerous precedent for others who violate a site's terms of service. US District Judge George Wu served as the voice of reason, however, when he overturned the conviction on grounds that the government’s interpretation of the CFAA was unconstitutionally vague and set a dangerous precedent. The ACLU says that there's ambiguity as to whether that ruling could have meaningful influence on future cases.

In giving federal authorities the power to criminally prosecute anyone who violated a site's terms of service, the conviction, if allowed to stand, essentially converted "a multitude of otherwise innocent internet users into misdemeanant criminals," Wu said.

That danger still looms today. The researchers' lawsuit aims to change that.