Data from the Lumen Database Highlights How Companies Use Fake Websites and Backdated Articles to Censor Google’s Search Results

Mostafa El Manzalawy - 2017 Lumen Summer Intern on August 24, 2017

Over the course of the summer, I have been researching notices pertaining to the little-known “stolen article” copyright scam that has been used to successfully remove an unknown number of unwanted URLs from the Google’s search results. The scam is relatively easy to execute, and has grown in popularity since 2013. Below, I discuss the nature of this content removal tactic, and present my findings from a preliminary dataset of 42 DMCA notices targeting a total of 52 allegedly infringing URLs.

INTRODUCTION

Businesses have become increasingly creative in their attempts to misuse the DMCA to remove negative reviews from the Internet. They have gone to great lengths to falsely claim copyright infringement with the intent of taking down content from Google’s search results and review sites.

One such tactic is the “stolen article” scam, which uses fake websites and backdated articles to remove content online. As described in a previous blog post, the scam typically plays out as follows:

A company (or individual) will come across some undesirable content online, which they believe will cause them reputational harm. Desperate to censor the content at any cost, and lacking a valid case for defamation, they will often seek the assistance of a “reputation management” agency. These agencies will proceed to create a website masquerading as a legitimate news source, whose sole purpose is to host the very content their client is seeking to remove, usually disguised in the form of a news article. The article is then backdated to give it the appearance of being published prior to the allegedly infringing content. The reputation management agency then files a DMCA notice on behalf of the “journalist” who wrote the review, claiming it was stolen from their client’s website, all the while shielding the true client’s name with an alias designed to make it difficult to trace back to them.

METHOD

The goal of this research project was to gather a varied sample of notices from the Lumen Database, which appeared to be using the stolen article scam to silence negative publicity online. Each notice was thoroughly investigated to ensure it was sufficiently “suspicious” to include in the dataset. The search for notices was not limited by date of submission, as this has shown to be a fairly new phenomenon in the world of digital copyright fraud.

SOURCES

Lumen Database

In order to find notices that bear a close resemblance to those using the stolen article scam, I searched for several combinations of the following words directly into the Lumen Database: “copied,” “review,” “stole,” “stolen,” “text,” “article,” “copyright,” “journalist,” in addition to the names of some well-known review sites like “Ripoff Report,” “Yelp,” and “TripAdvisor.” Results were then filtered by topic to only show DMCA notices.

Google Transparency Report

Once I had gathered the names of some of the “journalists” or “news sites” submitting DMCA takedowns to Google, I could then search Google’s Transparency Report for more of them. Under the section for “content removals due to copyright,” I was able to search for additional DMCA takedown notices submitted on behalf of the fraudulent websites.

News Articles

Although there is little written about this particular scam, I was able to find some examples in the press which served as a good starting point. Some notable mentions involve a UK home renovation company, a prominent Google executive, and an online gadget retailer, all allegedly attempting to remove negative reviews or articles by creating and backdating fake news articles online. WebActivism, a crowdfunded website dedicated to exposing online scams, was also tremendously helpful in my search for fake DMCAs.

INVESTIGATIVE TOOLS

DomainTools

DomainTools is an incredibly useful resource which allows you to look up historical WHOIS data of a particular domain.

For instance, let us examine a notice filed by “Fox18 News Network LLC” as a model for researching fake DMCAs. Fox18 News Network LLC sent a DMCA takedown notice to the New York Daily News claiming its article about a teen therapy program, Trails Carolina, was stolen from its website. The New York Daily News article was published on November 26, 2014. Fox 18 News Network LLC claims its article was published on November 25, 2014, one day prior.

By looking at the historical WHOIS data, we can see that the domain registration for fox18news.com at the time the DMCA takedown notice was filed, was last updated on August 24, 2015 to a new registrant by the name of “Registration Private” residing in Scottsdale, Arizona. This is most likely the date the domain was purchased by its new owner.

WHOIS registration for fox18news.com courtesy of DomainTools

The IP address history shows an IP change on September 1, 2015. The name server history also shows a change of server on August 27, 2015.

IP address history of fox18news.com courtesy of DomainTools

Name server history of fox18news.com courtesy of DomainTools

Listed below are the relevant dates in chronological order:

November 26, 2014: New York Daily News publishes article about Trails Carolina.

August 24, 2015: WHOIS registration data for fox18news.com is updated with a new owner.

August 27, 2015: fox18news.com hosting server is changed.

September 1, 2015: IP address shows a change.

April 12, 2016: DMCA takedown notice filed against New York Daily News.

Today: Fox 18 News no longer exists. The domain belongs to a new owner, and there is no trace of the news site or article in question.



As you can see, Fox 18 News’ domain history paints a very different picture than the one they are trying to portray in in their DMCA notice. Based on this information, we can therefore conclude that the domain name was likely acquired on August 24, 2015, nine months after the New York Daily News article was written, meaning their “news site” did not belong to them when the supposedly infringing article was published. Fox 18 News’ article was clearly backdated to make it appear as if it was written before the original one.

Internet Archives

In order to go one step further in this investigation, we can look for a snapshot or archived version of the website in its original form when the DMCA notice was sent. Some useful resources used in this report include the Internet Archive’s Wayback Machine, Archive.is, Screenshots.com, as well as WebActivism’s snapshots of various fraudulent websites.

The Wayback Machine reveals no archived webpages for fox18news.com from May 2006 until March 2016, when it first appeared as a "news site." There are a several snapshots in 2014 and 2015 although they ultimately lead to dead links.

A look back at a snapshot of the article captured by WebActivism shows it was “published” on November 25, 2014, one day before New York Daily News allegedly stole it from them on November 26, 2014.

Fake article on Fox 18 News captured by WebActivism

URL: http://fox18news.com/2014/11/25/teen-missing-from-north-carolina-wilderness-therapy-camp-found-dead-after-breaking-hip-in-stream-autopsy

Real article on New York Daily News

URL: http://www.nydailynews.com/news/national/teen-missing-n-therapy-camp-found-dead-article-1.2025238

After taking the website's domain history and archived pages into account, we can conclude with some certainty that the Fox 18 News article was published over a year after the New York Daily News article, and that it was backdated to give the impression that it was published first.

FINDINGS

I was able to gather a sample of 42 individual notices fitting the profile of the stolen article scam in the Lumen Database, targeting a total of 52 URLs to be removed from Google’s search results. Each notice was thoroughly investigated, as to only include the ones that had a strong indication of being fraudulent.

Notice Description

The notice descriptions revealed some wording patterns frequently used in these kinds of scams. The world cloud below provides a visual representation of the most commonly used words included in the DMCA notices.

Word cloud of DMCA notice descriptions created on TagCrowd

References to specific business names or individuals, as well as quoted texts from the infringing articles, were removed from the world cloud analysis in order to have a clearer picture of the generic language used by the reputation companies in question.

Based on my observations, the DMCAs loosely stuck to the following format:

“I am a journalist from [fake website]. My article about [topic] was copied without my permission. The whole work was stolen and posted on [real website] without my permission. Please remove it from Google’s search results.”

There are of course, many ways to phrase the same message, but that is generally the approach taken.

Another pattern of interest was the regularity with which notice descriptions had an unnecessary space before punctuation such as periods and commas. Below are some examples of these kinds of notices. I have replaced the quoted portions of the text with ellipses for simplicity’s sake in the following notice descriptions:

Lumen Notice: 12051102 Global Feminism Inc -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: Amena Capital Ltd (Australia)

“I am online journalist . Working for a reputed magazine . My article is copied as it is .Please look into this matter”

Lumen Notice: 12040318 Frankfort Herald News Corp -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: The Event (Pennsylvania, USA)

“Every single word is copied from my article . they used my source to publish their article with their unethical practices .”

Lumen Notice: 12097756 Frankfort News Corp -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: Brad Kuskin (New York, USA)

“Infringing the text excerpted on the site, beginning with the text "... " till the last word on this particular url . It's a totaly xerox of my article”

Lumen Notice: 12051109 Seiworld News Corp. -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: Ventana Capital (Colorado, USA)

“I am senior editor and my article is copied . Just to harm my reputation online . The article owner anonymously copied my content . Please look into this matter .”

Lumen Notice: 10908865 SeiWorld Broadcasting Networks Inc -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: Daniel J. Scavone – Attorney (New Jersey, USA)

“The article on the judgement given years ago had been covered by me . Please look into this matter and you see the whole content is copied .”

Lumen Notice: 11996205 Atha News Corp. -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: ATEL Development (Washington DC, USA)

“I am journalist . My whole article is copied from the beginning to the end along with the images .It is only done to harm my reputation online.”

Lumen Notice: 10997131 Seiworld Broadcasting Network -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: Indian Member of Parliament (India)

“I am journalist and work for breaking news section . My article on politician which was published a year back was part of breaking news has been copied . Please look into this matter”

Lumen Notice: 10909263 Seiworld Broadcasting Inc. -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: Indonesian Politicians (Indonesia)

“I am journalist and work for breaking news section . My article on politician which was published a year back was part of breaking news has been copied . Please look into this matter”

Lumen Notice: 12185604 Tom Middleton -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: APW Asset Management (UK)

“An investigation report on bankcrupty of a wine company is covered by me , wjich requires special reports with stats . My article is said and being published but , its being copied as it as onto other site without any legal documentation to republish . I request you to please look into this matter .”

Lumen Notice: 12224947 Amelia Hoghern -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: Trails Carolina (North Carolina, USA)

“the article about the missing boy through a therapy program is copied without going through any legal documentation work to redistribute my article . The article is copied , even the title as well . I would request you to please look into this matter”

Lumen Notice: 12082585 Fox18 News Network LLC -> Google Inc. Domain Location: Scottsdale, AZ Subject of Article: Trails Carolina (North Carolina, USA)

“the source of my article is being used here . Everything is copied and even the image . Please look into this matter .”

As you can see, 11 out of the 42 notices have this peculiar punctuation error, for completely unrelated articles. Takedowns for articles written about companies and individuals from the UK, Australia, India, Indonesia, as well many different states within the US, filed on behalf of several seemingly random domains, all have the same style of writing, and were all registered to same exact address in Scottsdale, Arizona. This is probably not a coincidence.

The patterns in punctuation, language, and domain registration strongly suggest that there is a single reputation management company, registered in Scottsdale, using the stolen article scam to remove undesirable search results for a wide variety of businesses from all around the world.

Dates of Submission

The earliest instance of the stolen scam in my dataset appeared on April 3, 2013. The takedown notice was using a fake website, wereviewwebsites.com, in an attempt to remove a Yelp review about a real estate brokerage firm from Google’s search results.

From the collection of notices examined, only two of this kind were submitted in 2013. Three were submitted 2014, and eleven in 2015, with the remaining majority of notices submitted the following year. A total of eighteen notices were submitted in 2016 (excluding similar notices attempting to remove the same content).

Notices from the Lumen Database plotted by month

Domain Registry Location

Out of the 28 unique fake websites found in the dataset, 10 of them were all registered in Scottsdale, Arizona. The breakdown is as follows:

Scottsdale, AZ (10 domains) - lewisburgtribune.com - frankfortherald.com - globalgirlmagazine.com - seiworld.com - athanews.com - tenpublications.com - theconsumerguardian.com - fox18news.com - terifier.com - saudidailynews.com

Faisalabad, Pakistan (3 domains) - tech-cave.com - mashablecity.com - gotohomestay.com

Lexington, CA (1 domain) - complaintscube.com

Kirkland, WA (1 domain) - tentionfree.com

Delhi, India (1 domain) - bravejournal.in

Protected or data otherwise unavailable (11 domains) - familylegalexpert.squarespace.com - rippoff.medianewsonline.com - lifehealthmax.com - wereviewwebsites.com - newgenerationnews.esy.es - wcn.besaba.com - corporatemortgageservicesinc.wordpress.com - newsbuzz.esy.es - indianat.890m.com - fashionmadefresh.com - yourlifesolution.com

Once again, we see that a large majority of the fake websites with an accessible domain history were registered in Scottsdale, Arizona.

Success Rate

According to Google’s Transparency Report, 16 out of the 52 URL takedown requests were approved (as of August 15, 2017). That means that approximately 30 percent of the likely fraudulent DMCA notices from the sample were successful in censoring content from Google’s search results by claiming copyright infringement with fake websites.

This number does not include URLs that were initially removed, then re-indexed once the scam was publicized on the news. Some examples of Google reversing their decisions include the BuildTeam scandal, as well as the AdWeek article about Google executive, Torrence Boone.

Although the sample size is admittedly small, the scam’s remarkably high success rate could indicate widespread abuse of the DMCA to unlawfully censor content on a much greater scale.

Additional Observations

Fake articles were backdated an average of 72 days (median = 8 days) before the original article was published. This is based on data collected from 28 URL removal requests, where both the fake and original article publish dates are available.

Scammers obtained their fake websites an average of 682 days (median = 341 days) after the “infringing” articles were posted online. This is based on data collected from 29 URL removal requests where both the domain histories and publish dates are available.

DMCA notices were sent to Google an average of 121 days (median = 100 days) after the fake websites were obtained. This is based on data collected from 35 URL removal requests with available domain histories.

GOING FORWARD

While there has been some discussion about the existence of the stolen article scam in legal circles, no comprehensive study has been conducted to investigate its prevalence and rate of success. Although this particular dataset only includes a total of 52 URLs (primarily due to time constraints), I would not be surprised to find several hundred or more additional DMCAs of this nature.

From the limited sample size, we can assume there is a high likelihood that one specific reputation management agency is filing the majority of these notices on behalf of clients all around the globe.

The people behind the online activist website, WebActivism, have claimed to have identified three major agencies responsible for this illegal behavior. Additionally, attorneys Marc J. Randazza and Alex J. Shepard recently filed a lawsuit on behalf of Pissed Consumer against a large number of these notoriously fake websites, and the individuals linked to their domain names. A settlement has reportedly been reached between Pissed Consumer and one of the accused reputation management agencies.

The low-risk, high-reward nature of this scam makes it extremely tempting to resort to in a desperate situation. The ease with which one could use it as a weapon against free speech, paired with the clear upward trend in the number of notices we have seen over the years, warrants further investigative research into which agencies are filing these fraudulent DMCAs, and what specifically can be done to stop them in the future.

Since these scammers are primarily relying on human error on behalf of Google’s removals team, a crowdsourced effort to identify and report abuses of the DMCA is encouraged as it would spread awareness to the OSPs being deceived by them. As I have demonstrated in this blog post, anybody willing to dig up additional notices of this nature can do so with freely available online tools, in conjunction with the Lumen Database.

Please note that this research project is a work in progress, and may be updated as I continue to source more fraudulent notices in the future. The dataset can be accessed here as a Microsoft Excel file for those interested in exploring some of the notices (last updated: August 15, 2017).

Mostafa El Manzalawy is a graduate of Sarah Lawrence College. He is currently pursuing his M.A. at NYU Gallatin, studying the tech industry and its effects on global wealth inequality. Mostafa is a tech enthusiast, spending his summer as an intern at the Berkman Klein Center for Internet and Society doing research relating to data protection and privacy law for the Lumen database. In his free time, he can be found spending far too much time on his computer, listening to podcasts, and indulging his interests in Eastern philosophy and human behavior.