Does Google like auto-generated websites wrapped in Google AdSense ads?

The short answer is no.

The long answer is a bit more convoluted. But so long as they are...

well branded

well funded

operating at scale

good at public relations

...the answer is yes, autogenerated websites full of scraped content are fine.*

*based on Mahalo.com

Mahalo SEO Spam Case Study

The Sales Pitch & Launch

Originally when launching Mahalo, Jason Calacanis claimed that it would be spam free and that SEOs would have hell to pay.

He had a multi-month sales pitch leading up to the launch of his site where he kept stating that Squidoo is spam and kept calling SEOs scumbags so he could pull in attention and links. This was well received by SEO conference organizers because people would talk about how outrageous Jason's speech was online, so (seeking marketing for their conferences) the SEO conference organizers acted like lap dogs standing in line waiting for their turn to have Jason call their paying attendees scumbags.

The publicity strategy worked great as it helped land Jason some mainstream press coverage and a lot of ditto head bloggers (who lacked either the experience or the mental faculty needed to see the bigger picture) got behind Jason.

The Wikpedia page about Mahalo reflects the public relations driven misinformed pitch

Search results quality Mahalo's goal is to improve search results by eliminating search spam from low-quality websites, such as those that have excessive advertising, distribute malware, or engage in phishing scams. Webmasters have a vested interest in seeing their sites listed. Calacanis has said that algorithmic search engines, like Google and Yahoo, suffer from manipulation by search engine optimization practitioners. Mahalo's reliance on human editors is intended to avoid this problem, producing search results that are more relevant to the user.

When people steal/borrow/syndicate content without any editorial value add or original content, and then wrap it in ads that is generally considered spam. We will come back to that topic later, I promise! ;)

Early Media Success

Around the above conversation flowed a bunch of links, which helped Mahalo get off to a fast start. At first Jason claimed he wanted to create "the best" content for the most popular search queries. Many members of the media were duped by Jason's misinformation, as well reflected in the cNet article titled Jason Calacanis' Mahalo: Screw the long tail:

Instead of a server farm that crawls through the entire known Web so it can automatically match Web pages to the queries you type, Mahalo's search results are created by humans, in anticipation of the queries its users will type in. How can this possibly work? Because, Calacanis says, the top 10,000 search terms account for 24 percent of all searches. If you can create great results for the top results, users will learn to appreciate the difference between machine search results--which are often thrown off by spam and poor-quality links--and human-powered search pages, lovingly created by caring search editors. For the obscure "long tail" queries that make up the 76 percent of search terms, Mahalo will serve up Google results.

Their first x articles were typically thin link lists, but hand generated. But since the pages were just link lists they were not remarkable enough to be linkworthy and the service was not sticky enough to keep people coming back. So Mahalo also decided to ramp up link building & awareness using 4 strategies:

A person who claims to have worked for Mahalo named Matthew Wayne Selznick wrote:

Regarding the Mahalo Blog Network: I don't know how recent that screenshot is, but it's amusing to see the blogs of several people who have either left the company or were laid off last October, when half the in-house editorial staff (including myself) was purged.

...

When I was working for Mahalo, staff were strongly encouraged to get blogs if we didn't have them and blog about Mahalo whenever there was a high-traffic opportunity like an awards show, sports or political event.

...

I unsubscribe from the blogs of my former co-workers when the majority of their posts are Mahalo link parades, just as I unsubscribe from any blog when it becomes a mouthpiece.

Their content was not Pulitzer prize level, but the strategy paid off and they started pulling in search traffic.

Strategy Shift

In spite of claiming that he just wanted to dominate the short head of search volume, that is not how Mahalo started gaining search traffic. Even if they poured hundreds of Dollars into a piece of content the generalist content with little to no topical expertise could not compete for the most competitive and highest traffic search keywords.

You need to have something useful or original to add to the conversation if you want to compete for the most competitive keywords, and penny pinching outsourced content doesn't get the job done there.

Instead what happened was that they ranked almost instantly for keywords like "best computer speakers" even with low quality scraped content.

Around the time I highlighted the emergence of that strategy, Google's Matt Cutts was interviewed about it and claimed that it was fine because Jason Calacanis was using MediaWiki to create his site. Jason also did a bit of damage control in a Sphinn comment where he claimed the spam pages were "experimental pages" that "we are no indexing"

In his own words:

That was 671 days ago. What has happened since?

A Prediction

Around the time of the above incident John Andrews (who gets the SEO field as well as anyone does) stated:

Everyone just copy Jason Calcanis and Mahaloo, ok? That sounds like a GREAT idea. Jason dissed SEOs in public, at a keynote, on purpose, and then learned a bit so he wasn’t quite so ignorant of SEO any more, and is now working the SERPs as a black hat SEO. Jason dissed affiliates in public, at a keynore, on purpose, and then learned a bit so he’s not as ignorant of affiliate marketing as he was before, and now Mahaoloo has embedded (inline) affiliate links (take a look.. added since Affiliate Summit). I think every "Learn how to Make Money Fast on the Internets" web site should simply point to Mahaoulo and say "copy them.. they are riding the black edge of gray hat SEO" and be done with it. So simple... just copy them. As they add pages, add splogs on those same topics because those are money terms. Every time they link to some resource, link to it from that blog. Scan technorati for Jason’s comments, and add one of your own right into that thread.. every time. Let Jason pave the way to profits.... each time he justifies his spam, he’s justified YOUR spam as well. Every time he explains how he’s not a spammer, he’s explaining why YOUR not a spammer either. Best of all, he’s being your spokesperson for FREE!

Was John Andrews once again correct? Lets take a look behind the curtains :D

What Happened?

Well the above computer speakers page that was highlighted still ranks in the top 5 search results in Google.

And the site has been growing quickly, with traffic increasing at least 3-fold over the past couple years.



Jason used the economic downturn as a convenient excuse to fire most of their editorial staff. But a big piece of that traffic growth is that they have got more sophisticated in their content scraping strategy.

To appreciate how reliant their model is on scraping content, I want you to see how a new page starts off.

Once you strip the ads and scraped content from that page there is nothing left but branding & navigation.

Two other noteworthy things about that page are that it was generated by a robot (see below) and that it is already indexed in Google. Once you have enough domain authority you can publish automated scraped garbage and rank well in Google. It is the Mahalo strategy.

That page (which was automatically generated in under a minute by a fake user robot named searchclick) is already ranking well in Google! How do you know searchclick is a fake user? Well look through all the different pages they created in under a minute over the course of the last year...likely 10,000's of them.

Understanding the Insidious Nature of Mahalo's Scraping

Search engines like Google scrape content so that they may provide a service of value to end users *and* publishers. When they make your snippets they are used to help promote your website.

What Mahalo does is take snippets, and publish them as content on their site. So they use your page titles and your content snippet to rank their site using your content, without your permission.

If you optimize your page titles on a new blog post you are helping to feed relevant optimized content into the Mahalo machine. They will scrape it, and if you are less authoritative than they are, they will likely outrank you!

To add further insult to injury, they put nofollow on links back to the content source which they are scraping content from, so while they are "borrowing" your content you are not getting any link credit for it.

And It Gets Worse!!!

As abusive and as extreme as the above sounds, it is actually only the first step in the process.

What happens next is that if your content (published on Mahalo without permission) causes the Mahalo page to rank for new valuable keywords then they may feed those keywords into their page generation tool and keep making more auto-generated pages in that area, leveraging their domain authority and YOUR content to compete against you while building an automated spam empire.

Some of the top earning pages might have freelancers thicken them out, but the only reason humans are involved at that stage is to legitimize the mass content scraping farm that is the base of the operation. If a company has 200,000+ automated pages with 0 overhead that make 5 cents/day each that is real cashflow - $10,000+ per day of profit!

Still not convinced of the profit potential? Mahalo.com has ~ 300,000 pages indexed in Google. On auto-generated pages it is far easier to get people to click an AdSense ad than it is to get them to buy something from Amazon.com (and you profit on 100% of the ad clicks vs only 1% of the Amazon.com clicks that convert). While there are 4 AdSense blocks *above* the Amazon.com affiliate links, Jason did $250,000 on Amazon's affiliate program last year "without trying" (again, his own stats in his own words...see Flickr.com/photos/jasoncalacanis/4234615626/ ).

Putting it All Together

If you build link equity and are good at public relations you can get away with murder in Google. Scale it big enough and the guidelines simply do NOT apply to you.

Most people who try to "pull a Mahalo" and spam up Google will likely fail because they lack

the public relations & affiliations needed to attempt to legitimize such a strategy

the willingness to lie just to get a bit of media ink

the public relations & media savvy to pull such a major bait and switch without getting caught

the domain authority to make it work algorithmically

Originally when launching Mahalo, Jason Calacanis claimed that it would be spam free and that SEOs would have hell to pay. Now that he is scraping your content (and adding nofollow to the links to your content) I think he is right. You are losing out on your search traffic because an authority site is "borrowing" your content and outranking you with your own content.

Jason got Squidoo penalized by calling it spam, and under the same level of scrutiny, how is Mahalo which scrapes millions of 3rd party content listings *without any editorial filter* not spam? Squidoo at least donates $10,000 a month to charity. Mahalo just steals your content without permission and keeps all the cash.

Are the search results going to start filling up with Twitter recycling start ups? What happens when the media gets in on this "what the bloggers have to say" scraping game? Does it even matter who created the content so long as someone wraps it in ads & ranks it?

I don't think we can stop people from being greedy or stealing, but I am surprised Google has turned a blind eye to this process. Is this what they want the web to become?