Google News and AdSense's Role in Blog Spam
Ed Kohler
Technology Evangelist has been live on Google News for two weeks now, giving us time to analyze how this new syndication source has effected our traffic. While our traffic has increased significantly since being added to Google News, the most interesting thing we've learned is how Google News together with Google AdSense plays a role in blog spam.

Here is how Blog Spam using Google News is Done:

1. Spammer creates new blog

2. Spammer chooses theme for the blog.

3. Spammer scrapes headlines off Google News related to blog's theme. Publishes headlines as blog posts.

4. Spammer syndicates their scraped content onto search engines, including and Technorati.

5. Spammer places Google AdSense ads on their site.

6. Spammer makes money off the AdSense clicks. Since the content is marginal, the ads look particularly good to visitors landing on the pages.

This is particularly annoying for news junkies subscribing to queries on blog search search engines. It leads to hundreds of false-positive search results from republished news stories. Here's an example:

Splog Example  

The above splog (spam blog) has page after page of scraped headlines from Google News with Google AdSense ads running in the right column. A story from our site was one of the scraped stories.

Here's another example:

Another Splog Example  

And another example:

Yet Another Splog Example
Yes, that site really does post two huge Google AdSense ads blocks before showing any real content. Then they finally post the content they scraped from Google News.

Is this hard to do? Unfortunately, no. In fact, the scraping and publishing process can be automated to search, scrape, and publish new splog pages on regular intervals such as once an hour.

Clearly, the only reason this type of spam exists is because they can make money off Google AdSense advertising. This wouldn't be a problem, except it wastes my time, and the time of anyone else who happens to end up on pages like this. Chances are pretty good that the advertisers paying for the clicks from sites like this are getting less qualified visitors for their money than visitors clicking through from sites with great content. This theory is based on the assumption that visitors will click on something, anything, once landing on such marginal web pages. A click away from a quality web site is likely a more considered click, thus offering a more qualified visitor to advertisers.

Why does Google Allow Splogs to Use AdSense?

Money. Money. Money. There was a time when Google hand approved publishers for inclusion in the AdSense program, but that no longer seems to be the case. is currently in Beta and will likely remain in Beta until Google stops allowing Sploggers to publish AdSense ads on their site. Until Google makes that move, Google Blogsearch will be overrun with too much splog noise to be a usable search tool. As soon as they do that, much of this splogging will disappear overnight.

Technorati Creates Effective Workaround

Technorati has recently added a new feature that helps blog searchers filter out the blog spam. Their new Authority slider allows you to filter out search results from blogs with low or no authority. Authority is measured based on the number of inbound links and splogs rarely have any inbound links (who would like to them?), so sliding the authority filter to the right quickly cleans up the results. Hat tips to Robert Scoble and Josh Teeters for pointing this out.

Update: This article has already been scraped and splogged to a site running Google AdSense ads:


1. Posted by: Tim on February 16, 2006 8:39 AM:

I'd be curious to hear your thoughts (in the future) about how being included into Google News might impact Google's duplicate content filters. Will there be any cases in the future where there will be other sites that have scraped yours that rank higher than you for specific content on that article.

2. Posted by: Ed Kohler on February 16, 2006 8:57 AM:

Tim, while it's ceratinly possible for a scraper site to out rank an original source, it rarely happens. Why? Most scraper sites have little to no link popularity. Since search engines use link popularity to determine authority, the site with the highest link popularity will likely rise to the top of the results.

More specifically, the site linked to from your comment has a PageRank of 5. That's an implossibly high bar for splogs to cross.

3. Posted by: open source reader on May 19, 2006 5:06 PM:

Google and blogspot seems to handle partially this kind of bad usage of their tools.
Here was a post I did some times ago 5 things I do not like in blogger
Item #1 was the captcha everywhere ...
I fact I realized after that my account was lock by anti-spam-robot, for every publishing action I have to fill a captcha ... which is very ennoying ...
Now that I have understood that, I made a request for human checking ....
I should get regular usage now ...

just my 2cents

4. Posted by: geld lenen on February 14, 2008 3:05 PM:

Well... I see this is an old post but come to see this again and again, now integrated with sometimes mixed up search results on several topics.

The strange thing is, they DO score in Google :S

5. Posted by: Nu geld Lenen on May 24, 2008 8:28 AM:

I'm not sure Google is going in the right direction. As you mentioned, money is what it's all about. But Google's idealistic goal to deliver the ultimate search results to the visitor is getting further and further away. Since Google earns big time on AdSense, they let it all happen and it's a shame.

6. Posted by: Nu geld lenen on May 24, 2008 8:38 AM:

Since Google has gone public, and thus main focus is on shareholder value, I think the search results are not what they once were.

7. Posted by: Ed Kohler on May 24, 2008 9:46 AM:

Nu, do you have any examples of Google's quality slipping to back up your assertions? By one measure - market share - Google is stronger than ever.

8. Posted by: kmadhav on May 11, 2010 12:48 AM:

on web there are many blogs are running for this purpose only. Google gives penalty on these blogs minus thirty. Google is updating his algorithm to handle these kind of spammy blogs.

Best Regards
Best Regards

