Link Refererer spamming

Referer link spamming is when someone from Site A fakes out web servers on Site B into thinking someone was referred from Site A. The hope is that Site B is running one of those dynamic visitors lists and, by doing so, Site A appears on Site B’s list when Google’s indexing spider visits. Google “sees” another occurrence of Site A and ranks them as more important than other, comparable sites. The industry name for this practice is “search engine optimization.”

I’ve seen an uptick in the number of link spammers “visiting” my site. Initially, these were one-offs like the P.Hilton video hawkers who had set up on blogspot.com. Last month, there was a spike from a Hallowe’en costume vendor who did the same. I’ve also been seeing direct hits from , makers of software to automate link spamming.

Here is a specific example of referrer spam, rendered in bitmap form because these people annoy me and I don’t want to encourage any business their way.

I don’t use the dynamic “visitor lists” for this very reason, but the link spamming still annoys me when I check my site stats and find that a new visitor is indeed a spammer. Luckily, my site uses apache, and this is moderately easy to fix by modifying the rules Apache uses.

For example, to block the above referers, I can create some general rules:

RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http://paris-hilton-videoblogspotcom*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www)?*debt*info*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www)?*mortgage*info*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www)?*loan*info*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www)?*french-wine-cellar*info*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www)?*credit*info*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://halloween-costumes-onlineblogspotcom*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://wwwadminshopcom*$ [NC,OR]

I’d be interested in other experiences with getting rid of these bozos. One curious thing in my logs was some of the sites only request the page and not the content within. There should be a way to write an automated rule.