If you are using Google Analytics, you probably want to know how much of your traffic is from real human beings. With Google Analytics, this is not always clear. In fact, about 56 percent of all traffic on a typical website is bot traffic. In the past, bots could not process JavaScript. Google Analytics uses JavaScript, but with the proliferation of dynamic ajax calls, single-page web applications and jQuery, smart bots are taking over. Today, a bot can process Google Analytics in a way that is almost similar to a human’s web browser.

bots and spamy referral data in google analytics

The main advantage is that search engine crawlers can now process JavaScript, making it possible for much of the human-readable web viewable via search engines. However, evil smart bots – those that crawl for content for scraping and using for other nefarious gain – are known to cause damage to site owners.

The good search engine bots are generally excluded from Google Analytics statistics. They also abide by a set of directives as set out by a websites robots text file, meta tags and crawl only through those pages they are supposed to. The good bots also prevent any requests from being sent to Google Analytic servers.

Types of Referral Spam

In terms of Google Analytics, referral spam comes in two flavors:

  • Spammy web crawlers: These are web crawlers do not identify themselves as bots and end up appearing in your analytics reports as session with 100 percent bounce rate and zero second duration.
  • Ghost referral traffic: This is probably the greater of the referral spam evils that never actually visit your website. In such cases, spammers exploit Google Analytics ability to transfer information through HTTP requests directly to the Google Analytics servers, which means someone can spoof a session. Ghost referral traffic is generated through a simple program sending out fake HTTP requests aimed at various Google Analytics properties, and traffic never actually hits your site. The most annoying fact about ghost referral traffic is that it can be used to spoof organic search results and send false events.


Why is Referral Spam so Bad?

For starters, it will mess your web analytics data. The “sessions” entered via referral spam are likely to skew data, clouding the engagement metrics accuracy and inflating your traffic volume metrics. For those unaware of spam issue, this can cause making decisions based on inaccurate data – especially so for websites with low traffic.

Referral spam also makes SEO difficult for everyone. The main agenda of referral span is to create links from sites that publish access logs. Some of these websites publish their web analytics data publicly, which includes hyperlinks back to the spammer designated URL. Such backlinks improve search engine results for the URL since several websites publish referrer data and presumed trustworthy.

There are also more disreputable opportunities for referral spammers. If the spammer is looking to send a website unwanted, unqualified traffic, all he or she has to do is simply the name of the referral URL to a victim’s URL. Since referral spam cannot be truly “authenticated” and tracked to a specific source, it can be used to harm reputations with the possibility of framing the innocent website as a spam referrer.

Another potential threat is malware being introduced to anyone curious enough to follow the referral spam addresses. With electronic data theft on the rise, it is simple for referral spam networks to point URLs to malicious software with the intention of stealing valuable personal information.

Finally, no one wants to look at advertisement when going through his or her web analytics acquisition reports.

The Solutions

  • The exclusion of foreign hostnames and filtering spammy crawlers

A common attribute of many ghost referrals is the inaccurate hostname attribution. As you review your Google Analytics referral data, the hostname is usually unrelated to your website content. Armed with this knowledge, it becomes easier to create a filter that will only show data with an accurate hostname. For Google Analytics users using only a single or a handful of domains, this is the simplest solution. In many cases, substituting your top domain name for example.com will prove sufficient.

This filter will help address issues with recent increase in direct traffic from a hostname with “not set” value. The first filter will help remove ghost referral traffic. But an extra filter will be required to remove spammy web crawlers that actually visit your site and report an accurate hostname.


  • Filter all referral spam sources

In case of domains where the measured view can easily be changed, blocking referral spam might require a deeper, exhaustive referral filter that includes all offending referral sites. While the list may target many of the frequently offending referral spam sources, it is by no means exhaustive.


  • Advancing segments for historical data

Filters only process data moving forward, but you can use advanced segments to review your historical data from before the implementation of filters. Similar to the solutions above, decide what approach is most palatable for your website and apply regular expression to remove session from the referral spam.


  • Bot filtering in view settings

From July 2014, Google has made it possible to filter bot and spider for more accurate data. While in the admin view interface, you can select this option. This will help remove any sessions named in the Interactive Advertising Bureau (IAB) known bots and spiders list.



Unfortunately, the above solutions are just temporary at the moment. Spammers are continuously innovating and Google Analytics users like you are in danger of falling prey to fake referral traffic gleaned through more sophisticated means. As Google and the other web analytics providers work at providing new mechanisms for combating referral spam, it is critical that you protect your website data by working with an SEO company in Los Angeles to help avoid unpleasant surprises. However, do not that Google is continuously working on a solution to resolve the referral spam issue.