How to Identify and Filter Bad Referrals in Your Google Analytics Data

Google-Analytics-blog-graphic-workingIf you are a site owner or marketer you have most likely seen a few odd referrals in your acquisition reports recently from websites that don’t seem to add up. Some of the biggest offenders you may have noticed are semalt.semalt.com, buttons-for-website.com, and variations of makemoneyonline.com.

What Are These Referrals?

These sites are not actual referrals; they are simply bots using black hat strategies to get extra traffic to their sites. They want you to research where the traffic is coming from in order to get you to their actual site.

How Does This Affect My Data?

These bots are the product of “black hat” SEO techniques but do not appear to be malicious. Our main concern should be how these bots are skewing our Google Analytics data. Especially if your site does not a large amount of traffic, bots can wreak havoc on percentages of traffic and metrics like bounce rate and average time on site. Since the bots come to your site and bounce almost immediately after, you can imagine how a few hundred of these referrals could make a website’s data virtually useless.

As an organic search marketer, you may not think this affects you or your reporting. And on the surface you may be correct, the basic Organic Search data available in Google Analytics’ Medium Report is not going to be affected at all based on these bad referrals. Where it could come into play is when trying to use the traffic data from other mediums to get a baseline or benchmark to compare to the traffic being attributed to Organic.

If you were to remove the bad referral data, not only would your overall site bounce rate and average time on site go up, but you would also get a better idea of the percentage of traffic or revenue that your organic channel is contributing as a percentage of overall traffic.

How Can I Fix My Referral Data?

Fortunately, there is a way to apply a rather quick fix to all of the views in your Google Analytics account in one fell swoop. What you’ll want to do is create a segment that excludes any sessions that come from a source like those listed above. By creating this segment, you can look at overall data in any report within Google Analytics and not have to worry about data skewing issues.

This new segment will replace your default All Sessions segment and will always be used when pulling data from multiple mediums or referral traffic in general. If you are reporting on referral data and comparing data Year-over-Year, you will definitely want to apply this segment when pulling reports. These bots were not active this time last year, and are therefore artificially inflating your referral numbers in GA.

The Solution

We have shared the basic segment below and welcome you to add other examples of sites you come across as more sites try to use this black hat strategy to drive traffic to their sites.

If you are not familiar with Shared Segments in GA, follow these simple instructions to get started:

  1. First click Add Segment on any basic report within GA.Add a segment in Google Analytics
  2. Select New Segment and choose the Conditions option under Advanced on the left sidebar in the following screen.
  3. Set the filter to Sessions and change Include to Exclude.
  4. Where you see Ad Content, select Acquisition and then Source.
  5. Change Contains to Matches Regex so you can apply all sites within a single filter.
  6. In the text field, input  the following regex (regular expression) to remove the URLs you would like to exclude:
    • (makemoneyonline|buttons-for-website|semalt)
  7. Click Save and the filter will be applied automatically across all views on any accounts you are associated with on your personal login. Users that have access to the same views will not have access to this segment automatically. It will need to be shared with them or created manually as above.
Instructions for setting up a segment to exclude bad referrals.

By using this filtering segment in Google Analytics, you will have a clear view of your organic search data and ultimately, a more accurate picture of your overall site health.

Check out Mike’s previous post in the Google Analytics Blog Series.