Exclude Bot Activity (Web/JavaScript) - Articles From the Archive

  • 30 July 2019
  • 4 replies
  • 797 views

Userlevel 5
Badge +4

Each week, Mixpanel will release articles from our archives to help you get the most out of Mixpanel. Follow the Topic Tag #fromthearchive to subscribe to these posts and get alerted when they drop.

By default, the following bots are filtered out by the Mixpanel JavaScript library:

Any other bot hitting your site will affect your Mixpanel data. That being said, it’s possible to set up some code to filter out these users:

  1. Find the user agent information of the individual accessing the site.
  2. Look for the word “bot” anywhere in the user agent information.
  3. If you find “bot,” set the $ignore property to true.

If you can, identify a common pattern in the bots to block them all in one shot by filtering out any interaction with your site that comes from a web framework that is not a consumer-facing browser. As an example, for GTM bots, this code would look like this:

var userAgentBotTest = navigator.userAgent; mixpanel.register({"User Agent": userAgentBotTest}); if (/(Mozilla\/4.0)/i.test(userAgentBotTest)) { mixpanel.register({"$ignore": true}); }

If you implement this code, you will block all userAgents with "Mozilla/4.0" in the userAgent. This does include some older browsers, but modern browsers such as Chrome, Safari, and Firefox will never include this in their userAgent strings. See a common list of bot userAgents and common bot browsers.

If this does not work, you can start tracking this userAgent going forward so you can find the common pattern among all of the bots crawling your site.

Note

$ignore must have a string or at least be set to true, or else the event will fire. For example, if I have '$ignore': '', the event will still fire since it's an empty string. '$ignore': false will also fire the event.

 

How do I remove bot data from my project?

Mixpanel data is write once, read forever, which means once a datapoint is written to a project, there isn't a way to selectively remove it.

However, there are a few other options:

 

Related topics

 


4 replies

Hi @stephanie - I am implementing Mixpanel in a saas web application.  We would like to upgrade to the paid plan, but seeing undefined MTUs .  We added the Mixpanel init inside our code as well as Sign In and Sign Out, but all other tracking is managed through Google Tag Manager.   I am seeing undefined MTUs coming from gtm-msr.appspot.com.  Attached here is a screenshot:

 

 

Based on your earlier post, where exactly would we add the following?

mixpanel.register({"$ignore": true});

And is Mozilla 4.0 still the user agent for Google Tag Manager?

 

Thanks you for your hep!

 

-AC

Hi @stephanie !  Just want to follow up on my earlier post.  We have Mixpanel set up buy are struggling on this last piece.  Thank  you!

Userlevel 5
Badge +4

Hi @acx7 ,

 

It looks like I missed this!

 

Before diving in I wanted to share how the $ignore super property works. When the '$ignore' property is correctly registered as a super property on a device or website, it will prevent data from being sent to Mixpanel from that device or website. Like all super properties, the property is inserted into the session cookie (for web) or stored in the local device storage (mobile). As long as that cookie is there, or the device storage is intact, the super property will remain until it is programmatically unregistered. When the Mixpanel library loads, if it finds the $ignore property, it simply won't send data for that session or from that device.

 

In summary, when $ignore is registered as a super property all events will include that property in the payload and when they hit our servers they're dropped. This applies only to event data, not user data and relies on super properties stored on the client device. 

 

With this said, to stop sending in bot data with your events, you'll need to implement your own custom code logic to check for bots. The article shares an example of how this logic might look like. 

Example:

var userAgentBotTest = navigator.userAgent;

mixpanel.register({"User Agent": userAgentBotTest});

if (/(Mozilla\/4.0)/i.test(userAgentBotTest)) {

     mixpanel.register({"$ignore": true});

}

 

Also, I believe the user agent information you might want to look for now is Mozilla 5.0. Something to note when implementing, $ignore is a flag that prevents event data from sending, but user profiles can still populate through. We don't actually have a flag that stops user data (it's a separate API endpoint). However, User data is easier to work with as you can alter data after the fact (update / delete etc.), whereas Event data is immutable.

 

Hope this helps! 

Thank you Stephanie.  It is strange, but weare no longer seeing empty MTUs from gtm-msr.appspot.com.  We did not change anything in our implementation, but your suggestion is very helpful in case the issue arises again.  Thanks again for your help.

Reply