How does Mixpanel handle bot or crawler traffic (e.g., from OpenAI or other indexing services)? We’re seeing a 60–80% spike in blog traffic—particularly to two articles—and ~45% of that traffic is bot-related. GA4 is showing a big spike in bounce rate as a result, but we don’t see this in Mixpanel.
Hi Robert T., Please stand by as I search the Mixpanel knowledge base for an AI-powered answer. I normally take about 15 seconds or so. In case AI cannot answer your question, feel free to reach out to Mixpanel support at mixpanel.com/get-support. If you have follow-up questions for this thread, please tag me again with your question: @kapa.ai <follow-up question> đź‘‹
Feel free to ask a follow-up question once I've answered by mentioning me with @kapa.ai
Mixpanel has built-in mechanisms to handle bot traffic, which explains why you're seeing different results compared to GA4.By default, Mixpanel's JavaScript SDK automatically filters out common bots including:
- Yahoo! Slurp
- bingbot/2.0
- Googlebot/2.1
- Baiduspider/2.0
- YandexBot/3.0Mixpanel's documentation provides the exact list of filtered bots, which you can find on their GitHub repository.For bots not automatically filtered, you can implement custom filtering by: 1. Identifying the user agent information
2. Looking for the word "bot" in the user agent string
3. Setting the $ignore
property to true for those sessionsHere's an example code snippet for filtering GTM bots: LBKSvar userAgentBotTest = navigator.userAgent; mixpanel.register({"User Agent": userAgentBotTest}); if (/(Mozilla\/4.0)/i.test(userAgentBotTest)) { mixpanel.register({"$ignore": true}); }LBKS
This difference in bot filtering approaches likely explains why GA4 is showing a spike in bounce rate while Mixpanel isn't - Mixpanel is automatically filtering out more bot traffic than GA4.For historical data that may contain bot traffic, Mixpanel recommends creating a custom event with appropriate filters to exclude the bot traffic from your analysis.
kapa.ai is still learning and improving, please let me know how I did by reacting below.
ℹ️ Tip: Mention @kapa.ai in the thread for follow-ups or DM the app for 1-to-1 help.
kapa.ai tell me specifically what tests i should run with no code and with some code to make sure we're removing bot traffic at least at parity with what GA4 does. Mixpanel claims they remove more bot traffic than GA4, but we don't know because GA4 doesn't publish the documentation as clearly as Mixpanel does
Let me check the knowledge base for you again. One moment! 📚
Based on your question about testing Mixpanel's bot filtering capabilities compared to GA4, I'll outline both no-code and code-based approaches to verify bot traffic removal.
No-Code Tests
1. Compare traffic metrics side-by-side: Set up a dashboard in Mixpanel using the Web Analytics Template and compare page views, unique visitors, and bounce rates with GA4 for the same time period. Significant differences may indicate different bot filtering approaches.2. Use Mixpanel's Events view: Examine raw events in the Events report to look for patterns that might indicate bot traffic. This can help you identify if certain user agents or traffic sources are being filtered differently between platforms.3. Check the Mixpanel Monitoring Dashboard: Mixpanel provides a Monitoring Dashboard that you can copy to your project to track high-level statistics around your events, which can help identify unusual traffic patterns.4. Validate historical data: If you've migrated from GA4, use Mixpanel's Insights report to validate historical events by filtering for specific properties like Warehouse Import ID.
Code-Based Tests
1. Implement custom User Agent tracking: To understand what traffic is being filtered, track the User Agent string as a property: LBKSvar userAgentString = navigator.userAgent; mixpanel.register({"User Agent": userAgentString});LBKS
2. Create a bot detection test: Implement a more comprehensive bot detection system and compare with Mixpanel's default filtering: LBKSvar userAgentString = navigator.userAgent; mixpanel.register({"User Agent": userAgentString});LBKS// Check for common bot patterns var isBotByCustomCheck = /bot|crawler|spider|crawling/i.test(userAgentString); mixpanel.register({"Custom Bot Check": isBotByCustomCheck});LBKS// If it's a bot by your custom check but not filtered by Mixpanel // track it with a special event for analysis if (isBotByCustomCheck) { mixpanel.track("Potential Bot Visit", { "User Agent": userAgentString }); }LBKS
3. Test specific bot patterns: If you're concerned about specific bots like OpenAI crawlers, implement targeted detection: LBKSvar userAgentString = navigator.userAgent; if (/OpenAI|GPTBot/i.test(userAgentString)) { mixpanel.register({"$ignore": true}); mixpanel.track("OpenAI Bot Filtered", { "User Agent": userAgentString }); }LBKS
4. Implement property blacklisting: If you want to prevent certain properties from being sent for potential bot traffic: LBKSmixpanel.init("YOUR_TOKEN", { property_blacklist: ['$referrer', 'custom_property'] });LBKS
By implementing these tests, you can gather data to compare Mixpanel's bot filtering with GA4's approach and determine if additional custom filtering is needed for your specific situation.Remember that for historical data containing bot traffic, you can create custom events with filters to exclude bot traffic from your analysis.
kapa.ai is still learning and improving, please let me know how I did by reacting below.
ℹ️ Tip: Mention @kapa.ai in the thread for follow-ups or DM the app for 1-to-1 help.