Can any insight be given as to why the number of users recorded on mixpanel would be different from the number of users recorded on BigQuery? Both have been set up through tag manager with duplicate tags for all events meaning the captured data should be equal.
Hi Chris S., Please stand by as I search the Mixpanel knowledge base for an AI-powered answer. I normally take about 15 seconds or so. In case AI cannot answer your question, feel free to reach out to Mixpanel support at mixpanel.com/get-support. If you have follow-up questions for this thread, please tag me again with your question: @kapa.ai <follow-up question> 👋
Feel free to ask a follow-up question once I've answered by mentioning me with @kapa.ai
There are several reasons why the number of users recorded in Mixpanel might differ from the number recorded in BigQuery, even if both are set up through tag manager and appear to capture the same data:
1. Identity Resolution Differences: Mixpanel uses an identity resolution process to merge multiple identifiers for a single user into one, especially if ID merge is enabled. When exporting data, Mixpanel event data before an alias event will have the original user identifier, not the resolved one. To accurately count unique users in BigQuery, you must use the identity mapping table and the resolved_distinct_id—otherwise, user counts may not match what you see in Mixpanel UI reporting. If you’re not using this mapping, your BigQuery user count will likely be inflated or misaligned compared to Mixpanel’s unique user count see details.
2. Data Sync and Timing Issues: Event data in Mixpanel and your data warehouse (BigQuery) can fall out of sync due to late-arriving data, import API updates, or deletions (e.g., GDPR requests). Mixpanel’s data sync only covers up to 10 days in the past and may not fully guarantee GDPR deletions are synced. There can also be a delay of up to a day for late data to appear in BigQuery see more.
3. Tag Manager and Duplicate Tags: If you have duplicate tags firing for all events, you might be double-counting events or users in one system but not the other, depending on how each system deduplicates or processes incoming data. This could inflate user counts in BigQuery if not handled properly.
4. Ad Blockers and Client-Side Tracking Loss: Client-side tracking (such as via tag manager) can be unreliable due to ad blockers or Do Not Track settings, potentially causing Mixpanel to miss events for a significant portion of users (30–50%). If BigQuery is ingesting data from a more reliable source (e.g., server-side), this could also cause discrepancies see explanation.
5. Timezones and Query Differences: Ensure both systems are using the same timeframes, timezones, and filters when comparing user counts. Differences here can easily lead to mismatched numbers see guidance.
6. Different Definitions of "User": Mixpanel’s unique user count is based on distinct_id (or resolved_distinct_id if using ID merge), while your BigQuery setup may be using a different identifier or not accounting for merged identities.
Recommendation:
To reconcile the counts, ensure you are:
- Using the identity mapping table and resolved_distinct_id in BigQuery if ID merge is enabled.
- Comparing the same timeframes, timezones, and filters.
- Checking for duplicate event ingestion due to tag manager setup.
- Accounting for possible data loss on the client side in Mixpanel.
For more details, see the Mixpanel documentation on user identity resolution and debugging data discrepancies.
kapa.ai is still learning and improving, please let me know how I did by reacting below.
ℹ️ Tip: Mention @kapa.ai in the thread for follow-ups or DM the app for 1-to-1 help.
