how can i join the tables mixpanel.mp_master_event_raw and mixpanel.mp_identity_mappings_data_raw and have distinct values for user id
To join the mp_master_event_raw
and mp_identity_mappings_data_raw
tables in Snowflake while getting distinct values for user IDs, you'll need to use the resolved distinct_id from the identity mappings table. This is particularly important for projects with ID merge enabled, as it helps you accurately count unique users.Here's how you can do it: LBKSSELECT COALESCE(mappings.resolved_distinct_id, events.data:distinct_id::string) AS resolved_distinct_id, COUNT(*) AS count FROM mixpanel.mp_master_event_raw events FULL OUTER JOIN mixpanel.mp_identity_mappings_data_raw mappings ON events.data:distinct_id::string = mappings.distinct_id WHERE -- Add your filters here -- For example: events.data:mp_event_name::string = 'sign up' GROUP BY COALESCE(mappings.resolved_distinct_id, events.data:distinct_id::string) LIMIT 100;LBKS
This query:
1. Joins the raw event table with the identity mappings table
2. Uses COALESCE to prioritize the resolved_distinct_id when available
3. Falls back to the original distinct_id when no mapping exists
4. Groups by the resolved ID to get distinct user countsFor more advanced querying of the identity mapping table, you can refer to the examples in the Snowflake integration documentation.Note that when working with ID merge enabled projects, using the resolved distinct_id is crucial to replicate the same user counts you see in the Mixpanel UI.
kapa.ai is still learning and improving, please let me know how I did by reacting below.
ℹ️ Tip: Mention @kapa.ai in the thread for follow-ups or DM the app for 1-to-1 help.
Let me check the knowledge base for you again. One moment! 📚
I understand you're still seeing discrepancies between your SQL query results and what's shown in the Mixpanel UI. This is a common issue when working with exported Mixpanel data. Let me help you troubleshoot this.
Common Causes of Discrepancies
There are several reasons why your query results might not match the Mixpanel UI: 1. Different Calculation Methods: Mixpanel reports calculate data differently. For example, Insights might count total events while Funnels count unique users.2. Timezone Differences: Make sure you're converting UTC timestamps to your project timezone.3. Late-Arriving Data: Data can take up to 24 hours to sync from Mixpanel to your destination.4. ID Resolution Complexity: The identity resolution might not be implemented correctly in your query.5. Filtering Differences: The filters applied in the UI might differ from your query.
Debugging Steps
Here's what you can try: 1. Check Your Identity Resolution Logic:
Make sure you're properly using the identity mappings table. For Snowflake, try this improved query: LBKSSELECT COALESCE(m.resolved_distinct_id, e.distinct_id) AS resolved_distinct_id, COUNT(**) AS count FROM "<DB_NAME> "."<SCHEMA_NAME> "."MP_MASTER_EVENT" e FULL OUTER JOIN "<DB_NAME> "."<SCHEMA_NAME> "."MP_IDENTITY_MAPPINGS_DATA" m ON e.distinct_id = m.distinct_id WHERE -- Add the same filters you're using in the UI -- e.mp_event_name = "Your Event" GROUP BY resolved_distinct_idLBKS
2. Verify Timezone Conversion:
Ensure you're converting UTC to your project timezone: LBKSSELECT TO_DATE(CONVERT_TIMEZONE('UTC','<PROJECT_TIMEZONE> ', time)) as event_date, count(**) FROM "<DB_NAME> "."<SCHEMA_NAME> "."MP_MASTER_EVENT" WHERE event_date > = TO_DATE('2023-01-01') AND event_date <= TO_DATE('2023-01-31') GROUP BY event_dateLBKS
3. Compare Calculation Methods:
Check if you're comparing the same metrics (total events vs. unique users) <https://docs.mixpanel.com/docs/tracking-best-practices/debugging#debugging-discrepancies|Debugging Discrepancies>.4. Drill Down to Specific Segments:
Try narrowing down to a specific day or user segment to identify patterns in the discrepancies.5. Check for Hidden Events:
Mixpanel exports all events to your destination, even ones hidden in the UI via Lexicon.If you're still experiencing issues, I recommend drilling down into specific segments of data to identify patterns in the discrepancies, as suggested in the Debugging Discrepancies documentation.
kapa.ai is still learning and improving, please let me know how I did by reacting below.
ℹ️ Tip: Mention @kapa.ai in the thread for follow-ups or DM the app for 1-to-1 help.