Mixpanel Community Icon

Counting Unique Users in Snowflake with Mixpanel Export Data

·
·

how to count unique users in snowflake from data expoerted from mixpanel data pipeline feature? it need to be joined with the identity cluster

  • Avatar of kapa.ai
    kapa.ai
    APP
    ·
    ·

    Hi Nave L., Please stand by as I search the Mixpanel knowledge base for an AI-powered answer. I normally take about 15 seconds or so. In case AI cannot answer your question, feel free to reach out to Mixpanel support at mixpanel.com/get-support. If you have follow-up questions for this thread, please tag me again with your question: @kapa.ai <follow-up question> đź‘‹

    Feel free to ask a follow-up question once I've answered by mentioning me with @kapa.ai

  • Avatar of kapa.ai
    kapa.ai
    APP
    ·
    ·

    To count unique users in Snowflake from data exported via Mixpanel's data pipeline feature, you need to use the identity mappings table to properly account for Mixpanel's ID merge functionality. Here's how to do it: When using the ID mappings table, you should use the resolved distinct_id instead of the non-resolved distinct_id whenever it's available. If there's no resolved distinct_id, you can fall back to using the distinct_id from the events table.Here's an example SQL query that counts unique users in San Francisco within a specific date range: LBKSSELECT COALESCE(mappings.resolved_distinct_id, events.distinct_id) AS resolved_distinct_id, COUNT(*) AS count FROM <DB_NAME> .<SCHEMA_NAME> .MP_MASTER_EVENT events FULL OUTER JOIN <DB_NAME> .<SCHEMA_NAME> .MP_IDENTITY_MAPPINGS_DATA mappings ON events.distinct_id = mappings.distinct_id AND events.properties:"$city"::STRING = 'San Francisco' AND TO_DATE(CONVERT_TIMEZONE('UTC','<PROJECT_TIMEZONE> ', events.time)) > = TO_DATE('2020-04-01') AND TO_DATE(CONVERT_TIMEZONE('UTC','<PROJECT_TIMEZONE> ', events.time)) <= TO_DATE('2024-09-01') GROUP BY COALESCE(mappings.resolved_distinct_id, events.distinct_id) LIMIT 100;LBKSMake sure to replace: - <DB_NAME> with your Snowflake database name - <SCHEMA_NAME> with your Snowflake schema name - <PROJECT_TIMEZONE> with your Mixpanel project timezoneThis query: 1. Joins the events table with the identity mappings table 2. Uses COALESCE to select the resolved_distinct_id when available, falling back to the original distinct_id when not 3. Groups by the resolved identifier to count unique users correctly 4. Filters by city and date range (which you can modify as needed)This approach ensures you're counting unique users the same way Mixpanel does in its UI reporting.Snowflake Integration Documentation

    kapa.ai is still learning and improving, please let me know how I did by reacting below.

    ℹ️ Tip: Mention @kapa.ai in the thread for follow-ups or DM the app for 1-to-1 help.