What's the table Schema of Mixpanel events when connecting to BigQuery for automated export?
Hi Asís Y., Please stand by as I search the Mixpanel knowledge base for an AI-powered answer. I normally take about 15 seconds or so. In case AI cannot answer your question, feel free to reach out to Mixpanel support at mixpanel.com/get-support. If you have follow-up questions for this thread, please tag me again with your question: @kapa.ai <follow-up question> 👋
Feel free to ask a follow-up question once I've answered by mentioning me with @kapa.ai
When exporting Mixpanel events to BigQuery, there are two main approaches to the table schema, depending on which pipeline method you're using:
Modern Data Pipelines (Current)
For the current BigQuery integration, Mixpanel creates a single table called mp_master_event
and stores all external properties inside the properties
column in JSON format. This design allows you to extract specific properties using BigQuery's JSON functions when querying the data.The table includes:
- Standard event fields (time, distinct_id, etc.)
- A JSON properties
column containing all event propertiesThe data is partitioned based on the _PARTITIONTIME
pseudo column in your project's timezone for better query performance.Mixpanel BigQuery Integration
Legacy Schematized Export Pipeline
If you're using the older schematized export pipeline, there are two schema options:
1. One Table for All Events (Monoschema)
A single table called mp_master_event
with:
- mp_event_name
(string): The name of the event
- time
(int): Timestamp when the event occurred
- distinct_id
(string): Unique ID of the user
- One column per unique property across all events
2. One Table per Event (Multischema)
Separate tables named after each event type, each containing:
- time
(int): Timestamp when the event occurred
- distinct_id
(string): Unique ID of the user
- One column per unique property for that specific eventSchematized Export Pipeline SchemaImportant notes:
- There's a limit of 9500 unique properties per table
- If this limit is reached, additional properties are stored in a JSON column called mp_properties
- You should not modify the schema of BigQuery tables generated by Mixpanel as it can cause export failuresFor querying examples and more details on the BigQuery integration, you can refer to the Schematized BigQuery Pipeline documentation.
kapa.ai is still learning and improving, please let me know how I did by reacting below.
ℹ️ Tip: Mention @kapa.ai in the thread for follow-ups or DM the app for 1-to-1 help.
Let me check the knowledge base for you again. One moment! 📚
In the Monoschema table structure of Mixpanel's legacy Schematized Export Pipeline, the operating system information is stored in a column named $os
. This is one of Mixpanel's default properties that is automatically collected by client-side SDKs.The $os
column contains values that identify the operating system of the device that sent the event, such as "iOS", "Android", "Windows", "macOS", etc. This allows you to differentiate between events coming from different operating systems when querying your data.This property is automatically collected by Mixpanel's JavaScript, Android, iOS, and Unity SDKs as shown in the default properties documentation Default Properties.When using the Monoschema in BigQuery, you can filter or segment your data based on this column to analyze user behavior across different operating systems.Note that if you're using the newer Data Pipeline integration with BigQuery, this property would be found within the JSON properties
column as properties.$os
.I hope this helps clarify the column name for operating system in the Monoschema!
kapa.ai is still learning and improving, please let me know how I did by reacting below.
ℹ️ Tip: Mention @kapa.ai in the thread for follow-ups or DM the app for 1-to-1 help.
Let me check the knowledge base for you again. One moment! 📚
Based on the knowledge sources provided, there isn't a default column specifically named "session_id" in the Mixpanel BigQuery export schema. However, Mixpanel does support Property-Based Sessions where you can define your own session ID property.According to the documentation on Property-Based Sessions: > "A Property-Based Session in Mixpanel is defined by a property you choose as the 'Session Id' (such as 'session_id') and the session persists as long as the value of the 'Session Id' property remains constant. As soon as the value for 'Session ID' changes, another session is started."
SessionsIf you're tracking a custom session ID property in your events (for example, a property named "session_id"), then in the Monoschema export to BigQuery: 1. For the legacy Schematized Export Pipeline: This property would appear as its own column in the mp_master_event
table, following the transformation rules (likely named "session_id").2. For the current BigQuery integration: Your session ID would be stored within the properties
JSON column, and you would need to extract it using BigQuery's JSON functions when querying.If you're using Mixpanel's automatic session computation rather than tracking your own session IDs, the session information would be available through the Session Start and Session End virtual events, which include properties like $duration_s
for session duration.Would you like more information about how to implement or query session data specifically?
kapa.ai is still learning and improving, please let me know how I did by reacting below.
ℹ️ Tip: Mention @kapa.ai in the thread for follow-ups or DM the app for 1-to-1 help.