Default Distinct ID

  • 15 November 2021
  • 2 replies

Userlevel 1
Badge +3


Every anonymous user gets an automatically generated id from Mixpanel - example: 039852ed-7ceb-47ef-88bd-766a14f46aa9 - and then we set our own custom id on authentication.

So it’s common for a user to have several mixpanel ids such as the example mentioned, along with one custom id set in the client (ie the id that matches user in our DB).

The issue:

I’d like to breakdown users by this custom id, but sometimes I get some users with the anonymous ids and other times I get them with custom ids. Is there any way to always default to custom ids?

Example use case:

  • Get all people that did X event last 7 days
  • Breakdown by Distinct ID
  • Get table with the ids (custom, not mixpanel anonym id) of users who did event X

Thanks in advance!


Best answer by nataliak 26 May 2022, 00:18

View original

2 replies

Userlevel 2
Badge +1

Hi there!  To give a bit of context, with ID Merge, as a user travels across platforms, Mixpanel will collect multiple distinct_ids to one user in order to connect their journey and associated data. This means that most tracked users won’t have one distinct_id, but a collection of them, called an “identity cluster”. Mixpanel ultimately applies logic to these clusters to determine which value to use as the cardinal identifier in event, user, and MTU calculations. This identifier is called the canonical distinct_idwhich is reflected as the top ID in the cluster. Since Mixpanel's backend applies logic to select the canonical distinct_id, you are not able to choose which distinct_id you've sent in the past is considered the canonical distinct_id, and the canonical distinct_id can change from one value to another in the future as well.
To summarize, you can certainly assign a unique id to your users, like your custom ID, but it may not necessarily be the canonical $distinct_id of that user. Here are some additional resources that may be helpful:

Since you're unable to choose the canonical distinct_id (ie the top distinct_id in a cluster), I would recommend setting your users' custom IDs as their own user profile property, which you can use to search and browse their events rather than distinct_id.
Hope that helps clarify things a bit! Please let me know if you have any further questions about this--I'm happy to help however I can!

Userlevel 1
Badge +3

Thanks @nataliak, super appreciated for the answer. I trust the canonical selection logic can be a complex thing so what I say next might not easily apply; could it make sense to change the canonical selection logic to prioritize the user_id that has been set by the client?

Anyway, I’ve since dig a little deeper and discovered the field $distinct_id_before_identity which seems to be solving my issue atm 😎