Has anyone tried/can you run an LLM parser over an event property? Context: Working in music and we have a search event with a search term event property. Users might type in long query like I am looking for a song like Queen's "Bohemian Rhapsody", but it has to be sung by a female artist from Australia . I want to use an LLM to extract the topics of search here, which in this case would be something like Specific Song, Specific Artist, Vocals, Location . This works well when I export the events to Sheets and run it through Gemini via the =AI() function, but, naturally, I don't want to export this and I want to analyze the data in Mixpanel. Thoughts?
I don't think Mixpanel is the tool for the job here, it just wasn't built for that; for starters, strings get cut off at like 256 characters or something, which is much shorter than you'd want; there's also very little support for string manipulation inside mixpanel (you can write custom properties that use regex, but it's far from the ideal way to do unstructured text parsing) I personally wouldn't even send those long queries to mixpanel, because it will be hard to make heads or tails of them. Instead, I would save them in a db somewhere, at which point you can run whatever analysis you want however you want it, with as much or as a little pre and post-processing that you need (e.g. send it to an LLM) if you want to be thorough, you could give the associated mixpanel event a unique $insert_id and then save that id on the database entry as well, so once you summarize a batch of queries, you can go and update the event in question with the list of topics in mixpanel, which you can then use for breakdowns and stuff
Thanks for your thoughts, appreciate it
I personally wouldn't even send those long queries to mixpanel, because it will be hard to make heads or tails of them
Yes, somewhat true. What it does allow me to do, however, is easily filter the queries I care about, e.g. the ones from a certain segment of my user base, and then do a deeper analysis on that outside of Mixpanel. I hear you when you say it should be logged in a db. Downside of that is that for me, as a product person, getting that set up by our engineers is going to be painful, not to mention I spent so much time in Mixpanel, I rather have all the data there 🙂
for starters, strings get cut off at like 256 characters or something
Good to know! Thanks for highlighting that. I see how many of our users' queries would exceed that.
