Where the heck do you collect data for analytics?

Attached is an excalidraw screenshot to hopefully explain this better, I'm struggling to figure out where to put analytics to be collected In my application, there are a lot of sources that data can come in from, lets take a user changing their consent for example. This event can come in from: Discord Bot: - Clicking a button on a solved message - Agreeing to server post rules - Agreeing to server rules - Using a /consent command - Clicking a button on the manage account menu - Disabling Indexing of their messages All of these call a common function in the Discord bot package which is just a wrapper for an api call to update the consent status Web: - Clicking a consent button on the web Both the Discord bot and the website then call the api which is just a tRPC endpoint, the data they pass in is the source (what the cause of it was, i.e using the /consent command) and the new consent state along with the target user and server id That api then calls my updateUserServerSettings function which just prepares the settings for being written and then writes it to prisma The trouble with this, is because this can come in from all these different sources I'm not sure where to put the analytics collection event. Lets just say I put it on the updateUserServerSettings function, I'm loosing the consent source, meta data about the server, the channel that it was in, etc. I could pass all that data in but then that becomes a bit crazy to do If I put it in the api, it's a similar issue of losing data although I at least get the consent source now, however if I ever directly call updateUserServerSettings I miss this analytics event If I put it in the Discord bot, since I'm still in that environment I'm able to collect a bunch of helpful data along with the analytics event such as applied tags, channel name, number of messages, etc, but then I need to make sure to also collect this analytics event on the web client, or duplicate collection on the api and db for redundancy What's the best approach for collecting analytics from a variety of sources like this? Along with that, lets say I use the last approach where on each level there is an analytics capture event, would I want each one of those to be its own event name or would i want to use the same eventname list "update_user_consent" for all of them? If anyone has recommended resources on this I'd love to check those out, I don't think this problem is so much a code architecture issue as it is just being new to analytics collecting and not knowing what is recommended Thanks! Just to add a bit more clarity, here is what the execution flow would look like: Discord Bot: User uses /consent Bot handles this event and calls updateConsent() updateConsent() creates a tRPC call and calls setConsentStatus tRPC (API) setConsentStatus() is called setConsentStatus() validates permissions then calls updateUserServerSettings() in the db DB updateUserServerSettings() is called, parses the input, then returns the result of the Prisma update I’m trying to figure out where in this flow of events I should put analytics, or each package shoudl get its own analytics call with all the possible data at each stage
25 Replies
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
Rhys
Rhys16mo ago
It’s not really a technical solution I’m looking here - sorry my question may have been a bit confusing I’m not sure where to put the actual data collection part as no matter which step in the process I put it, I lose meta data If I put it at the source in the discord bot, then I don’t get to reuse the same data collection in the website If I put it in the api, then I don’t get to collect metadata like what channel the command was used in, etc
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
Rhys
Rhys16mo ago
That’s not really what I’m asking The question is, I have an event that can come in from a lot of different sources I don’t want to copy paste a bunch of event collection information to the root of where those events come from as it feels like bad design, but any abstraction I do I lose valuable meta data
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
Rhys
Rhys16mo ago
Which part? How can I improve my question to make it clearer? I don’t know whether to put my analytics collection closest to the source where I can gather as much information as possible, or closest to where the actual database writes happen so that I can know that all actions will be captured by analytics
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
Rhys
Rhys16mo ago
I’m dealing with big scale which is why I asked the question originally as I’m trying to learn how to do this
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
Rhys
Rhys16mo ago
That's still not really the question I'm asking - I have the following functions:
Discord Bot Pacakge
handleConsentSlashCommand()
-> updateUserConsent()
-> callAPI()
-> setUserConsentState() // move over to the API package

API Pacakge
setUserConsentState()
-> setUserServerSettings(
{
consented: true
})

DB Pacakge
setUserServerSettings()
-> prisma.userServerSettings.updateById({...})
Discord Bot Pacakge
handleConsentSlashCommand()
-> updateUserConsent()
-> callAPI()
-> setUserConsentState() // move over to the API package

API Pacakge
setUserConsentState()
-> setUserServerSettings(
{
consented: true
})

DB Pacakge
setUserServerSettings()
-> prisma.userServerSettings.updateById({...})
See how at each step at the process, I lose information about the event that's being tracked? If I put the code to emit an analytics event at the very beginning of the process, and then consent gets updated from a buttonClick instead I then miss that event if I forgot to implement it If I put the code to emit an analytics event at the very end of all of this, then I have very little metadata about the event aside from the new state of the user, so I have no idea what the most effective consent method is but I at least get to track all the events
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
Rhys
Rhys16mo ago
that's the question that I'm asking is if that's the way that people recommend doing it, and along with that if I then make that one event called like "UpdateSettings" with optional fields or make individual events for each one like "UpdateSettingsSlashCommand" "updateUserServerSettingsBot" "UpdateSettings"...
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
Rhys
Rhys16mo ago
Right that makes sense - one thing with that though is on my first time implementing analytics I ended up missing a couple key places that I wanted to track data from since there were so many sources - that’s partly on me just implementing it poorly but I was hoping to find a way to avoid accidentally missing events I guess what might work best is two analytics events in this example - one at the very root of the call stack that just gets a bunch of data and is called like “UserConsentSlashCommand” and one at the very end where it updates in the database to just be a catch all to reference in the future that’s like “UserServerSettingsChange”
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
Rhys
Rhys16mo ago
I will be interpreting my data 😭 That’s good to know there’s no silver bullet approach though I appreciate your input - I think i just need to like you said plan out everything I could see needing and hope that catches the most relevant parts
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
Rhys
Rhys16mo ago
👍 Thanks for your input on it appreciate it - i agree analytics are hard 😅
Unknown User
Unknown User16mo ago
Message Not Public
Sign In & Join Server To View
scot22
scot2216mo ago
What is “big scale” how many events is that?
Rhys
Rhys16mo ago
When I meant big scale I was more referring to the number of places a similar event can come in from as that it what I was asking in this thread
scot22
scot2216mo ago
Ah I see, I think @donocode is putting you on the right track 🙂
Rhys
Rhys15mo ago
@Gino
cyremur
cyremur15mo ago
2 cents: client side can always be blocked / raise flags, especially when you allow users access from web browsers. I'd probably try to squash a Metadata json into the api call and let backend do the tracking
cyremur
cyremur15mo ago
Not sure if applicable but remembered this rant as https://youtu.be/bVRo68NByvE
Theo - t3․gg
YouTube
You're Losing HALF Of Your Data
Analytics are important - make sure you get the most important parts! I HAVE A PATREON NOW JOIN IT https://www.patreon.com/t3dotgg Twitch link: https://twitch.tv/theo Twitter link: https://twitter.com/t3dotgg Discord link: https://t3.gg/discord Everything else (insta, tiktok, blog): https://t3.gg/faq Services mentioned in video (ref tags aren'...