Thanks for the in-depth answer! Yeah it

Thanks for the in-depth answer! Yeah it somehow doesn't add up. But good to know that the cost is miniscule. Your use case sounds pretty much like ours. Do you happen to know a pattern for DOs to handle a big number of users - think global chat or feed. I was thinking of using a DO id stored in KV that's incremented as soon as a worker receives an overloaded error so every new requests uses the next DO instance. And another KV entry that stores all active DO ids for broadcasting to all of them.
7 Replies
Larry
Larry3w ago
A DO per chat is pretty common. That said, the first version of my DO-based system (3 years ago) was too fine-grained and caused me no end of complexity. As soon as we needed transactions across DO boundaries, it was untenable. So, we re-architected to use pretty fat DOs. We were worried about it scaling, but so far, it's not been a problem. Think of each DO as a mini server. I have upwards of 10,000 users on DO, but never more than say 100 connected and actively interacting with a single DO at a time. My daily active users are only about 200 max in a single DO instance. Each tenant has their own DO instance. All user data for that tenant. All interactions. All permission settings and checking. All of that is done on a single DO instance. I will run out of storage before I will hit some compute or other resource constraint, but I have a strategy for that when I approach that.
Earl
EarlOP3w ago
I plan to use the DO as a relay of live data, probably not even persisting data at first. Users (potentially tens of thousands) subscribe and a worker that gets called via a webhook broadcasts to all of the active users via the DO. I fear that this could become problematic with a single DO so I was looking for a strategy just in case. I don't want to run into a dead end with no way out without changing the underlying technology. But I guess there is no common pattern for this yet.
Larry
Larry3w ago
I'm unable to discern the whole picture from what you wrote. A squence diagram (use mermaid if you want to send via text), would help me to engage and help you figure this out. At first blush though, broadcasting to all 10's of thousands of users every time a new user signs up, does not strike me as useful nor scalable.
lambrospetrou
lambrospetrou3w ago
I echo what Larry said about providing more concrete details so that folks can help, but on first glance this seems like the common pubsub use-case? You publish one thing somewhere and you want to broadcast it through DO websockets. You can certainly do this. Most common approach is to shard users across N DOs for the websockets and then your worker will call an RPC to those N DOs and the DO will broadcast the message. Many details though here, depending on your exact scenario.
Earl
EarlOP3w ago
Yeah, it's pubsub on a large scale. Imagine a trade ticker that every active user on a website sees with updates coming in as quick possible. The question is how to scale the number of DOs up and down depending on workload. I don't think there is a way to tell if a DO is about to get overloaded. So maybe I'd have to react to the various overload errors it can throw and then throw some users off and create an additional DO instance. Or I find out (via trial and error) the number of users a single DO can reliably service and then use that as a metric for when to add a new DO.
Larry
Larry3w ago
I assume you don't want to send updates on every stock to every user. If so, then the channel that a user subscribes to in a pub-sub model would be one channel per stock. If so, it would help to estimate the distribution of subscriptions per stock. I imagine many (all?) would subscribe to market indexes like the Dow and NASDAQ, but it would be a huge cliff after that where each person only subscribes to a handful, and even for the most popular stocks it would be less than 100 or so... but you'd know better. This matters because you'd probably want to handle those market average subscriptions differently than the others. I can imagine an architecture where you have a DO per N users that holds the WebSocket connections. Use a simple hash algorithm on whatever id you use to record subscriptions. For non-market index stocks, I imagine one DO per stock or one DO per N stocks (again a hash of the stock ID to determine routing). Whenever there is a stock update, route the WebHook update for that stock to the right DO. That DO would forward the update to the relevant per-N-users DOs. Those might debounce for a few seconds (if you can live with that) so you can send a single update to the user with the n-stocks that have been updated in the interim. If it were me, I'd be drawing sequence diagrams and modeling distributions and doing sensitivity analysis on those models. You'll have to run experiments to get the constants for those models.
Earl
EarlOP3w ago
That's helpful, thanks for the insight!

Did you find this page helpful?