If you want to think of it as a bug feel
If you want to think of it as a bug feel free. But just keep in mind that theres 0 expectation of close events of any form running. It will generally run for the most part but not always
29 Replies
Is there any documentation around it?
Think about it. What if a meteor hit the location where the DO was running. Sure, it's designed so you don't lose any data that's aknowledged as written to storage, but any DO instances running in that data center will instantly disappear from memory. The next request will instantiate it somewhere else, but how could close code run on the instance that was running on hardware that is now molten metal? In reality, there are a large number of reasons why it could go down, a re-deploy is perhaps the most common. Maybe Cloudflare could add a feature for that particular situation, but since you have to take intermittent network failures into account anyway, I doubt they will.
tl;dr; design so your system works if the DO disappears instantly.
Ye didn't think about it this way
makes my whole billing system 5x harder ehh 😄
but so on DO update they just poof out of existence and every client gets dced?
My original thought was to have you count them upon alarm or if you design the dashboard to reach out during that call. I forget why that was rejected, but
state.getWebsockets.length
will get your the current active count.
I don't know if clients are disconnected on every redeploy. I'm told that hibernated WS are maintained on the machine of the Worker that originally proxied the WS upgrade request. That worker is usually (always?) updated in the same update deploy but in theory the machine and its hibernated WS connections might be preserved. You'd have to devise an experiment to confirm. However, I'd advise against relying upon whatever you learn. It's an implementation detail that Cloudflare doesn't want you to design to.
One thing you can be certain of though is that when you call state.getWebsockets.length
you'll get a reasonable count for that instance. I still don't recall why you don't want to use that?
actually it's state.getWebSockets().length
Im more worried now about my billing implementation 😄
I use alarm to bill users there
plus can bill last time on socket close
I guess missing last websocket close is not end of the world tho
Can you bill on socket opens?
I start timer on socket opens
and every X minutes I bill client
Like a
setTimeout()
?I set alarm
so it would sleep
*hibernate
I have a feeling you might be overcomplicating this. How many clients per DO instance do you expect?
Maybe I am. I expect mostly 1 but could get to 10
Can you simply cycle through the list of connections at each alarm point and then write to storage some record of which were active for that tick of the alarm?
Then periodically (once a day, once a week, once an hour), you could fetch that data out of storage?
That is what I do. I make a billing record or how much time was used + info and how much it cost
but I also subtract credits
Then why do you care about missing open and close events?
but if I bill every 10x minutes and user closes ticket after 8 minutes I still want to bill for those 8 minutes
So, run your alarm once a minute
ye but then I would pay for DO more
Here it comes down to how often will I miss socket close events
Now I understand. But the alarm will run really fast. It's like a few ms per alarm.
but DO goes to sleep only after 10 seconds
plus I would spam my database with many records
so I think 15 minutes might be my sweetspot
True but you only pay for the time where it's running your own code + if there are any open fetches. Running an alarm will only incure a few ms unless the alarm makes a fetch
Don't spam your database with a new row on every alarm run where it's open. Keep a record per account and update it.
Where would I keep that record?
In the DO storage
DO sql storage?
If you want to pull it out to an external database, do that in batches.
Yes, DO SQL storage or even DO KV storage.
But they add cost as well
The API for DO KV storage is super simple and avoids SQL complexity. Your only decision is how to name your keys.
My advice is to get your system up and running and worry about costs after you receive a surprising bill
You could model storage costs + reads + writes. You don't really need to model the clock time for alarm running though. It'll be negligible even with 1 minute granularity
I guess will get it out and see how it performs. Can later easily change frequency of alarms
But ye thanks for explanation!
I think the graphql api exposes some analytics around websocket stuff for DOs but why bill on a metric in which you are not billed on?
My own billing. I provide other features