Hey Wes. We aren't using Party Server
Hey Wes. We aren't using Party Server here; however it sounds like you are possibly awaiting the fetch result inside the realtime path. Is the
await fetch()
inside the onMessage
handler and sibling to the broadcast()
call?
You should be able to resolve this by moving the fetch to another codepath or ensuring the fetch isn't blocking the broadcast.31 Replies
I'm not awaiting it, but I can calling the fetch from within the onMessage handler
Can you share a code snippet? You may want to create a separate code path for the fetch ( I think it's called a Party? ) that will sync the state such that the main broadcast loop is always responsive.
ohh ok, let me look into that - 1 sec
It is expected that any slow events inside the realtime path will block operations. I'd be surprised if a fire and forget wasn't working intuitively inside the loop; however moving the fetch to an outside code path that syncs state might be a better design. We aren't using Party Server here, someone else might know better.
PartyServer has lifecycle hooks onMessage (WebSocket) and onRequest (fetch). I think Marak is on the right track. I'm just helping him remember the PartyServer terminology
yeah I was thinking about just firing up a second process that is a "client" of the web socket instead, but Im puzzled why this "fire and forget" seems to be blocking the network
I'm acutally not doing either - in onStart() I run a recursive function every 1 second that fires a fetch
In general, I avoid outgoing fetches from Durable Objects. It's outside of the ideal concurrency model with inputGates and outputGates.
If you do NOT await a fetch, then it will be held until all other events are done processing by the output gates. It doesn't sound like that's what's happening here exactly but that's an example why fetches are problematic.
So the better approach would be to listen for messages on another worker?
If it's truely fire and forget, I make an RPC call to a Worker that then forwards the fetch. You can even pass a Request object in to the Worker over the RPC.
The other way around is what I'm suggesting. Make the fetch use another Worker.
Moving the fetch to another worker might be the better move
GitHub
LED-GRID/src/server.ts at main · wesbos/LED-GRID
Contribute to wesbos/LED-GRID development by creating an account on GitHub.
this is the function which doesn not seem to be blocking. But I like your approach of moving it to another worker much better
I haven't done this myself but if you make the other "Worker" (actually another DO) be an RpcTarget, I hear that it's super local and efficient.
I dont think it needs to be a durable object, does it? Its just a fetch call
A short-lived DO with no storage is essentially a Worker, but yeah.
ahh gotcha
Think of it more like another process. I believe you can even export it from the same Worker project as you export the DO.
yeah exactly - ill try that now
Hmm - so I moved all the logic to a second worker and it's still blocking the websocket
It's a bit challenging to read, I think the issue is around here:
Using then() is most likely blocking
I removed all of that and still persisted
You may want to try something like:
Tried it - even removed async from all the functions
im going to try just make a separate script that acts as a client and send the data over websockets
Yes, I would suggest removing the timer loop and async awaits outside the broadcast code paths if possible
weird if I replace the call with a slow fetch it doesnt block.. So there is something weird going on outside of the fetch
Ill dig in some more- thanks for the help! I really appreciate it
and now the original code is working just fine.. time for a break haha
Happy to help. I would suggest decoupling the timer updates for state from the broadcasting code paths. It may be a better design to have the loop updating state separated from the code broadcasting the state to the clients.
setTimeout also breaks the concurrency model
What a rabbit hole. So I stripped everything out. IT has nothing to do with the promises or callstack. It seems like a bug in the workerd network level. I tried running
wrangler dev --remote
and there are zero issues
holy smokes. So the endpoint that I was hitting was a .local domain: http://wled-grid.local
- I swapped it to the straight IP address http://192.168.1.100
and it's still giving slow requests, but the blocked websockets are no longer happening 😐
I know macs have trouble with .local
dns resolution so maybe its even a lower level issue with my macIf it's possible you could try switching to RPC invocations via service bindings instead of making HTTP fetch requests.
I tried it. The fetch needs to happen because it's a piece of hardware running a web server, but I moved the fetch to a separate worker and called it via RPC and the issue persisted
Thanks to cycling back to us. I wonder if it's a timing thing that only occurs locally? I had some tests that failed locally because the clock didn't advance between invocations locally and occured in the same millisecond. Whereas in production, no two requests from the same user ever arrive in the same millisecond. Do you key off of milliseconds ever in your code?
nope.. One more thing I noticied is that
workerd
would spike the upload once the websockets could go through.
It really feels like a lower level DNS thing