Hey Wes. We aren't using Party Server

Hey Wes. We aren't using Party Server here; however it sounds like you are possibly awaiting the fetch result inside the realtime path. Is the await fetch() inside the onMessage handler and sibling to the broadcast() call? You should be able to resolve this by moving the fetch to another codepath or ensuring the fetch isn't blocking the broadcast.
31 Replies
wesbos
wesbos2w ago
I'm not awaiting it, but I can calling the fetch from within the onMessage handler
Marak
MarakOP2w ago
Can you share a code snippet? You may want to create a separate code path for the fetch ( I think it's called a Party? ) that will sync the state such that the main broadcast loop is always responsive.
wesbos
wesbos2w ago
ohh ok, let me look into that - 1 sec
Marak
MarakOP2w ago
It is expected that any slow events inside the realtime path will block operations. I'd be surprised if a fire and forget wasn't working intuitively inside the loop; however moving the fetch to an outside code path that syncs state might be a better design. We aren't using Party Server here, someone else might know better.
Larry
Larry2w ago
PartyServer has lifecycle hooks onMessage (WebSocket) and onRequest (fetch). I think Marak is on the right track. I'm just helping him remember the PartyServer terminology
wesbos
wesbos2w ago
yeah I was thinking about just firing up a second process that is a "client" of the web socket instead, but Im puzzled why this "fire and forget" seems to be blocking the network I'm acutally not doing either - in onStart() I run a recursive function every 1 second that fires a fetch
Larry
Larry2w ago
In general, I avoid outgoing fetches from Durable Objects. It's outside of the ideal concurrency model with inputGates and outputGates. If you do NOT await a fetch, then it will be held until all other events are done processing by the output gates. It doesn't sound like that's what's happening here exactly but that's an example why fetches are problematic.
wesbos
wesbos2w ago
So the better approach would be to listen for messages on another worker?
Larry
Larry2w ago
If it's truely fire and forget, I make an RPC call to a Worker that then forwards the fetch. You can even pass a Request object in to the Worker over the RPC. The other way around is what I'm suggesting. Make the fetch use another Worker.
Marak
MarakOP2w ago
Moving the fetch to another worker might be the better move
wesbos
wesbos2w ago
GitHub
LED-GRID/src/server.ts at main · wesbos/LED-GRID
Contribute to wesbos/LED-GRID development by creating an account on GitHub.
wesbos
wesbos2w ago
this is the function which doesn not seem to be blocking. But I like your approach of moving it to another worker much better
Larry
Larry2w ago
I haven't done this myself but if you make the other "Worker" (actually another DO) be an RpcTarget, I hear that it's super local and efficient.
wesbos
wesbos2w ago
I dont think it needs to be a durable object, does it? Its just a fetch call
Larry
Larry2w ago
A short-lived DO with no storage is essentially a Worker, but yeah.
wesbos
wesbos2w ago
ahh gotcha
Larry
Larry2w ago
Think of it more like another process. I believe you can even export it from the same Worker project as you export the DO.
wesbos
wesbos2w ago
yeah exactly - ill try that now Hmm - so I moved all the logic to a second worker and it's still blocking the websocket
Marak
MarakOP2w ago
It's a bit challenging to read, I think the issue is around here:
this.wled.sendPixels(updates).then(() => {
console.log('WLED sent', Date.now());
setTimeout(this.updateLED.bind(this), 1000)
})
this.wled.sendPixels(updates).then(() => {
console.log('WLED sent', Date.now());
setTimeout(this.updateLED.bind(this), 1000)
})
Using then() is most likely blocking
wesbos
wesbos2w ago
I removed all of that and still persisted
Marak
MarakOP2w ago
You may want to try something like:
setTimeout(() => {
this.wled?.sendPixels(updates).catch(console.error);
}, 0);
setTimeout(() => {
this.wled?.sendPixels(updates).catch(console.error);
}, 0);
wesbos
wesbos2w ago
Tried it - even removed async from all the functions im going to try just make a separate script that acts as a client and send the data over websockets
Marak
MarakOP2w ago
Yes, I would suggest removing the timer loop and async awaits outside the broadcast code paths if possible
wesbos
wesbos2w ago
weird if I replace the call with a slow fetch it doesnt block.. So there is something weird going on outside of the fetch Ill dig in some more- thanks for the help! I really appreciate it and now the original code is working just fine.. time for a break haha
Marak
MarakOP2w ago
Happy to help. I would suggest decoupling the timer updates for state from the broadcasting code paths. It may be a better design to have the loop updating state separated from the code broadcasting the state to the clients.
Larry
Larry2w ago
setTimeout also breaks the concurrency model
wesbos
wesbos2w ago
What a rabbit hole. So I stripped everything out. IT has nothing to do with the promises or callstack. It seems like a bug in the workerd network level. I tried running wrangler dev --remote and there are zero issues holy smokes. So the endpoint that I was hitting was a .local domain: http://wled-grid.local - I swapped it to the straight IP address http://192.168.1.100 and it's still giving slow requests, but the blocked websockets are no longer happening 😐 I know macs have trouble with .local dns resolution so maybe its even a lower level issue with my mac
Marak
MarakOP2w ago
If it's possible you could try switching to RPC invocations via service bindings instead of making HTTP fetch requests.
wesbos
wesbos2w ago
I tried it. The fetch needs to happen because it's a piece of hardware running a web server, but I moved the fetch to a separate worker and called it via RPC and the issue persisted
Larry
Larry2w ago
Thanks to cycling back to us. I wonder if it's a timing thing that only occurs locally? I had some tests that failed locally because the clock didn't advance between invocations locally and occured in the same millisecond. Whereas in production, no two requests from the same user ever arrive in the same millisecond. Do you key off of milliseconds ever in your code?
wesbos
wesbos2w ago
nope.. One more thing I noticied is that workerd would spike the upload once the websockets could go through. It really feels like a lower level DNS thing

Did you find this page helpful?