Yes, the requests are simply not

Yes, the requests are simply not connecting from either outside w/ Websockets or inside with RPC
19 Replies
Unknown User
Unknown User2mo ago
Message Not Public
Sign In & Join Server To View
Tobias
TobiasOP2mo ago
It's only happening with this specific DO
Unknown User
Unknown User2mo ago
Message Not Public
Sign In & Join Server To View
Tobias
TobiasOP2mo ago
I send you a DM already sorry
josh
josh2mo ago
@Tobias if I sent you an HTTP header, could you add that to a request so that we can get a trace? (from the DO team)
Tobias
TobiasOP2mo ago
bit difficult to reproduce that, i'm now pretty sure it's caused by a lot of load by some kind of "message recursion" (in my code), but CF should be blocking at some point and not hanging (it already did) + i can't really change something in the client code that quickly since it's all native apps (deployed in a restaurant on 10+ embedded devices lol) i can send you a request ID when it first happened
josh
josh2mo ago
Let me share what we are seeing on our side...
josh
josh2mo ago
Looks like a WebSocket connection was established at 17:43... Then we get a few more invocations (RPC or HTTP) all successful until we see two requests cancelled at 17:46. There are two WebSocket messages which are sent before the requests are cancelled.
No description
josh
josh2mo ago
A trace would be helpful because we can see what is happening when you say it is unresponsive, but from our end it looks like nothing is actually happening...
Tobias
TobiasOP2mo ago
This was the confusing part to me as well, I was looking through observability and nothing was visible. The client called me a few minutes later and then were saying "nothing works", I immediately checked Observability - nothing. Then used a mounted Outerbase instance which uses the RPC connection - infinitely loading with a pending request. I deployed a new version, this reset the DO and everything was back to normal. For example, "939149b06c18fba1" is a request ID with a RPC connection to the DO, and it has a wall time of 400k ms
josh
josh2mo ago
It's worth mentioning that your DO dropped the server side of the WebSocket connection at 17:45. When you say the DO was not responding was that to WebSocket messages? Could you have been sending them on a closed pipe?
Tobias
TobiasOP2mo ago
At that point any new connections, no matter if websocket or RPC were failing
josh
josh2mo ago
I deployed a new version, this reset the DO and everything was back to normal.
When was this?
Tobias
TobiasOP2mo ago
this was deploy 6f6bc01d-29f4-44c7-aa19-8a4846182203, around 18:15
josh
josh2mo ago
Okay. I'll share this information with oncall. Thanks for the info.
Tobias
TobiasOP2mo ago
Thank you
Tero Kivisaari
Was there ever any conclusion with this case? I have something that appears fairly similar, in this thread: https://discord.com/channels/595317990191398933/773219443911819284/1369400409386651670
Tobias
TobiasOP4w ago
I have not received an answer yet to my ticket, but I also downgraded its priority to normal, since in my case it was most certainly caused by an overload due to my code.
Tero Kivisaari
Stranglely enough, my problem was also resolved by deploying a new version of the worker containing the DO, only added a console.log statement, so it should have been something else than an implementation issue in our code

Did you find this page helpful?