It seems like websockets just stops working?

Hi, I've been running web sockets with two separate libraries and it works for a certain amount of time, but then just stops receiving events after a while. I originally thought this was a lib issue or the provider I am using for data was not sending the event, but the common thread seems to be Railway. Is there some way to confirm this?
38 Replies
Percy
Percy•8mo ago
Project ID: bdcc0a4b-7e77-4c0a-9a5f-9099d793cead
tansan [chill/build mode]
bdcc0a4b-7e77-4c0a-9a5f-9099d793cead
tansan [chill/build mode]
Railway | Help Center
WebSocket Connections Disconnecting – Railway | Help Center
Steps to take to make sure that WebSockets are behaving as expected on the platform.
Brody
Brody•8mo ago
can you be more specific about how long you are able to keep a websocket connection open for, that article is very old and not completely accurate
tansan [chill/build mode]
I am testing two different libraries with web sockets since I thought it was the library. Based on the logs, it seems like 24 hours or less.
Brody
Brody•8mo ago
then yes 24 hours would be the max time, that article was talking about a max time of around a few hours, the current limitation is 24 hours you would want to have your websocket connections reconnect on disconnect also, using a railway domain or a custom doesn't doesn't change anything anymore like that article said
tansan [chill/build mode]
Ah okay! I'll remove it then
Brody
Brody•8mo ago
@char8 - another report of the proxy restarts effecting a user
tansan [chill/build mode]
Okay I'll update my code tomorrow and give it a try
Brody
Brody•8mo ago
question, the communication between the websocket server and client, are these two railway services? what do you have going on?
tansan [chill/build mode]
Server on Railway is listening for websocket events from Alchemy, so waiting for infrequent blockchain events
Brody
Brody•8mo ago
so you aren't running a websocket server yourself, just a client?
tansan [chill/build mode]
That's right. I'm not running it and just running a client
Brody
Brody•8mo ago
then I don't think you'd be touching railways proxy is this maybe a limitation of alchemy's websocket server?
tansan [chill/build mode]
That's what I originally thought, so I also tested it with infura Similar results
Brody
Brody•8mo ago
I assume infura is a similar service as alchemy's?
tansan [chill/build mode]
Yeah that's right
Brody
Brody•8mo ago
have you tested your code on a different platform so you can rule out your code as a factor?
tansan [chill/build mode]
Only tested it on Railway
Brody
Brody•8mo ago
would you be up to testing it on fly perhaps?
tansan [chill/build mode]
It's a non-zero probability that its my code, but I see it working in teh beginning Fly has too much friction lol I moved away from fly I can definitely try to reconnect tomorrow and see what happens
Brody
Brody•8mo ago
haha you aren't wrong does this websocket connection have ping pongs?
tansan [chill/build mode]
It might... I have to check Not sure if the library exposes that to me Actually i dont think the websocket servers provide that
Brody
Brody•8mo ago
I know fly doesn't have the best DX, but from what I can tell, they do have a more stable networking setup (sorry char 😦 ), so if you could please run your code for 24 hours on fly so we can rule out any issues with your code or the platform
tansan [chill/build mode]
I'll consider it, but I'm going to try the reconnect suggestion first
Brody
Brody•8mo ago
well of course that would work but that's just hiding a potential problem
tansan [chill/build mode]
Yeah could be! I'm doubtful its the code though cuz it's a very simple copypasta. I'll let you know if I end up trying fly and report back what i find.
Brody
Brody•8mo ago
okay!
char8
char8•8mo ago
this sounds like something to ask infura/alchemy. Those would be normal outbound connections from our standpoint and we don't intercept traffic there. Websocket ping/pongs are a good first step (prevents an intermediate idle timeout killing the connection) and identifying zombie connections, but you should always implement reconnect. Assuming those folks run cloud infra, you'll see connections drop out whenever they cycle their proxy boxes / scale their fleet / rebalance connections. If it's a low freq. channel, the risk of missing events is hopefully low provided the reconnect is fast. For a high freq. stream you'd usually expect the API to provide some from of resume key so you can resume the stream from where you left off.
tansan [chill/build mode]
I actually asked Alchemy and they ran a test with my code and events and couldn't reproduce it. They suggested it might be my host which is why I opened a ticket here.
Brody
Brody•8mo ago
haha did they run a test for more than 24 hours though
tansan [chill/build mode]
yeah That's what they said, at least. I asked for their logs and they couldn't share it.
Brody
Brody•8mo ago
well id still like you to try on fly for lack of a better word, you need prove this is railways fault
tansan [chill/build mode]
Got it Btw. I spent 2 hours trying to get it into fly.io and gave up. fly.io doesn't like my dockerfiles. I couldn't launch my other elixir app in there either before I found Railway. Do you have any other suggestions for hosts I can just spin up easily via Dockerfile?
Brody
Brody•8mo ago
fly.io should automatically use a Dockerfile same as railway either way, do you have high frequency data coming across this websocket connection?
tansan [chill/build mode]
Naw the data is very infrequent Couldn't get it deployed, honestly. Ran into a lot of deploying issues with their cmd line tool.
Brody
Brody•8mo ago
okay then just do the reconnect on disconnect and call it a day
tansan [chill/build mode]
trying to do it for the brand man