R
Railway•10mo ago
fedev

Websocket disconnecting

Hey, I have an app that uses a websocket (hosted on railway) but for some reason every once in a while (2 - 2.5h) the websocket just disconnectes (and then reconnects since i set up a reconnecting websocket) but since there is high traffic, in the second it disconnects i can loose a lot of data. I saw another issue of this type and you guys suggested to use pings (which i'm sending like every 10 seconds both from the websocket and from the app connected to the websocket) and to use the .up.railway.app domain instead of a custom one as written here https://help.railway.app/troubleshooting/hKqw9Dr1moxfDySTy98E6G/websocket-connections-disconnecting/sb6bfjV6UnuMwkRyAvQ3Sb and i'm doing that too. Ill also attach some logs of the client reconnecting to the websocket every 2 or 2.5h
No description
77 Replies
Percy
Percy•10mo ago
Project ID: b16be651-2cf1-4464-91ef-3890dc9c63aa
fedev
fedev•10mo ago
b16be651-2cf1-4464-91ef-3890dc9c63aa (websocket service id)
Brody
Brody•10mo ago
this is a websocket connection between a client and your railway service right?
fedev
fedev•10mo ago
between a nodejs client and a nodejs websocket server (both on railway)
Brody
Brody•10mo ago
are they in the same project?
fedev
fedev•10mo ago
you mean service?
Brody
Brody•10mo ago
I mean project
fedev
fedev•10mo ago
no they're not
Brody
Brody•10mo ago
should they be?
fedev
fedev•10mo ago
hmm no why?
Brody
Brody•10mo ago
because it seems like they should, they talk to each other after all, and if you put them in the same project, you can use the internal networking
fedev
fedev•10mo ago
but its a websocket server it doesent need to be in the same project, it works fine just that sometimes it disconnects
Brody
Brody•10mo ago
also you miss read the help page, it says to use a custom domain instead of the railway domain, you have it the other way around
fedev
fedev•10mo ago
oh god...😂 I feel so stupid ahah
Brody
Brody•10mo ago
either way I'm kinda leaning towards this not being an issue with railway your time between disconnects are not fixed times, some 2 hours, some 4 hours, etc so I think you should start logging the disconnect reason, and once you have those error logs, go from there
fedev
fedev•10mo ago
will try with custom domain and add some logs and see then 😅
Brody
Brody•10mo ago
also also, I'm pretty sure fp or char told me that it has since been fixed a long time ago
fedev
fedev•10mo ago
oh ok
Brody
Brody•10mo ago
so yeah hold off on the custom domain for now log both error and disconnect events on both ends of the websocket
fedev
fedev•10mo ago
the only log i can get from websocket close event is the code
fedev
fedev•10mo ago
No description
fedev
fedev•10mo ago
code 1006
It is designated for use in
applications expecting a status code to indicate that the
connection was closed abnormally, e.g., without sending or
receiving a Close control frame.
It is designated for use in
applications expecting a status code to indicate that the
connection was closed abnormally, e.g., without sending or
receiving a Close control frame.
Brody
Brody•10mo ago
there are error and close event emitters for the websocket connection, you will want to print the reason for error and the reason for disconnect respectively
fedev
fedev•10mo ago
im listening to both of them error is not firing, only close
Brody
Brody•10mo ago
you will want to print the reason for error and the reason for disconnect respectively
fedev
fedev•10mo ago
but how can i print the error if there is no error...
Brody
Brody•10mo ago
if there was no error then you wouldn't have a disconnect, there is an error, print it
fedev
fedev•10mo ago
this.ws.on("close", closeMessage => {
console.log("WEBSOCKET CLOSE", closeMessage); // 1006 WAS LOGGED
if (this.shouldReconnect) {
this.scheduleReconnect();
this.stopPingTimer();
}
});

this.ws.on("error", (error: any) => {
// if (error.code === "ECONNREFUSED") return;
console.log(`${this.nameIdentifier} WebSocket error`, error); // NOTHING WAS LOGGED
});
this.ws.on("close", closeMessage => {
console.log("WEBSOCKET CLOSE", closeMessage); // 1006 WAS LOGGED
if (this.shouldReconnect) {
this.scheduleReconnect();
this.stopPingTimer();
}
});

this.ws.on("error", (error: any) => {
// if (error.code === "ECONNREFUSED") return;
console.log(`${this.nameIdentifier} WebSocket error`, error); // NOTHING WAS LOGGED
});
Brody
Brody•10mo ago
i just cant see how this issue is railways fault, given the fact that the time between disconnects is sporadic, if there where timeouts for websocket connections in place you would see constant time between disconnects like i have said, log the reason, you are only logging the code
fedev
fedev•10mo ago
ok then I will see if I can manage to figure it out thank you anyways maybe the problem is that ws.send("ping") instead of ws.ping() how am I not logging the reason ? there is console.log(error) and console.log(closeMessage)
Brody
Brody•10mo ago
you are only logging the code please read the reference docs for the close event https://developer.mozilla.org/en-US/docs/Web/API/WebSocket/close_event
fedev
fedev•10mo ago
you mean I'm doing console.log(closeMessage) instead of console.log(closeMessage.reason)?
Brody
Brody•10mo ago
please read the reference docs for the close event https://developer.mozilla.org/en-US/docs/Web/API/WebSocket/close_event
fedev
fedev•10mo ago
I have read it and still not understanding what you are saying
Brody
Brody•10mo ago
I don't think my attempts to guide you in the right direction are bearing any fruit. I have other threads from other users to attend to and wish you the best as you debug your issue. Hopefully, it gets resolved! 🙂
fedev
fedev•10mo ago
ok sorry, thank you anyway railway
fedev
fedev•10mo ago
Hey brody, i'm pretty sure that the websocket disconnecting issue is a problem from railway I have deployed the same websocket server on AWS and 11 hours later (for me it 20:10) the node client still hasen't disconnected from the websocket
No description
Brody
Brody•10mo ago
okay try a custom domain next
fedev
fedev•10mo ago
Yes I already tried it yesterday
Brody
Brody•10mo ago
try on fly?
fedev
fedev•10mo ago
Wym?
Brody
Brody•10mo ago
fly.io you tried it on aws, now try it on fly.io
fedev
fedev•10mo ago
Oh its a hosting platform?
Brody
Brody•10mo ago
yeah
fedev
fedev•10mo ago
Ok later will try to host on fly and tomorrow will see if it disconnected then
fedev
fedev•10mo ago
I tried with fly and 10h and 30 min later its still connected
No description
fedev
fedev•10mo ago
i also retried with railway websocket again just to make sure and less than 2h later it disconnected
No description
Brody
Brody•10mo ago
okay, I'm gonna run a test myself too, if I can reproduce this, I will get the team involved for you
fedev
fedev•10mo ago
Ok thanks, for the moment i deployed the websocket server on fly io
Brody
Brody•10mo ago
what would you say the max time you have been able to keep a websocket connection open for on railway is?
fedev
fedev•10mo ago
from this screen 9,5h but generally it's 2h
Brody
Brody•10mo ago
well I hope I don't have to get back to you in 9.5 hours I also kind hope I can reproduce this, I'm not sure what I'd say to you if I can't
fedev
fedev•10mo ago
not urgent ahah if you want i can provide you the code i'm using
Brody
Brody•10mo ago
all we will be waiting for is my websockets test to disconnect
fedev
fedev•10mo ago
yeah
Brody
Brody•10mo ago
can reproduce
No description
No description
Duchess
Duchess•10mo ago
Thread has been flagged to Railway team by @Brody.
Brody
Brody•10mo ago
@char8 - websocket disconnects
char8
char8•10mo ago
Actively looking into this - have an idea of what it could be, so testing that hypothesis. Will update here when I have more.
fedev
fedev•9mo ago
Hey any news?
Ray
Ray•9mo ago
Hey, not yet sorry. It may be related to some other network issues we're going to investigate soon The team's heads down on getting regions out atm
fedev
fedev•9mo ago
Still no updates on this? 😕
Brody
Brody•9mo ago
well as long as envoy has a memory leak lol I've said this before, but are you sure you don't want to run this communication through the private network?
fedev
fedev•9mo ago
Could try it Tho i've never used it, I tried reading docs but don't much understand how to get it working
Brody
Brody•9mo ago
everthing in the same project, then just use internal domains and the port your app runs on when opening the ws connections
fedev
fedev•9mo ago
cool managed to connect, forgot to change to ws instead of wss but is it like a guaranteed thing that with private networking it won't disconnect?
Brody
Brody•9mo ago
well theres no proxy with the private network
fedev
fedev•9mo ago
ok then for the moment I will use private networking, thank you brody
Ray
Ray•9mo ago
This happens as a side effect of our edge proxy. No short-term fix for now 😦 Sorry for the late follow up, and the bad news on this. We're reworking a major layer on all of this soon
fedev
fedev•9mo ago
I understand thank you anyway Thanks
Dayblox
Dayblox•9mo ago
Might just be needing a keepalive ?
Brody
Brody•9mo ago
both ray and char8 have confirmed its not a code/config issue
Dayblox
Dayblox•9mo ago
I meant a keepalive message to avoid inactivity But 9 hours feels lengthy
Brody
Brody•9mo ago
both fedev's app and my app has constant activity, my test had a message every second, so its railway
Dayblox
Dayblox•9mo ago
Alright, sorry I'm caught up now
Brody
Brody•9mo ago
no worries at all
char8
char8•9mo ago
We’ve lowered the envoy restart times to once a week for the edge proxies a few weeks ago. The routing ones still reset once a day, looking into upping that as well (these eat up a fair bit more ram). Hopefully some improvement though 🤞