Some services are not responding
Some of my services are not responding to requests. Im getting timeouts.
Solution:Jump to solution
I got it to work, turns out railway changed the database private url from
postgres-a1234.railway.internal:5432
to postgres.railway.internal:5432
. This caused any DB related code to timeout the worked, which also caused the /api/ping/ route to timeout till the worker restarted. This was the only difference in the replica, which already had the non-suffixed URl version for its own DB. Not sure what to make of this, and whether its the app's fault. We can close this thread now.35 Replies
Project ID:
f433fcca-164a-41f2-ba64-3fee6f309a1b
f433fcca-164a-41f2-ba64-3fee6f309a1b
could you send some examples service domains?
Will that be published publicly using AnswerOverflow?
yes
you can always delete it afterwards (and that would delete it from answer overflow as well)
Ok.
Use any credentials and it just times out
The signin page is cached
is this consistently timing out for you?
Yes for the last hour
THe exact replica is hosted on a staging server and that is working perfectly
I tried redeploying, restarting, everything
Just this project is misbehaving
what is the tech stack of this app?
may I ask why you believe this to be an outage, and not your app misbehaving?
Because even simple endponits such as /api/ping/ that does nothing but reply a hardcoded hello JSON isnt working consistently
It works once in 10 to 15 min
Rest of the time its timing out
that's unfortunately not definitive proof, so let's keep an open mind going forward
Sure
Also I have the exact replica in another project thats working fine
the log in is just infinite loading for me, but I'm on mobile right now, will take a closer look when at the computer
There are no logs showing up in this service as well for me to debug anything
how long ago was this service deployed?
Occassionally it just says workertimeout
And it restarts
2 hours ago last
First deployed more than a year ago
that's pointing more towards an issue with your app than an issue with railway
Working perfectly till today
It worked perfectly for a year
does your app contact any 3rd party apis?
Not the /api/ping/ route
do the metrics look normal?
Yes
what's your start command
I didnt change anything related to the settings as well
It was working just fine yesterday too
have you tried re-deploying?
Yes
Redeployed all services in this project
no one else has reported anything wrong, so at this time i believe this is an issue with your app
im seeing this page during a login attempt, meaning railway is working, but your app isnt
Alright, I have a feeling it's an isolated incident. Something similar to this happens ages ago. Had an email thread with Angelo. He resolved it just for our service.
Rest is up to the railway guys. I'll move this service away shortly since a customer is very frustrated.
you said your service was logging a worker timeout, indicating your app is timing out
something in your code is blocking all requests from completing and timing out
Hmm let me check
i understand how this can look like an outage from your perspective, and i totally get all of your reasoning behind thinking that, but i've been doing community support for a long time and ive seen far stranger issues that looked even more like an issue with railway but turned out to be code issues, so please spend some time combing over all of your code and add some verbose debug logging to help you find out where your code is freezing up at
Makes sense. I'll keep an open mind.
Solution
I got it to work, turns out railway changed the database private url from
postgres-a1234.railway.internal:5432
to postgres.railway.internal:5432
. This caused any DB related code to timeout the worked, which also caused the /api/ping/ route to timeout till the worker restarted. This was the only difference in the replica, which already had the non-suffixed URl version for its own DB. Not sure what to make of this, and whether its the app's fault. We can close this thread now.definitely not the apps fault, but railway isn't just going to change names like that either, though that's for sure what caused it