Railway Deployment fails with `operation timed out`
Trying to run
wasp deploy railway deploy [project-name]
. It ends up failling because of a railway failures by the looks of it. Followed the steps to deploy manually and it worked
output:
Anyone else seeing these issues? Is there a way the deploy command could be more resilient to network related failures?3 Replies
Hi @alittlefluffy. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
----------------------
It appears that network-related timeouts and failures during Railway deployments are a known issue, both with manual and CLI-based approaches. Your experience of the deployment failing with an "operation timed out" error is consistent with other reports, especially when deploying via CLI commands like railway up or wasp deploy railway deploy.
Currently, the Wasp CLI's deployment process does not have built-in resilience or automatic retries for network-related failures. This has been identified as an area for improvement for other providers (like Fly.io), where suggestions include making the deployment process more robust by retrying failed commands and providing clearer instructions to users on how to recover from partial failures. However, there is no explicit mention in the knowledge sources of such improvements being implemented for Railway deployments yet. The general advice is to retry the deployment manually if you encounter such errors, as you did, or to follow the manual deployment steps as a fallback, which tend to be more reliable in these scenariosΒ Improve how our Fly deployment via Wasp CLI handles errors.
If you continue to experience these issues, you may want to:
- Retry the deployment after a short wait, as the problem could be transient.
- Use the manual deployment steps, which you mentioned worked for you.
- Monitor the related GitHub issues and PRs for updates on improved error handling and resilience in the Wasp CLI for Railway deployments.
At this time, the knowledge sources do not mention any automated retry or resilience features for Railway deploys in Wasp CLI, but this is a recognized area for future improvement.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Sorry @alittlefluffy, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist:
Perhaps try again in 15mins?
Sometimes Railway API is flaky π’ that's the issue here. I like your suggestion to make our API calls more resilient, but the problem is that Wasp CLI is calling Railway CLI to perform the actions. The issue is that we don't have access to raw HTTP calls which we can then retry, we can only retry calling the Railway CLI command - which might not be what you want e.g. we might be creating multiple different services because the first command actually succeeded.
I guess the best thing for us would be if Railway CLI retried the HTTP call. Maybe we should create an issue on their repo? π
Yeah, the difficulty to make the calls more resilient is tough to build well for everyone, you can't necessarily capture someones intention, and without access to the actual HTTP calls you could be retrying for user failures tbf. Sort of a rock and a hard place
Guess I'll live with some deployment flakiness for now π₯Ή
Thank you anyway!