R
Railway13mo ago
Horserix

Deploy fails due to it not being able to check \health endpoint

The deploys are funky when triggered by new pushes to project's master branch. Sometimes they pass the check and some other times I just have to remove the previous active deploy and redeploy but even then the behavior is unpredictable. Some other times pushing a new change helped. I'm using golang (gin). router.Run(ip + ":" + port) where ip is 0.0.0.0 and port is defined as a service variable on port 8080. My deploy details are:
{
"$schema": "https://railway.app/railway.schema.json",
"build": {
"builder": "NIXPACKS",
"buildCommand": "./run.sh production-build"
},
"deploy": {
"numReplicas": 1,
"startCommand": "./main",
"healthcheckPath": "/health",
"sleepApplication": false,
"restartPolicyType": "ON_FAILURE",
"restartPolicyMaxRetries": 10
}
}
{
"$schema": "https://railway.app/railway.schema.json",
"build": {
"builder": "NIXPACKS",
"buildCommand": "./run.sh production-build"
},
"deploy": {
"numReplicas": 1,
"startCommand": "./main",
"healthcheckPath": "/health",
"sleepApplication": false,
"restartPolicyType": "ON_FAILURE",
"restartPolicyMaxRetries": 10
}
}
buildCommand is basically: go get && go build -o main Would love help on this since it makes fixing an urgent issue a bit of a nightmare. Thanks!
22 Replies
Percy
Percy13mo ago
Project ID: ec8587ed-0fb6-417a-a80d-ab29eb6a37b6
Horserix
Horserix13mo ago
Project ID: ec8587ed-0fb6-417a-a80d-ab29eb6a37b6
Brody
Brody13mo ago
first, why have you set a custom build command? @Horserix ping for viability
Horserix
Horserix13mo ago
I used that in another platform and it worked when used it after creating the project. I have no problem using the recommended settings from the docs. I just didn't think this could have an impact on the deploy due to it working a while ago I'll try removing it and using the default and ping you back if it changes 👍
Brody
Brody13mo ago
what do the healthchecks fail with?
Horserix
Horserix13mo ago
These are the last logs from the build logs:
Publish time: 0.93 seconds



====================

Starting Healthcheck

====================


Path: /health

Retry window: 5m0s



Attempt #1 failed with service unavailable. Continuing to retry for 4m59s

Attempt #2 failed with service unavailable. Continuing to retry for 4m58s

Attempt #3 failed with service unavailable. Continuing to retry for 4m56s

Attempt #4 failed with service unavailable. Continuing to retry for 4m52s

Attempt #5 failed with service unavailable. Continuing to retry for 4m44s

Attempt #6 failed with service unavailable. Continuing to retry for 4m28s

Attempt #7 failed with service unavailable. Continuing to retry for 3m58s

Attempt #8 failed with service unavailable. Continuing to retry for 3m28s

Attempt #9 failed with service unavailable. Continuing to retry for 2m58s

Attempt #10 failed with service unavailable. Continuing to retry for 2m28s

Attempt #11 failed with service unavailable. Continuing to retry for 1m58s

Attempt #12 failed with service unavailable. Continuing to retry for 1m28s

Attempt #13 failed with service unavailable. Continuing to retry for 58s

Attempt #14 failed with service unavailable. Continuing to retry for 28s



1/1 replicas never became healthy!

Healthcheck failed!
Publish time: 0.93 seconds



====================

Starting Healthcheck

====================


Path: /health

Retry window: 5m0s



Attempt #1 failed with service unavailable. Continuing to retry for 4m59s

Attempt #2 failed with service unavailable. Continuing to retry for 4m58s

Attempt #3 failed with service unavailable. Continuing to retry for 4m56s

Attempt #4 failed with service unavailable. Continuing to retry for 4m52s

Attempt #5 failed with service unavailable. Continuing to retry for 4m44s

Attempt #6 failed with service unavailable. Continuing to retry for 4m28s

Attempt #7 failed with service unavailable. Continuing to retry for 3m58s

Attempt #8 failed with service unavailable. Continuing to retry for 3m28s

Attempt #9 failed with service unavailable. Continuing to retry for 2m58s

Attempt #10 failed with service unavailable. Continuing to retry for 2m28s

Attempt #11 failed with service unavailable. Continuing to retry for 1m58s

Attempt #12 failed with service unavailable. Continuing to retry for 1m28s

Attempt #13 failed with service unavailable. Continuing to retry for 58s

Attempt #14 failed with service unavailable. Continuing to retry for 28s



1/1 replicas never became healthy!

Healthcheck failed!
I just used the default config for the build and deploy and it seems it's having the same issues. It's stuck at the health check attempts 🤔
Brody
Brody13mo ago
can you show me the code that starts the server?
Horserix
Horserix13mo ago
Sure!
main_ctx := context.Background()
var environment = os.Getenv("ENVIRONMENT")
var port = os.Getenv("PORT")
var ip string
router := gin.Default()

if environment == "production" {
router.ForwardedByClientIP = false
gin.SetMode(gin.ReleaseMode)
ip = "0.0.0.0"
}

router.Use(auth_controller.AuthMiddleware)
controllers.RegisterRoutes(router)
services.RegisterServices()
google_cloud_storage_service.Init(main_ctx)
firebase_service.Init(main_ctx)

db.InitDb()

ticker := time.NewTicker(5 * time.Minute)
go event_scheduler.SendNotifications(ticker.C)

router.Run(ip + ":" + port)

defer ticker.Stop()
main_ctx := context.Background()
var environment = os.Getenv("ENVIRONMENT")
var port = os.Getenv("PORT")
var ip string
router := gin.Default()

if environment == "production" {
router.ForwardedByClientIP = false
gin.SetMode(gin.ReleaseMode)
ip = "0.0.0.0"
}

router.Use(auth_controller.AuthMiddleware)
controllers.RegisterRoutes(router)
services.RegisterServices()
google_cloud_storage_service.Init(main_ctx)
firebase_service.Init(main_ctx)

db.InitDb()

ticker := time.NewTicker(5 * time.Minute)
go event_scheduler.SendNotifications(ticker.C)

router.Run(ip + ":" + port)

defer ticker.Stop()
The service variable ENVIRONMENT is correctly set as production in the Variables tab The /health GET endpoint is allowed. There were multiple past deployments where the health checks worked btw and haven't changed it since
Brody
Brody13mo ago
show me the health endpoint code?
Horserix
Horserix13mo ago
func RegisterRoutes(router *gin.Engine) {
router.GET("/health", onHealthCheck)
}


func onHealthCheck(c *gin.Context) {
logger.Println("Health check")
c.JSON(200, gin.H{
"message": "healthy",
})
}
func RegisterRoutes(router *gin.Engine) {
router.GET("/health", onHealthCheck)
}


func onHealthCheck(c *gin.Context) {
logger.Println("Health check")
c.JSON(200, gin.H{
"message": "healthy",
})
}
Brody
Brody13mo ago
what is your port variable set to
Horserix
Horserix13mo ago
8080
Brody
Brody13mo ago
anything bad in the deployment logs when the healthcheck fails?
Horserix
Horserix13mo ago
Nope, everything goes well, the build and publishing are both successful 🤔 In the deploy logs I can see the server running ok. But it's not being hit by any of the requests. I have a previous deployment that is active and working. Which it was triggered a commit ago and worked. It is weird 🤔 But the new commit just fails due to the health check
Brody
Brody13mo ago
don't listen on 0.0.0.0 just do router.Run(":" + port)
Horserix
Horserix13mo ago
I can try that again. I was doing that previously and encountered the same issue. When this first happened I had that setup (router.Run(":" + port)). Which resulted in me then adding the ip after checking the docs and the odd behavior creeped in again btw thanks for helping out 👍
Brody
Brody13mo ago
let me know how that goes
Horserix
Horserix13mo ago
Didn't work :/ Stuck at the health endpoint check again (To add a bit more context too, I tried disabling the check but obviously when I attempt to make a request to the server I get a Application failed to respond after its deployment).
Brody
Brody13mo ago
gin is misbehaving then I have full confidence this isn't a problem with railway
Horserix
Horserix13mo ago
Interesting, I'll try setting up just the health endpoint using somethinge else other than gin and confirm this is the case. Thanks!
Brody
Brody13mo ago
sounds good
dwaynemac
dwaynemac13mo ago
i'm having a similar issue, @Horserix did you manage to resolve it? on my case the problem was my server expected (mandatory) a https request and Healthcheck seem to be a http request. Allowing for http on healthcheck endpoint solved it for me.
Want results from more Discord servers?
Add your server