400 error on Load balancing endpoint
Hello, first time here I am using llama.cpp server image to host a model via the load balancer serveless endpoint.
The worker is running and I can check the log an see that my server is running. But when I am trying to hit the endpoint it returning 400 error.
Here is how I am making the request.
This request is taking 2 minutes and then return a 400 error.
For more context, I am running the following image:
ghcr.io/ggerganov/llama.cpp:server2 Replies
Unknown User•5d ago
Message Not Public
Sign In & Join Server To View
@esp.py
Escalated To Zendesk
The thread has been escalated to Zendesk!
Ticket ID: #26369