Can Runpod be trusted for production
Hi to all, i am new to using runpod and i like the ui and their easy serveless deployment. However i have had running work for muiltiple hours and my credits have be depleting. All my messages keep on getting queued despite the runing workers. I am about to deploy my beta and people are waiting. So this got me concerned as whether or not i should trust runpod. So i would like the communities support and opinions. since support team didn't procide much insight
9 Replies
are you using serverless queues with /run?
yes
@flash-singh yes i am using it with run
@flash-singh I am i not supposed to use /run
here's what i would do, create a locust file (python load testing), run 1-2 requests every second, and every hour increase the load to 50-100 requests per second for 10 minutes, and then go back to 1-2 requests every second, and run it for 48h - 168h, if you have less than 1% errors it could be a good option.
I tried the load balancing endpoints, but I would not use that for production, maybe the run and runsync are more stable
Thanks this is very insightful😆
i've also noticed that CPU and 4090, often run out of supply, and seems to change on a weekly/bi-weekly basis. if you rely on 0 active workers or have peak work loads, it's a high chance you will not have the compute you need. if you have active users, especially paid users, it's worth having a back-up deployment on a second serverless provider, that ideally automatically changes if it exceeds a threshold error limit.
having that said, runpod offers the most compute per dollar and has one of the easiest UIs to deploy services with
@emilwallner thanks. Thoug i am still struggling with getting any response right now though. the status is running so i was wondering if you may have any adivice on that. this happed even after putting atleast 'one' active worker
1) Have you created an API key with read/write access 2) Use the API Key in the Request. Not sure if it applies to the /run /runsync, but for the load balancer endpoints, you all need to make sure you have a /ping endpoint, and set PORT = 8000 (for example), and PORT_HEALTH=8000 in your env part of the UI deployment
When you check the logs from the active worker, is it printing 200 (once every 10 second or so)?
Also, the printing the error message and using curl -v is helpful
@emilwallner thanks man
if your using queue serverless and requests are pending, while worker is running and worker isn't processing the request, then there is definitely a bug with how you have setup the handler, i would simplify the handler first to where it would work, then start making it complex to fit your use case, if you need me to dig bit more feel free to pm me your endpoint id