Runpod serverless overhead/slow
I have a handler that is apparently running very fast, but my requests are not. I'm hoping to process video frames. I know this is an unconventional use case, but it appears that it is working reasonably well with this one exception:
What we're actually seeing are not fast responses, but responses that take at least a second, and often longer. The runpod dashboard claims that this is execution time, but the worker logs disagree.
Request ID Delay Time Execution Time
sync-61d689e6-502e... 0.18s 1.28s
What's going on here? Is there anything we can do?
I'll post my handler code in the next message.
129 Replies
My handler code looks like this:
I've removed bits and pieces in order for the message to fit, but the important code remains.
I was also wondering if there's some way to use streaming for this, but it doesn't seem like it since we can only stream responses, we cannot stream data to the server, unless I'm misunderstanding something! I'd really love to spin up a worker per-user and let them connect via websockets but I'm not sure if there's a way to do anything like that.
What is running http://localhost:8080/process_frame?
A process that actually processes the video frame and returns data to send to the user. We could probably fold it into the handler, but profiling shows that it takes a small fraction of a second to return.
You're handler looks fine to me... likely your dealy comes from process_frame
It's running this:
https://neuralpanda.ai/elonify
Neural Panda
Make magic with video.
Unless I'm misunderstanding my logging, it's very fast:
logger.info(f"Handler completed in {end_time - start_time:.3f} seconds")
This implies that it posted to the process_frame endpoint and received a response and is sending data back in 0.09 seconds right?
Also, thank you so much for responding so quickly!
Are you running locally?
Yes
If you mean is the process_frame endpoint running locally.
are you doing a docker run each time?
I have tested this both locally and deployed. The stats I've shown you are from the deployed version.
And no, it's a simple FastAPI endpoint (process_frame is)
So from above the handler completes in 0.090 seconds. Is that not fast enough for you?
Oh it is, but I don't get data that fast.
The execution time from Runpod shows as 1.5 seconds+
That's what's confusing me.
How many active, max workers do you have set. Have you enabled Flashboot?
This is with an active worker. Just one that I'm using for testing.
10 max, one active right now.
Are you doing synch or async RUN call?