Runpod serverless overhead/slow

I have a handler that is apparently running very fast, but my requests are not. I'm hoping to process video frames. I know this is an unconventional use case, but it appears that it is working reasonably well with this one exception:
2024-08-11T23:25:55.468886839Z INFO: 127.0.0.1:40312 - "POST /process_frame HTTP/1.1" 200 OK
2024-08-11T23:25:55.469954721Z handler.py :61 2024-08-11 23:25:55,469 Local server request completed in 0.079 seconds
2024-08-11T23:25:55.470257231Z handler.py :70 2024-08-11 23:25:55,470 Handler completed in 0.080 seconds
2024-08-11T23:25:55.468886839Z INFO: 127.0.0.1:40312 - "POST /process_frame HTTP/1.1" 200 OK
2024-08-11T23:25:55.469954721Z handler.py :61 2024-08-11 23:25:55,469 Local server request completed in 0.079 seconds
2024-08-11T23:25:55.470257231Z handler.py :70 2024-08-11 23:25:55,470 Handler completed in 0.080 seconds
What we're actually seeing are not fast responses, but responses that take at least a second, and often longer. The runpod dashboard claims that this is execution time, but the worker logs disagree. Request ID Delay Time Execution Time sync-61d689e6-502e... 0.18s 1.28s What's going on here? Is there anything we can do? I'll post my handler code in the next message.
No description
No description
129 Replies
teddycatsdomino
teddycatsdomino•2mo ago
My handler code looks like this:
def is_server_running():
for process in psutil.process_iter(['name', 'cmdline']):
if process.info['name'] == 'python3' and 'server.py' in process.info['cmdline']:
return True
return False

def start_server():
if is_server_running():
print("Server is already running.")
return None

server_process = subprocess.Popen([sys.executable, "server.py"])
if not is_server_ready():
stop_server(server_process)
raise Exception("Server failed to start within the allocated time")
return server_process

def handler(event):
start_time = time.time()
logger.info("Handler started")
try:
job_input = event["input"]

if "frame" not in job_input:
logger.error("No frame data provided")
return {"error": "No frame data provided"}

logger.info("Sending frame to local server")
request_start = time.time()
# Send the frame to the local server for processing
response = requests.post("http://localhost:8080/process_frame", json={"frame": job_input["frame"]})
request_end = time.time()
logger.info(f"Local server request completed in {request_end - request_start:.3f} seconds")

result = response.json()

if "error" in result:
logger.error(f"Error from local server: {result['error']}")
return {"error": result["error"]}

end_time = time.time()
logger.info(f"Handler completed in {end_time - start_time:.3f} seconds")
return {"processed_frame": result["processed_frame"]}

if __name__ == "__main__":
start_server()
runpod.serverless.start({"handler": handler})
def is_server_running():
for process in psutil.process_iter(['name', 'cmdline']):
if process.info['name'] == 'python3' and 'server.py' in process.info['cmdline']:
return True
return False

def start_server():
if is_server_running():
print("Server is already running.")
return None

server_process = subprocess.Popen([sys.executable, "server.py"])
if not is_server_ready():
stop_server(server_process)
raise Exception("Server failed to start within the allocated time")
return server_process

def handler(event):
start_time = time.time()
logger.info("Handler started")
try:
job_input = event["input"]

if "frame" not in job_input:
logger.error("No frame data provided")
return {"error": "No frame data provided"}

logger.info("Sending frame to local server")
request_start = time.time()
# Send the frame to the local server for processing
response = requests.post("http://localhost:8080/process_frame", json={"frame": job_input["frame"]})
request_end = time.time()
logger.info(f"Local server request completed in {request_end - request_start:.3f} seconds")

result = response.json()

if "error" in result:
logger.error(f"Error from local server: {result['error']}")
return {"error": result["error"]}

end_time = time.time()
logger.info(f"Handler completed in {end_time - start_time:.3f} seconds")
return {"processed_frame": result["processed_frame"]}

if __name__ == "__main__":
start_server()
runpod.serverless.start({"handler": handler})
I've removed bits and pieces in order for the message to fit, but the important code remains. I was also wondering if there's some way to use streaming for this, but it doesn't seem like it since we can only stream responses, we cannot stream data to the server, unless I'm misunderstanding something! I'd really love to spin up a worker per-user and let them connect via websockets but I'm not sure if there's a way to do anything like that.
teddycatsdomino
teddycatsdomino•2mo ago
A process that actually processes the video frame and returns data to send to the user. We could probably fold it into the handler, but profiling shows that it takes a small fraction of a second to return.
Encyrption
Encyrption•2mo ago
You're handler looks fine to me... likely your dealy comes from process_frame
teddycatsdomino
teddycatsdomino•2mo ago
Neural Panda
Make magic with video.
teddycatsdomino
teddycatsdomino•2mo ago
Unless I'm misunderstanding my logging, it's very fast: logger.info(f"Handler completed in {end_time - start_time:.3f} seconds")
2024-08-11T23:43:21.230260058Z handler.py :48 2024-08-11 23:43:21,230 Handler started
2024-08-11T23:43:21.230512959Z handler.py :56 2024-08-11 23:43:21,230 Sending frame to local server
2024-08-11T23:43:21.318790389Z INFO: 127.0.0.1:50212 - "POST /process_frame HTTP/1.1" 200 OK
2024-08-11T23:43:21.319975282Z handler.py :61 2024-08-11 23:43:21,319 Local server request completed in 0.089 seconds
2024-08-11T23:43:21.320171923Z handler.py :70 2024-08-11 23:43:21,320 Handler completed in 0.090 seconds
2024-08-11T23:43:21.230260058Z handler.py :48 2024-08-11 23:43:21,230 Handler started
2024-08-11T23:43:21.230512959Z handler.py :56 2024-08-11 23:43:21,230 Sending frame to local server
2024-08-11T23:43:21.318790389Z INFO: 127.0.0.1:50212 - "POST /process_frame HTTP/1.1" 200 OK
2024-08-11T23:43:21.319975282Z handler.py :61 2024-08-11 23:43:21,319 Local server request completed in 0.089 seconds
2024-08-11T23:43:21.320171923Z handler.py :70 2024-08-11 23:43:21,320 Handler completed in 0.090 seconds
This implies that it posted to the process_frame endpoint and received a response and is sending data back in 0.09 seconds right? Also, thank you so much for responding so quickly!
Encyrption
Encyrption•2mo ago
Are you running locally?
teddycatsdomino
teddycatsdomino•2mo ago
Yes If you mean is the process_frame endpoint running locally.
Encyrption
Encyrption•2mo ago
are you doing a docker run each time?
teddycatsdomino
teddycatsdomino•2mo ago
I have tested this both locally and deployed. The stats I've shown you are from the deployed version. And no, it's a simple FastAPI endpoint (process_frame is)
Encyrption
Encyrption•2mo ago
So from above the handler completes in 0.090 seconds. Is that not fast enough for you?
teddycatsdomino
teddycatsdomino•2mo ago
Oh it is, but I don't get data that fast. The execution time from Runpod shows as 1.5 seconds+ That's what's confusing me.
Encyrption
Encyrption•2mo ago
How many active, max workers do you have set. Have you enabled Flashboot?
teddycatsdomino
teddycatsdomino•2mo ago
This is with an active worker. Just one that I'm using for testing. 10 max, one active right now.
Encyrption
Encyrption•2mo ago
Are you doing synch or async RUN call?
Want results from more Discord servers?
Add your server