Failed to return job results

My serverless worker logs these errors throughout the process: 2025-08-07T09:05:53.399265616Z {"requestId": "7d8f9b4a-9caf-48cb-a798-e4047fe62a9b-e1", "message": "Failed to return job results. | 404, message='Not Found', url='https://api.runpod.ai/v2/mwbt52if15qdt0/job-done/nvyv3441xhr52v?gpu=NVIDIA+H100+80GB+HBM3&isStream=false'", "level": "ERROR"} no progress updates on /status and no completed status either (the process completes successfully though) Instead the worker does an immediate retry - that also fails Not sure how to troubleshoot this
4 Replies
echoSplice
echoSpliceOP3w ago
Also noteworthy: its always the same url in that error message. with "job-done" in it. Even when in the middle of the process
Poddy
Poddy3w ago
@echoSplice
Escalated To Zendesk
The thread has been escalated to Zendesk!
Ticket ID: #21603
Unknown User
Unknown User3w ago
Message Not Public
Sign In & Join Server To View
echoSplice
echoSpliceOP3w ago
its small. I have added some logging to verify: response = { "user_id": user_id, "tenant_id": tenant_id, "job_id": job_id, "training_type": training_type, "status": "success", "model_name": f"{training_config['model_name']}.safetensors", "model_size_mb": round(model_size_mb, 2), "training_duration_seconds": round(training_duration, 2), "download_url": f"https://s3-accelerate.amazonaws.com/{S3_USER_BUCKET}/{s3_key}", "uploaded_at": current_time, } return_object = { "refresh_worker": True, "output": response, } log.info(f"return_object: {json.dumps(return_object, indent=2)}") return return_object that results in: ag0r5rmbsjnrne[info]Returning success response to RunPod... ag0r5rmbsjnrne[info]return_object: {\n "refresh_worker": true,\n "output": {\n "user_id": "45edd360-4855-43f8-b941-eb04d2fa8611",\n "tenant_id": "3715a4a3-ef01-4e36-a809-e2a71cdcceb6",\n "job_id": "custom-styles_style_64e0e0af-9ef6-456a-8c94-09878047b968_1754572187033",\n "training_type": "custom_style",\n "status": "success",\n "model_name": "style_64e0e0af-9ef6-456a-8c94-09878047b968.safetensors",\n "model_size_mb": 146.17,\n "training_duration_seconds": 26.74,\n "s3_url": "https://s3.amazonaws.com/------------/------------/--------/------------.safetensors",\n "uploaded_at": "2025-08-07T13:10:33Z"\n }\n} ag0r5rmbsjnrne[error]Failed to return job results. | 404, message='Not Found', url='https://api.runpod.ai/v2/mwbt52if15qdt0/job-done/ag0r5rmbsjnrne?gpu=NVIDIA+H200&isStream=false'

Did you find this page helpful?