Has anyone experienced issues with serverless /run callbacks since December?
We've noticed that response bodies are empty when using /run endpoints with callbacks in the RunPod serverless environment (occurring sometime after December 2nd).
Additional context:
- /runsync endpoints are working normally
- Response JSON format appears correct in the "Requests" tab of RunPod console under Status
- Our last deployment to this endpoint was two months ago
Could anyone confirm if there have been any releases since December that might have introduced this issue? We haven't made any changes to our deployment since two months ago, but are now seeing empty response bodies with callbacks.
Thanks in advance! 🙏
66 Replies
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
same here

getting a barrage of 520 and 415 since 4 am
same here
Same here with two clients of mine simultaneously. Status request immediately after the job's completion returns the correct result. But no webhook
getting JSON strings
sometimes empty responses
badly formatted string
i am not getting any responses
it started happening today
getting empty object in request.body and undefined in request.rawBody
Yes
everything is breaking 😢
yes
No one is responding too.
my entire flow is disrupted due to this.
Same here, we're only getting the input data, but not the request id anymore so we can't fetch the output of the job
from what I found, they're missing the "content-type" header
the webhook POST is missing the content-type header, if you can fix it, just set it to application/json (I am using a middleware in rails)
same here, getting initial "IN_QUEUE" status but not receiving any response , runsync works ok
anyone has a fix or know what's wrong?
I think we need to wait for them to wake up, it's currently ~2am in san francisco
for me I did a manual fix by adding a middleware by directly setting all webhook requests as contenttype of json
smth like this fixed it for me in python
Yes, If using nodejs, set the request header to application/json manually and use express json parser to get the parsed body
unfortunately my company uses bubble as a front end service, do i have to make a middleware on my end to solve this?
i think someone pushed an update before they went to sleep
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
happens to everyone to be honest
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
i only use europe regions


fails on every datacenter
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
it returns a valid output (can be seen in requests tab of an endpoint), but the webhook post request body doesn't seem to be a serialized json
event setting content type manually didnt work for me
trying to serialize it manually from chunks sent
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
Yup
For those who work with node.js and didn't resolve it with manually setting the content-type, here's the custom serializer from chunks into json
(express)
same here 🙌🏻
Sometimes I get empty body (with binary data). Is there any explanations from runpod?
not yet, i see this happens since 03:00 at night UTC
Is there any way to escalate issues like these to runpod staff, especially if it happens in the middle of the night for them?
i guess discord is the only place
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
Is that actually escalating it? I'm guessing they're all sleeping right now
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
Ok, I guess they will notice either way once they wake up. But I guess a company like runpod should also have monitoring set up that screams at them when suddenly a huge amount of webhooks across all customers fail.
We've recently had an sdk version that didn't work properly for about 2 weeks straight so...
Im using n8n for webhook, I can provide screenshot but there is no programming things, so I don't sure if it's fit or understand other developers
But I think runpod should increase their focus on support or deployment management. Because 1 or 2 months ago, runpod sdk was broken and I couldn't see if I checked discord
but anyway, Im sharing screnshots @nerdylive
The right one is request from runpod

And Runpod respond as binary, I guess

And that's the binary data

Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
Yes :/ When they will fix u think?
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
Hey, sorry about this! We’re aware of the issue and will have a hotfix in next hour. The response is currently missing the application/json header. As a workaround, you can update your code to parse the body as JSON even if the header is missing.
Thanks for quick respond 💪
thanks meow meow 🚀
@yhlong00000 any updates? 👀
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
was in the middle of a huge refactor to make this work but if you are working on it I'll just wait
We’re running the final tests now, it should be ready soon.
We’re pushing the change now; it will take about 15 minutes. I’ll keep you posted.
The release is almost complete, and my testing shows the response looks good. Could you verify it on your end and let me know if you still encounter any issues?
seems to be working now
thanks
It works now. Are there any steps you are taking to prevent issues like this from happening in the future? The biggest issue for us was that there was no official reaction at all for more than 5 hours of our regular working day.
Yes, we will reflect on this incident internally and implement additional safeguards and necessary changes to prevent this from happening again in the future. We’re truly sorry for the inconvenience!
All systems operational now, back to normal
Even with priority support this was quite unnerving, would be great if you could have support team for the midnight san francisco hours
Thanks for the feedback! We’re currently short on support staff but will work towards providing 24/7 support in the future.
I'd like to confirm that our application has recovered from the above issue. Thank you.
I am having issues right now... Cannot create pods through the python package despite having GPUs available. Keep getting the "There are no longer any instances available with the requested specifications. Please refresh and try again" but if I try to create it with exactly the same settings from the Web UI it all works out...
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
Its been happening since last week. Is I try to hit the button on the webui at the same time with triggering the script, it doesn't work with the runpod python package but it works with the Web UI.
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
Are any of you still experiencing issues with serverless vllm? I cannot manage to release a working endpoint. I keep getting 500s and even some 502 bad gateway from cloudflare. I don't even know how to further describe my issues, it's days that I'm banging my head on this problem and I'm losing sanity. I tried to rollback to runpod/worker-v1-vllm:v1.6.0stable-cuda12.1.0, without any luck. Lucklily it seems that my old endpoints created in the past few months are not experiencing visible issues
502 are coming in strong now and my in progress requests seems to be multiplying according to inprogress counter (without aparent reason)
The UI refreshes periodically to display the latest GPU availability. When you click the deploy button, the system checks the real-time availability of the GPU. If availability is low and many users are renting or releasing GPUs, it’s possible the UI shows a GPU as available, but by the time you deploy, it’s already taken due to the refresh delay.
maybe try to record a video, screenshot, logs, endpointIds, current settings, those will be useful to figure out the issue.
Didn't manage to collect all the material yet, however it seems related to constraining the generation with:
extra_body={"guided_json": json_schema}
https://docs.vllm.ai/en/latest/usage/structured_outputs.htmlThat also happens with instances that show "Medium" or "High" availability. Is a general issue with Runpod.