Container images loading stuck in a loop when launching

I have workers that never actually launch after pulling containers. Have no idea how to debug this. Deleted and recreated the endpoint and get the same behavior. Any thoughts on how to resolve? It is extra aggrevating as I've had to spin this up because of the EU-SE-1 performance degradation and now getting hit with this issue. endpoint id: d9b5s5qpbl0sfb === Snip from the logs. This is just looping repeatedly. Have the credentials set for dockerhub, the image is published and available, etc. === 2025-05-07T19:34:02Z loading container image from cache 2025-05-07T19:34:50Z Loaded image: docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:34:51Z 0.1.24-dev0 Pulling from docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:34:51Z Digest: sha256:b066e7235b92701dca45b26a3da6437e1fdc3ca96f751fd5bd614cdb40f532bb 2025-05-07T19:34:51Z Status: Image is up to date for docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:34:51Z worker is ready 2025-05-07T19:37:23Z create container docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:37:23Z loading container image from cache 2025-05-07T19:37:31Z create container: still fetching image docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:37:32Z create container docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:37:32Z create container: still fetching image docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:37:33Z Loaded image: docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:37:34Z 0.1.24-dev0 Pulling from docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:37:34Z Digest: sha256:b066e7235b92701dca45b26a3da6437e1fdc3ca96f751fd5bd614cdb40f532bb 2025-05-07T19:37:34Z Status: Image is up to date for docker.io/REDACTED_NAMESPACE/REDACTED_IMAGE:REDACTED_TAG 2025-05-07T19:37:34Z worker is ready
36 Replies
David
David4w ago
Yep same here
Mandulis - Max - 2/2
We also experience problems Workers are running and we pay but they only stay in the queue and don't get processed...
zongheng1619
zongheng16194w ago
same here.. I have stopped all of my workers, else it's charging me all the time
Mandulis - Flix
Same for us.
Anubhav
Anubhav4w ago
I am facing the same. Do you know how can we get this complaint escalated?
noahpantsparty
noahpantspartyOP4w ago
I'll tag @Dj to see if they can provide any insight to what is going on. Sounds like it is impacting quite a few of us.
Dj
Dj4w ago
Great Okay one sec I thought this was isolated to like 2-3 people Working on this, let me make a lot more noise Can I ask, are these public or private images?
noahpantsparty
noahpantspartyOP4w ago
in my case, this is/was happening with a private image
Anubhav
Anubhav4w ago
In my case public.
Dj
Dj4w ago
GHCR or Docker Hub?
Anubhav
Anubhav4w ago
DockerHub for me,.
noahpantsparty
noahpantspartyOP4w ago
DockerHub
georg
georg4w ago
DockerHub for me with a private image.
Dj
Dj4w ago
I'm working on getting this escalated, I can definitely see the pattern we're just isolating the problem
Mandulis - Flix
Same
Dj
Dj4w ago
Small update, this is being looked into. Can you check your Container Registry Token you provided RunPod to ensure its valid? You can check on Docker Hub directly to avoid messing with anything. https://app.docker.com/settings/personal-access-tokens
georg
georg4w ago
We can release new endpoint with the same token. nothing change and it is working.
Mandulis - Flix
Yes, valid.
Anubhav
Anubhav4w ago
It works for me now.
David
David4w ago
Private Been down for about 2 hours Same image as the last 3 days, container is trying to load from runpod cache
noahpantsparty
noahpantspartyOP4w ago
away from kb right now but will check soon.
georg
georg4w ago
for me, now the whole workers are just gone away and start all over, assigned 40 and they all started over again.
David
David4w ago
The issue is to do with the cache i'd imagine
David
David4w ago
https://uptime.runpod.io/ - Reporting no issues for serverless, can we get an update so we can relay to our customers
RunPod status
Welcome to RunPod status page for real-time and historical data on system performance.
zongheng1619
zongheng16194w ago
yeah, it's tryng to pull the image from cache again and again, I see it's even said success.
Dj
Dj4w ago
We've identified the issue, I'm not sure why we won't fire an incident for it but it's being treated like one internally.
noahpantsparty
noahpantspartyOP4w ago
FWIW: I'm still having the same issues. I nuked the endpoint and recreated and still have the same problems.
David
David4w ago
This is pretty frustrating, there has been frequent issues with runpod serverless. This 3 hours has cost us once again
Dj
Dj4w ago
I totally understand, the people fixing this issue are actively using this thread to help move things forward. I'll join their call and listen for progress.
noahpantsparty
noahpantspartyOP4w ago
I hear you and share your frustration. Thanks. Please do keep the updates coming.
Dj
Dj4w ago
Can you dm me your newest pod id? @noahpantsparty Are you still pending a release? Found it, it looks like you turned it off
noahpantsparty
noahpantspartyOP4w ago
I cancelled all the requests. I can start another if helpful Should i create another release? Did a new release on endpoint trky2t3f5nehnx and still seeing the same behavior.
Dj
Dj4w ago
We're deploying what should be a fix to production now It's just a slow process
noahpantsparty
noahpantspartyOP4w ago
TY
Dj
Dj4w ago
I'm confident this issue should be fixed for all affected users. Our solution is a band-aid and over the coming days we'll fix the issue permanently.
noahpantsparty
noahpantspartyOP4w ago
Looks like it is working for me now. Thanks @Dj for helping to resolve the issue.

Did you find this page helpful?