Runpod•5d ago

Pod SSH keeps disconnecting

I terminated my pod to avoid being charged. I had tried terminated and recreating the pod multiple times thinking it might be issue with a specific machine. But the same keeps happening. SSH keeps disconnecting randomly that I am not able to do work. Here's the SSH log: -- RUNPOD.IO -- Enjoy your Pod #puuuogf0fbma0f ^_^ Error response from daemon: Container 068a5870008ded73495e58f7295b7e240f686b96ec67f1020bc829c33b378fad is not running Connection to 100.65.19.183 closed. Connection to ssh.runpod.io closed. If necessary, I can print the verbose information. But it is obvious that the error is that the container stopped running after a while.

9 Replies

CalvinnOP•5d ago

This is terrible. My container keeps on stopping. The same happens for GPUs with both low and medium availability.

CalvinnOP•5d ago

Tried waiting for 10 minutes after the pod starts and it still fails. -- RUNPOD.IO -- Enjoy your Pod #bt7lakn5m1c8xz ^_^ Error response from daemon: container 00777c6698202fc8dd16c538fcf57f06694996b86e1fd66f5a6f5b208fc5114a is not running Connection to 100.65.27.24 closed. Connection to ssh.runpod.io closed.

Unknown User•5d ago

Message Not Public

riverfog7•5d ago

If the logs show nothing try checking system log tab

CalvinnOP•5d ago

I see that it is trying to load meta-llama/Meta-Llama-3.1-8B-Instruct model as soon as it is launched. I am using the vLLM latest Docker image. I did not run anything here. The errors appear as soon as the pod is created.

message.txt

riverfog7•5d ago

Did you provide the huggingface auth token It is failing on auth

CalvinnOP•5d ago

But why is it trying to launch a server? I am just launching a Pod. It is not supposed to launch a vLLM server right away right? Got it. I tried with the Pytorch and it works. Just noticed the vllm:latest image is community. Not sure what is happening but pretty sure the behavior isn't what's normally expected. Marking this issue as resolved. Thank you for all your help!!

riverfog7•5d ago

Vllm should try to launch a openai api compatible server right away Its intended behavior You can check the cmd of the image

CalvinnOP•5d ago

Oh I see. Thank you for pointing it out!

Gaming

Programming

Pod SSH keeps disconnecting

Did you find this page helpful?