CUDA NO WORKY?
I'm unable to get SSH working to pods from a clean Cuda docker image. Despite saying they're ready and giving me an SSH line (and charging me $$$), they all spit out the same error:
You can try one here. https://console.runpod.io/pods?id=mlbfg4iutwm19c
The only reason I'm using a clean Cuda image without PyTorch is because apparently the official PyTorch Cuda envs are misconfigured. By misconfigured I mean, no matter what I try, I can't get cuda visible to python, or get any
CUDA_DEVICES_AVAILABLE.
No matter how many times or pods I try this on, I never get cuda defined!!18 Replies
At this point I'd like to request a refund. I'm at wit's end. Even the LLMs are telling me runpod's cuda envs must be misconfigured
@chess Do you have a image to share? I tried to look at your link and it did not lead to anywhere
I tried the official template, and was able to get it?

https://console.runpod.io/pods?id=mlbfg4iutwm19c this one shows up in my console, can you see it?

What template is that using?
You said a clean cuda thing?
I cannot see pods that are on your system
but if you are trying to get ssh setup
you can maybe try hold on
it was a custom template with that docker image
Got it
and ssh would just kick me out every time
let me take a look
Im not familiar with this template, but:
1. I think runpod is working
2. If you want to try i have a ssh script that tries its best
to install ssh by password based
and tells u how to ssh into it when done
3. Let me give it a try
GitHub
GitHub - justinwlin/Runpod-SSH-Password: Help ppl do pod ssh throug...
Help ppl do pod ssh through password. Contribute to justinwlin/Runpod-SSH-Password development by creating an account on GitHub.
This is the repo fyi
if curious
actually the hard thing with this, is it might be too minimal, i wonder if it even has some basic terminal access / openssh server installed
do you have like a link to your custom docker image? im guessing that maybe doesn't have openssh installed? or something like that
You'll need:
openssh-serverFYI, this template was:
that i was able to see the torch cuda thing

You can run my script to do password ssh with a runpod official template through web terminal / jupyter labs, and should work 🙂 or you can set up ssh key properly
once you use a runpod official template, which has more than the bare minimum setup for you, you can just run my script in the web console or in the jupyter labs:
Should get SSH + I tried again, on a fresh pod and i still got the stuff working. I did not reinstall torch / torchvision tho. I just go straight from our pod



Summary:
1. Try to use a runpod official template to start
2. You can run my ssh script, or set up ssh keys (in the docs) so you get automatic SSH for future pods spun up. Our templates are setup with openssh server
3. You can run your:
python -c "import torch; print(torch.cuda.is_available())" which as i show in my screenshot in two instances that it does pick upgot it. thanks for the assist. unfort id still like to process a refund because in the interim ive switched over to lambda labs
You can try to submit a ticket with runpod support if it was a hardware issue on runpod side