RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join Server
NNERDDISCO1/16/2024

CUDA 12.3 support

I created a template with a custom image (based on runpod/containes) to run CUDA 12.3, but when I use pytorch 2.1.2 + python 3.10, it tells me that it's not working. ```bash python3 -c "import torch; print(torch.cuda.is_available())" ...
NNik1/16/2024

Is there a way to get pod logs programmatically?

After creating an on-demand pod via GraphQL API I'd like to get access to the pod's logs without using the UI.
AAlex1/15/2024

GPUs look available via `runpod.api.ctl_commands.get_gpu()` which aren't available.

I'm currently trying to find which types of GPUs are available (in order to programatically decide what GPU type I want). I saw that there is a runpod.api.ctl_commands.get_gpu() function which calls the graphql api, but the information it returns seems inconsistent with what's available. For example, right now. I can run...
Ttimoshishi1/15/2024

Serverless endpoint long waits in "Initializing" state

Requests to a serverless endpoint at /run have an "Initializing" status in the dashboard for up to 15 minutes. Is this a normal queue time for an endpoint with no other requests?
RRaios1/15/2024

Foooocus too slow on generation

SD is very complexed, so i decided to use fooocus UI instead which im used to. The problem here is that the generation process takes LONG time to occur while on sd no more than 5 seconds per picture. is there a way to make fooocus generate on the same speed?
RRaios1/15/2024

Image Generation problem

NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check. why this happens when i try to generate a pic?
LLLili Lu1/15/2024

could not start a temporarily closed pod

I had a pod which stopped yesterday, but now I could not start it.
RRaios1/15/2024

Outdated controlnet how to update?

how to update controlnet to its latest version? i cant find it on updates and when i use the link from the official page it says its installed but on the ui it still shows the old interface without the new options,
Solution:
``` cd /workspace/stable-diffusion-webui/extensions/sd-webui-controlnet git pull source /workspace/venv/bin/activate pip3 install -r requirements.txt...
No description
RRaios1/15/2024

There are no available GPUs on this host machine

why this happened? I need to secure a GPU wihtout happing such issues, i terminated and deleted the old pod and when i started new one the same thing occured. I havent managed to even use SD all this time as i keep getting such errors, how to solve this?
No description
Nnick.luvy1/15/2024

copy folders from one location to another, inside Jupyterlab?

Hi, is there a way to copy folders from one location to another, inside Jupyterlab? I want to install something but it installs in the wrong venv folder.
SSuperintendent1/15/2024

a6000 is apparently all gone but still available on page

trying to use a6000 on community cloud but its apparently all used, which i doubt because it still shows the ability to grab one on the main gpu select page, for now i grabbed 2x3090 and am hoping that it will work for what im trying to do
AAlexGilSeg1/15/2024

Empty trash?

I deleted files using gui and in doesn't register so I am out of space.. I really need to find a way to get some storage back and I dint want to restart the server.. Is there a way to "empty trash"?
Ttimoshishi1/14/2024

Versioning serverless endpoints

I have tagged images on dockerhub that I am using for serverless endpoints. Main branch is tagged as v1. In order for the serverless endpoints to update with the new code, do i need to retag each code change in dockerhub as well as specify a new release in the Runpod dashboard? Does the current system not pull down the image at the specified tag if the code has changed in the image?...
Solution:
There is an edit endpoint menu where u can say new release, pick a tag that is different and will redownload it. Yes new changes in dockerhub need to be retagged with something new No, it doesnt auto download....
TTRSML1/14/2024

how can I find my pod's ip address?

how can I find the public IP address for my running pod? I've opened some TCP ports, but don't know what address to use to reach them. Thanks...
Jjbe1/14/2024

"This server has recently suffered a network outage and may have spotty network connectivity." and

getting a "This server has recently suffered a network outage and may have spotty network connectivity. We aim to restore connectivity soon, but you may have connection issues until it is resolved. You will not be charged during any network downtime." that has persisted for a couple of hours now. price seems to still be ticking. is it dead? how long should I wait? ID: 1tncsp3jl5lsc5
M_manuelcerezo1/14/2024

Multinode training Runpod ports

I'm trying training a distributed models using multinode, 2xPods x8GPU 4090 for each. We cant train using torchrun, because i need the same TCP port, for each machine, so, runpod assigned me a random external port , command example: NODE A:...
Jjustin1/14/2024

Feature Request / Is it possible RunpodCTL

Just sharing a wish / pending thought as a backlog wish ~ Is it possible to add a CLI command to runpodctl, where it generates SSH keys / stuff, and I can send "the public key" to another pod and stuff, and it automatically adds it to the authorized public keys etc. And then it does a connection and a direct SCP file transfer? ...
TTRSML1/13/2024

How to mount persistent storage volume in pod?

I've created persistent storage and launched a pod from the storage UI. When I log in via ssh I can't see the storage volume. How do I find/mount it for use?
MMikeCalibos1/13/2024

RunPod SD InvokeAI v3.3.0 Errors

When I try to run a runpod with invokeai, I just get a Server Error and Runtime Error when I try to generate an image.
Nnientenickgrazie1/13/2024

ENDPOINT IS

⛅|gpu-cloud Hi to you all, can somebody please tell where to find the "endpoint" code ? I would like to connect to my GPU Cloud based by using Python !It would be grand if somebody can post here an example of a working Python code to connect and use the GPU ! Thanks a lot to all those that would like help 😆...