Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Error in starting New Pod

Whenever I am trying to create a new pod in the last 30 min, the system logs are showing this error. Tried multiple templates, still gives a similar daemon error each time. Any idea how to resolve this??? 'error starting container: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.8, please update your driver to a newer version, or use an earlier cuda container: unknown'...

Used bank to get $5 credit but credit not showing up

Hi it said if I use my bank to set up payment and loaded $10 I get $5 in credits but that's not happening and i just wanted some help with that

Pod not downloading container correctly from docker.io

The pod shows that it has downloaded the container from docker.io. The digest is the same as when I pull it on to my own server. The pod shows 15MB/30GB even though the container is 16GB. It then just says starting but never starts. The ssh connections shows:-- RUNPOD.IO -- Enjoy your Pod #1ly..7o ^_^ Error response from daemon: container 3ae07de.... is not running Connection to 1.. closed....

any good template for beginner?

I just tried below, but I got so much problem to install extensions. Is thete any good template for beginner? Ondemand,A4000,NetworkStorage70GB. Template:RunPod Stable Diffusion runpod/stable-diffusion:web-ui-10.2.1...

“We have detected critical error..” issue

Hi team, I am currently suffering from this message “we have detected a critical error on this machine which may affect some pods. We are looking into the root cause and apologize for any inconvenience. We would recommend backing up your data and creating a new pod in the meantime” appeared in one of the pods I deployed. I cannot access to the server and I got some important data in this workspace Could you help me with this?...

H100 VRAM usage limited by power

So I am paying for a H100 but only getting the "power" of an A40? This doesn't seem right? It never exceeds 30% of total VRAM available Also 310 watt power limit seems very low?...
No description

Can't use anything on ComfyUI

Hey I just finished downloading and set up ComfyUI onto RunPod, but after i installed it, I noticed I couldn't move or use mostly any buttons or anything, most notably the "Run" button.

Whenever I restart my pod, all my data is lost even if i have a volume

Whenever I restart my pod, all my data—including models, outputs, and workflows—is lost, even though I use ComfyUI Manager Permanent Disk with Torch 2.4 template, which installs the ComfyUI files into /workspace. (I have my own network volume.)

Pods are not connecting

It keeps buffering, the connection i tried to rent other pods as well same with all i don't know what's the issue if it's not gonna connect why am i paying for a such service and there's no response from your side it keeps saying waiting for connection data please fix this issue
No description

API keys automatically created with each pod

I noticed in my audit logs that an API Key was created automatically every time I created a pod. Just wanted to check that that's normal?
Solution:
all pod has its own api key
Message Not Public
Sign In & Join Server To View

How do I fix this theme loading error?

Super annoying, feel like I've tried everything already so any suggestions helps a lot.
No description

Pod Global Networking not working

Hi, I had no issues using global networking the day before yesterday, but it has not worked for me since then. I have 2 pods running, both in US-NC-1 with global networking enabled, and ping on the internal name from either pods does not work and obviously neither do service comms. I also noticed DNS and sporadic disconnect errors on the network yesterday, so I'm wondering if this is a known issue that's being worked on?...

Pods dont get removed on issues when starting

When i start a pod but the image is not found the pod is just burning money until all eternity. Please for the love of god. Fix that.
Solution:
We're working on a new Pod Deploy method which will fix this.

Why does all my data get stored/installed into container disk?

I am working with the runpod code server template, I have the mount volume path set at /workspace Everything I download into workspace shows up on the container disk (temp storage) What gives? https://discordapp.com/channels/912829806415085598/1407345635463794708...

ports removed

Hello, I have this pod, i used it last time few hours ago, now, all the ports were deleted, I added some of them back, but even when I click on them I get a cloudlfare timeout error (even the jupyternotebook)
No description

Pod 100% Mem usage freeze

Building wheels for collected packages: flash-attn Building wheel for flash-attn (setup.py) ... \ Gets stuck on the above when trying to run: ...

ComfyUI models not downloading on volume properly

Been having a issue I can't resolve for the past three days. I've set up a network volume and am running a pod with ComfyUI, everything is fine besides downloading the actual Wan 2.1 model's I'd like to use. I've tried wget, huggingface-cli, downloading through the ComfyUI Manager, and manually dragging and dropping the model onto Jupyter Labs but EVERY time it says 14.96gb downloaded yet when I hover over the model in Juptyer Labs it says 13.3gb or 13.8gb leading to the error in ComfyUI when trying to run the workflow. How can I avoid this? Is it being truncated? ...
No description

2 Broken H100 GPU's

I'm 14, paid for 2x H100 GPUs for my AI research company Aedis, and got completely broken hardware. ERROR: "CUDA out of memory. GPU 0 has 79.18 GiB total, 14.50 MiB free. 79.16 GiB already in use" How is 79GB already consumed on a FRESH session?? My 2B model can't even load....

Transfer pod volume (/workspace) data to Network volume

I have created a pod and have data on /workspace folder. After that I created network volume. I want to transfer the data to network volume so that I can use that volume and spin up another pod using that. How to achieve that. Please help.

Can’t connect to pod (EU-IS-2)

Not sure if this is an issue with all regions, but trying to spin up a new pod atm (RTX 4090) and can’t connect to it via Terminal or SSH, tried refreshing page, waiting for 30 min, nothing helped
Solution:
ah looks like this is the problem lol, ill try using an earlier pytorch template