Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Change GPU on stopped POD.

Hello, I have tried to find a way to change gpu on stopped pod bun I'm not sure if its possible. I would like to adjust gpu depend on the case without creating new pod. Thak you....

8 x RTX 5090 PODs JupyterLab and Terminal not working

I'm having almost constant issue when I try to use 8 RTX 5090s. The POD starts after 6 to 8 minutes only to find that neither JupyterLab nor the terminal are working, either showing a 404 error or an empty white page. I tried to stop and restart the POD but that didn't help. I have to terminate and try again until I get lucky and the terminal works, and 1 out of 10 times the JupyterLab may work. That is something I had to live with which is annoying but doable. But my problem is I'm being charge...

OutOfMemoryError: CUDA out of memory

I keep getting this error when trying to run various models (e.g., gpt-oss-20b, llama-3.3-70b) on pods. Even when running GPUs with way more than the required vRAM (e.g., 141GB H200 for gpt-oss-20b) I still get this error. I have tried setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True but that didn't fix it.
[info] Pipeline stopped due to error: CUDA out of memory. Tried to allocate 42.49 GiB. GPU 0 has a total capacity of 139.72 GiB of which 38.96 GiB is free. Process 584678 has 100.75 GiB memory in use. Of the allocated memory 99.91 GiB is allocated by PyTorch, and 181.47 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)\n
>...

Jupyter Console is Ready but cant connect

I have been trying out with multiple secure cloud pod configurations in EU-RO-1, and sometimes even though the pod is running and the http service is healthy I can't connect to it nor user the web terminal

comfyuiUIWAN 2.2 haven't gotten the server running unless launched manually

ComfyUI wan2.2 The container has been giving me problems that went through about $20 worth credits where I'd figure to reach out and see if I'm doing something wrong here but it's a one click container and it seems that the scripts to start the actual server is not working for me I'm unsure if this is due to attaching a container to or a storage pod but I have not been able to get this going other than manually starting it I am having issues with it even downloading from the model manager within...

Too many open files on GPU pod A6000

On Pod A6000, frequently i'm facing "Too many open files" and "cannot allocate memory" whereas Comfyui is not using all VRAM/RAM. Usually, it happens after sampler generation, during an interpolation process. OS: Ubuntu 22.04 LTS...
No description

EU-RO-1 servercenter veeeery slow internet connectivity

This morning we are trying to create pods with network volumes in RO-1 and do a simple "apt update", but this takes forever (like 5-10mins) on each pod! In the past I always got a slow pod every now and then, but today ALL pods behave like this. What is going on? we use 4090 with ~10 vCPU, PyTorch 2.4 image + NetworkVolume...

is there any solution to check container is healthy?

hi i'm a newbie of runpod gpu i want to check my gpu container is healthy. how can i check container is healthy?...

ComfyUI fails to start on nerdylive/stableswarm:dev-29b04cfff834fa5cd02b319bf9103251be95ca80

I'm trying to run a gpu server for the first time and of course problems arise, I've been trying to figure this out for the past 3 hours to no avail, I'm renting out a 5090 server @Jason

Does it usually take this long to install Comfyui on the Swarmui setup?

This is my first time running a server so i used the template nerdylive/stableswarm:v0.1.1-cuda12.6 It finished initializing though oddly enough swarmui setup is stuck on 3 out of 6 installing comfyui @Jason
Solution:
closing to make another thread more ontopic on my issue
No description

ComfyUI workflows takes hours on runpod

Hey my comfyui Workflow takes super long to finish loading on runpod and i wanted to ask if theyre aare ways to optimze it.

Disk quota exceeded error

i was encountering disk quota exceeded error when downloading a file. i still have 30gb free on my network storage and my disk storage has 30-50gb free depending on the template. it happens even if i had just started the pod. when i download while running something from comfyui, I will get the same error on comfyui as well. i was trying to save my current workflow before I restart so that I wont have to redo it after but then my workflow becomes blank. after restarting, the workflow wont open in...

Is it possible to run OpenWebUI on a pod?

I asked the AI bot and it said yes however google Claude and the document the bot referenced don't specify that it can be run on the pod. I need more RAM for my RAG usecase and it would be so much cheaper and efficient to host it all in Runpod but I don't know if I can, Id need to use OoenWebUI as I did a personal test and it worked but now I need to upload 13k documents to it too and my PC isn't beefy enough for that.

RUNPOD_ALLOW_IP does not work

I just tried a series of testing with 1. RUNPOD_ALLOW_IP to be my address 2. RUNPOD_ALLOW_IP to be 0.0.0.0/0 3. Without RUNPOD_ALLOW_IP environment variable...

The US-IL-1 server is experiencing issues with connectivity and performance again

Really bad connections and consecutive pods deployed on US-IL-1 servers. Sometimes I have performance issues with this server and it's really frustrating. You should check it

My images that were generated in 20 seconds went to 50 seconds per image. Has there been any change

My images that were generated in 20 seconds went to 50 seconds per image. Has there been any change in ComfyUI? It's literally been 2 hours and out of nowhere this happens, without changing anything. I haven't updated anything, I've tried different runpod pods and it's very slow....

Need help setting up vs code ssh

Hi Everyone, I have tried everything but cannot make ssh work for vs code. I can ssh into the pod using terminal but vs code connection does not work. Tried also direct TCP, password based ssh. Direct TCP ssh does not work. Do I need to contact runpod? Please help....

Overcharging

How do I reduce the charges Im getting from my pod. Im not using it most days but getting charged $20 per day when it's $0.87/hr

'd like to share my recent experience using the RunPod service to set up a ComfyUI environment.

Hello everyone, I'd like to share my recent experience using the RunPod service to set up a ComfyUI environment. I loaded $10 intending to use the platform for my project, but unfortunately, I encountered severe instability issues and ended up losing $8. The main problem was constant environment instability and frequent disconnections, which made it impossible to work or use the service for its intended purpose....
No description