Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

H100 in US-KS-2 has high CPU latency

Hey, I just reserved an H100 in US-KS-2 and the instance is super slow. It has taken over ten minutes just to install Python dependencies. It's not just download speed (which is slow) but also CPU because after the downloads complete, processing the wheel installations is going very very slow. I was using A100 last week and this process went really fast. ``` root@7f041151afc9:/workspace/stanford-cs336-lmfs-assignment1-basics# uv run wandb login Using CPython 3.13.5...

3090 and 4090 server dont work with comfyUI

See log, looks like the comfyUI wan 2.1 template doesnt work on the 4090 server pod

lora loader not showing loras

I'm running my pod with the runpod/stable-diffusion:comfy-ui-6.0.0 template. The issue I'm having is when using Lora Loader, it's not populating the list with the Lora's I've saved. I have tried loading the Lora using civicomfy as well as uploading it through the file browser, but neither methods worked. The Lora files are not corrupted. I verified by running comfyui on my device locally and was able to see the Loras in the list....
No description

python requests download are slow within eu region

Hello Runpod Team, We are experiencing high network latency when our Runpod VMs download small files (2MB to 10MB PDFs) from AWS S3. Issue Details...

Pod and remote connection on vscode slow.

Hello, The terminal has become extremely slow, both in the VS Code remote environment and directly in the Runpod browser terminal. It's taking a significant amount of time even to type letters or create folders and files. I have already increased the root storage to 20GB and tried switching between all three available routers in our office, but the issue still persists. I am using an A100 instance with a network volume of 1.5TB. Over the past two days, I've been experiencing this performance issue. I’ve also noticed that the network volume is currently at 91% usage, while the container volume is only at 5%. I have cleared all caches, but the slowness continues. I would appreciate your help in identifying and resolving the issue....

CA-MTL-3 Region is not working

I'm using CA-MTL-3 region storage but it's not working for a long time... help plz..
No description

Unsloth Llama Scout will not download

Hi, I'm in Runpod in oobabooga one-click and I'm trying to run Unsloth Llama Scout Q6_K on 2x A40 (just for conversation, not training/learning). I've followed the directions listed at Unsloth's official "Llama 4: How to Run and Fine-Tune", but get stuck at step 2 of "how to run", where it says to select what model you would like to use. In the coding given at Unsloth's page, I put allow_patterns = "Q6_K", and get told that there's no such command. I had originally tried downloading it using the regular oobabooga interface. It "downloads" very quickly, but there's nothing actually there. When I try to load the model using llamaccp, I get told "list index out of range". The same thing happened with wget in the terminal. It was 127 kb. I'm completely new at this and really have no idea what I'm doing. I had been getting Google Gemini to help me but that wasn't going anywhere. I'm grateful for any and all help or feedback. Thank you....
No description

Network Storage / Persistent Storage Guidance for Community Cloud

Hi friends. Can anyone give some suggestion, of what is the right strategy to run the Pods in RunPod if i use Community Cloud, as we know once we terminate it, whole "Installation process of any UI / Scripts and Data is gone" So not going through the repeated process, what solution can i look into... How can i connect external Storage, like DropBox or any other... ...

tuple index out of range

i try train flux lora (Next Diffusion - FluxGym) and when i start .... i see that message and nothing hapen IndexError: tuple index out of range...

i need some help i want to know can we point our custom domain to a deployed pod?

hello everyone! i need some help i want to know can we point our custom domain to a deployed pod? i tried doing it by accessing web terminal and then the public ip address of the pod and then appoint my domain to that ip,...

The pod doesn't start

Runpod Pytorch 2.8.0 runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 start container for runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04: begin error starting container: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'...

Registry fetching extremely slow for the past 2 days

I usually only work in the serverless environment but have my rather large docker images as a runpod template as well in case I need to do some testing on the actual comfyui gui. For the past 2 days fetching the docker image to deploy the pod has significantly slowed down. A 130GB Docker image used to take ~20 minutes too boot up in a pod. Now its well over one hour as the download gets slower and slower. This is with the official docker hub not any private image repos. Has anyone else been experiencing this or have I just gotten unlucky with low bandwidth pods?...

i tried a "pod" for a few days ago.. not working anymore?

hello. just have to start off by saying that i have no clue how this work, i just followed a guide from youtube where he guided how to rent a 4090, then go to pods and search his username to try comfyui. it worked well first day. and when i was done using it for the day, i pressed stop ( i clicked stop cause i guess thats how to prevent it to charge money when im idle) -day after i wanted to try comfyui again, and i clicked "pods" and start.. but it never worked again.. is there something i di...
No description

network volume speed test

Might be a stupid question. Is there a guide or quick recommendation on how i can check the transfer/latency between machine and network volume? Just wanting to check if it would be bottleneck in anyway (i am assuming not). Also safe to have multiple pods accessing example one dataset from same network volume wouldnt much degrade in training perforamnce?...

Jupyer cant use password

No matter what i do, i cant change my jupyter password I tried editing the config file I tried doing "jupyter server password" I tried editing the environment variable...

Persist user with network volume

I'm trying to set up runpod to be used as a development environment where I can easily move my environment across different GPU's. Basically I am creating a container that creates a user and then I am mounting a network volume in the users home directory. I want to be able to login into my git, setup vscode server etc. all within this mounted user's volume so every time i start a new pod its all already there and I dont have to repeat this process every time. The problem I am running into is tha...

cant see workspace

im trying to run the official runpod comfyui workspace and put in a volume with path, but i still cant save the model and lora data anywhere. I cant find my workspace or any files where i can save the models, pls help me

How do I mount more than 1 network volume to a pod?

Hi guys, I have a requirement to able to mount more than one network volume on a pod, how do I do that?

Docker error : error creating container: container: create: container create: exit status 1

I am getting the following error : while trying to create a pod using the following specifications: * Region : US-KS-2 (using network volume) * GPU - RTX A6000 * Docker image : runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04...

Cannot connect with WINSCP to upload/move files

I am able to connect just fine with the ssh command but i'm unable to use anything else like WINSCP