RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods-clusters

(Beginner Question) Hosting Quantized model

Hi, I'm new to runpod can anyone point me towards how I can host a quantized model like this? I want to try the 2.71bit version first. https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF-UD

using RTX 4090 Pod want to create API from civitai diffuser models - is cog best way?

Would love to avoid Docker.. and want to stick to my own server instead of serverless. input would be one of several safetensor models and some parameters, and then the generated image is the output of the API.

trl vllm-serve not binding to port

I have a pod with two A6000 and I am trying to run vLLM on one of them via:
VLLM_LOGGING_LEVEL=DEBUG NCCL_DEBUG=TRACE trl vllm-serve --model meta-llama/Meta-Llama-3-8B-Instruct --gpu_memory_utilization=0.75 --max_model_len 2048 --host 0.0.0.0 --port 8000
VLLM_LOGGING_LEVEL=DEBUG NCCL_DEBUG=TRACE trl vllm-serve --model meta-llama/Meta-Llama-3-8B-Instruct --gpu_memory_utilization=0.75 --max_model_len 2048 --host 0.0.0.0 --port 8000
...

Any easy pipeline to migrate from GCP Cloud Compute VM Instance to Runpod Cluster?

Whats the easiest route? Looking to migrate within the next 24hrs.

Do A6000 pods have NVLink support?

Do A6000 pods have NVLink support?

how do i do this?

hay, how do i do this? im trying to rent a GPU so i can run my anaconda stuff and i have no idea how to make this work, can i get a hand setting this up? please?

Bug in Runpod ComfyUI Network Volume Setup

The /workspace/comfyui folder is not the actual one. The actual one is in /ComfyUI which is outside of the network volume mount. This means that if you terminate the pod, your progress is lost. I think that the folder should be inside the volume mount so the state persists between pod reinitialisations

web ui was demanding i pay just to start a pod, but i have plenty of credits

I have $69 in credits but the web ui was prompting me to put more money in before starting a pod. this is probably because i had a very old tab open, i had to log out and log back in. so this issue has been fixed for me, but i could see it happening to other people. thanks
No description

GPU Suddenly Stopped Working

I cannot restart the pod because all the files are in the container storage now. Can you fix it, please?

custom docker which uses streamlit and postgresql

how do i solve a issue that i have to host using streamlit and postgresql

Unable to start kernel

Hi I’m Cathy, I’m trying to run a Flask + JupyterLab project (with Whisper, Gemini API, etc.) on a RunPod GPU pod. I set up a virtual environment and installed all my dependencies, but I keep running into issues: Jupyter kernels don’t use my venv by default, and when I try to switch, I get port conflicts or 502 errors. Sometimes, even after installing packages like Flask or Whisper, my notebook still says “ModuleNotFoundError.” The RunPod dashboard often shows “Not Ready” for JupyterLab and my HTTP service, even when I think they’re running....

Assistance with SSH Access to My Pod

I am trying to connect to my pod via SSH using the following configuration: Host: 69.30.85.33 Port: 22045 User: root...

Global Networking

Hi, I have a volume on IL-1. I thought that was a Global Networking server when I created the volume. Whenever I go to make a pod on any other server, it won't let me choose my volume and a different server. Also, there are no options for Global Networking under any instance price. Any clarification would be helpful.
No description

[GPU not assigned to Pod – Need help]

[GPU not assigned to Pod – Need help] Hi, I have an issue with my Pod where the GPU is not assigned even though I selected RTX 4090 (on-demand). - Pod ID: ul6dffsrkkbnvl...
No description

Network Issue

Hello, I'm currently using a rented GPU, but I received the message: "This server has recently suffered a network outage and may have spotty network connectivity. We aim to restore connectivity soon, but you may have connection issues until it is resolved. You will not be charged during any network downtime." ...

Unable to back up volume data to Google Cloud storage bucket

Hi, I've been trying for a while now to sync my Google Cloud storage bucket to Runpod so that I can back up my volume data. I followed the instructions provided by the documentation, but I just can't seem to initiate the transfer; it just keeps refreshing, and then I open up the options tab to select whether to upload or download from Google Cloud storage. I created my service account key JSON key. I provide the bucket name and directory path, but it doesn't seem to work. I ensured that the buck...
No description

Can't send using RunPodctrl and can't resend

I've installed runpodctl on the receiving PC, but I'm still unable to receive the .zip file. I got an "approve access" notification on the receiving PC, and although I approved it, nothing was received. On the sending Pod, I also can't generate a new one-time send code because it says the .zip file already exists....

Not Work

Why did it stop at 18% when I try to download on port 3000 in the CavitAI+ Detail Tweaker XL tab? This error pops up.
No description

runpod-torch-v280 & RTX 4090 unsatisfied condition: cuda>=12.8

Hello, start container for runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04: begin error starting container: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.8, please update your driver to a newer version, or use an earlier cuda container: unknown ...

What will happen if i resume pod on fully occupied physical machine via api

Hello, doc says: "Most of our machines have between 4 and 8 GPUs per physical machine. When you start a Pod, it is locked to a specific physical machine. If you keep it running (On-Demand), then that GPU cannot be taken from you. However, if you stop your Pod, it becomes available for a different user to rent. When you want to start your Pod again, your specific machine may be wholly occupied! In this case, we give you the option to spin up your Pod with zero GPUs so you can retain access to your data." What will happen if I try to resume a stopped pod via API and all gpus in physical machine is already occupied? What the status code will be? Is there any details message in response body? And what the response will be, if there is no free gpu on whole datacenter? ...
Next