Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Running pod Terminal is not starting

I just launched a new pod with a specific container. when i click on start web terminal, button reacts but the connect to web terminal is not enabling

Llama

Hello! For those who tried, how much GPU is needed for inference only, and for fine-tuning of Llama 70B? How about the inference of the 400B version (for knowledge distillation)? Is the quality difference worth it? Thanks!
Solution:
Only for inference

XXXX.safetensors is not a safetensors file

Hi, I have problems when generating an image, it tells me that the safetensors file is not a file, I have tried to install it from wget and gdown --fuzzy, there is no way it works. Any ideas? I would really appreciate your help, I have been having this problem for many days 😦 The file works on my local machine, other people have tried it and it works for them too, I think something is missing....

I cannot use my SSH key for authentication process for my pod.

I have been using my pods using ssh key. but afew hours ago, I purchased Saving Plan. Then I cannot login to the new pods using my ssh key. The ssh key have been working well for other pods(not saving plan) so far. Please help me with this issue. Thanks in advance.

502 Error when attempting HTTP 8188 connection

I continue getting a 502 error when trying to connect to ComfyUI via HTTP 8188. The Jupyter Notebook is accessible. I have a GPU attached. I also do not see any errors in the logs. Any ideas?

runpodctl create pod

How to use runpodctl to create pod with specific datacenterId

connection refused for port forwarding (for colab)

1. I set up a pod and it's running... 2. Added the env variable for JUPYTER_PASSWORD 3. Set up ~/.ssh/id_ed25519 4. Pushed the Jupyter lab [Port 8888] button in Connect 5. Ran the port fwd cmd: ...

Volume with no files is registered as having 23GB

Started a new pod and /workspace had unusual amount of data used for the pod. Deleting everything from the volume still shows substantial usage. Seems like a bug in calculating storage. root@5c5eadaefa32:/workspace# df -h Filesystem Size Used Avail Use% Mounted on overlay 10G 64M 10G 1% /...

How to Keep Installed Python Modules Persistent and How to Mount Multiple Volumes?

I'm running into a couple of issues on Runpod and would appreciate some help: Whenever I pause and restart my pod, all of my installed Python modules are lost. How can I make sure the Python modules I install remain persistent even after restarting? I know that this issue with persistence could probably be solved by mounting multiple volumes, but I can't find any method to mount multiple volumes in Runpod. Could you guide me on how to do this?...

Deploy pod without scheduled downtime

Trying to put up a pod i will use for a while, how can i get one that will not lose my data? Tried with many pods...
No description

Low GPU usage

I've currently set up a pod with 4 different GPUs and allocated each GPU to a different port with the command: CUDA_VISIBLE_DEVICES=0 python main.py --listen --port 8188. It worked. I have 4 different ComfyUI tabs operating by themselves. But when I generate, the speed of generations are incredibly slow. It has taken 2+ minutes to generate a single SDXL image on all of them....
No description

Special characters in pod IDs

pod IDs should not have special characters. this is meant to be used in APIs.
No description

Why is Runpod so slow?

I'm using the RTX 6000 which has 48gb VRAM, but my generation speed on Comfy is extremely slow. Is there a reason for this?

Why is pod speed VERY slow with multiple ongoing pods

I have created 4 separate network volumes and attached 1 pod to each of them. I've experienced VERY slow speeds when using ComfyUI. Why is this? Is Runpod limiting my VRAM because I have 4 pods going at once?

H100 NVLink

If I buy two 8xH100s, can I use nvlink between multiple GPUs?

Jupyter Notebook not cooperating after 1st Reboot

hey support team! chatgpt, perplexity & @ai-helper-runpod all mentioned to reach out to support... i have a comfyui/flux (1 click) template installed. everytime i restart the pod, i can still successfully login to the jupyter notebook but i cannot make any changes. if i save/upload a file, it will say not found......

Suggest a template for this text classification model (small model from huggingface)

I want to do some (zero-shot) text classification with this model [1] or with something similar (Size of the model: 711 MB "model.safetensors" file, 1.42 GB "model.onnx" file ) Now I see a LOT of pod templates... ...

Is it possible to save template overrides on the official templates?

I want to preserve my environment vars, conatainer/volume size, start commands.

Runpod VLLM - How to use GGUF with VLLM

I have this repo mradermacher/Llama-3.1-8B-Stheno-v3.4-i1-GGUF and I use this command "--host 0.0.0.0 --port 8000 --max-model-len 37472 --model mradermacher/Llama-3.1-8B-Stheno-v3.4-i1-GGUF --dtype bfloat16 --gpu-memory-utilization 0.95 --quantization gguf" but it doesn't work... It say "2024-10-07T20:39:24.964316283Z ValueError: No supported config format found in mradermacher/Llama-3.1-8B-Stheno-v3.4-i1-GGUF" ...