Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Enable UFW

hello - trying to setup a firewall on a gpu cluster but I can't get UFW enabled - if anyone is around to help, that would be great

Stable DIffusion Template WITH ControlNet models preloaded?

While the Stable Diffusion template i'm using does have ControlNet, it doesn't have any of the ControlNet models. I've found it difficult trying to install them. Can someone tell me 1) a template with the Controlnet models preloaded or 2) how to quickly install ControlNet models?...

Pod hangs for git add command. Tried some memory loading and hangs indefinitely.

Pod hangs for all memory specific commands. Can't push code and move to other server. Can some one help Pod ID: 7ukzipgyte46cg...

Terminal does not work in jupyter notebook.

Hei guys, for some reason the terminal in jupyter notebooks is not working anymore, when i open the terminal, i just get an empty window in which i can't type anything. I need to use the web terminal for any script executions

Increase spending limit

I keep hitting my $40/hour limit and need this increased. How can I do this?

Hi,

I am trying to send a file from my local system to my pod volume using this command rsync -e "ssh -p 10234 -i /home/dell/ssh_keys/ssh_key_dell_Latitude_A4213.txt" -avP /home/dell/exp10/conda_env.zip root@213.173.100.0:/workspace/testing/ but when I run this I get this error ash: line 1: rsync: command not found rsync: connection unexpectedly closed (0 bytes received so far) [sender]...

Jupyter notebook - does it keep on running?

I am using Jupyter notebook on my pod, can I close the tab, will it keep running?

Open-WebUI 404 Error

When using the Better Ollama CUDA 12 template, and following the instructions found here: blog.runpod.io/run-llama-3-1-405b-with-ollama-a-step-by-step-guide, getting an error when posting a query using open-webui: Ollama: 404, message='Not Found', url='https://<snip>-11434.proxy.runpod.net/api/chat' Interestingly enough, replacing the open-webui localhost URL with the above URL works well with cURL using network diagnostics. Wanted to replicate the issue on a less expensive server, but can no longer find the template....
No description

Why is upload speed so slow?

A week back when I downloaded a 6BG checkpoint, it took 1-2 hours. Now it's telling me it'll take 12 hours. Is there a reason for this?

GPU errored, machine dead

Search 0 matches 2024-09-04T11:12:09Z stop container 2024-09-04T11:12:44Z remove container...

Slow Container Image download

Two EU datacenters are experiencing extreme slowdown during docker container image download, EU-SE-1 and EU-RO-1, to the point where our scaler can't keep up with load spikes because it takes > 30 minutes to start up a pod. This needs to be resolved as it's directly costing us money, we can't properly scale, causing our queue to keep spiking and building. Alongside being forced to use on-demand vs spot because of the slow download speed....

Can I specify CUDA version for a pod?

nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.4, please update your driver to a newer version, or use an earlier cuda container: unknown vLLM based container image fail to start...
Solution:
In deploy click Filters and you can specify Cuda version there.

Pods wont start

Looks like auth to hugging face failed, cannot launch any pods - tried with multiple configs, same result. Clicking on start web terminal does nothing, sometimes connect to jupyter button appears but does not do anything. Pod ID: 5d15c6q1grfm6p ``` .254316737Z ...done....

create POD with full Intel Sapphire Rapids CPU chip for Parallel Algorithm scalability test.

Hi, I usually create PODs for GPU tasks, accessing through ssh, so I am very familiar in that sense. But now we need to rent a POD with just a modern Intel CPU fully available for us. In particular, we need one with Intel Sapphire Rapids architecture, so that it supports AMX matrix instructions. This is for a parallel CPU algorithm for which we need to obtain performance and energy consumption results (plots). I went to the menus of runpod but i could not find options on the CPU side, neither exact info of the CPU model of the pod. Am i missing something too obvious? Thanks in advance...

My pod had been stuck during initialization

ogw47gdxzk3a26 - stuck during image pulling. Could you checkout what happened and handle that issue, because our infra is not ready to handle this kind of your errors.

Creating instances with a bunch of open ports

I'm using several gpu pods. I faced the the lack of open ports. afaik, while creating instances, the number of ports is restricted. Only support at most 10 ports. How can I get 20 ro 30 ports while creating an instance?...

creating instance from an image file

i want to make an image from an image file (faster than using registry), any idea how to do it? i prefer to use the runpod storage, because it is faster that way.

Creating pods with different GPU types.

Hello, Can I create pods with different GPU types? Say I want to create a pod with 2 A40s and 1 RTX A5000. I asked because I there is a gpuTypeIdList property on the runpod graphql specs. Also, it would be amazing to have that feature. Thanks!

Slowish Downloads

I'm trying to setup a pod running ComfyUI for Flux at the moment, and it's going to take 30-40 mins just to download the models with the speed it's running at. ```Downloading 1 model(s) to /workspace//storage/stable_diffusion/models/unet... Downloading: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors 0K .......... .......... .......... .......... .......... 0% 10.9M 34m23s...

can't cloud sync with Backblaze B2

I need help, I can't do cloud sync with Backblaze B2 I put the key ID and the application key and the bucket root path but it says Something went wrong!...