Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Cannot restart pod

Pod took too long to boot up, about 30 minutes, and still deducted my money. It's ridiculous! Now it still won't boot up after rebooting. This is my podid: z1q1abvlmyhpbn Please help!...

ollama: when i try to install ollama with the command

curl -fsSL https://ollama.com/install.sh | sh i get the error message : curl: (22) The requested URL returned error: 403...

Can I get the account balance from api?

I want to monitor the account balance (e.g. get the balance in real time and notify me), is there an api to get this data 😮

github pull time

I've pushed to a repository on github, it's been about 40 minutes, and it hasn't pulled it and tried to rebuild. Is there any dashboard, or something that can tell me if it's just lagging, or etc....

Expand pod size

I want to expand the size of my pod, but I see a warning message Editing a running pod will cause it to reset. You will lose all data that isn't stored in your volume mount path (/workspace) ...

Currently my network volume is pinned to EU. Is it possible to move it to a different region?

The GPU availability is limited in the region. I am looking to resume using the data in my n/w volume. thanks.

How to run https://hub.docker.com/r/dockurr/windows

I wanted to run a Windows instance inside a Docker pod container but when i run it via as startup command :docker run -it --rm -p 8006:8006 --device=/dev/kvm --device=/dev/net/tun --cap-add NET_ADMIN --stop-timeout 120 dockurr/windows i get this error: ...

Limit Memory Usage

Multiprocessing is requiring a lot of memory usage and the server just crashes when the threshold is reach (needing a restart). Is there are a way that I can prevent this interaction from happening so I don't have to keep restarting the server? Perhaps a way to set a server-wide memory usage limit before the threshold is hit?
No description

How to Run a RunPod GPU Pod Behind a Reverse Proxy Without Exposing the URL?

Good evening, I hope everyone is doing well. I would like to know if it is possible to run a Pod (runPod GPU) via reverse proxy. For example, I want to send POST requests to the running Pod, but I don’t want the URL to be exposed. So, I thought about setting up a reverse proxy, pointing it from my domain with a subdomain like xpto.domain.com to this Pod. How can I do this? Is it possible?...

Hello, I am pulling out all my hair with this 3000 HTTP service

Hi, im new in cloud computing and im trining to follow this video : https://www.youtube.com/watch?v=b9jNa9pYLJM&t=223s I restart the tuto 4 time but im stuck with the same probleme everytime.....
No description

Container restart policy

Is it possible to run a container only once? After the job is done, the container restarts again, and I have to catch it to terminate, which is not the desired behavior. I need some kind of docker restart policy 'no'

Docker image infinitely restarts

I am trying to use a 3-year-old docker image (from dockerhub) to run IsaacGym and, when I try to start a pod with the image, the system logs indicate that the image was successfully downloaded, but after that it will say it’s starting the image every couple seconds; it does this infinitely and I do not have access to any error logs (or do I?). Does anyone know what might be the issue? Keep in mind that this issue reappears when I try other images, including newer ones.

when will cpu network pods be fixed??

when will cpu network pods be fixed??

cuda upgrade

How can i upgrade to pod to cuda 12?
No description

File migration, SSH between Pods

I am trying to transfers lots of data between pods to expand storage. Since my source Pod is very full, using runpodctl is not an option (due to requiring compressing the files first before wormholing it). I also tried croc but ran into many issues. I have not resorted to using SSH methods like rsync or scp, but I can form a connection between pods (ssh through my local machine works) despite adding my source's public key to my destination's authorized_keys. Is it not possible to SSH from pod to pod? Or is there extra set-up needed to do so?...

Pod stuck forever on "This server has recently suffered a network outage "

My pod is stuck with this message "This server has recently suffered a network outage and may have spotty network connectivity. We aim to restore connectivity soon, but you may have connection issues until it is resolved. You will not be charged during any network downtime." I cannot start it. When I try nothing happens. No logs even. I spent a lot of time configuring it and don't have the time to configure another, and it would cost me more money to do so. I can create a new pod just fine with the same base config. How can i use the pod that I created just a couple of weeks ago?...

VLLM problem in A100 instance

Hi everyone, right now i'm having problem when trying to deploy runpod into A100 instances. I have tried many of A100 in different region but it always showing error "OSError: [Errno 98] Address already in use". I'm using VLLM version 0.6.3post1, it works fine for the last month but recently it not work anymore and if i deploy to another instance like A40 or RTX6000, it just works fine. Does anyone know the problem?

EU-RO-1 consistently slow network performance compared to other data centers

We consistently get significantly slower netwrok when using GPUs on this data center, GitHub pulls are dreadfully slow,. It is a shame since it seems to have a high availability for RTX 4090