Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

H100 pod not connecting to network drive of the same region

I have a dual H100 pod that's supposed to be connected to a network drive (both on CA-MTL-1), but when I try to move data, do a git status of a repo, or even start a python script residing on the network drive the terminal hangs. Seems like a network issue? I've trying to spawn dual H100 pods multiple times, but I'm getting the same IP (probably the same hardware?), so nothing changes. Trying this out from a machine with RTX A5000 works fine! Is there something I can do?...

something wrong with pytorch2.4.0 image's jupyter

most of my pod created today using template pytorch2.4.0 couldn't open jupyter lab, while 2.2.0 was fine. Wonder some updates on the docker image.
No description

4 x A40 never ready in CA

Create 4 x A40 Pod today in CA, however Pod never ready state no log no connect...
No description

Unable to connect to pod after launch H100s

Today consistentatly this seems to be happening. Everytime we launch a H100 GPU
No description

Pod image for network storage management

Hi, I use runpod mainly for serverless ComfyUI, using a network storage to host medias and models. To manage the network storage I assumed there is no other way to use a Pod as file manager. Maybe there are other solutions? ...

storage full error, disk write error

I am trying to unzip files of 1GB, and I have such 1200 files. each zip contains around 1,00,000 images. When I unzip those, first of all it takes good amount of time, and second, after some time, I get this error 3/SynthImage/test/815/a55815_11_0_275.jpg: write error (disk full?). Continue? (y/n/^C) even though disk is of 2048 GBs. and if I continue, it runs for some time and then same error....

Ask the service rate limite and etc.

Can the service runpod.io meet such needs:I would like to convey our usage scenario. Specifically, we are looking to provide a public network service, with initial users estimated to be around 2,000 to 10,000 (about 2,000 to 10,000 teachers from 30,000 middle schools). If each user has about 10 uses per day, that would result in approximately 20,000 to 100,000 requests. In this case, is there a possibility that runpod.io's rate limiting or circuit breaker would be triggered? Is it possible to co...

Why is there still a daily charge after purchasing pod A40-48G with a one-time payment?

I purchased a GPU A40 *1 48G pod in Secure Cloud mode on February 17, Volume Disk: 60G Container Disk: 30G ...
Solution:
后来在工作人员的帮助下,且验证后了解到账单是指会显示每天的费用,不会重复扣费。

error on gpu causing damages

could i request a refund for this GPU? the CUDA is not working and the experiment i did is now broken, unuseable.
No description

Network problem

There were various problems with requests for various resources. The following commands return 403 error Forbidden: - wget https://go.dev/dl/go1.23.6.linux-amd64.tar.gz - go get github.com/aws/aws-sdk-go ...
Solution:
Answer from support: Curently the IS-1 region is having issue. If possible I would recomend creating pod in other region

Official template vllm-latest is broken

Hi everyone, I'm trying to deploy vLLM Pod using the official vllm-latest template, but I get the following error: Traceback (most recent call last): File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run()...
No description

Downloading models causes the pod to freeze

Hey, not sure if I'm missing something obvious here. I'm noticing two problems (might have the same cause): 1. I'm trying to download phi-4 14b from HuggingFace. ...

Getting 403 forbidden error on multiple pods

When I try downloading an image from a URL via python or curl I get a 403 forbidden error. The same happens when I try calling the Gemini API. Is this happening beacuse RunPod servers are being blocked? Attaching a screensot of the response I get on python. Locally it works fine....
No description

Share data between two runpods using network volume

Sorry if this is a dumb question, but I have two pods and a single network volume. I want to share data between them but they appear to mount separate filesystems on the network volume. Is there a way to have them mount the same filesystem?

runpodcli create pod error

I'm trying to create a pod via the runpodcli but am getting the following error. Am I doing something wrong? I'm able to create this pod via the UI without issues. $ runpodctl create pod --templateId zfbl0v84bw --imageName jmparejaz/batchinf:latest --gpuType "NVIDIA H100 PCIe" Error: There are no longer any instances available with the requested specifications. Please refresh and try again....

Actual internet speed much lower than listed internet speed

Hi, I have an 8x3090 pod that says it has 437 mbps down and 518 mbps upload. Running a speed test (sivel/speedtest-cli/), in the web terminal, I got download speeds of 10.43 Mbit/s and 7.19 Mbit/s. When downloading a model from the huggingface hub, I get speeds between 2-7 MB/s. Is there a problem on my end here? What can I do to get faster, more tolerable speeds?...
No description

Cannot restart pod

Pod took too long to boot up, about 30 minutes, and still deducted my money. It's ridiculous! Now it still won't boot up after rebooting. This is my podid: z1q1abvlmyhpbn Please help!...

ollama: when i try to install ollama with the command

curl -fsSL https://ollama.com/install.sh | sh i get the error message : curl: (22) The requested URL returned error: 403...

Can I get the account balance from api?

I want to monitor the account balance (e.g. get the balance in real time and notify me), is there an api to get this data 😮