Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

My pod is taking forever to download the image

1 x RTX 2000 Ada 6 vCPU 31 GB RAM Image size is around 18gb...
Solution:
@jojje runpod team suggested to keep images in docked registry instead of GitHub.

Pods stuck on “Waiting for logs”

Hi, not only one of my pods (cpu5c-2-4) but 2 new pods I’m spinning up are stuck on “waiting for logs…”. It’s been like this for many hours, I’ve tried restarting them and also creating new pods but to no avail. All of the pods are in different locations. Any help would be appreciated as this is extremely time sensitive

I am having issues with running jupyter lab on my pod, it was running before but just got disconnect

I am having issues with running jupyter lab on my pod, it was running before but just got disconnect and now i am seeing this message not ready make sure your service is running

Container Registry Auth not working for private docker images

Hi guys, I created a key from dockerhub and added it to runpod settings "Container Registry Auth". I - chose a random credential name, - used my dockerhub username as the username of the credential, - and used the generated key as the password. ...
Solution:
Thank you a lot for the help it worked now! Porbably because i accidentally included blank space when pasting the credential into runpod.

Model Maximum Context Length Error

Hi there, I run an AI chat site (https://www.hammerai.com). I was previously using vLLM serverless, but switched over to using dedicated Pods with the vLLM template (Container Image: vllm/vllm-openai:latest. Here is my configuration:
--host 0.0.0.0 --port 8000 --model LoneStriker/Fimbulvetr-11B-v2-AWQ --enforce-eager --gpu-memory-utilization 0.95 --api-key foo --max-model-len 4096 --max-seq-len-to-capture 4096 --trust-remote-code --chat-template "{{ (messages|selectattr('role', 'equalto', 'system')|list|last).content|trim if (messages|selectattr('role', 'equalto', 'system')|list) else '' }} {% for message in messages %} {% if message['role'] == 'user' %} ### Instruction: {{ message['content']|trim -}} {% if not loop.last %} {% endif %} {% elif message['role'] == 'assistant' %} ### Response: {{ message['content']|trim -}} {% if not loop.last %} {% endif %} {% elif message['role'] == 'user_context' %} ### Input: {{ message['content']|trim -}} {% if not loop.last %} {% endif %} {% endif %} {% endfor %} {% if add_generation_prompt and messages[-1]['role'] != 'assistant' %} ### Response: {% endif %}"
--host 0.0.0.0 --port 8000 --model LoneStriker/Fimbulvetr-11B-v2-AWQ --enforce-eager --gpu-memory-utilization 0.95 --api-key foo --max-model-len 4096 --max-seq-len-to-capture 4096 --trust-remote-code --chat-template "{{ (messages|selectattr('role', 'equalto', 'system')|list|last).content|trim if (messages|selectattr('role', 'equalto', 'system')|list) else '' }} {% for message in messages %} {% if message['role'] == 'user' %} ### Instruction: {{ message['content']|trim -}} {% if not loop.last %} {% endif %} {% elif message['role'] == 'assistant' %} ### Response: {{ message['content']|trim -}} {% if not loop.last %} {% endif %} {% elif message['role'] == 'user_context' %} ### Input: {{ message['content']|trim -}} {% if not loop.last %} {% endif %} {% endif %} {% endfor %} {% if add_generation_prompt and messages[-1]['role'] != 'assistant' %} ### Response: {% endif %}"
I then call it with:...

Runpod storage configuration

Hello, I am working with a dataset off 4 mill files. Normally it takes up about 23gb on my pc. But on runpod its about 10-15x that size. The reason I have found is that runpod takes up 100kb for a file. My files are each around 6-10 kb, but ubuntu which pod is running from is configured to allocate 100kb min. for a file. I don't have permission within the pod to manage that. What can I do about this?

Fine Tuned Whisper V3 Large Turbo Configuration

Hi, I have a fine-tuned version of Whisper V3 Large Turbo on Hugging Face. I’ve successfully tested it on Google Colab, and everything works as expected. However, I’ve encountered some trouble with the deployment and am unsure of the easiest way to manage this process. I attempted to use the Docker image for the Fastest Whisper implementation but ran into issues when loading my model in the ct2 format. At this point, I’m fine with not using the faster version—I just want a straightforward way to deploy the model, either as a Pod or a serverless endpoint. Does anyone have suggestions or know of a clear step-by-step tutorial to achieve this? I couldn’t find any resources that explain the process clearly. 🤔...

AWS ECR Image run on Pod RTX 2000 Ada

Hi, I have one query related to docker run on pod How I install docker on L40 Model and pull the docker image from AWS ECR....

Deploy custom private docker image

Hey guys. Is it possible for me to deploy my custom docker image without making it public? (That would be show stopper for my project) I've seen the docs about Bazel but it seems like it creates just public repositories in your docker hub. I've also seen the credentials but they were login and password and I don't really feel comfortable sharing those to my docker account when they have api keys for that

Reset a Pod with CLI or API

How can I Reset my Pod using the cli tool or api? I see the option in UI but not in the apis. My use case, recreate the Pod (without losing GPU) with the latest Docker image....

Persistent files in Network Storage

I spun up a network volume. I've now added a GPU. I'm working on ComfyUI. When I download models, some of them are large and I'd like them to be persistent and would rather not have to download them each time. Also, obviously I'd like my images and videos to be saved persistently. Also, I'm going to make sure I'm using the right folders, etc. I noticed theres a ComfyUI folder in root (/), and also one in /workspace. Is /workspace temporary and wiped every time I attach a new GPU? Or its o...
Solution:
No it it's the opposite,workspace is usually the persistent disk,check your pod settings where do you mount the volume disk or network storage
Message Not Public
Sign In & Join Server To View

Link broken in Runpod.io Github tutorial

I'm trying to use the instructions from the Runpod blog to deploy a repo to RunPod with GitHub integration. This is the blog post: https://blog.runpod.io/deploy-github-repos-straight-to-runpod-with-github-integration-2/ On the page to "Link Your GitHub Account" the button to Link GitHub is broken, leading to a 404 page not found: https://github.com/apps/runpod-inc/installations/new/permissions?target_id=157309625 ...

pods just keep stopping without any reason why when downloading?

they just keep pausing and then nothing happens why?
No description

Pod execution stopping without errors

I've been having issues since I started using a Pod yesterday, the execution of the finetuning script inside the pod stops abruptly and randomly, without any errors or anything to show in the logs. Every time this happens, I am wasting money and I can't afford to look at it 24/7 to make sure it's running. It happens every few hours. What could be happening?

H100 + 20 TB of storage

Hi guys , im looking to get a pod with the above mentioned spec with high bandwith internet, I see no option to increase the storage let me know how i can make use of it @Madiator2011 @Finley

How to set Saving Plan on existing pod?

AI helper said: Go to your Pod dashboard Look for the "More Actions" or similar option (possibly a hamburger icon in the lower left) Select "Edit Pod" Look for a section related to Savings Plans -> but there's nothing like that :/...

Cant upload .env

Im using pods to use comfyui, i need to upload .env file to one of the node root folder but it seems to be restricted. Is there any way so i can upload the env file?

Enable Global Networking through API

Is it currently possible to enable Global Networking on a pod through the API/ SDK? I'm not seeing anything about it in the pod definition here, but I'm not sure if this is the right place to be looking: https://graphql-spec.runpod.io/#definition-Pod

Pod / GPU stopped working

I've had a pod running continiously for over a week. Today it just stopped working. Everything on the surface looks ok, but Ollama (the tool I'm using) won't run any model now. The error is saying not enough resources. It looks like the GPU is not working. I've tried reinstalling Ollama, restarting the pod and terminating all processes. Nothing works. ...

Slow performance on EU-SE-1 pods with network storage

Performance for last two days on pods deployed on EU has been quite bad. I use mostly comfyui for image generation and I noticed it takes three times as long to execute my workflows compared to before. In some cases the generation just flat out fails. I tried using A40s and A5000 GPUs with same results. Also transferring files from PC to pods seems quite slow (using runpodctl) None of this was an issue like 2 days ago...
Solution:
Seems perfo is back to normal ,marking as solved