Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Setting a time limit for use

Hi. I want to know if there's a way I can set a time limit on my pod so it shuts down after, say, two hours and doesn't start up again another 12. That's one of the problems I've had is I've been using Stable Diffusion for too long. So if there's a quick way to set a time limit, let me know.

I can't figure out how to make fine tuning work

Hello everyone, I'm seeking some expert guidance with a fine-tuning project and I'm hoping someone can help. I have credits on this platform that I loaded specifically for fine-tuning a large language model. Unfortunately, I've had a lot of trouble getting the process to work correctly myself, and these credits are currently going unused....

how to start with network volume through graphql api

``` curl --request POST \ --header 'content-type: application/json' \ --url 'https://api.runpod.io/graphql?api_key=${YOUR_API_KEY}' \ --data '{"query": "mutation { podFindAndDeployOnDemand( input: { cloudType: ALL, gpuCount: 1, volumeInGb: 40, containerDiskInGb: 40, minVcpuCount: 2, minMemoryInGb: 15, gpuTypeId: "NVIDIA RTX A6000", name: "Runpod Tensorflow", imageName: "runpod/tensorflow", dockerArgs: "", ports: "8888/http", volumeMountPath: "/workspace", env: [{ key: "JUPYTER_PASSWORD", value: "rn51hunbpgtltcpac3ol" }] } ) { id imageName env machineId machine { podHostId } } }"}'...

api.runpod.io is down?

curl: (6) Could not resolve host: api.runpod.io graphql endpoint is not working Please resolve...

making pod a docker image

Hi I am new to runpod can anyone tell me if I can package my pod like tarball file as it is a running container so can i make a docker image of my pod?

cant update, cant add loras to power lora loader and cant get text and clip files to show up

i was just getting set up on runpod and then everything now looks different. the model library and node library menus on the left are gone. when i try to "update comfy UI" i get a message that it "Failed to update ComfyUI". I also have a the Power Lora Loader node but it doesn't let me add or change the loras in it currently like it did on ComfyUI on my local PC. I feel like nothing is working and would love some help if anyone has a few minutes.

Can't get alleninstituteforai/olmocr running on any recommended pod.

Hello! A bit new here - I'm trying to run alleninstituteforai/olmocr:latest. Requirements are:
"Recent NVIDIA GPU (tested on RTX 4090, L40S, A100, H100) with at least 15 GB of GPU RAM 30GB of free disk space"...

jupyter does not load in H200 XSM

It's a joke, I always lose my money and time because of those mistakes, it's unacceptable

Is runpod slower in the europe evening/ america mid day?

It's been 2 days now where I ran my pods during the whole day and at the end of the day (i'm in Germany) it's always super slow. I run comfy on it. KSampler and loading the diffusion models takes foreverrr even on rtx 4090 or 5090. Restarting comfyui is a hit or miss, sometimes it works, sometimes it just never starts again and I need to restart the whole pod.. My network volume is on US-IL-1 Is this normal? Is this because more people are online at this time?...

What's the easiest way to get a qwen-image pod?

I can bring up ComfyUI but I haven't found a workflow I can just import that doesn't depend on others and create a dependency mess I don't know how to clean up. I don't have a lot of experience with ComfyUI.

How to install ollama (and download models) in to /workspace?

I've got the pod working as expected but the ollama install.sh script installs to a default directory that I cannot figure out how to change. (/usr/local). This is of course not useful for runpod because this space gets reset every time the pod restarts. How do I install ollama and store models in to /workspace so that it persistent?
Solution:
If you just want to run LLM's check out https://get.runpod.io/koboldcpp if you want a way to run LLM's on runpod thats more optimized

time out occur on pod

i need immediate help i run a custom template pod on runpod it works good at start but when i test it at morning when i come to office it shows time out occuerd error its happening from last two days i am using secure cloud pods not the community even though it happened with the community too.

Can't start a pod anymore with graphql mutation without a networkVolumeId

When I run this mutation ``` mutation podFindAndDeployOnDemand($input: PodFindAndDeployOnDemandInput) { podFindAndDeployOnDemand(input: $input) { id...

Instant Cluster DDP config not working

I created an instant cluster with a couple nodes, but torch DDP isn't working - seems like nodes can't talk to each other. Documentation says that instant cluster pods have relevant env vars created by default, & ports open, which doesn't seem to be true. I checked via ssh sessions. root@node-1:~# env SHELL=/bin/bash...
No description

How to transfer 10 - 15 images from local to specific folder in Network volume

Hi, I'm beginner of programming, so my question could be messy. I'm trying to transfer 10 - 15 images to network volume when I deploy pod with api request(I'm using data center that doesn't support S3-api) [What I made so far] 1. python code that I can use api request with variable(e.g. ORD_183)...

Migrating Network Disk Region

Hi, I realized after setting up everything on my network volume that it was on a region not allowing any H200 SXM. So now I would like to migrate it to another region, pulling though S3 using a CPU pod seems like the good approach but I can't find any network drive compatible with both S3 and H200.. Am I missing something?...
Solution:
It doesn't matter that the target isn't S3 compatible for pulling

Dev permission access in a team setting.

In a team setting with dev permissions, I should be able to create serverless pods with docker hub credentials and also create network volumes and add files to them right? I have dev permissions but am unable to do this. This doesn't really make sense to me but if it is intended I'd just like to know....

GPU's are unavailable on pod.

Hi guys, I've set 4xH100 instance (default one at most). And when the pod is instantiated GPU's are not available within. (I have a script to validate that) here's c6ghnnsno6fkvu whatever pod id. I'll keep it for a day, to let you check it exactly. Here's my script output:...