RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods-clusters

How to set environment variable when launching pod with network volume

I am launching a pod with ashleykza's automatic1111 template using a network volume, however it starts to redownload everything even though it's already on my network volume. She provided an environment variable to skip 'sync'ing, which I thought I did when editing the template overrides as shown in the second pic. Despite this, its still redownloading everything. What am I supposed to set 'key' to here to prevent it from redownloading everything?
No description

'Background' options for Pod Initiated file transfer

I'm trying to scope out if there's a solution to have a runpod send me back a small .db/txt file on completion of task, or of progress before closing due to being outbid and closed (Community pods) I've been looking at rsync, runpodctl, SSH, and they all seem to require transfer to be 'initiated' from the recipient machine I'm looking at the google drive API, which I think is going to be my best bet for an 'always ready to receive' solution. ...
Solution:
You might need something like this, detect the signal and do something: import signal import boto3 import os...

No such image

I just created an image, pushed it to docker.io and created a Pod template referencing this image. However, startup fails due to Error response from daemon: No such image: $IMAGENAME I can pull the image locally from my machine without being logged in to docker.io. Why is my Pod not able to pull the image?...
Solution:
Yep, solved. Building the image with docker buildx build --platform linux/amd64 helped. Not a Runpod issue at all.

network volume usage on pod deploy

I created a community pod with 40 GB volume storage. By default it started with 59% usage. I tried deploying another pod and the same thing happened. This is in the US region.
Solution:
if its really empty, but it says used you can report it from the website's contact button @mathew
No description

Is it possible to use Runpod to finetune a text to speech model

I am not super tech savvy so I am unsure if this is possible, The TTS is (https://github.com/erew123/alltalk_tts) I know how to connect to runpod via SSH but I dont know how to connecting the two would work if its possible at all.

Predict SSH over TCP command predicting <username> - trying to automate pulling a repo at pod deploy

I want to pull a git repo into the workspace of a pod as it is deployed, i am trying to ssh into a pod without accessing the gui, i know the command has a typical form ssh <username>@<runpodproxy> -i (path to ssh). I do not know how <username> is generated. I can tell that the <username> is <[podID]-[string]>. Anyone know what the [string] is? is it predictable or otherwise associated with the pod? I am also looking into the runpodctl exec python [file] [pod id] command, any suggestions would be appreciated....

text gen webui template not downloading models

wehn I try downlaoding a model on text gen web ui nothing happens

Error response from daemon: driver failed external connectivity on endpoint.

Suddenly I am getting below error when I try to docker compose up The Docker was working fine on the pod. I just made some code changes and rebuilt it and now I getting below errors: Gracefully stopping... (press Ctrl+C again to force) Error response from daemon: driver failed programming external connectivity on endpoint mia-runpod-backend-engine-1 (f4a69cb1cbf0100d22af23c3d5dc5a09aeeac3425476d4bc8bfbf886e42a77f1): Unable to enable MASQUERADE rule: (iptables failed: iptables --wait -t nat -A POSTROUTING -p tcp -s 172.19.0.4 -d 172.19.0.4 --dport 8000 -j MASQUERADE: /usr/sbin/iptables: error while loading shared libraries: libip4tc.so.2: cannot close file descriptor: Error 24 (exit status 127))...

Updated Torch templates

Hi RunPod team. I write again because ever the templates on Runpod are out of date. We are lacking a torch 2.3 template for ROCm and CUDA. Tomorrow, torch 2.4 is released as well.

Persistent home directory?

Hi, I wonder if there is a way to persist the home directory. It's really inconvenient to lose all configurations after every reboot...

Getting Available Values for Stable Diffusion API Parameters.

I am trying to figure out where to get the official Keys:Values for the API Parameters for the FASTAPI Stable Diffusion setup. For instance, For the Sampling Method, there is a drop down with values that you can clearly see and use in the Automatic 1111 WebUI, but how do I know what the official parameter is and the exact values available so that when I send a request via the API it sends the right value? In the swagger example it provides some of this, but clearly not all. And certainly does not provide the entire available values. Specifically, I am looking for the official parameter for selecting the Stable Diffusion Checkpoint, How to get the specific values available for the Sampling Method and Schedule type. In the Automatic 1111 webui screenshot - Are these the official Checkpoint model names to use? Including the string in the brackets after the name? By selecting the "/openapi.json" (screenshot) (small link at the very top of the API documentation page) It gives a HUGE JSON file which you can search through and get clues to what the parameter values are. Is this a good practice to find this information this way?...
No description

Which template to use?

Is there any advantage to using ashleykza's A1111 Stable Diffusion 1.9.4 over the official runpod template (unpod/stable-diffusion:web-ui-10.2.1)?

A100 OneTrainer stuck in downloading loop

Hi there, I am trying out the OneTrainer template and it seems to be broken for the A100 GPU's. After everything has been downloaded, it seems to go back to the "Extracting" part again and starts downloading again, like an infinite loop. This does not happen for a RTX4090 for example...
No description

OneTrainer

Hi there, I am trying out the OneTrainer template but how do I see the actual GUI? Since there is no web app. Is there some VNC connection?

Locking down web accessible items

So I have a stupid question, still trying to understand how all of this works. whenever I deploy a pod, for Text Generation Web UI and APIs I get a list of services that are available via http from the get go. I already added a key so the web interface is locked but I still need to lock down the file upload, visual studio code, and juniper notebook. Does anyone have a where I can read up on how to do this?...
No description

How to fetch more than 8 gpus on RunPod (2 nodes)

Hi, Usually RunPod provides max 8 gpus, how i can fetch more than 8 gpus?

Environment Variables are not set in "SSH over exposed TCP" for the root user

The template env variables are not available in the "SSH over exposed TCP" connection which is the root user. I am wondering if the reason for this is because they are only set for the "Basic SSH" user?

Environment variables missing

Hello, I am creating a pod with environment variables, but it doesn't seem to work. When I connect via SSH, and echo $ENV_VAR_NAME it prints nothing. Am I missing something? Also, printenv doesn't show the default environment variables from runpod, nor my added environment vars. I am using my own template, but the Docker image is build on top of an official runpod image....
Solution:
Its docker yeah, environment in linux is per user i guess, so when you login using ssh, your env's wont be there because the docker container starts your application as a different user

GPU Memory Used Issue

Can anyone please explain to me why I am using 93% of my memory while running nothing? I imported my comfyUI workflow but not actively running the flow Are there any way I can allocate it to reduce it?...
No description

Avast Antivirus detects Runpod as a Trojan Virus

Hello For some reason my Anti-Virus detects api.runpod.io as a Trojan Horse virus, which renders the service unusable. The Alert pops up after selecting a GPU to use, then clicking on the "Change Template" button to select a Pod Template....
No description