RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡｜serverless

⛅｜pods-clusters

Flynn

8/5/2024

LoRAs aren't showing

So, I have LoRAs both private and from Civitai. I've clicked "Copy Link" and pasted them into the respective LoRA folder after the "wget" command. It claims they have been saved, and they show up on the left-hand side where the folders are, but it doesn't show up in the web-ui

Flynn

8/5/2024

ADetailer for Runpod Stable Diffusion isn't working.

I recently just downloaded ADetailer from huggingface to improve faces on my generations. I have all the settings right, the enable box ticked, but it doesn't work. It doesn't even zoom in and create a box around the face as if the ADetailer is in effect, it just does nothing.

lil_xiang

8/5/2024

How many ports can I expose?

Hi, what's the maximum number of tcp ports I can expose in one pod? Basically we are going to use a pod with 8 gpus, we want to expose lots of ports for different purposes. Thanks,...

Xtramiche

8/4/2024

ComfyUI : GPU and VRAM at 0%

Hi. I'm running an RTX4090 pod with the comfyui template by ai-dock to run the flux[dev] model . However, the pod shows 0% GPU usage and also 0% VRAM usage. In contrast, the RAM ahs about 30gb taken up. The model runs slower than I expected too(although I have no point of comparison). Is this likely to be a bug in runpod's resource monitoring or is there something wrong with my pod or pod template ?...

MarioHachemer

8/4/2024

CUDA error: uncorrectable ECC error encountered

I just provisioned an 8xH100 NVL machine, made it load a very large model and then the container got stuck into a restart loop trying to load the model stuck on this error: 2024-08-04T16:43:13.809833249Z RuntimeError: CUDA error: uncorrectable ECC error encountered This looks like a hardware defect. Is there a way to get my credits back for that run?...

Xtramiche

8/3/2024

Save template overrides

Hi. I'm using the Comfyui - ai-dock template but it's getting tiresome to manually change the environment variables each and every time. Is there any way that runpod.io could remember them for me. Or can I save a copy maybe with those settings in My Templates ? Thanks for your help....

Solution:

yes sure copy them in my-template works, i think there is a suggestion to easily save the env from a pod but i don't know its progress

BTSee

8/3/2024

Base image + code

Hi. I'm new to runpod. I am building a sort of mega-template using a runpod base template (2.2.1-py3.10-cuda12.1.1-devel-ubuntu22.04) and adding my own code, about another 500Mb. where is a good place to host the image for runpod template to pull from? thanks for any tips / links.

King

8/3/2024

Why can I only rent 7 h100 nvls since yesterday and not 8?

aman1391

8/2/2024

Accerelate launch is getting stuck on pods

Accerelate alunch on the pod getting stuck on the command line

PaleBlueDot

8/2/2024

network storage sooo slow

Hi, I'm new to runpod. I'm running a 5xH100 in US-KS2 with network storage in the same region. Loading the model (70B Llama) from storage is going to take 25 minutes. Is this normal for runpod? This normally takes seconds on other machines I've used.

disintegral

8/2/2024

Malformed database disk image

https://pastebin.ai/x85pnrblnu

lil_xiang

8/2/2024

UDP port and template from private docker registry

Hi, i have 2 questions: - I only see http ports and tcp ports options when creating pod, do we have udp port exposing? - can we create template from private docker registry? ...

aptni27

8/1/2024

Problems SSH'ing multiple times, lost ssh keys?

Has anyone experienced issues SSH'ing into a runpod machine multiple times? I have a terminal already ssh'd into the machine (which has a public IP), but now other terminals are requesting a password at login? I'm on Macos with ZSH and now my publickey is not working without any changes to the runpod container authorized keys? I can literally cat ~/.ssh/authorized_keys in one terminal on the remote machine and verify that the keys are present, but in other terminals I'm unable to log in....

Punit

8/1/2024

Venv not found

So I have a network volume which I use to run pods for ComfyUI and I had created a venv in it. It was working fine for few months but now suddenly it shows error bash: venv/bin/activate: No such file or directory I dont have my venv anymore?...

disintegral

8/1/2024

A1111 Stable Diffusion 1.10.0 Pod filling up disk immediately

I added around 10GB of space to the pod after failing to boot once, and it immediately fills up to 100% with stuff like this showing up on container. The same exact Storage Volume worked to boot the pod OK yesterday. I would like to keep all my LORAs and settings, but this is annoying to deal with....

Laikh

7/31/2024

Unable to start pod with MI300x

Observing "hang" when starting pod with 8xMI300x, screenshot attached. Any ideas on how to fix this?

Sir Falk

7/31/2024

Exposing port not working

I'm trying to create embeddings using infinity. There is already a docker container for that: https://hub.docker.com/r/michaelf34/infinity Now I've tried to launch it and expose port 7797. However, I can't reach the container via the proxy:...

Kushagra

7/30/2024

Error after restarting the containers.

Command : docker compose up Error: WARN[2024-07-30T12:12:22.042930970Z] Controller.NewNetwork mia-runpod-backend_default: error="failed to create DOCKER-USER IPV6 chain: iptables [+] Running 3/4es --wait -t filter -N DOCKER-USER: ip6tables v1.8.4 (legacy): can't initialize ip6tables table `filter': Table does not exist (do...

utmostmick0

7/30/2024

ULTIMATE Stable Diffusion Kohya ComfyUI InvokeAI

doesn't start properly looks like its creating the stable diffusion container 4 time in a row

ChiKim

7/30/2024

Anyone Getting Bad Pods with Internet Issues?

I'm in US, and I get a lot more bad pods with internet issues than working pods like 7 out of 10. I'm trying to spot a community pod with rtx 4090 and the default template pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04. When I get a bad pod, I get error pulling image: Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) If the pod runs and if I connect via ssh and try to setup, I often run into problem with apt on ubuntu and python pip. Sometimes I get certificate error, extremely slow speed less than 10 bytes per second, etc. I have to keep launching different pod until I get a working one. Anyone has the same problem?...

Previous Next

Gaming

Programming

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!