Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡｜serverless

⛅｜pods

🔧｜api-opensource

📡｜instant-clusters

🗂｜hub

joe

11/8/2024

Network errors in Secure Cloud

Hello, I am using secure cloud to serve inference for an LLM, can someone explain what these messages mean? Is this the infra’s fault or mine? Is there any roadmap for improving reliability of network?...

Ana

11/8/2024

Pod with Comfy (flux + stable diffusion)

Hello, Right now I have a pod with stable-diffusion:web-ui-10.2.1 and I want to have only 1 pod where I can choose whether to use flux dev version or stable-diffusion:web-ui-10.2.1 , I heard about comfy that allows both but I am not clear, can you recommend me the best template according to my requirements? I don't know if in my current pod with stable diffusion I can add comfy, if I create another pod I will have to move all my files to the new stable diffusion and it will be long 😦...

JanE

11/8/2024

Changed Log output on the Runpod website

we are using FastAPI in one of our applications on your run pods. Since a couple of days the FastAPI log output is not displayed on the website's log window. In order to see the log output I have to start FastAPI via terminal now. Have there been recent changes to the way logfiles are displayed on the runport website?...

finley

11/8/2024

How do I find my network volume with runpodctl?

lou

11/7/2024

network outage pls fix to it

my pod is not works pls fix to it

akashgupta

11/7/2024

Cannot see logs on my pods

I can only see queue time but cannot see logs on my pods. is this issue faced by anyone else as well

Shreyansh

11/7/2024

Storage Pricing

How is storage pricing calculated? Is it per month altogether or same like pods per minute or maybe per day?

riverfog7

11/6/2024

Any network issues in EU-RO-1?

My git clone is running at 32KiB/s and I can't copy from s3 (its very slow). Also apt-get is slow. (same speed as git). But downloading files seems to work as expected (got 33MiB/s)...

stevex

11/6/2024

I'm seeing 93% GPU Memory Used even in a freshly restarted pod.

Not sure what to do about this. nvidia-smi shows there are no processes running, but when I try to run a job it shows "Process 1726743 has 42.25 GiB memory in use". How do I find and kill that?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacity of 44.52 GiB of which 18.44 MiB is free. Process 1726743 has 42.25 GiB memory in use. Process 3814980 has 2.23 GiB memory in use. Of the allocated memory 1.77 GiB is allocated by PyTorch, and 53.97 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacity of 44.52 GiB of which 18.44 MiB is free. Process 1726743 has 42.25 GiB memory in use. Process 3814980 has 2.23 GiB memory in use. Of the allocated memory 1.77 GiB is allocated by PyTorch, and 53.97 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

...

Fabian

11/5/2024

Persistance in pod logs from my training

I started my pod instance, associated with a volume where my dataset is located and cloned my repository through github using VS code integration. I left from home and my laptop went to sleep mode. When I come back, my training was stopped and session disconected

Putu Ganteng

11/5/2024

Custom template

Hi there! I'm trying to make my custom CPU docker-based template, but something wrong Locally the image starts well and I don't have any problems, but the same image can't run like pods I'm wondering what I'm doing wrong, because it is really simple app ...

Dockerfile

Sander

11/5/2024

Help Request: ODM Container Only Using CPU

Has anyone tried to deploy an ODM processing node using a pod before? https://github.com/OpenDroneMap/NodeODM How do I add the --gpus all to the pod?...

nikolai

11/5/2024

GraphQL Schema

Hi there, is it possible to get RunPod's GraphQL Schema or enable introspection? I need it for an integration I'm currently working on. 🙂...

Solution:

nope

raphael

11/5/2024

How saving plan work ?

Could someone clarify how saving plans work? The documentation is quite limited. I understand that it helps reduce costs over a set period, but I'd like to know if, when I get a saving plan for a pod, it guarantees access to the same GPU for the entire reservation duration. If I stop my pod for some reason, do I have to rebuild it, or can I simply restart it?...

DeepbrainAI

11/5/2024

502

Hello we are having a trouble with 502 error we are running a comfyUI with runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04 our port 8188 is still running and we also can send a get api to 8188 port...

11/5/2024

Decommissioning on November 7th

I received this email: "We are reaching out because you currently have serverless workers or pods running in the EUR-NO-1 data center, which is scheduled for decommissioning on November 7th. This change is part of our efforts to upgrade capacity, enhance the network, and improve other infrastructure." What actions should I take if I'm currently running a pod with a savings plan? How I restore a pod with the same savings plan?...

ashkan-game

11/4/2024

Lost my GPU

Hello, I stopped my pod and when I came back, I have 0 GPUs available. Should I hope that this machine can get the GPU back, or it will never get it back and I should switch to a new pod?...

justk

11/4/2024

Where are default models mounted? I can't find them under /comfy-models

```root@054f3147d5b1:/# ls -al /comfy-models/ total 4 drwxr-xr-x 2 root root 10 Oct 25 09:17 . drwxr-xr-x 1 root root 4096 Nov 4 10:00 .. root@054f3147d5b1:/workspace/ComfyUI/custom_nodes/comfyui_controlnet_aux# df -h...

justk

11/4/2024

Is scp on ssh connection to pods not supported? what could be alternative download files from pod?

without using runpodctl

John lanser

11/3/2024

Port forwarding understanding

Greetings, I have been a user of vast ai, and there they have a list of ports alreadt assigned to it and they map to exactly same one on your machine. But in runpod they map to a different one. I have to run a miner and I need to tell two of my ports to it, now should I be telling it my external or internal ports and how would they map to internal ones? I am also attaching picture of vast ports and yours as well...

Previous Next

Gaming

Programming

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!