RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods-clusters

Confidential Computing Support

Hi! I'm looking to do experiments with the H100 and Confidential Computing (CC). I saw on a talk from NVDIA GTC that in order for the H100 to support CC it needs to be running alongside an CPU with support for Virtualized-based TEE ("Confidential VM") some supported CPUs are AMD Milan, or later and Intel SPR and later. Are your H100 running with CPUs that support Confidential VM? Are they in environment suited for Confidential Computing?...

Ollama on RunPod

Hey all, I am attempting to set up Ollama on a Nvidia GeForce RTX 4090 pod. The commands for that are pretty straightforward (link to article: https://docs.runpod.io/tutorials/pods/run-ollama#:~:text=Set%20up%20Ollama%20on%20your%20GPU%20Pod%201,4%3A%20Interact%20with%20Ollama%20via%20HTTP%20API%20). All I do is run the following two commands on the pod's web terminal after it starts up, and I'm good to go: 1) (curl -fsSL https://ollama.com/install.sh | sh && ollama serve > ollama.log 2>&1) & 2) ollama run [model_name]...
No description

Runpod Python API problem while trying to list pods

Hi! Since this morning, I have difficulties to list my pods via runpod.get_pods(). Apart from rare successes, usually an error pops up within graphql.py saying that 'Something went wrong', recommending to try again later. Are there some known issues? I am using Version 1.6.2. Thanks in advance!

Multiple SSH keys via Edit Pod option

I understand I need to separate public keys by newlines. However pasting in SSH keys separated by newlines via Edit Pod-> Environment Variables doesn't seem to allow two people to connect simultaneously. Sorry if this has been answered elsewhere, thanks in advance!...
Solution:
Either that or simply manually add the SSH keys to the authorized_keys file which is much simpler.

L40S aren't available

Hello. On the community cloud, the website shows that L40S are available at price $0.5/hr, but when I'm trying to create a pod, it says that they aren't available.
No description

It is possible to reserve GPUs for use at a later time?

To ensure that there is a GPU available at a planned time of use, is it possible to reserve GPUs?

Is there a way to scale pods?

I would like to scale up number of pods in order to meet demand. Is there a way to do that?

Build with Dockerfile or mount image from tar file

Is there a possibility to build a image from dockerfile trough runpod or mount my tar file?

Performance of Disk vs Network Volume

Is there a significant trade-off in performance between the pod's local volume and a network volume? How should I think about this?

Runpod's GPU power

Does Runpod's gpu share? I need a GPU with 100% power for training
Solution:
The GPU is dedicated to you, they are not shared.

Error when trying to Load "ExLlamav2"

I haven't used Runpod in a while but I'm pretty sure I used this one before, but somehow it's not working
No description

CPU clocking speed

I'm creating a machine, how do i know CPU clocking speed or its range?

NVENC driver conflict

Trying to use accelerated ffmpeg on a pod that has worked on pods before. Getting attached error even though driver is correct version. ```Mon Apr 22 03:52:28 2024
+---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+...
No description

Is it all pods based on Docker?

I want to work on the original Ubuntu system or KVM, also get the public IP of the machine. Is it possible? It seems that the container image all are dockers.

Comfyui runpod don’t save workflow

Hi, I’m having a problem, when I stop the pod with comfyui and I run it again my workflow disappears

HTTP service [PORT 7860] Not ready

Like the title says, the http service isn't ready for me to connect when I'm trying to run TheBloke Local LLMs One-Click UI template. I'm using the A100 GPU with 100GB Disk and 100GB pod volume. its usually lets me connect after a few minutes and its been more than this.

I'd like to run a job that takes 8x GPUs.. any way I can increase the spend limit?

I'd like to run a job that takes 8x GPUs.. any way I can increase the spend limit?

Suddenly cannot boot SD Pod having trouble with "Could not load settings"

full error message: 2024-04-20T12:41:36.193868880Z *** Could not load settings 2024-04-20T12:41:36.194453815Z Traceback (most recent call last): 2024-04-20T12:41:36.194479595Z File "/workspace/stable-diffusion-webui/modules/launch_utils.py", line 244, in list_extensions 2024-04-20T12:41:36.194485685Z settings = json.load(file)...

4xH100 pod is stuck -- can't restart or stop

I am still connected with SSH, but the pod can't be used due to some network issues. RunPod UI also can't reach it (it shows waiting for logs). Over night the pod failed with: ```...
No description