RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods-clusters

how can i access network volume from jupyterlab notebook ?

I am currently running a test on some LLM models, and currently trying to setup a network volume so that I am download and use some of the larger models while also working on some other embbeding models as well (not able to download both llm and embedding model into the defualt volume at the same time) Would like to ask how can I move the model to the network volume so that I wont have running out of volume error. Thanks!...
No description

I need to reinstall the pip requirements for comfyui everytime I start a pod.

I need to reinstall the pip requirements for comfyui everytime I start a pod. I understand there is a difference between disk and pod volume. I assume I have to update the correct one. So which one do I need to update to permanentely update comfyui and how do I do that? I already tried solving this with the ask-ai bot without success. Thank you.
Solution:

RTX 6000 Ada pods breaking

I am using the ULTIMATE Stable Diffusion Kohya ComfyUI InvokeAI template on multiple RTX 6000 Ada pods. Lately I have run into many issues with the pods breaking while using ComfyUI. The latest incident occurred shortly after loading multiple IPAdapter models through the Model Manager. After loading the models, the ComfyUI page froze, and then gave me an error saying "Error loading workflows: Unexpected token '<", "<!DOCTYPE "... is not valid JSON". The previous incident occurred directly after loading a workflow JSON into ComfyUI. Same symptoms and issues. I am unsure how the two are connected. ...
No description

how can i shrink my volume size?

I increased my volume size for some work and now there is error saying you cannot decrease the volume size. I deleted all the files to free up space and now i do not want to pay for the whole 100gb volume....

Can I use the filtering syntax when calling myself query?

query Pods { myself { pods(filter: { name: { startsWith: "basic::" } }){ id name...

How to connect to SFTP via rclone/Fuse

I've got a question which prevents me from using Runpod for my use-case at the moment: I'd like to connect Runpod to an SFTP server, and mount the corresponding remote volume to the local filesystem of the pod container - so it can be used like any other directory by apps like blender. The way to do this usually is to use rclone, or a Fuse mount. This would let me connect multiple pods to a folder for them to write - I know I can do this with a runpod network folder - but also to read from that SFTP drive, which is a local NAS in our office on which we update regularly a lot of massive files, in order to avoid doing the sync manually. Crawling through the doc, for now I can see how to connect TO the pod via SSH, export data snapshots to S3 & co and use network volume. I've come across the issue below that since "Fuse is not supported by Runpod because it requires granting privileges to the container. Since Fuse is a kernel module, it needs to be supported by the host". Another option would be to use a cloud sync utility like dropbox/gdrive, but this would involve an additional data-transfer from our office volumes to the cloud. Would love to know if there is a workaround from your team!...

I lost my mininconda env after I start my runpod

I have a miniconda installed in my runpod and I found that miniconda folder installed under /workspace is gone. Is my data not safe in runpod?

How can you move a network drive to another region?

Even if I have to buy a 2nd network drive (which I'd prefer not to do if it can be avoided), what is the most efficient way to move the files?

Create Pod with networkVolume using runpodctl

Hi, I create a network volume and give it a name. How to create a pod with this volume attach ?

How can I create a pod with public ip using graphql?

When I create a pod using GraphQL, the supportPublicIp parameter doesn't take effect. The created pod's IP address is private, not public.

Anyone know any good cmd's for downloading files in Jupyter Notebook?

I'm trying to download flux off huggingface. I've been using wget, aria2 and cURL, but none of them are working for me. aria2c -x 16 --header="Authorization: Bearer <huggingface token>" -o flux.safetensors "https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/flux1-dev.safetensors" wget --header="Authorization: Bearer <huggingface token>" -o flux.safetensors "https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/flux1-dev.safetensors"...
No description

Is there a way to determine my SSH username for a pod through runpodctl?

It seems like the username to connect to ssh is always of the format {pod id}-{some hex number}. If I spin up a pod using runpodctl and call runpodctl get pod, I can get the pod ID. However, what is this hex number? Where do I get it from?

Need help for error

Excuse me everyone, I need help where my pod can't start, it says error starting container and almost 2 hour. Can someone help me?

Where's the network volume?

Hey all, The documentation about network volumes are frustratingly poor. I created a pod and SSH'd into it, have no idea where to put my files so they're stored. Really annoying. I'm using the ComfyUI with Flux template and my goal is to store Flux models in a folder so I do not need to re-download it all the time....

I want to get a Public Url

I want to deploy my web app in RUNPOD and get a public url which i can use as a webapp. My Web app should open anytime and anywhere. Is this possible with on demand pods?

How can I access more logs from a pod?

when I view pod logs, it seems like I can only access a few hundred lines of logs. I'm dealing with an error that outputs a huge amount of logs, and concurrent requests cause a downstream error to print. Because of this I'm unable to view the logs of the original error, or what inputs caused it. Is there any way to download a full log dump of a pod?...

Running pod Terminal is not starting

I just launched a new pod with a specific container. when i click on start web terminal, button reacts but the connect to web terminal is not enabling

Llama

Hello! For those who tried, how much GPU is needed for inference only, and for fine-tuning of Llama 70B? How about the inference of the 400B version (for knowledge distillation)? Is the quality difference worth it? Thanks!
Solution:
Only for inference

XXXX.safetensors is not a safetensors file

Hi, I have problems when generating an image, it tells me that the safetensors file is not a file, I have tried to install it from wget and gdown --fuzzy, there is no way it works. Any ideas? I would really appreciate your help, I have been having this problem for many days 😦 The file works on my local machine, other people have tried it and it works for them too, I think something is missing....

I cannot use my SSH key for authentication process for my pod.

I have been using my pods using ssh key. but afew hours ago, I purchased Saving Plan. Then I cannot login to the new pods using my ssh key. The ssh key have been working well for other pods(not saving plan) so far. Please help me with this issue. Thanks in advance.