Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Creating serverless templates via GraphQL

I want to automatically create serverless templates from command line/python. I tried the example for creating a new pod here: https://docs.runpod.io/docs/create-pod-template which worked. I want to create a serverless template though, but when I add isServerless: true to the graphql request (as per the PodTemplate documentation) it errors out with: ```...

streaming

i'm used to using openai to stream text all in one request that is done async. why does runpod just spam requests for the /stream endpoint instead of making it one request and feeding data as it's generated?

Issue with Worker Initiation Error Leading to Persistent "IN_PROGRESS" Job Status

Hi All, While testing the endpoint, I observed that when initiating a job with an empty input in the request body (using {"input": {}} or an empty JSON {}), the worker fails to start successfully. Despite this, the job status continually displays as "IN_PROGRESS". This problem persists without reaching a resolution or a timeout state, necessitating manual intervention to cancel the job. The log output indicates an error message: "Job has missing field(s): input." However, the job status doesn't reflect this error and remains indefinitely in progress....
No description

Log retention and privacy

I'm weighing the cost benefit of cloud GPU for AI inference tasks and self-hosted vs privacy implications. I want to be able to submit personally private information which I'm comfortable with submitting an API call to initialize a worker as long as it is ephemeral in RAM. However are the API calls logged to any medium that is persistent longer than the specific request is being processed?...
Solution:
serverless logs persist for around 30d, please do not log any private or privacy related data

Serverless doesn't work properly when docker image is committed

I built the image locally using the following command and it works fine after submitting it to serverless. sudo docker build -t xsjiang/rp-comfyui:t1 --platform linux/amd64 . I then ran this image locally and built a second image using the following command. sudo docker run --runtime=nvidia -it -v runpod-model:/runpod-volume -p 8188:8188 -p 8000:8000 --network host --name comfyui xsjiang/rp-comfyui:t1 /bin/bash...

[Errno 122] Disk quota exceeded

My workers are occasionally getting this error, which I've never seen before: { "dt": "2024-01-12 07:05:43.179082", "endpointid": "nrbt6cd41ed5he",...
Solution:
Looks like you run our of volume storage

Error whilst using Official A1111 Runpod Worker - CUDA error: an illegal instruction was encountered

https://github.com/runpod-workers/worker-a1111 I am using the official A1111 Runpod Worker. It's not actively maintained and I ran into 2 issues whilst building the Docker image but those were easily resolved. After successfully building the docker image and loading it onto an API endpoint, I'm getting an error which I'm struggling to solve....

Use private image from Google Cloud Artifact Registry

I'm trying to setup the authentication to GCP Artifact Registry, but without much success. I've followed the instructions here: https://cloud.google.com/artifact-registry/docs/docker/authentication#json-key I try to configure the credentials in RunPod using the contents of the base64 key file, but I immediately get an error from RunPod as soon as I hit save. A red popup on the top right saying "❌ Failed to Update Registry Credential". I guess this is because the key is too long? (3185 chars)...
No description

Outpainting

Hey guys . I want to create an Outpainting feature in my web app and I want to use an Inpainting SD model + Automatic1111 + Flask API that will do the Outpainting job. But I want a 100% serverless solution that is not slow. How can I do that please? 🙂...

Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!

I've been utilizing RunPod endpoints for the past month or so, no issues everythings been working wonderfully. Past week, a handful of my jobs have been failing. I'm not entirely sure why. I have not made any code changes and not changed the docker image that is from my template. I do notice that it seems to be waiting for GPU's to be available. but not sure why, when it finds them this error throws. Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select...
No description

SCP

scp -P 22 -i ~/.ssh/id_ed25519 root@213.173.102.159:/mmdetection/data/results/vis/results.tar.gz ./ root@213.173.102.159's password: I am attempting to retrieve some files from my pod via SCP but I'm being prompted for a password, why is this happening?...
Solution:
Yep absolutely correct, I ended up solving it with: bash -c 'apt update;DEBIANFRONTEND=noninteractive apt-get install openssh-server -y;mkdir -p ~/.ssh;cd $;chmod 700 ~/.ssh;echo "$PUBLIC_KEY" >> authorized_keys;chmod 700 authorized_keys;service ssh start;sleep infinity' as my docker command and was going to update my post but couldn't find where my post was in help-topics (turns out it was in serveless lol)

Performance Difference between machine u3q0zswsna6v88 and cizgr1kbbfrp04

Hey all, what are the difference between these two machines? For the exact same code, u3q0zswsna6v88 takes 60s and cizgr1kbbfrp04 takes 8s. I repeat the same request multiple times and none of these requests hit cold start. Happened around 5:15pm today. Thanks!...

Warming up [Billing]

Hello everyone, I'm curious about the runpod billing on serverless endpoints. Specifically, does runpod charge for elapsed seconds when executing tasks outside the handler? In my use case, I'm concerned about potential charges during tasks such as 'warming up models.' which are not triggered via api, Can anyone provide insights or details on this matter? Thank you!...
Solution:
Yep, you are basically charged from the moment the container starts until the moment it stops, regardless of what it is actually doing or whether the handler has started or not.
No description

Worker not consuming jobs

I am following this tutorial: https://blog.runpod.io/serverless-create-a-basic-api/ I have submitted a few requests to the API, and the UI shows a worker is running (and I am being billed for the worker) but it doesn't seem to be consuming any jobs. All the jobs remain in the "queued" state indefinitely. There are no logs in the worker logs tab. See attached screenshots. Requested 24gb worker (3090) Any idea what is going on / tips on how to debug further?...
No description

RuntimeError: The NVIDIA driver on your system is too old (found version 11080). Please update your

I deploy a new version today but keep running into this error. Did something changed on RunPod? Thanks!

Worker log says remove container, remove network?

Not even sure this is an issue, but one of my endpoints I'm testing has a throttled worker that has an odd output in their log. I'm not sure if it's crashed and been removed or just deallocated or something? ``` 2024-01-10T14:00:00Z create pod network 2024-01-10T14:00:00Z create container ghcr.io/bartlettd/worker-vllm:main ...
Solution:
thats normal, unless worker is running

Hi all. I created a pod, started it, but can't ssh, can't start its "web terminal", can't do anythin

I've created a new pod, started it, added the RSA keys, etc… however, can't ssh; Error response from daemon: Container f3aeaa504300180e74107f909c00ece20c4e18925c55c45793c83c9d3dc52852 is not running Connection to 100.65.13.88 closed. Connection to ssh.runpod.io closed....

Should I be getting billed during initialization?

Trying to understand exactly how serverless billing works with respect to workers initialising. From the GUI, behaviour is inconsistent and I can't find an explanation in the docs. I have an example where workers are pulling a docker image, one of the workers says they're ready despite still pulling the Image while the other two are in the initialising state. The indicator in the bottom right shows the per second pricing for one worker which would make sense if its active, but it clearly isn't ready to accept jobs. Also, pulling images from Github container registry takes an absolute age, I'd be disappointed about getting charged more because of network congestion....
Solution:
we have seen this happen if you update your container using same tag
No description

[RUNPOD] Minimize Worker Load Time (Serverless)

Hey fellow developers, I'm currently facing a challenge with worker load time in my setup. I'm using a network volume for models, which is working well. However, I'm struggling with Dockerfile re-installing Python dependencies, taking around 70 seconds. API request handling is smooth, clocking in at 15 seconds, but if the worker goes inactive, the 70-second wait for the next request is a bottleneck. Any suggestions on optimizing this process? Can I use a network volume for Python dependencies like I do for models, or are there any creative solutions out there? Sadly, no budget for an active worker....
Solution:
Initializing models over a network volume can inherently be slow bc ur booting from a different harddrive. If u can is easier to bake into the docker image as ashelyk said. Ur other option is increase idle times after a worker is active that way ur first request is initialized the model into vram and subsequent requests are easy to pick up for the worker...
No description

Runpod VLLM Context Window

Hi I've been using this template in my serverless endpoint https://github.com/runpod-workers/worker-vllm I'm wondering what my context window is/how its handling chat history? ...