Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

serverless deployment

i want to deploy my llm on serverless endpoint, how can i do that?

How to know when request is failed

Hello, everyone I am using webhook to be notified for job completion. I wondering if this webhook is also called when request is failed. Or is there any other way to know whether request is failed? ...

IN-QUEUE Indefinitely

I am attempting to deploy a model from HF Spaces in runpod serverless - using the ByteDance/SDXL-Lightning Docker image. I started by selecting 'Run with Docker' for the ByteDance/SDXL-Lightning space on HF and copied the Docker image tag: registry.hf.space/bytedance-sdxl-lightning:latest. Next, in RunPod, I set up a serverless template by entering the Docker image tag into the 'Container Image' field and inputting 'bash -c "python app.py"' as the container start command. I allocated 50 GB of disk space to the container and finalized the template. Subsequently, I used this template to create an API endpoint in the 'Serverless' section. However, whenever I try to run the model, my requests remain indefinitely in the 'in-queue' state. Could you help identify what I might be doing wrong?...

Costing for Serverless pods without GPU

I can't see any documentations about costing of serverless pods without any GPU
Solution:
There is no serverless without GPU, serverless is only available for GPU not CPU

Migrating from Banana.dev

Hello. I'm trying to migrate a project over. My previous flow was, whenever I wanted to deploy a change, I would run banana deploy, and that would send off the project to be built into an image and deployed. I was unable to really utilize the local image build option as I have limited resources. I've read the migration blog post, but the only option there is for building the image and pushing to a repo. Is there anything similar to banana deploy, where I only need to tweak the Dockerfile a bit and update the handler? Alternatively, I could go the dockerless route, but I'm not seeing any way in that post to specify non-Python requirements. I need access to ffmpeg and a project on GitHub; how would I setup these? And what would be the preferred way to provide API tokens/keys?...
Solution:
A few things - there is a tutorial with a little more details than the blog here: https://docs.runpod.io/tutorials/migrations/banana/overview Secondly, if you can't build your docker image (resources limited) use the CLI tool for projects: https://docs.runpod.io/cli/projects/get-started This will "build" your image on our GPU....

how to deploy suno bark tts model using runpod serverless endpoints

hi i want to deploy the opensource tts model called suno bark on runpod and i want to create an api endpoint for it. can anybody help me do that?

Active worker doesn't get enabled

Hi! I've been trying to resolve this for a month now, but chat is very unresponsive, I would very appreciate if we can resolve it here: Seems if you update serverless endpoint from active worker from 0 to X, it doesn't apply until you do a full drop to 0, this is what support advised me to do. ...

Massive spike in executionTime causing my jobs to fail (AGAIN)

There was an issue around 3am (UTC+2) that caused my cold start time, delay time and execution time all to spike, and also resulted in failed jobs due to executionTimeout exceeded which impacted my customers.
No description

Failed to get job. | Error Type: ClientConnectorError

I have another worker stuck in this state... See the attached logs.
No description

Serverless endpoint endlessly on "IN QUEUE" state

Serverless endpoint endlessly on "IN QUEUE" state. It was working for a while and I made a small change (like changing a print statement). And I redeployed my Docker image and it is now always stuck on "IN QUEUE" state...

Connection aborted for Faster-Whisper endpoint when using "large-v2" model (Pyhton & NodeJS)

I tried to hit the Faster-Whisper endpoint using large-v2 model. I got ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) when using Python and ConnectionClosed: The socket connection was closed unexpectedly. For more information, pass verbose: true in the second argument to fetch() when using NodeJS. With Python, I managed to get proper response from the endpoint when using the medium model. I've also tried using "large-v1" but got the connection aborted error too. With NodeJS, I got the same error with the medium, large-v1, and large-v2 model....

error pulling image: Error response from daemon: Get "https://registry-1.docker.io/v2/"

Hey all, I have been getting this error on the worker: error pulling image: Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) worker: e70bl866azw54g Can someone take a look please?...

Can I use websocket in serverless?

I just need a TCP port open and a public IP address, I will manage the certs, thats doable, is it possible? Or is it just task -> output type situation? Would love some advice. Thanks!...
Solution:
task -> output

Dockerless CLI can not sync local files to runpod server

when i try use dockerless following runpod blog feb 2 2024, i config and create project sucess on local ,but when i start a development session useing "runpodctl project dev", ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'builder/requirements.txt' . cmd file is "C:\Users\Administrator\test-project>" .please tell why . log info : Waiting for Pod to come online... Project test-project Pod (npia2rx81eq4wp) created. Checking remote project folder: /runpod-volume/68b7482a/dev/test-project on Pod npia2rx81eq4wp Syncing files to Pod npia2rx81eq4wp...
Solution:
install guide for wsl if you gotta install that: https://canonical-ubuntu-wsl.readthedocs-hosted.com/en/latest/

Huge P98 execution time in EU-RO region endpoint

We are seeing a huge P98 execution time in one of our EU-RO region endpoints for the past few days. It used to be below 60s in general, but now it soared above 40 minutes. We also see no correlation between the input text length & inference time, so just wanted to check if there is any hardware or driver releated issues in this region....
No description

Docker build can't finish

When I try to run docker build . in bash, it starts the process, but fails and says exit code: 1 It downloads around 4 gigabytes, and then at #13 it reads the following: download 3/8] RUN . /clone.sh taming-transformers https://github.com/CompVis/taming-transformers.git 24268930bf1dce879235a7fddd0b2355b84d7ea6 && rm -rf data assets */.ipynb: set: line 3: illegal option -o pipefail...

Broken serverless worker - wqk2lrr3e9cekc

@flash-singh this worker is clearly broken, please can you take a look?
No description

Worker is very frequently killed and replaced

I have an endpoint configured with 1 active worker and 2 max workers (24GB PRO). The requests are being handled by an asynchronous handler. For some unknown reason -- I can't see any errors or other failures in the logs, every 30 min - 2h (some times less, sometimes more), the worker restarts. Same worker (according to the id), but the container is restarted. ...

What is the recommended System Req for Building Worker Base Image

I was trying to build a custom runpod/worker-vllm:base-0.3.1-cuda${WORKER_CUDA_VERSION} image, but my 16vCPU, 64GB RAM server crashed. What is the recommended system spec for this purpose

Is there documentation on how to architect runpod serverless?

Wondering if theres Do's / Dont's of integrating runpod serverless into a larger architecture. I assume its not as snappy as lambda so I'd need to plan more aggressively around warm / cold starts? Also is RunPod serverless ready for prod deployments, or is it more of a "use at your own risk" service?