Probleme when writing a multi processing handler

Hi there ! I got an issue when I try to write a handler that processes 2 tasks in parallel (I use ThreadPoolExecutor). I use the transformers library by HF for loading the models and I use Langchain to process the inference. I tested my handler on Google collab, it works well, so I create my docker template and create an endpoint in Runpod, but when it comes to the inference, I constantly have an error : CUDA error: device-side assert triggered. Which I don't have when I test the handler on collab. How can I handle that, and particularly, what can cause this error ? Because I use a 48GB GPU (which is highly sufficient for my models that take around 18 GB in total), so it can't be a resource issue.
3 Replies
ashleyk
ashleyk4mo ago
If you're trying to process concurent jobs, you need to follow this doc: https://docs.runpod.io/serverless/workers/handlers/handler-concurrency
Concurrent Handlers | RunPod Documentation
RunPod supports asynchronous functions for request handling, enabling a single worker to manage multiple tasks concurrently through non-blocking operations. This capability allows for efficient task switching and resource utilization.
Blah Blah
Blah Blah4mo ago
Thanks ! I'll that. I naively thought I didn't have to change anything from a local handler code. Hopefully that solves the problem
justin
justin4mo ago
GitHub
Runpod-OpenLLM-Pod-and-Serverless/handler.py at main · justinwlin/R...
A repo for OpenLLM to run pod. Contribute to justinwlin/Runpod-OpenLLM-Pod-and-Serverless development by creating an account on GitHub.
Want results from more Discord servers?
Add your server
More Posts
Idle time: High Idle time on server but not getting tasks from queueI'm testing servers with high Idle time to keep alive and get new tasks, but the worker is showing iIs there a programatic way to activate servers on high demand / peak hours load?We are testing the serverless for production deployment for next month. I want to assure we will havRunpodctl in container receiving 401Over the past few days, I have sometimes been getting a 401 response when attempting to stop pods wiIncreasing costs?guys last few days seems an increase in cost without a spike in active usage. do you have any idea wCannot establish connection for web terminal using Standard Diffusion podI'm able to connect to the Webui HTTP client. And I can connect via SSH from my local machine AND I [URGENT] EU-RO region endpoint currently only processing one request at a timeWe have a production endpoint running in the EU-RO region but despite us having 21 workers 'running'Runpod errors, all pods having same issue this morning. Important operationI got this error on all my pods today We have detected a critical error on this machine which may aHi, I have a problem with two of my very important services, and I received the following messageHi, I have a problem with two of my very important services, and I received the following message: Error while using vLLm in RTX A60002024-02-22T11:19:46.009303238Z /usr/bin/python3: Error while finding module specification for 'vllm.502 error when trying to connect to SD Pod HTTP Service on RunpodI've been following along with this tutorial - everything was going smoothly until it cam time to co