IN-QUEUE Indefinitely

I am attempting to deploy a model from HF Spaces in runpod serverless - using the ByteDance/SDXL-Lightning Docker image. I started by selecting 'Run with Docker' for the ByteDance/SDXL-Lightning space on HF and copied the Docker image tag: registry.hf.space/bytedance-sdxl-lightning:latest. Next, in RunPod, I set up a serverless template by entering the Docker image tag into the 'Container Image' field and inputting 'bash -c "python app.py"' as the container start command. I allocated 50 GB of disk space to the container and finalized the template. Subsequently, I used this template to create an API endpoint in the 'Serverless' section. However, whenever I try to run the model, my requests remain indefinitely in the 'in-queue' state. Could you help identify what I might be doing wrong?
17 Replies
PatrickR
PatrickR5mo ago
Do you have access to any of the Worker logs to see what's going on?
singhtanmay345
singhtanmay3455mo ago
can't open worker logs
PatrickR
PatrickR5mo ago
What happens when you click this, then Logs?
No description
singhtanmay345
singhtanmay3455mo ago
When I click on it, it tries launching logs for worker but then fails to do so. Basically shows nothing
haris
haris5mo ago
@singhtanmay345 are you able to see any workers running? Should look l ike that green box in the top left on Patrick's screenshot above, you may be getting unlucky and trying to send requests when all of our GPUs are already in use
justin
justin5mo ago
No that isnt the issue Is my guess Cause his workers are idle it looks like i think the bigger issue is the expectation that u can just run Hf model on runpod directly It doesnt sound like u have a proper handler.py file setup / a hugging face docker container prob doesnt have a runpod package installed Can read this for a basic understanding of how to do a python file for runpod serverless
justin
justin5mo ago
RunPod Blog
Serverless | Create a Custom Basic API
RunPod's Serverless platform allows for the creation of API endpoints that automatically scale to meet demand. The tutorial guides you through creating a basic worker and turning it into an API endpoint on the RunPod serverless platform. For this tutorial, we will create an API endpoint that helps us accomplish
justin
justin5mo ago
@Merrell is it possible to update this blog to say the platform must be built for amd-64? actually a huge issue ppl run into
Data_Warrior
Data_Warrior5mo ago
Facing the same issue
ashleyk
ashleyk5mo ago
What issue? Are all your workers throttled? What do the worker logs say etc? You need to provide more detail.
justin
justin5mo ago
u say the same issue but the issue i pointed out was that u cant just point to any random docker image lol. u gotta prepare it to work the way runpod expects which is why i linked the article not sure if ur saying u made the same mistake? then read the article i linked
inventionsbyhamid
@singhtanmay345 were you able to run the model successfully? I also want to do the same, apparently need to setup the worker on runpod somehow which I am not sure how to.
PatrickR
PatrickR4w ago
I reread this thread, and yes Justin is correct. You can't just grab any random docker image, it needs to pass through the RunPod Handler to send in the correct requests. @inventionsbyhamid We don't have SDXL Ligtning as a dedicated worker, but we do have a tutorial on running SDXL turbo here: https://docs.runpod.io/tutorials/serverless/gpu/generate-sdxl-turbo#deploy-a-serverless-endpoint You don't need to build your Docker image for this, as we have prebuilt templates for SDXL turbo.
inventionsbyhamid
Managed to do it with help from a friend. I wanted to deploy SDXL Lightning (Bytedance), It's the most run image gen model on replicate. You guys should add it.
PatrickR
PatrickR4w ago
Do you have a repo I can checkout? I would be intrested in seeing what we can add to help support this.
inventionsbyhamid
Let me ask internally if we can open source it
nerdylive
nerdylive3w ago
maybe this is because an improper handler code, no worker took the job, so the job stays in the queue... I think its easy enough to make your own serverless code if its only for sdxl lightning try it out