Hi RunPod team / community,
I’m trying to deploy a Qwen model (Qwen/Qwen2.5-0.5B) using a serverless container on RunPod, but my job stays in the queue and never runs.
Setup:
GPU: 32 GB Pro, count 1
Disk space: 20 GB
Queue type: GPU-based
Model deployed via a custom Docker image
Python 3.10, torch with CUDA 12.1, runpod SDK
Observation:
Docker image builds and deploys successfully.
Worker shows as “Ready” in the Workers tab.
No logs appear in the Logs tab when the job is queued.
Clicking the Worker ID shows: “Image loaded successfully, worker is ready.”
Dockerfile / runpod_worker.py: (summary, full files can be shared if needed)
Using nvidia/cuda:12.1.0-runtime-ubuntu22.04
runpod_worker.py calls runpod.serverless.start({"handler": handler})
Problem:
Jobs stay in the queue indefinitely.
Worker is ready but does not pick up jobs.
Has anyone faced this issue or can advise what might be missing in the serverless setup to allow the worker to pick up queued jobs?
Thanks in advance!