Recipe for Llama 4 Scout on vLLM
Multi-nodes Serverless Endpoint
Serverless

Network connectivity issues on EUR–IS nodes (works fine on EUR–RO)
serverless request is killed, logs are gone
New build endless "Pending" state
Skip build on Github Commit
Model initialization failed: CUDA driver initialization failed, you might not have a CUDA gpu.
getting occasional OOM errors in serverless
ComfyUI + custom models & nodes
bug in creating endpoints

16 GB GPU availability almost always low
Endpoint specific API Key for Runpod serverless endpoints
generation-config vllm
--generation-config vllm
.
How do I add --generation-config vllm parameter when using Quick Deploy? Want to be able to set custom top_k, top_p, temperature in my requests instead of being stuck with model defaults.
Thanks!...New UI New Issue again lol

ComfyUI looks for checkpoint files in /workspace instead of /runpod-volume
Unhealthy worker state in serverless endpoint: remote error: tls: bad record MAC

Job Dispatching Issue - Jobs Not Sent to Running Workers
Stuck at initializing

So serverless death