RunpodR
Runpod3y ago
5 replies
antoniog

Issues with building the new `worker-vllm` Docker Image

I've been using the previous version of worker-vllm with the awq model in production, and it recently turned out that there are problems with scaling it (all the requests are being sent to the one worker).

I've tried the newest version of the worker-vllm. It works when using a pre-built Docker Image but I need to build a custom Docker Image with a slightly modified vllm (there's one minor update that negatively affects the quality of outputs).

Unfortunately, there are issues when building a Docker Image (even without any modifications).

There are already 3 issues related to that on GitHub:
https://github.com/runpod-workers/worker-vllm/issues/21#issuecomment-1862188983
https://github.com/runpod-workers/worker-vllm/issues/25
https://github.com/runpod-workers/worker-vllm/issues/26

Could you, please, take a look on it? Or provide with a solution for scaling the previous version of worker-vllm? Thanks in advance!
GitHub
I'm building the image with WORKER_CUDA_VERSION=12.1 on an M1 Mac using command docker buildx build -t antonioglass/worker-vllm-new:1.0.0 . --platform linux/amd64 and getting errors. See below....
Errors when building the image · Issue #25 · runpod-workers/worker-...
GitHub
I tried to build the docker from scratch but also get an error (using CUDA 11.8, runpod/base:0.4.4) RUN python3.11 -m pip install -e git+https://github.com/runpod/vllm-fork-for-sls-worker.git@cuda-...
Build not possilbe · Issue #26 · runpod-workers/worker-vllm
GitHub
Around 1-3% the download of model while building docker image get stuck and don't move forward. This happens with different models too and wasn't happening earlier. Outside of this docker i...
HF Model Download get stuck · Issue #21 · runpod-workers/worker-vllm
Was this page helpful?