R
RunPod•4mo ago
Casper.

worker-vllm build fails

I am getting the following error when building the new worker-vllm image with my model.
=> ERROR [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false if [ -f /run/secrets/HF_TOKEN ]; then export HF_TOKEN=$(cat /run/secrets/HF_TOKEN); fi && if [ -n "Pate 10.5s
------
> [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false if [ -f /run/secrets/HF_TOKEN ]; then export HF_TOKEN=$(cat /run/secrets/HF_TOKEN); fi && if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then python3 /download_model.py; fi:
#10 9.713 Traceback (most recent call last):
#10 9.713 File "/download_model.py", line 4, in <module>
#10 9.715 from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#10 9.715 File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#10 9.715 from vllm.model_executor.model_loader import get_model
#10 9.715 File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#10 9.715 from vllm.model_executor.weight_utils import (get_quant_config,
#10 9.715 File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#10 9.715 from vllm.model_executor.layers.quantization import (get_quantization_config,
#10 9.715 File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#10 9.716 from vllm.model_executor.layers.quantization.awq import AWQConfig
#10 9.716 File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#10 9.716 from vllm._C import ops
#10 9.716 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
=> ERROR [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false if [ -f /run/secrets/HF_TOKEN ]; then export HF_TOKEN=$(cat /run/secrets/HF_TOKEN); fi && if [ -n "Pate 10.5s
------
> [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false if [ -f /run/secrets/HF_TOKEN ]; then export HF_TOKEN=$(cat /run/secrets/HF_TOKEN); fi && if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then python3 /download_model.py; fi:
#10 9.713 Traceback (most recent call last):
#10 9.713 File "/download_model.py", line 4, in <module>
#10 9.715 from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#10 9.715 File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#10 9.715 from vllm.model_executor.model_loader import get_model
#10 9.715 File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#10 9.715 from vllm.model_executor.weight_utils import (get_quant_config,
#10 9.715 File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#10 9.715 from vllm.model_executor.layers.quantization import (get_quantization_config,
#10 9.715 File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#10 9.716 from vllm.model_executor.layers.quantization.awq import AWQConfig
#10 9.716 File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#10 9.716 from vllm._C import ops
#10 9.716 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
18 Replies
Casper.
Casper.•4mo ago
@Alpay Ariyak I'm not sure what the exact issue is here. Do you need a GPU to build the image? I checked out commit 2b5b8dfb61e32d221bc8ce49f98ec74698154a6e to get it working for now. Seems latest release is broken somehow
ashleyk
ashleyk•4mo ago
Its a regression, was fixed previously.
Casper.
Casper.•4mo ago
Actually, I had to go even further back to get it working 😅 Just pushing my new serverless image now. Would love for this to be fixed so that I can upgrade bfeb60c54eaad2eeffa9741ce7600eb30e573698
Alpay Ariyak
Alpay Ariyak•4mo ago
Checking this now This has to do with vLLM's updates Seems to only be affecting AWQ and potentially other quants
Alpay Ariyak
Alpay Ariyak•4mo ago
So this is why https://github.com/vllm-project/vllm/blob/929b4f2973ec6a53ea4f0f03d21147ef8b8278be/vllm/model_executor/weight_utils.py#L85-L122 Love the # TODO(woosuk): Move this to other place. lol The function is not used in that file in any way
GitHub
vllm/vllm/model_executor/weight_utils.py at 929b4f2973ec6a53ea4f0f0...
A high-throughput and memory-efficient inference and serving engine for LLMs - vllm-project/vllm
Alpay Ariyak
Alpay Ariyak•4mo ago
Hey @Casper. could you try with image 0.3.1 as base In this line instead of 0.3.0 https://github.com/runpod-workers/worker-vllm/blob/717343b0ad4d8a4ea76626c52b473619c646e30b/Dockerfile#L2 Pushed a new image, should work
Casper.
Casper.•4mo ago
My model is not quantized I’ll try it later
Alpay Ariyak
Alpay Ariyak•4mo ago
Got you Should fix either way
Casper.
Casper.•4mo ago
Getting same error ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
Alpay Ariyak
Alpay Ariyak•4mo ago
What’s the full error
Casper.
Casper.•4mo ago
Same as I pasted above
> [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false if [ -f /run/secrets/HF_TOKEN ]; then export HF_TOKEN=$(cat /run/secrets/HF_TOKEN); fi && if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then python3 /download_model.py; fi:
#9 10.13 Traceback (most recent call last):
#9 10.13 File "/download_model.py", line 4, in <module>
#9 10.13 from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#9 10.13 File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#9 10.13 from vllm.model_executor.model_loader import get_model
#9 10.13 File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#9 10.13 from vllm.model_executor.weight_utils import (get_quant_config,
#9 10.13 File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#9 10.13 from vllm.model_executor.layers.quantization import QuantizationConfig
#9 10.13 File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#9 10.13 from vllm.model_executor.layers.quantization.awq import AWQConfig
#9 10.13 File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#9 10.13 from vllm._C import ops
#9 10.14 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
------
executor failed running [/bin/sh -c if [ -f /run/secrets/HF_TOKEN ]; then export HF_TOKEN=$(cat /run/secrets/HF_TOKEN); fi && if [ -n "$MODEL_NAME" ]; then python3 /download_model.py; fi]: exit code: 1
> [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false if [ -f /run/secrets/HF_TOKEN ]; then export HF_TOKEN=$(cat /run/secrets/HF_TOKEN); fi && if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then python3 /download_model.py; fi:
#9 10.13 Traceback (most recent call last):
#9 10.13 File "/download_model.py", line 4, in <module>
#9 10.13 from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#9 10.13 File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#9 10.13 from vllm.model_executor.model_loader import get_model
#9 10.13 File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#9 10.13 from vllm.model_executor.weight_utils import (get_quant_config,
#9 10.13 File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#9 10.13 from vllm.model_executor.layers.quantization import QuantizationConfig
#9 10.13 File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#9 10.13 from vllm.model_executor.layers.quantization.awq import AWQConfig
#9 10.13 File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#9 10.13 from vllm._C import ops
#9 10.14 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
------
executor failed running [/bin/sh -c if [ -f /run/secrets/HF_TOKEN ]; then export HF_TOKEN=$(cat /run/secrets/HF_TOKEN); fi && if [ -n "$MODEL_NAME" ]; then python3 /download_model.py; fi]: exit code: 1
Alpay Ariyak
Alpay Ariyak•4mo ago
Are you sure you’re using 0.3.1?
Casper.
Casper.•4mo ago
Yeah it loads the new 0.3.1 [vllm-base 1/7] FROM docker.io/runpod/worker-vllm:base-0.3.1-cuda11.8.0 I'm on the latest commit d91ccb866fc784b81a558f0da44041a020ba54e0
Alpay Ariyak
Alpay Ariyak•4mo ago
I see what's going on
Casper.
Casper.•4mo ago
I am building on a Macbook M2 btw Just for reference
Alpay Ariyak
Alpay Ariyak•4mo ago
Pushing new base image Done, try now
Casper.
Casper.•4mo ago
Rebuilding now, let's see Build worked @Alpay Ariyak, thanks for fixing it
Alpay Ariyak
Alpay Ariyak•4mo ago
No problem, thanks for pointing it out!