Runpod•2y ago•

29 replies

worker-vllm build fails

I am getting the following error when building the new worker-vllm image with my model.

 => ERROR [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "Pate  10.5s
------
 > [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then         python3 /download_model.py;     fi:
#10 9.713 Traceback (most recent call last):
#10 9.713   File "/download_model.py", line 4, in <module>
#10 9.715     from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#10 9.715   File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#10 9.715     from vllm.model_executor.model_loader import get_model
#10 9.715   File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#10 9.715     from vllm.model_executor.weight_utils import (get_quant_config,
#10 9.715   File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#10 9.715     from vllm.model_executor.layers.quantization import (get_quantization_config,
#10 9.715   File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#10 9.716     from vllm.model_executor.layers.quantization.awq import AWQConfig
#10 9.716   File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#10 9.716     from vllm._C import ops
#10 9.716 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

 => ERROR [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "Pate  10.5s
------
 > [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then         python3 /download_model.py;     fi:
#10 9.713 Traceback (most recent call last):
#10 9.713   File "/download_model.py", line 4, in <module>
#10 9.715     from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#10 9.715   File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#10 9.715     from vllm.model_executor.model_loader import get_model
#10 9.715   File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#10 9.715     from vllm.model_executor.weight_utils import (get_quant_config,
#10 9.715   File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#10 9.715     from vllm.model_executor.layers.quantization import (get_quantization_config,
#10 9.715   File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#10 9.716     from vllm.model_executor.layers.quantization.awq import AWQConfig
#10 9.716   File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#10 9.716     from vllm._C import ops
#10 9.716 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

 => ERROR [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "Pate  10.5s
------
 > [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then         python3 /download_model.py;     fi:
#10 9.713 Traceback (most recent call last):
#10 9.713   File "/download_model.py", line 4, in <module>
#10 9.715     from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#10 9.715   File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#10 9.715     from vllm.model_executor.model_loader import get_model
#10 9.715   File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#10 9.715     from vllm.model_executor.weight_utils import (get_quant_config,
#10 9.715   File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#10 9.715     from vllm.model_executor.layers.quantization import (get_quantization_config,
#10 9.715   File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#10 9.716     from vllm.model_executor.layers.quantization.awq import AWQConfig
#10 9.716   File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#10 9.716     from vllm._C import ops
#10 9.716 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

 => ERROR [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "Pate  10.5s
------
 > [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then         python3 /download_model.py;     fi:
#10 9.713 Traceback (most recent call last):
#10 9.713   File "/download_model.py", line 4, in <module>
#10 9.715     from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#10 9.715   File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#10 9.715     from vllm.model_executor.model_loader import get_model
#10 9.715   File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#10 9.715     from vllm.model_executor.weight_utils import (get_quant_config,
#10 9.715   File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#10 9.715     from vllm.model_executor.layers.quantization import (get_quantization_config,
#10 9.715   File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#10 9.716     from vllm.model_executor.layers.quantization.awq import AWQConfig
#10 9.716   File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#10 9.716     from vllm._C import ops
#10 9.716 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

Casper.OP•2/28/24, 5:37 PM

@Alpay Ariyak I'm not sure what the exact issue is here. Do you need a GPU to build the image?

Casper.OP•2/28/24, 5:47 PM

I checked out commit 2b5b8dfb61e32d221bc8ce49f98ec74698154a6e2b5b8dfb61e32d221bc8ce49f98ec74698154a6e to get it working for now. Seems latest release is broken somehow

ashley•2/28/24, 5:52 PM

Its a regression, was fixed previously.

Casper.OP•2/28/24, 6:10 PM

Actually, I had to go even further back to get it working

Casper.OP•2/28/24, 6:10 PM

Just pushing my new serverless image now. Would love for this to be fixed so that I can upgrade

Casper.OP•2/28/24, 6:13 PM

bfeb60c54eaad2eeffa9741ce7600eb30e573698bfeb60c54eaad2eeffa9741ce7600eb30e573698

Alpay Ariyak•2/28/24, 11:58 PM

Checking this now

Alpay Ariyak•2/29/24, 12:45 AM

This has to do with vLLM's updates
Seems to only be affecting AWQ and potentially other quants

Alpay Ariyak•2/29/24, 12:49 AM

So this is why https://github.com/vllm-project/vllm/blob/929b4f2973ec6a53ea4f0f03d21147ef8b8278be/vllm/model_executor/weight_utils.py#L85-L122

Love the # TODO(woosuk): Move this to other place.# TODO(woosuk): Move this to other place. lol

The function is not used in that file in any way

GitHub

vllm/vllm/model_executor/weight_utils.py at 929b4f2973ec6a53ea4f0f0...

A high-throughput and memory-efficient inference and serving engine for LLMs - vllm-project/vllm

Alpay Ariyak•2/29/24, 1:01 AM

Hey @Casper. could you try with image 0.3.1 as base

Alpay Ariyak•2/29/24, 1:02 AM

In this line instead of 0.3.0 https://github.com/runpod-workers/worker-vllm/blob/717343b0ad4d8a4ea76626c52b473619c646e30b/Dockerfile#L2

Alpay Ariyak•2/29/24, 1:02 AM

Pushed a new image, should work

AAlpay Ariyak This has to do with vLLM's updates Seems to only be affecting AWQ and potentiall...

Casper.OP•2/29/24, 7:35 AM

My model is not quantized

AAlpay Ariyak Hey @Casper. could you try with image 0.3.1 as base

Casper.OP•2/29/24, 7:35 AM

I’ll try it later

Alpay Ariyak•2/29/24, 7:36 AM

Got you
Should fix either way

AAlpay Ariyak Got you Should fix either way

Casper.OP•2/29/24, 2:53 PM

Getting same error ImportError: libcuda.so.1: cannot open shared object file: No such file or directoryImportError: libcuda.so.1: cannot open shared object file: No such file or directory

Alpay Ariyak•2/29/24, 2:53 PM

What’s the full error

AAlpay Ariyak What’s the full error

Casper.OP•2/29/24, 2:53 PM

Same as I pasted above

 > [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then         python3 /download_model.py;     fi:
#9 10.13 Traceback (most recent call last):
#9 10.13   File "/download_model.py", line 4, in <module>
#9 10.13     from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#9 10.13   File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#9 10.13     from vllm.model_executor.model_loader import get_model
#9 10.13   File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#9 10.13     from vllm.model_executor.weight_utils import (get_quant_config,
#9 10.13   File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#9 10.13     from vllm.model_executor.layers.quantization import QuantizationConfig
#9 10.13   File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#9 10.13     from vllm.model_executor.layers.quantization.awq import AWQConfig
#9 10.13   File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#9 10.13     from vllm._C import ops
#9 10.14 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
------
executor failed running [/bin/sh -c if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "$MODEL_NAME" ]; then         python3 /download_model.py;     fi]: exit code: 1

 > [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "PatentPilotAI/mistral-7b-patent-instruct-v2" ]; then         python3 /download_model.py;     fi:
#9 10.13 Traceback (most recent call last):
#9 10.13   File "/download_model.py", line 4, in <module>
#9 10.13     from vllm.model_executor.weight_utils import prepare_hf_model_weights, Disabledtqdm
#9 10.13   File "/vllm-installation/vllm/model_executor/__init__.py", line 2, in <module>
#9 10.13     from vllm.model_executor.model_loader import get_model
#9 10.13   File "/vllm-installation/vllm/model_executor/model_loader.py", line 10, in <module>
#9 10.13     from vllm.model_executor.weight_utils import (get_quant_config,
#9 10.13   File "/vllm-installation/vllm/model_executor/weight_utils.py", line 18, in <module>
#9 10.13     from vllm.model_executor.layers.quantization import QuantizationConfig
#9 10.13   File "/vllm-installation/vllm/model_executor/layers/quantization/__init__.py", line 4, in <module>
#9 10.13     from vllm.model_executor.layers.quantization.awq import AWQConfig
#9 10.13   File "/vllm-installation/vllm/model_executor/layers/quantization/awq.py", line 6, in <module>
#9 10.13     from vllm._C import ops
#9 10.14 ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
------
executor failed running [/bin/sh -c if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "$MODEL_NAME" ]; then         python3 /download_model.py;     fi]: exit code: 1

Alpay Ariyak•2/29/24, 2:54 PM

Are you sure you’re using 0.3.1?

Casper.OP•2/29/24, 2:55 PM

Yeah it loads the new 0.3.1
[vllm-base 1/7] FROM docker.io/runpod/worker-vllm:base-0.3.1-cuda11.8.0

Casper.OP•2/29/24, 2:55 PM

I'm on the latest commit d91ccb866fc784b81a558f0da44041a020ba54e0d91ccb866fc784b81a558f0da44041a020ba54e0

Alpay Ariyak•2/29/24, 2:58 PM

I see what's going on

Casper.OP•2/29/24, 3:03 PM

I am building on a Macbook M2 btw

Casper.OP•2/29/24, 3:03 PM

Just for reference

Alpay Ariyak•2/29/24, 3:08 PM

Pushing new base image

Alpay Ariyak•2/29/24, 3:09 PM

Done, try now

Casper.OP•2/29/24, 3:18 PM

Rebuilding now, let's see

Casper.OP•2/29/24, 3:32 PM

Build worked @Alpay Ariyak, thanks for fixing it

Alpay Ariyak•2/29/24, 3:32 PM

No problem, thanks for pointing it out!

worker-vllm build fails

Similar Threads

Similar Threads

Similar Threads