slow model loading times with vllm

deployed vllm worker from webui with 0.8.5 version and attached a network storage. it is a finetuned gemma3 model. INFO 05-17 20:09:56 [loader.py:458] Loading weights took 113.32 seconds INFO 05-17 20:09:56 [model_runner.py:1140] Model loading took 23.3141 GiB and 160.792180 seconds is this normal? total loading time is 160s. could this be a disk io issue?
1 Reply
Jason
Jason2w ago
maybe its normal, but if you want faster speed, build the image with model inside it rather than using network volume

Did you find this page helpful?