© 2026 Hedgehog Software, LLC

Twitter GitHub Discord

More

Communities Docs About Terms Privacy

slow model loading times with vllm - Runpod

Runpod•11mo ago•

3 replies

slow model loading times with vllm

deployed vllm worker from webui with 0.8.5 version and attached a network storage.
it is a finetuned gemma3 model.

INFO 05-17 20:09:56 [loader.py:458] Loading weights took 113.32 seconds
INFO 05-17 20:09:56 [model_runner.py:1140] Model loading took 23.3141 GiB and 160.792180 seconds

is this normal? total loading time is 160s.
could this be a disk io issue?

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

21,906Members

Sponsored

Resources

Recent Announcements

Similar Threads

Was this page helpful?

Continue the conversation

Join the Discord to ask follow-up questions and connect with the community

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

21,906 Members

Similar Threads

Slow model loading

RRunpod / ⚡｜serverless

VLLM model loading, TTFT unhappy path

RRunpod / ⚡｜serverless

Veryyyyyy slow serverless VLLM

RRunpod / ⚡｜serverless

vLLM serverless not working with hugginface model

RRunpod / ⚡｜serverless