Veryyyyyy slow serverless VLLM
Considering moving away from RunPod, this is just insane how slow this is on serverless
Runpod serverless 4090 GPU, cold start of vllm:
Model loading took 7.5552 GiB and 52.588290 seconds
My local 3090 cold start of vllm
Model loading took 7.5552 GiB and 1.300690 seconds
¿any idea?
Runpod serverless 4090 GPU, cold start of vllm:
Model loading took 7.5552 GiB and 52.588290 seconds
My local 3090 cold start of vllm
Model loading took 7.5552 GiB and 1.300690 seconds
¿any idea?