Considering moving away from RunPod, this is just insane how slow this is on serverless
Runpod serverless 4090 GPU, cold start of vllm:
Model loading took 7.5552 GiB and 52.588290 seconds
My local 3090 cold start of vllm
Model loading took 7.5552 GiB and 1.300690 seconds
¿any idea?