Some query take a long time than usual
I notice that some query take very long time (stucking in delay), why ?
Ps. I notice thar problem occur when I leave server idle for a while

6 Replies
maybe cold start
the model needs to be loaded in the worker

Can i upload chkpoint to runpod storage for using serverless ?
yes, sure its just like a pod, you just need the code to download it from inside serverless
then put it into /runpod-volume for network volume/storage
So i need to modify code on vllm worker on runpod git ?