Rundpod VLLM Cuda out of Memory

Hi I've been using the default runpod VLLM template with the mixtrial model loaded in the network volume. I'm encountering CUDA out of memory on cold starts.

Here is the error log.

2024-01-15T20:32:13.726720287Z torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 0 has a total capacty of 47.54 GiB of which 16.75 MiB is free. Process 422202 has 47.51 GiB memory in use. Of the allocated memory 47.05 GiB is allocated by PyTorch, and 12.67 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

message.txt3.91KB

Runpod•3y ago•

75 replies

Concept

Rundpod VLLM Cuda out of Memory

message.txt3.91KB

Rundpod VLLM Cuda out of Memory

Similar Threads

Rundpod VLLM Cuda out of Memory

Similar Threads

Similar Threads

Similar Threads