Vllm problem, cuda out of memory, ( im using 2 gpus, worker-vllm runpod's image )

"dt":"2023-12-22 12:02:22.089336" "endpointid":"489pa1sglkvuhf" "level":"info" "message":"torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 0 has a total capacty of 44.35 GiB of which 77.38 MiB is free. Process 1721253 has 44.26 GiB memory in use. Of the allocated memory 43.80 GiB is allocated by PyTorch, and 14.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF" "workerId":"xhzthbakpe7j7z" }
No description
3 Replies
nerdylive
nerdylive6mo ago
ah new error
nerdylive
nerdylive6mo ago
No description
nerdylive
nerdylive6mo ago
"message":"ValueError: Cannot find the config file for awq"