vLLM Inconsistently Hangs at NCCL Initialization
Hi, I am trying to run vLLM on 2x A40s GPUs and it will sometimes hang at NCCL initialization. This inconsistently occurs and sometimes will work fine. But for a pod that it hangs on, repeated attempts will aways hang...
CUDA 12.4.1
python 3.10
vllm 0.7.3
command:
vllm serve unsloth/Meta-Llama-3.1-8B --tensor-parallel-size 2
2 Replies
Unknown User•7mo ago
Message Not Public
Sign In & Join Server To View
we weren't and I think forcing 12.4 fixed the issue. Thanks!