P2P is disabled between NVLINK connected GPUs 1 and 0
Hey team! Could you fix NVLink issue for H100 SXM Community pods? I encounter this error frequently. Corrupted pod ID: 4a5acwxj2kene6
P2P is disabled between NVLINK connected GPUs 1 and 0. This should not be the case given their connectivity, and is probably due to a hardware issue. If you still want to proceed, you can set NCCL_IGNORE_DISABLED_P2P=1.
I can proceed with NCCL_IGNORE_DISABLED_P2P flag but this will drop performance ~ 10%
P2P is disabled between NVLINK connected GPUs 1 and 0. This should not be the case given their connectivity, and is probably due to a hardware issue. If you still want to proceed, you can set NCCL_IGNORE_DISABLED_P2P=1.
I can proceed with NCCL_IGNORE_DISABLED_P2P flag but this will drop performance ~ 10%

Solution
@storuky2306 so got response and aparently gpu5 is not supporting P2P.
What we can advise for now is to pick diffrent machine
What we can advise for now is to pick diffrent machine