CUDA error in community 4090x4 pod

https://github.com/BAI-Yeqi/PyTorch-Verification using this script gives this error pod id: uy54q2udx0jbcw (deleted)
GitHub
GitHub - BAI-Yeqi/PyTorch-Verification
Contribute to BAI-Yeqi/PyTorch-Verification development by creating an account on GitHub.
No description
3 Replies
Unknown User
Unknown User8mo ago
Message Not Public
Sign In & Join Server To View
Jaunty
Jaunty8mo ago
I notices that on some instances when i'm taking multiple rtx 4090 cuda is not working - even for first cudaGetDeviceCount it is giving me 999 (unknown error). on other instances all is working fine. maybe drivers are out of date, or some system configuration is wrong, not figured out yet
Unknown User
Unknown User8mo ago
Message Not Public
Sign In & Join Server To View

Did you find this page helpful?