Alright so, I restarted the pod (with the env var you suggested) and CUDA reported zero gpus
Then I removed the env var, restarted, and CUDA now reports four GPUS. no change from previous code/config
Either:
- somehow the pip install commands messed up CUDA, and restarting fixed that
- runpod is flakey on if the gpus get attached or not