Error when using 'nvidia-smi' on pod, 'Failed to initialize NVML: Unknown Error'
Hello. I am training ASR model on GPU pod, and as the title says, error started occurring.
I know this is a well-known issue and I know it needs to be fixed on the host server.
The pod information that the above issue occurred on is as follows.
Region: US-GA-2
Pod ID: mvu7urvdtnjdi4
I have not experienced any inconvenience related to this issue, and I just wrote this post to send you, so I think it isn't necessary to open a ticket.
I hope that appropriate action will be taken someday.
best regard, michigety.
I know this is a well-known issue and I know it needs to be fixed on the host server.
The pod information that the above issue occurred on is as follows.
Region: US-GA-2
Pod ID: mvu7urvdtnjdi4
I have not experienced any inconvenience related to this issue, and I just wrote this post to send you, so I think it isn't necessary to open a ticket.
I hope that appropriate action will be taken someday.
best regard, michigety.


