Multi-Instance GPU Support on RunPod
Hey,
Last night, I was experimenting with RunPod for the first time, and tried to enable MIG via
nvidia-smi
- but it didn’t work. Instead, it threw an ‘insufficient permissions’ error.
This appears to be due to how virtualisation is implemented, providing a lack of hardware-level control within a pod.
I have tried the same thing on Vast.ai and received the same results, however they informed me that they do not officially support said functionality.
With all that, is there a way to get this to work on RunPod? MIG is mentioned in the following blog post: https://www.runpod.io/articles/rent/h100-sxm
Thank you for your help and time.Rent H100 SXM in the Cloud – Deploy in Seconds on RunPod
Instant access to NVIDIA H100 SXM GPUs—ideal for training large language models and high-performance computing—with hourly pricing, global availabilit...
5 Replies
Your inside a docker on a host shared with other customers so you won't be able to get that going but maybe you want instant cluster instead
Ah, okay. Thought that was why, but didn’t know there was an alternative. Thank you, I’ll give it a try!
Just got around to looking at this, and appears that this option will be out of my budget. Thank you for the insight nonetheless, @Henky!! - definitely no other way to do this with RunPod?
You'd have to DIY something but your not gonna get true root, its a container rental service
Thought as much… thank you for your response again!
Anything that seems to work costs an arm and a leg 😢
So I haven’t been able to test them
You can do this in runpod with bare metal + request access to the permissions
I think they refer it there just to show the capabilities but yes maybe its abit confusing because its written in runpod's blog