Monitor GPU VRAM - Which GPU to check?
I am trying to monitor the GPU VRAM usage in serverless worker. To do this with pynvml I need to provide the index of the GPU. Is there a way I can obtain the index of the GPU my worker is using? I did not see this info in the ENV variables. I do see RUNPOD_GPU_COUNT but not sure if that helps.
Seems that RunPod is monitoring cpu, gpu stats as they present that information in their web interface. Does the RunPod python module expose those stats, without having to code our own?
Below is a code snippet that reports VRAM usage in a %.
Thanks! π
16 Replies
Maybe I could use GraphQL with PodTelemetry? Where's my GraphQL experts at? π
Unknown Userβ’15mo ago
Message Not Public
Sign In & Join Server To View
If I assume that my worker is using gpu at index 0. If there are multiple GPU in the server that might not be accurate. I might be on GPU 3 and another worker using GPU 0. I am pretty sure I can get that info with GraphQL. I should be able to query by pod ID and it has PodTelemetry in the return, which contains cpu and gpu stats. I'm just struggling with the documentation for it.
Unknown Userβ’15mo ago
Message Not Public
Sign In & Join Server To View
Yeah, I've seen that. I'm still looking for a good example of making a graphql request.
Unknown Userβ’15mo ago
Message Not Public
Sign In & Join Server To View
I would need to provide the pod id
Unknown Userβ’15mo ago
Message Not Public
Sign In & Join Server To View
So what do I do? add podId: ${pod_id} to inupt?
Unknown Userβ’15mo ago
Message Not Public
Sign In & Join Server To View
That's great, thanks!
I was going to send that data over the web socket but this is much better. I can just have the browser call this once a second and update CPU/GPU graph. π
Unknown Userβ’15mo ago
Message Not Public
Sign In & Join Server To View
Yeah, I think It is really coming along. Everything works just need to update the CPU/GPU graph and display the result media.

Unknown Userβ’15mo ago
Message Not Public
Sign In & Join Server To View
ToonCrafter is just one in the market... I will likely try and add a lot of models before going live. My code builds the interface dynamically so should be able to add them pretty fast.
