I'm seeing 93% GPU Memory Used even in a freshly restarted pod.
Not sure what to do about this. nvidia-smi shows there are no processes running, but when I try to run a job it shows "Process 1726743 has 42.25 GiB memory in use". How do I find and kill that?
3 Replies
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
I tried most of that .. the process id it quoted doesn't show up in
ps -ef
(and the number is a bit unusual).
If there was a process holding onto memory, restarting the pod would clear that.Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View