RunpodR
Runpod9mo ago
octimot

A100 SXM with 87% GPU Memory Used at boot

I'm trying to boot up a 1 x A100 SXM pod on EU-RO-1, however, it boots up with 87% GPU Memory Used

I can't track the process that is using the memory, so I assume it's a bug?

root@a91054b2008e:/# nvidia-smi
Wed Apr 16 08:45:48 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.08             Driver Version: 550.127.08     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-80GB          On  |   00000000:0A:00.0 Off |                    0 |
| N/A   27C    P0             62W /  400W |   71238MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+


Has anyone encountered this?
Solution
It looks like there was an issue with the machine on Runpod and they've removed it!
Was this page helpful?