RunpodR
Runpod2y ago
jon691

8x H100 SXM5, Error 802

I'm getting an "Error 802: system not yet initialized" on an 8x H100 SXM5 community pod.

Running nv-fabricmanager gives this error:
# /usr/bin/nv-fabricmanager -c ~/nvswitch/fabricmanager.cfg request to query NVSwitch device information from NVSwitch driver failed with error:Failed to load the requested module [NV_ERR_MODULE_LOAD_FAILED]

From nvidia-smi:
Fabric State : Completed Status : Success

My workload runs smoothly on the 8x H100 PCIe pod.
Was this page helpful?