RunpodR
Runpod6mo ago
Goran

CUDA error: CUDA-capable device(s) is/are busy or unavailable

I see quite a few jobs fail with this error message:
RuntimeError: CUDA error: CUDA-capable device(s) is/are busy or unavailable18:38:08CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

This usually happens for all jobs on a worker (I have to terminate the worker).
A retry on another worker completes as expected.
Was this page helpful?