I’m generating videos with InfiniteTalk using RTX 5090 / RTX 4090, running the ComfyUI SageAttention (CUDA 12.8) template.
On 5090 / 4090 everything works perfectly. I sometimes use a Network Volume, and it works reliably there as well.
However, when I try to run the exact same setup on H200 SXM, I run into a serious issue: • the output video contains only ~1 second (or even just 1 frame), • and then the rest of the video is a black screen.
What I’ve already verified: • I tried both the SageAttention template and a setup without it • I clone the same GitHub repositories • I use the exact same ComfyUI workflow • Same models, same settings • Same input image and the same MP3 audio file
Everything is literally 1:1 identical, except for the GPU.
On H200 there is no full crash — generation finishes — but the output video is broken.
Could this be related to: • H200 / Hopper-specific behavior • SageAttention or attention backend compatibility • FP8 / precision differences • CUDA 12.8 issues on Hopper GPUs
I’d really appreciate any ideas or similar experiences. Thanks in advance