Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
I just modified TTplanet's one. its a good starting point. Let me know how your Wan Lora success is. I'm going to keep experimenting, and will let you know what i discover too. I think there is a reason character lora's aren't being released as fast as "motion/animation" loras
Also TorchCompileModelWan (for wan21) breaks Lora's on 50xx but not 30xx. Must be the 3.3 triton, or torch nightlies. I just tested. Same workflow works fine on 30xx. but not 50xx with Torchcompile running on both.
Not my research - direct from Kijai: Benchmarks. Everything here is Wan2.1 720 @ 49 frames, sageattention, cuda 12.6, 3080 ti, listing thresh, starts, times, and quality observations
0.300 (rel_l1_thresh) a lot more subtle grainy noise than 0.250 0.00 (start) -> 13:08 (encoding time) 0.2875 0.00 -> 13:39 more grainy noise 0.280 0.00 -> 13:18 92% starts noisy 0.275 0.00 -> 14:55, near indistinguishable from 0.25 0.270 0.00 -> 14:41, 96%, nearly perfect 0.265 0.00 -> 16:09, 96%, near perfect 0.250 -> looks lossless 0.10 -> 16:30
2x run: Best so far, but slower, 0.250 0.10 (start) -> 16:20
Near if not lossless at 0.250, at the cost of about 3 minutes when compared to 0.300.
in summary
0.300 ~ 13:00 I found a bit too grainy for my liking, though speed is nice 0.275 ~ 14:55 great balance between speed and quality 0.250 ~ 16:30 I found was basically lossless
Introducing Stable Virtual Camera, currently in research preview. This multi-view diffusion model transforms 2D images into immersive 3D videos with realistic depth and perspective—without complex reconstruction or scene-specific optimization.
@Furkan Gözükara SECourses have you seen a xformers fix for blackwell yet? I searched but haven't seen progress. I saw someone compiled wheels for python 3.12 but not 3.10, was thinking to upgrade venv for kohya to 312. Also for wan21. lora training, you must use captions, or the lora model can't do anything, no flexibility beyond the source images similarity. I tried at least 4 tests using musubi. Also still struggling to get good likeness as compared to hunyuan
I have compiled xFormers on xformers-0.0.30+c5841688.d20250306 torch==2.7.0.dev20250228+cu128 triton-3.2.0+git8f9b005b the compile worked I am able to install. Python 3.10.11 - Windows 11 However I...
Hey @Furkan Gözükara SECourses I am running the wan2.1 on H100 14b model, while text to video works great, I'm having issues with Image to video, I'm following every step of your tutorials but when I try to generate i get an error (no lens). what am I doing wrong?