Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
Sorry @Dr. Furkan Gözükara the 5090 patch does work sometimes while doesn't work the other time (not stable) However, I would like to ask if you can point to me where I can enable Yoloface in the SwarmUI ? I remember I saw one of those vidoes before but I forgot where I can enable it after I have dowloaded the model.
Do not skip any part of this tutorial to master how to use Stable Diffusion 3 (SD3) with the most advanced generative AI open source APP SwarmUI. Automatic1111 SD Web UI or Fooocus are not supporting the #SD3 yet. Therefore, I am starting to make tutorials for SwarmUI as well. #StableSwarmUI is officially developed by the StabilityAI and your m...
1:08:43 How to use edit image / inpainting 1:10:38 How to use amazing segmentation feature to automatically inpaint any part of images 1:15:55 How to use segmentation on existing images for inpainting and get perfect results with different seeds
@Dr. Furkan Gözükara I can't seem to understand the logic for segmentation of face >=2 with yolo . What's the command line again? for 2 face left to right
Hey all, I've played around to try training loras on Wan, I cannot get Diffusion-pipe to work, it never finds the config file, and i've tried DiffSynth, which works but only accepts videos.
I wondered, is there a 3rd alternative ? And as it seems i'm currently stuck with DiffSynth, we do agree the dataset must be videos max 720x1280 of 81 frames at 16fps, right ?
Yeah it's not super clear, but I cannot make image work in training, the data process step returns this : "AttributeError: 'LegacyReader' object has no attribute 'count_frames'", I'm not a tech wizard but to me it means no images and only videos ahah
So i've barely scraped the surface but on diffusion-pipe I've tried simple videos at 16fps. Following the num_frames from the empty embeds node from Kijai, I've used 16n+1 number of frames, (17, 33, 49, 65, 81, 129). I added all these in the frame_bucket part of the dataset.toml to be safe, so it looks like that : frame_buckets = [1, 17, 33, 49, 65, 81, 129] (1 is for Images I guess).
Basically everything got processed but not the 129 frames clip, i suppose it's too long.
Running on linux on a 4090, I can train the lora at rank 128, resolution 704, and it uses around 19gb of the gpu. All that training on the wan 1.3b model. I've tried 1024 training but it ooms instantly. (knowing my system was taking 400mb, I don't think it's possible to train 1024 at all on a 4090)