Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
and just very...weird non-compliance to prompts it's hard to explain. It's almost like selective overtraining - it passes my "blue hair" sampling test - but then it won't put the character into certain poses no matter what I try, for example
That's super interesting. Thank you for sharing your findings as well. I think I won't spend time experimenting with Prodigy for now. Did you notice any advantage in Adamw8bit or Lion over Adafactor?
between the three I find Lion to be the best for me for illustration type in general as - and this may be in my head but I feel I can increase the batch size without sacrificing detail as much
Oh i think one thing for Illustration vs. Realism too I found online and I am not sure if it helped but it did pop up consistently is to set the min SNR gamma to 1.5 instead of 5
otherwise my general philosophy I tried to keep to was: 2-4 batch size, repeats x epochs to get to ~5000 steps Many snapshot saves - just remember to clear after training or you'll fill up your HD real quick I try to keep learning rate kind of low - I think of it like a BBQ and try to keep learning low and slow so usually LR ~8x 10-6 (8e-06) Text encoder = 1/2 Unet rate or just leave at 3e-06 LoRA tab rank and alpha I played with a lot and beside making the size bigger IDK I think with illustration you get diminishing returns after maybe 16 rank. So i use 16 rank and either 8 alpha or 1 alpha (random - I didn't really test the difference yet or check it) I have no idea about DoRA or if it helps at all - dropout probability is supposed to help if you're overtraining but I don't mess with it. My best Prodigy attempt had it at 0.01 though so YMMV.
warmup I do 1/10 of overall epochs if at all Base optimizer and settings are from the .json if i changed I went to the wiki and started with the params they recommended
Can anyone recommend a TTS that supports the POLISH language and allows me to clone my voice? I know there are options like F5-TTS, Zonos, Fish Speech, etc. I need one that can be installed locally on Windows and supports voice cloning. It must support Polish.
Hello everyone. Do you know how to fix this?: File "C:\Python310\lib\site-packages\torch\nn\modules\activation.py", line 1241, in forward attn_output, attn_output_weights = F.multi_head_attention_forward( File "C:\Python310\lib\site-packages\torch\nn\functional.py", line 5354, in multi_head_attention_forward raise RuntimeError(f"The shape of the 2D attn_mask is {attn_mask.shape}, but should be {correct_2d_size}.") RuntimeError: The shape of the 2D attn_mask is torch.Size([77, 77]), but should be (1, 1).
@Furkan Gözükara SECourses is it possible to fine tune lora or full checkpoint of the FLUX Fill model that is used for inpainting, to inpaint a particular thing like a logo or such?
Is it possible in Wan 2.1 to repeatedly generate the same video and change only small details using prompts? For example, let's say I have a scene and I only don't like some object in the background, which I want to prompt differently. Or the resolution. Everything else is perfect and should not change. Or is the output always random?