Software Engineering Courses (SECourses)•10mo ago

ok it's working now

ZZono50 ok it's working now

Furkan Gözükara SECourses•3/2/25, 7:26 PM

great

Furkan Gözükara SECourses•3/2/25, 7:26 PM

sorry for this issue i had forgatten to push

Mikeyo8989•3/3/25, 4:34 AM

so is it normal for the wan2.1 i2v 480p model download to be 7 9GB files?

cyberbol•3/3/25, 10:40 AM

regaeding RIFE FPS increase , i have always error red color regarding sound at the end in CMD comsole. It is normall ?

RRenzo Nice. I appreciate it!

Aikage•3/3/25, 2:28 PM

After much testing...

Aikage•3/3/25, 2:28 PM

I went back

Aikage•3/3/25, 2:28 PM

lol

Aikage•3/3/25, 2:29 PM

I just cannot make prodigy work as well for some reason

Aikage•3/3/25, 2:29 PM

If I use ADAMW8Bit, ADAFactor, Lion

Aikage•3/3/25, 2:29 PM

any of these I can get nice clean results

Aikage•3/3/25, 2:29 PM

if I use Prodigy there's always some level of smeariness

Aikage•3/3/25, 2:30 PM

and just very...weird non-compliance to prompts it's hard to explain. It's almost like selective overtraining - it passes my "blue hair" sampling test - but then it won't put the character into certain poses no matter what I try, for example

Renzo•3/3/25, 2:33 PM

That's super interesting. Thank you for sharing your findings as well. I think I won't spend time experimenting with Prodigy for now. Did you notice any advantage in Adamw8bit or Lion over Adafactor?

Renzo•3/3/25, 2:33 PM

If you get similar results and the training process requires less time, that's a win I would say.

Aikage•3/3/25, 2:34 PM

between the three I find Lion to be the best for me for illustration type in general as - and this may be in my head but I feel I can increase the batch size without sacrificing detail as much

Aikage•3/3/25, 2:34 PM

but I'm going to be real I trained the same LoRA maybe 2x on each

Aikage•3/3/25, 2:34 PM

so it's not exactly a huge dataset of information

Aikage•3/3/25, 2:35 PM

all three produced a good result and I used Dr. Furkan's "fast" settings from his patreon

Aikage•3/3/25, 2:35 PM

as the base

Aikage•3/3/25, 2:35 PM

and then just adjusted learning rates/batch/epoch/ etc. as I felt was needed

Aikage•3/3/25, 2:36 PM

Oh i think one thing for Illustration vs. Realism too I found online and I am not sure if it helped but it did pop up consistently is to set the min SNR gamma to 1.5 instead of 5

Aikage•3/3/25, 2:40 PM

otherwise my general philosophy I tried to keep to was:
2-4 batch size,
repeats x epochs to get to ~5000 steps
Many snapshot saves - just remember to clear after training or you'll fill up your HD real quick

I try to keep learning rate kind of low - I think of it like a BBQ and try to keep learning low and slow so usually LR ~8x 10-6 (8e-06)
Text encoder = 1/2 Unet rate or just leave at 3e-06
LoRA tab rank and alpha I played with a lot and beside making the size bigger IDK I think with illustration you get diminishing returns after maybe 16 rank. So i use 16 rank and either 8 alpha or 1 alpha (random - I didn't really test the difference yet or check it)
I have no idea about DoRA or if it helps at all - dropout probability is supposed to help if you're overtraining but I don't mess with it. My best Prodigy attempt had it at 0.01 though so YMMV.

warmup I do 1/10 of overall epochs if at all
Base optimizer and settings are from the .json if i changed I went to the wiki and started with the params they recommended

Aikage•3/3/25, 2:46 PM

https://civitai.com/models/1301071/mykhaila-tutorial-model-for-betterwaifu This was the model I made using prodigy if you want to see what I mean about smeariness. It's just not very crisp to me although the image to image consistency is quite good

cyberbol•3/3/25, 4:16 PM

Can anyone recommend a TTS that supports the POLISH language and allows me to clone my voice? I know there are options like F5-TTS, Zonos, Fish Speech, etc. I need one that can be installed locally on Windows and supports voice cloning. It must support Polish.

95лвл•3/3/25, 5:12 PM

Hello everyone. Do you know how to fix this?: File "C:\Python310\lib\site-packages\torch\nn\modules\activation.py", line 1241, in forward
attn_output, attn_output_weights = F.multi_head_attention_forward(
File "C:\Python310\lib\site-packages\torch\nn\functional.py", line 5354, in multi_head_attention_forward
raise RuntimeError(f"The shape of the 2D attn_mask is {attn_mask.shape}, but should be {correct_2d_size}.")
RuntimeError: The shape of the 2D attn_mask is torch.Size([77, 77]), but should be (1, 1).

95лвл•3/3/25, 5:12 PM

use SUPIR v27

95лвл•3/3/25, 5:14 PM

All work fine till i moved it to another drive:)

Ccyberbol regaeding RIFE FPS increase , i have always error red color regarding sound at t...

Furkan Gözükara SECourses•3/3/25, 6:36 PM

yes if no sound it gives that :d

Furkan Gözükara SECourses•3/3/25, 6:36 PM

not important

MMikeyo8989 so is it normal for the wan2.1 i2v 480p model download to be 7 9GB files?

Furkan Gözükara SECourses•3/3/25, 6:36 PM

yes atm that way

Furkan Gözükara SECourses•3/3/25, 6:36 PM

they published fp32 and no half weights yet

Furkan Gözükara SECourses•3/3/25, 6:36 PM

i will ask bf16 alternative

m3dia_offline•3/4/25, 1:25 AM

@Furkan Gözükara SECourses is it possible to fine tune lora or full checkpoint of the FLUX Fill model that is used for inpainting, to inpaint a particular thing like a logo or such?

Mm3dia_offline <@205854764540362752> is it possible to fine tune lora or full checkpoint of the...

Furkan Gözükara SECourses•3/4/25, 10:37 AM

yes

Furkan Gözükara SECourses•3/4/25, 10:37 AM

https://github.com/nftblackmagic/catvton-flux

GitHub

GitHub - nftblackmagic/catvton-flux

Contribute to nftblackmagic/catvton-flux development by creating an account on GitHub.

Gill Bastar•3/4/25, 7:34 PM

Is it possible in Wan 2.1 to repeatedly generate the same video and change only small details using prompts? For example, let's say I have a scene and I only don't like some object in the background, which I want to prompt differently. Or the resolution. Everything else is perfect and should not change. Or is the output always random?

And is it possible to fine-tune Wan 2.1?