Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
repeats are a kohya thing, 1 epoch in the general context of ML usually means going once through your dataset. And look here, 2,5 minutes. Even if you used multiple GPUs, it cannot mean much more than going through your dataset once with the s/it I have seen so far on flux
I have question about training and captions please : How many images you think, for a Pose lora including 2 or multiple characters should be in the folder, and repeats ? And also my captionning, can I send one here with the related image ? It is Sfw but may contain violence, soo i prefer to ask ^^
sooo...I've tried some lora trainings recently. What can I say. Sometimes results are pretty good on 70-80 epochs (once on 50 epochs) and sometimes 200 epochs are not good enough. I guess it all depends on dataset.
and training on realistic dataset will not give you any cartoonish (comics) style output. Because it's too overtrained on realistic images. I did sample check every 40 epochs (5 checks during training) and after 80th epoch cartoon style becomes realistic
Hi @Furkan Gözükara SECourses can you create another train preset using 24gb Vram to train fp16 Lora. I trained using Rank_3_18950MB. The lora is fp8 compare with fp16 training in your Rank 1 or Rank 2 presets, the prompt understanding of the fp8 lora is reduce and the result of fp8 is so far not satisfy my need
I'm just saying that even during inference a low-precision base model doesn't make a difference, so how would it during training. The Lora weights are still bf16+
No, base model fp8 which uses to train lora does affect the lora. For example 2 train 2 lora same dataset but one with fp16 based model one with fp8. The lora with fp16 have better prompt understand than fp8. My training is person lora. While the person quality is still good with both loras. The prompt understanding every subjects outside of the person is reduce with fp8 lora. For example: my prompt have something like little angels playing in the background. The fp16 lora show some angel with body of human and wing but the fp8 showing some birds.
I am training currently a Lora with Flux using 20 photos and 200 Epoch, is it normal it will take 4 tot 5 hours using an RTX 4090 locally? Or did I miss any optimization setting or script?
I used a Saas solution where I trained on the same 20 photos using Flux Lora, it took me only 30 minutes, any idea what can cause the difference in speed?