Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
Did I mention that I love Prodigy for SD 1.5 Lora training? I just tried it with a very poor quality image set. I extracted images from phone video, and CompreFace could only select 13 of them, almost all of which were a bit blurry. I started training this with 8x iterations on 64/64 with the usual Prodigy script with d_coef 0.6. The end result could be said to be good, because the subject was similar, but produced images as blurred as the sample images. So it also captured the "style" of the sample photos well, which would be good, but in this case it's a drawback. I tried using alpha 1, but by about the third epoch it was overdone. That would be fine if I wanted a Lora in a few minutes, but that wasn't the goal. So I reset the alpha to 32 and took the weight down to 32 as well. To slow down the overtraining, I also halved the d_coef to 0.3. And the miracle happened! The result also yielded images with a score above 0.9 (it generated 0.97 the first time!), and it wasn't blurry! Fantastic! So for anyone interested in this script version, I've attached the file, and a sample prompt for you to experiment with. This is not 100% "style"-free, but the quality is much better, and on 32/32! So even after nearly 100 Lora, Prodigy continues to surprise and challenge.
I have a question. For some people, the results of SDXL Dreamboots look OK in general, but the quality of the image is very low: skin texture, eyes, eyelashes. What can be the reason?
I never touch epoch, it's always 10, and 99% of the time the last one is always the best. That's why I like Prodigy with cosine training, because I don't have to search for the best version because it's the last one. I always calculate my reps like this: 100/number of images, and rounded up
These data are personal. For style (as you can see from the above) I need to train a stronger model, but there is labelling, so it's absolutely up to me how strong a style I want.
But if I stick to flexibility in style, I don't go above 2 for tensorboard normalization. Then I can mix style with the uniqueness of other models. But if I'm doing a very strong style, then it almost doesn't matter which model I choose to use.
The great thing about Prodigy is that you can control the learning speed with two switches. The d_coef is the fine tuner, and the alpha is the coarser one. The lower the alpha, the faster it learns. So up to 5 epochs would be enough if the alpha is half the weight. This is just a theory, I haven't tested it, but it would make it even faster to train Lora, even in a few minutes (now about 10-15 minutes on an RTX 3090 card).
I like the 10 epoch and the 100 divisor because it's more transparent, and based on previous tests, it's really the right one, no more, no less. And for that you can already set the d_coef well.
I haven't switched to RealisticVision 6.x for training yet, I think 5.1 is good enough to get Lora results that work well on other models, and I get nice results on RV 6.x.
So with these results I think it's unnecessary to train on SDXL for now, maybe I'll think about SDXL-turbo if I can train it to get even better results in a small size.
Although someone recently argued for SD 2.1, that it could also be trained nicely, but many people quickly gave up on it. It's 768, so it might be worth a try with Prodigy.
I have a lora for lineart style, an image(simple lineart + colored lineart) both then how can we change the style of the little lineart only? Using controlnet or ipadapters? Or change the style of the plane colored image by some anime style lora?
Description Some Trainer (like hcp-diffusion, naifu) have implemented a method of training called "pivtoal tuning", which is basically trained Embedding and LoRA(DreamBooth) at the same t...