Software Engineering Courses (SECourses)•11mo ago

It's safe to say that gor low epoch count Muti gpu has a little impact ?

wardensc2•2/11/25, 6:29 AM

hi @Furkan Gözükara SECourses do you have plan to train Flux Schnell model ? I heard they have some great model. For example this model: https://civitai.com/models/943001/shuttle-3-diffusion

Dario•2/11/25, 3:51 PM

Hi Furkan, I've been experimenting with Flux dreambooth and LoRA training for several months now, using a high quality dataset of 3D renders en photographs of a specific car (which is not trained in the base model).

After extensive testing, I am actually getting better results with Flux LoRAs then dreambooth finetune.
The finetuned model does not follow the car's body well, the LoRA is replicating the car's shape a lot better.

For the LoRA, I found that AdamW with network dimension 128 and alpha 64 worked best for me.
For the dreambooth, I've used your configs and left them untouched.

I am surprised that this is the case as you would expect the oposite. Even the extracted LoRAs from the trained checkpoint performed worse than the LoRA.

I know this is very little information to go on, I can send you more grid tests if you want, but I am wondering if you have an idea where to look.
The only thing I can think of is that the LoRA uses AdamW and dreambooth uses Adafactor.
I could not use AdamW for dreambooth, not even on a 48GB GPU.

My goal and quality metrics are product (car) accuracy. So perhaps my configs should differ from your configs that are more leaned towards people?

PS. I've left all the common settings the same like image repeats, dataset/captions, noise offset, snr. And I have tested multiple epochs to find the best sample size

Renzo•2/11/25, 7:28 PM

I would be happy to test your LoRA configuration on a character/person if you're willing to share it. I have fine-tuned models and created LoRAs for a couple of months and my findings match Furkan's results, so it would be nice to see if there is a better configuration for LoRAs.

Dario•2/11/25, 7:57 PM

I'm happy to share my json or toml file in a message (maybe dropping it here is against the community guidelines?).
Keep in mind that I am absolute not a ML expert, I am a 3D product visualization artist dabbling with AI. However I have a beefy pc (4090 + 3090 + 3080) so I've been able to do a lot of testing, running 2 trainings at the same time.
Most of my "knowledge" comes from using custom Claude & Perplexity project with documentation about LoRA, koyha etc.
In any case, I am also very surprised by these results, as I am sure Furkan has found the best configs

PPixel.dust It's safe to say that gor low epoch count Muti gpu has a little impact ?

Furkan Gözükara SECourses•2/11/25, 8:09 PM

eloborate more

Wwardensc2 hi <@205854764540362752> do you have plan to train Flux Schnell model ? I heard ...

Furkan Gözükara SECourses•2/11/25, 8:09 PM

i tried and it didnt work well

Furkan Gözükara SECourses•2/11/25, 8:09 PM

i dont see any reason to use it for individuals

DDario Hi Furkan, I've been experimenting with Flux dreambooth and LoRA training for se...

Furkan Gözükara SECourses•2/11/25, 8:09 PM

interesting

Furkan Gözükara SECourses•2/11/25, 8:10 PM

i never had such case

Furkan Gözükara SECourses•2/11/25, 8:10 PM

fine tune always yields better and it is also expected

Dario•2/11/25, 8:10 PM

Could it have something to do with the Adafactor versus AdamW optimizer?

PPixel.dust the batc/epoch is smaller but the overal amount is slower

Pixel.dustOP•2/11/25, 8:12 PM

@Furkan Gözükara SECourses like the fist one is as single gpu and will take 150 hour the second one in dual gpu is expecte to take 250 hpurs

DDario Could it have something to do with the Adafactor versus AdamW optimizer?

Furkan Gözükara SECourses•2/11/25, 8:24 PM

adamw is better than adafactor at quality slightly. but it uses more vram

PPixel.dust <@205854764540362752> like the fist one is as single gpu and will take 150 hour ...

Furkan Gözükara SECourses•2/11/25, 8:24 PM

when you do 2x gpu

Furkan Gözükara SECourses•2/11/25, 8:24 PM

you need to reduce number of epochs 1/2

Furkan Gözükara SECourses•2/11/25, 8:24 PM

if 3 gpus

Furkan Gözükara SECourses•2/11/25, 8:24 PM

1/3

FFurkan Gözükara SECourses when you do 2x gpu

Pixel.dustOP•2/11/25, 8:31 PM

ok thanks, I think I understand it better now

PPixel.dust ok thanks, I think I understand it better now

Furkan Gözükara SECourses•2/11/25, 8:46 PM

yep because multi gpu increases batch size but keep same step count

Furkan Gözükara SECourses•2/11/25, 8:46 PM

so 1 step you actually do 2

FFurkan Gözükara SECourses adamw is better than adafactor at quality slightly. but it uses more vram

Dario•2/11/25, 9:13 PM

Maybe that was the difference then as I trained the LoRAs with AdamW but the dreambooth finetune with Adafactor (I could not fit AdamW in my gpu or a rented 48GB one)
Do you think this could be the difference?
I still believe finetuning should be better so I’m just confused about why LoRA yields better results for me.

DDario Maybe that was the difference then as I trained the LoRAs with AdamW but the dre...

Furkan Gözükara SECourses•2/11/25, 9:26 PM

hard to know possibly a mistake somewhere

Aleksei Naumov•2/11/25, 10:05 PM

Hey there!
Anyone out here was deploying flux lora training as an API?
Would be happy to chat or be grateful if you can share something useful on how to deploy to runpod/other compute providers.
P.S. If you can share some useful presets for training person lora - beer on me haha!

Hirmuolio•2/11/25, 11:31 PM

Does training lora for Noobai v-pred require different settings?
I'm using kohya_ss gui (dev branch that is up to date with the latest script).
Someone somewhere said that it needs

--zero_terminal_snr

--zero_terminal_snr

parameter. Anything else?

HHirmuolio Does training lora for Noobai v-pred require different settings? I'm using kohya...

Furkan Gözükara SECourses•2/12/25, 12:20 AM

what is Noobai v-pred

FFurkan Gözükara SECourses i dont see any reason to use it for individuals

wardensc2•2/12/25, 3:40 AM

recently when people train lora with kohya and they also train clip text encode together, they claim that lora train together with clip text encode give much better result. I wonder your latest lora training script json file have these clip text encode trained with lora or not ?

FFurkan Gözükara SECourses what is Noobai v-pred

wardensc2•2/12/25, 3:42 AM

It a finetuning of Illustrious model, another heavy modified training from SDXL model.

Hirmuolio•2/12/25, 5:09 AM

NoobAI-XL v-pred uses v-pred instead of eps like other SDXL models which makes it behave a bit different.

FFurkan Gözükara SECourses hard to know possibly a mistake somewhere 😄

Dario•2/12/25, 9:36 AM

I'll keep testing and let you know. In general the deambooth finetune does preserve small details betetr (logos, brake caliper, etc), but the LoRA follows the car's shape a lot better. I need to test it further with a contolnet and depth map (taken from the 3D model) to see if this car body issue can be resolved that way. And maybe as a final test I can send a 80GB+ GPU to do a AdamW dreambooth training run.

Hirmuolio•2/12/25, 9:49 AM

After a test I can say that normal SDXL lora training settings +

--zero_terminal_snr

--zero_terminal_snr

on NoobAI v-pred doesn't work. Lora outputs just grey images that sometimes have very blurry subject.

DDario Hi Furkan, I've been experimenting with Flux dreambooth and LoRA training for se...

Timson•2/12/25, 9:50 AM

are you sure fine tune checkpoint is not just undertrained? How many images do you have in the trainign dataset and how many training steps did you do for lora and fine-tuning?

Timson•2/12/25, 9:51 AM

The thing is - lora training is just much faster in terms of steps/resemblance compared to full checkpoint, but checkpoint should yeild better generalization

AAleksei Naumov Hey there! Anyone out here was deploying flux lora training as an API? Would be ...

Timson•2/12/25, 9:53 AM

this chat is full of friendly people sharing their knowledge of training flux, for starters just follow basic tutorials of Dr. Furkan on the subject you are interested, they cover 99% of your questions

TTimson are you sure fine tune checkpoint is not just undertrained? How many images do y...

Dario•2/12/25, 10:12 AM

Good point, thanks for your suggestion. The dataset has 45 images. LoRA produces good results at around 6000 steps. For dreambooth I am trying around 9 - 10K steps. Do you think I should train longer still?

DDario Good point, thanks for your suggestion. The dataset has 45 images. LoRA produces...

Timson•2/12/25, 10:18 AM

yes, continue training if you don't see model sanity degradation, you can get better results compared to lora at 18-20k steps or more, if your use case require perfect resemblance at the cost of lower model sanity

Timson•2/12/25, 10:19 AM

but if your use case do not benefit much from improved generalization this can be just too costly for similar results

TTimson yes, continue training if you don't see model sanity degradation, you can get be...

Dario•2/12/25, 10:25 AM

Great thanks I will give that a try. Costs and speed are not a priority, obtaining the best product accuracy / quality is, so I will give this a go by training a lot longer and compare the epochs in a grid. Cheers

Wwardensc2 recently when people train lora with kohya and they also train clip text encode ...

Furkan Gözükara SECourses•2/12/25, 11:44 AM

we already train clip

Furkan Gözükara SECourses•2/12/25, 11:44 AM

our all lora configs have it

Dario•2/12/25, 2:37 PM

Would it help to also train the text encoder (at 50% unet LR for example) and use T5 attention mask for dreambooth training just as we do with LoRA training? Has anyone seen any benefits?

FFurkan Gözükara SECourses we already train clip

wardensc2•2/12/25, 2:51 PM

Compare with train checkpoint follow your config, if I extract lora from trained checkpoint, is the extracted lora quality better ?

gio3000.•2/12/25, 3:10 PM

Annybody finetuned a style using image data set of a particular city district? I will start It over a previous finetuned person using dreambooth

DDario Would it help to also train the text encoder (at 50% unet LR for example) and us...

Furkan Gözükara SECourses•2/12/25, 3:28 PM

kohya not supporting

FFurkan Gözükara SECourses kohya not supporting

Dario•2/12/25, 3:47 PM

What do you mean? Both parameters can be applied in the koyha UI (dreambooth tab) and the training runs without issues. Do you mean it will cause problems with the final trained models?

DDario What do you mean? Both parameters can be applied in the koyha UI (dreambooth tab...

Furkan Gözükara SECourses•2/12/25, 9:46 PM

kohya doesnt train clip large or t5 when doing fine tuning

Furkan Gözükara SECourses•2/12/25, 9:46 PM

try you will see same speed and vram

FFurkan Gözükara SECourses try you will see same speed and vram

Dario•2/13/25, 9:04 AM

Ah I see, thanks for clarifying

DDario Ah I see, thanks for clarifying

Furkan Gözükara SECourses•2/13/25, 11:00 AM

yes i told to add kohya but he still didnt add