Software Engineering Courses (SECourses)•5mo ago

Hi Furkan, I want to confirm something about your captioning theory. For a small and very consistent

Hi Furkan, I want to confirm something about your captioning theory.
For a small and very consistent dataset (around 50 solid product photos on white background), is it still best practice to only use the token + class as caption (like CCCSNOOO bag), and let the images carry all visual attributes?
Or in such small homogeneous datasets, could adding more detailed captions (e.g. color, material, shape) actually help stabilize training with T5?

JonkoXL•8/18/25, 6:52 PM

Good question! AFAIk depends on what you want. iI you dont caption a certain thing, it will always generate it in that thing, unless you really force it in a prompt.
If you caption something (like color, material) you can more easily change the color , material later.
I think Furkan will confirm this

JJonkoXL Good question! AFAIk depends on what you want. iI you dont caption a certain thi...

Elli_OakenOP•8/18/25, 7:25 PM

Thanks for the clarification! If T5 is active with such a small and homogeneous dataset, could relying only on token + class captions cause it to fall back to it's pretraining bias (e.g. anime/fantasy)? In that case, would the safer option be to disable T5?

Also, around what epoch should the results typically start to become visible? I’ve tested the token + class approach, but even after 190 epochs I still mostly get strange anime-like outputs. Does that indicate that the LoRA simply needs more training, or that the captioning strategy itself is unsuitable for this setup?

JonkoXL•8/18/25, 7:27 PM

I have no idea about T5, is that Clip?
What model are you training on? SDXL, Pony , Flux, Qwen, Wan 2.1, Wan 2.2?

JonkoXL•8/18/25, 7:27 PM

What's your learning rate

JJonkoXL What's your learning rate

Elli_OakenOP•8/18/25, 7:29 PM

Flux. It's this config setup.

24GB_GPU_Quality_Tier_1_22950MB_14.1_Second_IT_Trains_T5_and_T5_Attention.json5.6KB

JonkoXL•8/18/25, 7:29 PM

Are you able to generate images while training?

JJonkoXL Are you able to generate images while training?

Elli_OakenOP•8/18/25, 7:30 PM

No, but I have three computers.

JonkoXL•8/18/25, 7:30 PM

I think your learning rate is extremely low

JonkoXL•8/18/25, 7:30 PM

now I havent trained Flux, but 5e-05 is pretty low

JonkoXL•8/18/25, 7:30 PM

meaning 0.00005

JonkoXL•8/18/25, 7:31 PM

try learning rate 2e-4

JJonkoXL now I havent trained Flux, but 5e-05 is pretty low

Elli_OakenOP•8/18/25, 7:31 PM

Yes, 5e-5 looks low compared to typical SDXL LoRAs. But for Flux1 it’s the standard recommendation (e.g. Furkan’s config). It trains slower, but keeps things stable since Flux is more sensitive than SDXL/1.5. But maybe I'm wrong.

JonkoXL•8/18/25, 7:32 PM

Yeah I am coming mostly from SD 1.5 , SDXL and Wan

JonkoXL•8/18/25, 7:32 PM

i still would try going for 2e-4 with flux and see if you get decent results around 1500-3000 steps

JonkoXL•8/18/25, 7:32 PM

Worth a shot if you are not happy with your results

JonkoXL•8/18/25, 7:33 PM

try to find out if you can generate images while it's training (that means , the training algorithm stops for a moment to generate an image based on a prompt set)

JonkoXL•8/18/25, 7:33 PM

so every 100 steps generate two images, one with prompt identical to a training image, and another one with something you may want to prompt later

JonkoXL•8/18/25, 7:33 PM

the later is more important

JonkoXL•8/18/25, 7:36 PM

Yes , samples

JJonkoXL Yes , samples 🙂

Elli_OakenOP•8/18/25, 7:42 PM

Thank you very much!

Pitbulls•8/18/25, 8:21 PM

Guys, can you help? I don't understand why the image is not properly generated... What am i missing? I don't understand how wan works...

PPitbulls Guys, can you help? I don't understand why the image is not properly generated.....

AiInfluence•8/18/25, 8:23 PM

play with the return_noise and add_noise

AiInfluence•8/18/25, 8:23 PM

theres something wrong there

AAiInfluence theres something wrong there

Pitbulls•8/18/25, 8:36 PM

Thanks for the advice - i used the same setting from base workflow text to video - but the result didn't change much

AiInfluence•8/18/25, 8:37 PM

try 4+4 steps

JonkoXL•8/18/25, 9:00 PM

try this workflow

JonkoXL•8/18/25, 9:00 PM

https://civitai.com/models/1872180/jonxl-multi-wan-21-22-advanced-v5-aio-workflow-t2i-t2v-i2v-flf2v-trim-extend-upscale-interpolate

JonXL Multi Wan 2.1 + 2.2 Advanced V5 AIO Workflow - T2I T2V I2V FL...

JonXL Multi Wan 2.1 + 2.2 Advanced V5 AIO Workflow - T2I T2V I2V FLF2V (Trim, Extend, Upscale, Interpolate) * V5 released! Now supports Wan 2.1 wit...

JonkoXL•8/18/25, 9:00 PM

its better if you use the 4 steps Lora

JonkoXL•8/18/25, 9:09 PM

Download links are in the workflows' notes

JJonkoXL try this workflow

Pitbulls•8/18/25, 9:09 PM

Thanks, i definitely will! Wan is a new territory for me

JonkoXL•8/18/25, 9:10 PM

Sure thing, once you go WAN you never WAN'na go back

JJonkoXL Sure thing, once you go WAN you never WAN'na go back

Pitbulls•8/18/25, 9:11 PM

i hope so!

xlads•8/18/25, 10:18 PM

Does anyone know if flux full fine tuning / dreambooth trained models allow nsfw generations?

Sskyrrr @Dr. Furkan Gözükara Hey Dr, is there a theoratical limit with checkpoint traini...

Furkan Gözükara SECourses•8/19/25, 5:22 PM

i think 350 good

Furkan Gözükara SECourses•8/19/25, 5:22 PM

i trained with 256 images

FFurkan Gözükara SECourses i feel krea need more epochs

Timson•8/19/25, 5:23 PM

have you tried creating character checkpoints from krea? I'm at 10000 steps and the model is clearly ruined by now: all stylized outputs became blurry mangled and photorealistic

Timson•8/19/25, 5:24 PM

no comparable character similiarity was reached like with flux-dev

Timson•8/19/25, 5:25 PM

I'll try with different datasets but I don't think this process works with krea like with flux dev.

EElli_Oaken Hi Furkan, I want to confirm something about your captioning theory. For a small...

Furkan Gözükara SECourses•8/19/25, 5:25 PM

i think you dont need detailed captions

Furkan Gözükara SECourses•8/19/25, 5:25 PM

with flux detailed captions makes sense only at general fine tuning with multiple concepts

Furkan Gözükara SECourses•8/19/25, 5:26 PM

and it is not doable with flux dev

Furkan Gözükara SECourses•8/19/25, 5:26 PM

but with qwen i am hopeful

TTimson have you tried creating character checkpoints from krea? I'm at 10000 steps and ...

AiInfluence•8/19/25, 5:26 PM

200epoch and 4000 and you should be good timson

AAiInfluence 200epoch and 4000 and you should be good timson

Timson•8/19/25, 5:27 PM

4000 what?

Timson•8/19/25, 5:28 PM

I mean it's kinda working but the result is much much worse than flux dev for me in both character similarity and model creativity

AiInfluence•8/19/25, 5:37 PM

steps

AiInfluence•8/19/25, 5:38 PM

idk guys im getting good results

AAiInfluence idk guys im getting good results

skyrrr•8/19/25, 7:18 PM

me too actually, krea is my new go-to for finetunings

Sskyrrr me too actually, krea is my new go-to for finetunings

cyberbol•8/19/25, 7:54 PM

I did so far just Lora. Not trained yet full model dreambooth

�

🍭🎀 𝒜𝓋𝒶 𝐹𝓇𝒾𝑔𝑔 🎀🍭•8/19/25, 11:26 PM

ok i'll do krea next for comparison. doing my first fine tune locally, done loads on massedcompute. got 5090 and 128gb ram. went with the 24gb config. 25 pics, 200 epochs looks like its going to be around 6h.

Hi Furkan, I want to confirm something about your captioning theory. For a small and very consistent

Similar Threads

Similar Threads

Similar Threads