Software Engineering Courses (SECourses)•17mo ago

maybe torch 2.5 diff

Furkan Gözükara SECoursesOP•8/30/24, 3:32 PM

are you using torch 2.5? vrams tested on torch 2.4

Absentis•8/30/24, 3:47 PM

Thanks to your guides and configs I fine tuned an SDXL model with pics of my dog and of me using OneTrainer (will post some results later). The thing I'm struggling with is extracting a Lora from the model. I was able to do this with kohya_ss but the Lora comes out at almost 2Gb. I was wondering am I using wrong settings in kohya_ss or should I extract the Lora and then run the reduce Lora option in kohya_ss?

AAbsentis Thanks to your guides and configs I fine tuned an SDXL model with pics of my dog...

Furkan Gözükara SECoursesOP•8/30/24, 3:52 PM

why you need to reduce its size?

Furkan Gözükara SECoursesOP•8/30/24, 3:52 PM

you can reduce by

Furkan Gözükara SECoursesOP•8/30/24, 3:52 PM

set precision to bf16 or fp16 compare each one quality

Furkan Gözükara SECoursesOP•8/30/24, 3:52 PM

set rank 128 or lower if you need lower size

Furkan Gözükara SECoursesOP•8/30/24, 3:52 PM

may reduce quality

Furkan Gözükara SECoursesOP•8/30/24, 3:52 PM

i use kohya gui extractor

Absentis•8/30/24, 4:25 PM

Yeah, using the Kohya gui. Main reason I'd like to reduce the size is I wanted to share it + my hard drive was getting full (I really should buy a bigger drive). I played a bit with the resize option in the gui. Dropped it to a rank of 64 and the final size was about 380MB without noticeable quality loss. I was mainly curious if there was a recommend way to get smaller Loras.

Absentis•8/30/24, 4:29 PM

But you are right. I should just follow your example of just testing the quality of the Lora at different ranks and precision till I find a good mix.

AAbsentis Yeah, using the Kohya gui. Main reason I'd like to reduce the size is I wanted t...

Furkan Gözükara SECoursesOP•8/30/24, 4:45 PM

awesome

FFurkan Gözükara SECourses maybe torch 2.5 diff

daroth•8/30/24, 4:52 PM

not sure, I just installed from the files in the latest kohya pack

daroth•8/30/24, 4:52 PM

I did install the torch 2.5 one too

AAbsentis Yeah, using the Kohya gui. Main reason I'd like to reduce the size is I wanted t...

nicearoni•8/30/24, 8:55 PM

Appreciate you sharing your results, may test extracting mine too

sergio_gonzo•8/31/24, 2:47 AM

Dr Furkan, i hope evertyting is ok, im traying to make my firt lora model in flux, i have a 4070 super ti 16 gigas vram this speed is good? if i install the torcho 2.5 could increase speed? im use the rank 5 setting

Ssergio_gonzo Dr Furkan, i hope evertyting is ok, im traying to make my firt lora model in flu...

Furkan Gözükara SECoursesOP•8/31/24, 2:51 AM

speed is good . after this try torch 2.5 and see if improves. i think may improve

sergio_gonzo•8/31/24, 2:54 AM

thank you Dr Furkan Im using 44 images maybe i will try with fewer the next time

Phuong Bui•8/31/24, 6:29 AM

Can I ask if a 1080 8gb card can train lora? If not, is the 3060 capable?

PPhuong Bui Can I ask if a 1080 8gb card can train lora? If not, is the 3060 capable?

Furkan Gözükara SECoursesOP•8/31/24, 10:28 AM

it cant since 1080 doesnt have bf16

Furkan Gözükara SECoursesOP•8/31/24, 10:28 AM

but 3060 perfectly can

FFurkan Gözükara SECourses it cant since 1080 doesnt have bf16

Phuong Bui•8/31/24, 11:10 AM

Thank you for letting me know. With 3060 how long does it take to train on average?

PPhuong Bui Thank you for letting me know. With 3060 how long does it take to train on avera...

Furkan Gözükara SECoursesOP•8/31/24, 12:18 PM

it would take like 10 hours to get a perfect quality

Furkan Gözükara SECoursesOP•8/31/24, 12:18 PM

because we have to use huge optimization

Furkan Gözükara SECoursesOP•8/31/24, 12:18 PM

however there may come a new technique

Furkan Gözükara SECoursesOP•8/31/24, 12:18 PM

training a single layer

Furkan Gözükara SECoursesOP•8/31/24, 12:18 PM

waiting kohya on that

daroth•8/31/24, 12:31 PM

Seems like vast majority of character Loras work even without including their supposed trigger words. Interesting

Ddaroth Seems like vast majority of character Loras work even without including their su...

Furkan Gözükara SECoursesOP•8/31/24, 12:39 PM

flux has internal encoding

Furkan Gözükara SECoursesOP•8/31/24, 12:39 PM

so every image fully captioned

dxqb•8/31/24, 1:06 PM

not sure what you two mean. I have used a character lora with flux recently. trigger word (which was the name of the person) was necessary. using the lora without it already did something, but not everything

Ddxqb not sure what you two mean. I have used a character lora with flux recently. tri...

Furkan Gözükara SECoursesOP•8/31/24, 1:29 PM

if you dont use trigger word and if it is not overfit you wont get exactly that person

Furkan Gözükara SECoursesOP•8/31/24, 1:30 PM

becuase you cant know exact internal encoding / caption

FFurkan Gözükara SECourses if you dont use trigger word and if it is not overfit you wont get exactly that ...

sergio_gonzo•8/31/24, 3:23 PM

Dr Furkan i hope evrything is ok, wht do you think could be the best, cost/benefit amount fo dataset image for training?

Ssergio_gonzo Dr Furkan i hope evrything is ok, wht do you think could be the best, cost/benef...

Furkan Gözükara SECoursesOP•8/31/24, 3:43 PM

i plan to test with 100 images

Furkan Gözükara SECoursesOP•8/31/24, 3:43 PM

currently works as low as 15 very good

daroth•8/31/24, 4:29 PM

hm, does Lora merging work for flux?

Ddaroth hm, does Lora merging work for flux?

Furkan Gözükara SECoursesOP•8/31/24, 4:30 PM

they added some tools but didnt test yet

daroth•8/31/24, 4:30 PM

I remember results weren't always good previously, especially if one wanted to merge loras of different sizes

macba•8/31/24, 5:53 PM

When using Dreambooth training, what is the formula for total steps when using multiple GPUs? In the video about Flux training it seemed like you would multiply everything by the number of GPUs, so 100 images, 1 repeat, batch size 1, 10 epochs would be 1000 steps on a 1 GPU, 4000 steps on a x4 GPU machine. However, when running a Dreambooth finetune, the system insists it's going to be 1000 steps unless I actually increase the number of epochs to 10 x the number of GPUs…