Software Engineering Courses (SECourses)•11mo ago

Hey, I just started training a fine-tune on Massed Compute following the tutorial using 2x A6000 (wi

Hey, I just started training a fine-tune on Massed Compute following the tutorial using 2x A6000 (without NVLink), and seems like only 1 GPU is utilized.

Is there a way to utilize both for faster training?

AAragon Hey, I just started training a fine-tune on Massed Compute following the tutoria...

Furkan Gözükara SECourses•2/8/25, 5:22 PM

yes

Furkan Gözükara SECourses•2/8/25, 5:23 PM

but only for lora

Furkan Gözükara SECourses•2/8/25, 5:23 PM

on fine tuning 48 gb is not enough

FFurkan Gözükara SECourses but only for lora

AragonOP•2/8/25, 5:24 PM

It's a fine-tune... so there's nothing I can really do to utilize both and I should just leave it as is? (I got the 2x A6000 plan for extra storage)

AAragon It's a fine-tune... so there's nothing I can really do to utilize both and I sho...

Furkan Gözükara SECourses•2/8/25, 5:25 PM

yes you cant utilize atm

Furkan Gözükara SECourses•2/8/25, 5:25 PM

only way is 80 gb gpus sadly

Furkan Gözükara SECourses•2/8/25, 5:25 PM

i reported kohya but he coudlnt solve

AragonOP•2/8/25, 5:27 PM

I see... all good, it's still quite inexpensive compared to other options, I just hoped I could perhaps cut the runtime in half.

Thanks!

AAragon I see... all good, it's still quite inexpensive compared to other options, I jus...

Furkan Gözükara SECourses•2/8/25, 5:27 PM

you are welcome

Furkan Gözükara SECourses•2/8/25, 5:27 PM

L40 is slightly faster

FFurkan Gözükara SECourses L40 is slightly faster

AragonOP•2/8/25, 5:29 PM

Hmm, it also has more storage, so maybe I should switch to a 1xL40 server then?

I have 53 hours left with only 1 hour passed. Think I'll reduce the time (and it's also slightly cheaper than 2xA6000) by switching?

AAragon Hmm, it also has more storage, so maybe I should switch to a 1xL40 server then? ...

Furkan Gözükara SECourses•2/8/25, 5:36 PM

yes i recommend

AAragon Hmm, it also has more storage, so maybe I should switch to a 1xL40 server then? ...

Furkan Gözükara SECourses•2/8/25, 5:36 PM

yep

AragonOP•2/8/25, 5:38 PM

Will do

AragonOP•2/8/25, 5:41 PM

Also, have you considered perhaps creating a Docker image with all the tools already included, as it seems that Massed Compute (and I assume other services like Runpod as well) support Docker images?

It'll both simplify the process (no need to download stuff and install, and users less likely to encounter an error if for example a Python package is updated), and save time (which cost money when renting).

I don't know however if running through a Docker container affects the performance when using a GPU

AAragon Also, have you considered perhaps creating a Docker image with all the tools alr...

Furkan Gözükara SECourses•2/9/25, 1:09 AM

we already have massed compute image and it is already big

Furkan Gözükara SECourses•2/9/25, 1:09 AM

all scripts would be just huge

Furkan Gözükara SECourses•2/9/25, 1:09 AM

best to install them because installation is super fast on massed compute

Pixel.dust•2/9/25, 6:57 PM

I was able to make it work just need to read more the documentation and the warings for a local 2x3090, no nvlink

Pixel.dust•2/10/25, 1:55 AM

not shure but when using multi gpu it seens to be slower than using just one, not shure if is a bad config or a comunication overhead

PPixel.dust not shure but when using multi gpu it seens to be slower than using just one, no...

Furkan Gözükara SECourses•2/10/25, 1:58 PM

remember you will do half epoch

Furkan Gözükara SECourses•2/10/25, 1:58 PM

when doing 2x gpu

Furkan Gözükara SECourses•2/10/25, 1:58 PM

it will be same

Furkan Gözükara SECourses•2/10/25, 1:58 PM

because your batch size becomes 2

Pixel.dust•2/10/25, 2:11 PM

Does the global amount of steps change?

Pixel.dust•2/10/25, 2:19 PM

the batc/epoch is smaller but the overal amount is slower

FFurkan Gözükara SECourses on fine tuning 48 gb is not enough

Dario•2/10/25, 2:21 PM

Hi Furkan, are you referring to a flux finetune in Koyha, and not a dreambooth training?
If so, what is the main difference between dreambooth and fine tuning? Fine tuning is more for style/concept version dreambooth being better for specific subjects? Thanks

Bill Meeks•2/10/25, 2:33 PM

I made a new tutorial for my Everly Heights ClipFrame tool. Video editors, producers, and people building datasets from videos will find it useful.

https://youtu.be/qrv9vgd1WH0?feature=shared

YouTubeBill Meeks

Everly Heights ClipFrame: Transform Video Insights into Action | Fu...

Discover Everly Heights ClipFrame, a versatile tool designed to revolutionize how professionals and creators analyze, organize, and extract value from video content. Perfect for marketing teams, content creators, video editors, and even AI developers, ClipFrame simplifies the process of breaking down videos, organizing frames, and exporting cust...

PPixel.dust Does the global amount of steps change?

Furkan Gözükara SECourses•2/10/25, 11:24 PM

it doesnt keep previous training logs

Furkan Gözükara SECourses•2/10/25, 11:24 PM

therefore you have to reduce new epoch

Furkan Gözükara SECourses•2/10/25, 11:24 PM

so 200-170 if you are using 170 epoch base

DDario Hi Furkan, are you referring to a flux finetune in Koyha, and not a dreambooth t...

Furkan Gözükara SECourses•2/10/25, 11:24 PM

no different atm

Furkan Gözükara SECourses•2/10/25, 11:24 PM

dreambooth difference is using regularazation images

FFurkan Gözükara SECourses it doesnt keep previous training logs

Pixel.dust•2/11/25, 4:09 AM

It's safe to say that gor low epoch count Muti gpu has a little impact ?

wardensc2•2/11/25, 6:29 AM

hi @Furkan Gözükara SECourses do you have plan to train Flux Schnell model ? I heard they have some great model. For example this model: https://civitai.com/models/943001/shuttle-3-diffusion

Dario•2/11/25, 3:51 PM

Hi Furkan, I've been experimenting with Flux dreambooth and LoRA training for several months now, using a high quality dataset of 3D renders en photographs of a specific car (which is not trained in the base model).

After extensive testing, I am actually getting better results with Flux LoRAs then dreambooth finetune.
The finetuned model does not follow the car's body well, the LoRA is replicating the car's shape a lot better.

For the LoRA, I found that AdamW with network dimension 128 and alpha 64 worked best for me.
For the dreambooth, I've used your configs and left them untouched.

I am surprised that this is the case as you would expect the oposite. Even the extracted LoRAs from the trained checkpoint performed worse than the LoRA.

I know this is very little information to go on, I can send you more grid tests if you want, but I am wondering if you have an idea where to look.
The only thing I can think of is that the LoRA uses AdamW and dreambooth uses Adafactor.
I could not use AdamW for dreambooth, not even on a 48GB GPU.

My goal and quality metrics are product (car) accuracy. So perhaps my configs should differ from your configs that are more leaned towards people?

PS. I've left all the common settings the same like image repeats, dataset/captions, noise offset, snr. And I have tested multiple epochs to find the best sample size

Renzo•2/11/25, 7:28 PM

I would be happy to test your LoRA configuration on a character/person if you're willing to share it. I have fine-tuned models and created LoRAs for a couple of months and my findings match Furkan's results, so it would be nice to see if there is a better configuration for LoRAs.

Dario•2/11/25, 7:57 PM

I'm happy to share my json or toml file in a message (maybe dropping it here is against the community guidelines?).
Keep in mind that I am absolute not a ML expert, I am a 3D product visualization artist dabbling with AI. However I have a beefy pc (4090 + 3090 + 3080) so I've been able to do a lot of testing, running 2 trainings at the same time.
Most of my "knowledge" comes from using custom Claude & Perplexity project with documentation about LoRA, koyha etc.
In any case, I am also very surprised by these results, as I am sure Furkan has found the best configs

PPixel.dust It's safe to say that gor low epoch count Muti gpu has a little impact ?

Furkan Gözükara SECourses•2/11/25, 8:09 PM

eloborate more

Wwardensc2 hi <@205854764540362752> do you have plan to train Flux Schnell model ? I heard ...

Furkan Gözükara SECourses•2/11/25, 8:09 PM

i tried and it didnt work well

Furkan Gözükara SECourses•2/11/25, 8:09 PM

i dont see any reason to use it for individuals

DDario Hi Furkan, I've been experimenting with Flux dreambooth and LoRA training for se...

Furkan Gözükara SECourses•2/11/25, 8:09 PM

interesting

Furkan Gözükara SECourses•2/11/25, 8:10 PM