Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
Also, have you considered perhaps creating a Docker image with all the tools already included, as it seems that Massed Compute (and I assume other services like Runpod as well) support Docker images?
It'll both simplify the process (no need to download stuff and install, and users less likely to encounter an error if for example a Python package is updated), and save time (which cost money when renting).
I don't know however if running through a Docker container affects the performance when using a GPU
Hi Furkan, are you referring to a flux finetune in Koyha, and not a dreambooth training? If so, what is the main difference between dreambooth and fine tuning? Fine tuning is more for style/concept version dreambooth being better for specific subjects? Thanks
Discover Everly Heights ClipFrame, a versatile tool designed to revolutionize how professionals and creators analyze, organize, and extract value from video content. Perfect for marketing teams, content creators, video editors, and even AI developers, ClipFrame simplifies the process of breaking down videos, organizing frames, and exporting cust...
Hi Furkan, I've been experimenting with Flux dreambooth and LoRA training for several months now, using a high quality dataset of 3D renders en photographs of a specific car (which is not trained in the base model).
After extensive testing, I am actually getting better results with Flux LoRAs then dreambooth finetune. The finetuned model does not follow the car's body well, the LoRA is replicating the car's shape a lot better.
For the LoRA, I found that AdamW with network dimension 128 and alpha 64 worked best for me. For the dreambooth, I've used your configs and left them untouched.
I am surprised that this is the case as you would expect the oposite. Even the extracted LoRAs from the trained checkpoint performed worse than the LoRA.
I know this is very little information to go on, I can send you more grid tests if you want, but I am wondering if you have an idea where to look. The only thing I can think of is that the LoRA uses AdamW and dreambooth uses Adafactor. I could not use AdamW for dreambooth, not even on a 48GB GPU.
My goal and quality metrics are product (car) accuracy. So perhaps my configs should differ from your configs that are more leaned towards people?
PS. I've left all the common settings the same like image repeats, dataset/captions, noise offset, snr. And I have tested multiple epochs to find the best sample size
I would be happy to test your LoRA configuration on a character/person if you're willing to share it. I have fine-tuned models and created LoRAs for a couple of months and my findings match Furkan's results, so it would be nice to see if there is a better configuration for LoRAs.
I'm happy to share my json or toml file in a message (maybe dropping it here is against the community guidelines?). Keep in mind that I am absolute not a ML expert, I am a 3D product visualization artist dabbling with AI. However I have a beefy pc (4090 + 3090 + 3080) so I've been able to do a lot of testing, running 2 trainings at the same time. Most of my "knowledge" comes from using custom Claude & Perplexity project with documentation about LoRA, koyha etc. In any case, I am also very surprised by these results, as I am sure Furkan has found the best configs