Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
they couldn't fix the issues it has, so they did stuff to mask them, and that makes it almost unusuable for anything other than exactly what they released, with a very narrow range of what it can successfully do
I haven't spoken much about my ongoing project to de-distill schnell to make a permissive licensed version of flux, but I have been updating it periodically as it trains. I just noticed it is the #2 trending text-to-image model on Hugging Face. Working on aesthetic tuning now.
it's sort of like this - if you mix up flour, water, eggs, chocolate chips, sugar, baking soda, salt - you have a batter that can be turned into a lot of things. YOu could make a chocolate chip cake or cookies at this point. if you then bake cookies, you have cookies. and no one can come along and turn those cookies back into batter.
you can then shape the mush into some other shape, and say you turned it back into batter and now you've baked a cake or something - but you didn't actually do that. you just made mush and reshaped cookies into cake shaped cookies that now don't taste all that good
eh, he'll keep working on it till he gets something that he'll then prance around and boast about. you'd think HE would know better - but he's backed himself into a corner
i've got close to 3500 hours in on studying stable diffusion in the last 2.5 years. and close to 300 hours in on studying flux since it released. if not more
i literally spend an entire month sitting here, feeding it one single prompt at a time to see what it would return by default - walking through it's latent space
it's massively overfit, and it's been heavily doctored to deal with the 'woman laying in the grass' warping and shortening issue it shares with sd3-2b-medium.
sure, because it uses the t5xxl encoder nd the clip_l encoder. so you're going to get coherant. but give it a word like Umber and you should get mostly the color. the dictionary defintion for umber is "a natural pigment resembling but darker than ocher, normally dark yellowish-brown in color ( raw umber ) or dark brown when roasted ( burnt umber ). 2. a brownish-gray moth with coloring that resembles tree bark."