Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
seems to be a big training effort, not just remerge: "Hello everyone! An early version of RealVisXL V5.0 (560k steps) is now available on Hugging Face. The model is still in training, and the final version will be released on Hugging Face and CivitAI at the end of August. You can leave your feedback on this version in the comments below."
@Dr. Furkan Gözükara Bro, I am going to sleep now its 6am haha, training loras all night, I fixed all my machines now. Anyway, I want to Train 1000 images (a style of art) I want to train that but doing it on 1 A6000 it would take like a week right? What system you recommend and how much would it cost me on Mass compute? 4-8 GPU A6000? If its $3-5 an HR I am down to pay it. Please do recommend the correct VM to use on there and do you have a video setting one up as I only use local, never rented hardware before. Let me know thanks and gn brother! Totally worth the Patreon fee!
@Dr. Furkan Gözükara another question, if I want to use custom nodes in swarm UI, I did not see the folder, do I have to create one? Autocompletions Users-log.ldb clip_vision tensorrt Embeddings Users.ldb comfy-auto-model.yaml unet Lora VAE controlnet upscale_models Stable-Diffusion clip diffusion_models, or where are the custom node folder?
@Dr. Furkan Gözükara I've test on windows on my PC with A4000 (8go) and it worked, however as I suspected it use shared memory during the step "move text encoder to gpu" as there is not enough memory on the GPU, that's why it OOM on linux, because it can't use shared memory, then it's passed back to the cpu freeing memory and the rest is loaded that require less than 8go so during training it's stay under 8go. Is there a way to not move the text encoder to gpu, like having it calculated on cpu directly, having it use shared memory or making it use less VRAM, so it will work also on linux ?
Replace SD3Tokenizer with the original CLIP-L/G/T5 tokenizers. Extend the max token length to 256 for T5XXL. Refactor caching for latents. Refactor caching for Text Encoder outputs Extract arch...
Hi there, may I ask you something, for kohya_ss, for generate the meta_lat.json for the full finetuning of flux, do you know which path of the model should I set as argument, the path to flux1-dev.safetensors seems to not working
@Dr. Furkan Gözükara new test with high TE learning rate, now with unique tokens no class and person class regularization, I see that using person class avoid class mixing