Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
just wanted to ask if some people on here use 4090s for dreambooth training in kohya and their experience? I noticed that on runpod, a 4090 is a lot slower than a 3090ti and im trying to figure out if this is a issue on runpod or with a 4090 overall. Ive seen some issues posted on the kohya github about 4090s being slow, but im not sure if this is still the case
and transformers using the following command: pip install xformers==0.0.23
I also edit webui-user.bat to use xformers: set COMMANDLINE_ARGS=--xformers
With pip install xformers==0.0.23, I get the following:
Installing collected packages: torch, xformers Attempting uninstall: torch Found existing installation: torch 2.1.0+cu118 Uninstalling torch-2.1.0+cu118: Successfully uninstalled torch-2.1.0+cu118 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. torchvision 0.16.0+cu118 requires torch==2.1.0+cu118, but you have torch 2.1.1 which is incompatible. torchaudio 2.1.0+cu118 requires torch==2.1.0+cu118, but you have torch 2.1.1 which is incompatible. Successfully installed torch-2.1.1 xformers-0.0.23
And when I execute webiu-user.bat, I get the following: RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
are you using the runpod kohya template by any chance? i had never noticed until watching the good dr's video that if you don't kill auto1111, 25-30% of the vram is getting sucked up before you start
i typically just reduce the batch size until it works, or throw on xformers, but i'm going to retry a few of them without cheating since apparently its bad juju
Oh, i see, yeah, let me do it as well. potential, means, when i run training, it doesnt take load on gpu it should take....usually, when the temperature gets higher, when the gpu is totally using it's power, but when i run training, with kohya, it only gets half of the temperature, compared to it usually takes.
Fixed to work tools/convert_diffusers20_original_sd.py. Thanks to Disty0! PR #1016 The issues in multi-GPU training are fixed. Thanks to Isotr0py! PR #989 and #1000
The only thing that broke, it seems, was Blip captioning, but as I recall, I was using some type of rollback version of something. I think it was transformers. Honestly most of these captions tools are lousy and if you are using a smaller data-set manual captioning is best
Can I do multi-gpu training in Koyha if the graphics cards are different models/different Vram? (example, RTX 3080 and RTX 3090). Sorry to ask before testing, it's that I need a new power supply if I want to use both cards.