Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
Hi guys! I've been training FLUX LoRAs using Kohya on Windows with 4 RTX GPUs. About a month ago, everything worked fine, but now I keep getting crashes related to torch.distributed.get_world_size() when using Accelerate.
Has something changed recently in PyTorch or Accelerate that breaks multi-GPU training on Windows? Is there any recommended workaround for this?