Hi guys! I've been training FLUX LoRAs using Kohya on Windows with 4 RTX GPUs. About a month ago, everything worked fine, but now I keep getting crashes related to torch.distributed.get_world_size() when using Accelerate.

Has something changed recently in PyTorch or Accelerate that breaks multi-GPU training on Windows?
Is there any recommended workaround for this?