Hey, I just started training a fine-tune on Massed Compute following the tutorial using 2x A6000 (wi
Hey, I just started training a fine-tune on Massed Compute following the tutorial using 2x A6000 (without NVLink), and seems like only 1 GPU is utilized.
Is there a way to utilize both for faster training?
Is there a way to utilize both for faster training?



