I'm just saying that even during inference a low-precision base model doesn't make a difference, so

I'm just saying that even during inference a low-precision base model doesn't make a difference, so how would it during training. The Lora weights are still bf16+
Was this page helpful?