If I train model A for 3200 steps and model B for 5000 steps, both having the same base model, would

If I train model A for 3200 steps and model B for 5000 steps, both having the same base model, would the result of generating an image be the same for both if for model B I use a checkpoint that's near 3200 steps like model A (save every x epoch/steps)?

And if I train a model for 3200 steps and then retrain it for another 1800 steps for a total of 5000 steps, will that model be as efficient as if I trained a model to 5000 steps directly?

Thank you 🙂
Was this page helpful?