Ok with 2 GPUs I'm getting a stable 3.36 sec/it. I can never remember, though--I'm seeing:
steps: 0%| | 50/17500 [02:47<16:16:46, 3.36s/it, avr_loss=0.443]
steps: 0%| | 50/17500 [02:47<16:16:58, 3.36s/it, avr_loss=0.447]
(step #50 being done twice)
What does this mean? Will the final steps be 17500 or 35000?