From what I understand (correct me, if I am wrong, Doc), an epoch equals how many times an image in the input dataset is trained.
So suppose you have an input dataset of 15 images, and you select an epoch of 200, each image will be trained 200 times, so the No. of steps = 15x200 = 3000 (3000 steps is a trade-off figure b/w time and quality that the Doc likes to use in his videos).
Now, the time taken for each step will depend upon your GPU. So, say a 12GB GPU trains with Flux at 12 second/ iteration. Then, the total time =
3000 x 12 = 36,000 secs (approx. 10hrs.)