this is my tensorboard so far... the black one was only trained 1 epoch, the teal one 2 epochs, and
this is my tensorboard so far... the black one was only trained 1 epoch, the teal one 2 epochs, and the pink one is in the middle of its 9th epoch right now. Until the 2000 mark all 3 were following the exact same path on the loss/average chart. The first two had the default batchsize/gradient/LR, the pink one has 0.001 as LR (10x default LR). I only used 4 repeats in my dataset cos i put 138 images in and that comes to 552, which is close to the 14x40repeats you were using in your tutorial (which comes out to 560). I used 552 regularization images as well. Am I just using too many images? maybe I should just pick 14 high quality face-only images only like you did




