Hmm, could this be related? It seems like with 6 processes, batch size of 2 and multi-gpu enabled, I

Hmm, could this be related? It seems like with 6 processes, batch size of 2 and multi-gpu enabled, I get 96 epochs and 100 batches per epoch. But with 1 process, batch size of 1 and multi-gpu disabled, I get 8 epochs @ 1200 batches per epoch.

num examples / サンプル数: 600
  num batches per epoch / 1epochのバッチ数: 100
  num epochs / epoch数: 96
  batch size per device / バッチサイズ: 2
  gradient accumulation steps / 勾配を合計するステップ数 = 1
  total optimization steps / 学習ステップ数: 9600


running training / 学習開始
  num examples / サンプル数: 600
  num batches per epoch / 1epochのバッチ数: 1200
  num epochs / epoch数: 8
  batch size per device / バッチサイズ: 1
  gradient accumulation steps / 勾配を合計するステップ数 = 1
  total optimization steps / 学習ステップ数: 9600
Was this page helpful?