Runpod•10mo ago

Training runs 2-5x slower on pods than on home system.

Home system: 4090, 7950x, 64GB RAM, W.2 SSD.

I comparisons:
1x 4090: 2.5-3x slower on ALL ops.
L40: 5x slower
h200: 1.5x slower

The 'slowness' refers to the time for each operation. In the attached examples, I show that a nn.Linear module takes around 2x slower on the Runpod 4090, vs mine.
Why may this be?

For extra context, my dataset is mnist, and it is loaded onto the gpu

No replies yetBe the first to reply to this messageJoin