Real Performance Comparison: H100 vs RX 6000 Ada
Hi,
I’m experiencing some confusion or perhaps misunderstanding regarding the performance of the H100 and RX 6000 Ada GPUs during model training.
Lately, I’ve been working with both GPUs to train a model using 9 GB of training data and 8 GB of testing data. The model has 2.6M parameters.
On the RX 6000 Ada, I’m observing an average speed of around 200 ms/step in my current tests:
Epoch 2/15
601/1427 ━━━━━━━━━━━━━━━━━━━━ 2:41 195ms/step - binary_accuracy: 0.8878 - loss: 0.2556
yesterday using the h100 i was having more than 300ms/step... somethings 400ms/step and rarely 200ms/step.
With the same script, same date, same everything.
Are the H1000 and RX6000 ada the same thing ?
Regards.
6 Replies
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
ok running again on RTX 6000 : 498/1427 ━━━━━━━━━━━━━━━━━━━━ 2:58 192ms/step - binary_accuracy: 0.8825 - loss: 0.2695
I ll make another try on H100 today and I will post the result here
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
I guess the dataset and the number of parameters your model handles
but in both runs everything is same
Unknown User•11mo ago
Message Not Public
Sign In & Join Server To View
yep i don't understand those numbers yet...
That's on a H100, maybe my script isn't optimized
2005/2005 ━━━━━━━━━━━━━━━━━━━━ 688s 341ms/step - binary_accuracy: 0.8848 - loss: 0.2727 - val_binary_accuracy: 0.8486 - val_loss: 0.4044 - learning_rate: 0.0010
Epoch 3/15
1391/2005 ━━━━━━━━━━━━━━━━━━━━ 2:57 290ms/step - binary_accuracy: 0.9031 - loss: 0.2306