There's inconsistency in performance ( POD )

Hello. I rent and operate 20 RTX4090 GPUs all day long.
However, there are significant differences in inference speeds.
Each line in the table in the attached image represents 2 RTX 4090 GPUs.
One processes 150 images in 3 minutes. However, the rest only process 50-80 images. On my own RTX4090 2-way server that I purchased directly, the throughput is 180 images processed in 3 minutes. I haven't been able to figure out why these speed differences are occurring.
The inference task is generating one image.

There's inconsistency in performance ( POD )

Similar Threads

There's inconsistency in performance ( POD )

Similar Threads

Similar Threads

Similar Threads