ModularM
Modular12mo ago
7 replies
andrea onofri

Parallelize in Mandelbrot example do not use all physical cores of an Intel Core i9 CPU

Parallelize don't use all 24 physical cores of a "13th Gen Intel® Core™ i9-13900KF × 32" processor.
Instead only the 8 "performance" cores are actually used.
Achieving only a speedup just below 8X on a processor with 24 physical cores.

It seem that parallelize algorithm implementation for this processor use performance cores instead of physical cores for sizing the thread pool.
I try changing parallelize runtime argument num_workers with substantially no practical effect, except reducing the distance from 8X with optimal values around 100.

(Mojo examples) .../mojo/examples$ mojo mandelbrot.mojo

Number of logical cores: 32
Number of physical cores: 24
Number of performance cores: 8

Vectorized: 5.576460718604651 ms
Parallelized: 0.7627593649078195 ms
Parallel speedup: 7.310904297161366

I would expect a parallel speedup below 24X, but probably above 20X.
Mandelbrot computation is highly parallelizable, but the 16 not "performance" are usually slower.

I hope this can be the first of my cent contribute to a truly promising language & friends ( Mojo->MAX&Magic 😉
SPOILER_Screenshot_from_2025-01-26_16-50-56.png
Screenshot_from_2025-01-26_16-27-56.png
Screenshot_from_2025-01-26_17-23-38.png
Was this page helpful?