Optimizing a bubble sort implementation in C for an x86-64 architecture

Hello , I'm working on optimizing a bubble sort implementation in C for an x86-64 architecture specifically targeting an Intel Core i7 processor using GCC 11.2 . I noticed that the -O3 flag resulted in slower performance compared to -O2 when sorting large arrays of integers 1 million elements in this case.

Here are the timings, average of multiple runs:
Solution
Compile your code with profiling enabled using
Copy code
gcc -pg -O3 -o sort sort.c
./sort 1000000
gprof sort gmon.out > analysis.txt

For perf, use
perf record -g ./sort 1000000

Generate a report to get full detail of where time is being spent on your program
perf report
Was this page helpful?