Debugging Persistent Segmentation Fault in Multi-threaded C++ Program on AMD Barcelona CPUs

I have been wrestling with a persistent segmentation fault in a multi-threaded C++ program running on a cluster of AMD Barcelona CPUs Linux/x86_64. The code causing the crashes is a heavily used function, and under load, running 1000 instances of the program same optimized binary can generate 1 to 2 crashes per hour.
Now here's the interesting part, the crashes happen on different machines within the cluster although the machines themselves are almost identical, and they all share the same characteristics - same crash address and call stack.
Solution
Tools like GDB can help you track memory access patterns in different threads , and you know that mutexes are synchronization mechanisms you should add around critical sections of
foo
to allow you have thread safe access to shared data . How's it going ? @Marvee Amasi
Was this page helpful?
Debugging Persistent Segmentation Fault in Multi-threaded C++ Program on AMD Barcelona CPUs - DevHeads IoT Integration Server