Deleted Comment
1) ray tracing. Looping over all the pixels in an image using ray tracing to determine the color of each pixel. The algorithm and data structures are complex but don't change during the rendering. N cores is about N times as fast.
2) in Solvespace we had a small loop which calls a tessellation function on a bunch of NURBS surfaces. The function was appending triangles to a list, so I made a thread-local list for each call and combined them after to avoid writes to shared data structure. Again N times faster with very little effort.
The code is also fine to build single threaded without change if you don't have OpenMP. Your compiler will just ignore the #pragmas.
'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?'
This was never literally practiced.
But excessive hours were the norm. And I loved it. It helped me launch into a successful career.
But it hurt my relationship with my partner (now wife), and it burned me out.
I miss those days, but I don’t miss what they did to my health.