Threads, it’s easy to overlook the dependencies
At CriticalBlue, we often hear that thread based programming is hard and that it requires us all to ‘think parallel’. Conceptually though, I don’t think the fundamentals of thread programming are all that hard to grasp. In pthreads, we create threads so functions can execute in parallel; we use mutexes to coordinate mutual exclusion through critical regions, and we use condition variables to wait until conditions are right before proceeding. Seems simple enough.
Rather, what really is hard is to constrain a parallel program such that all possible interleavings of program execution are correct. While it’s easy to create threads and run them in parallel, it’s hard to catch all the data dependencies between the threads which must be properly ordered to ensure always race free execution.
I recently proved once again just how easy it is to overlook a critical dependency. I was preparing some examples for a talk I was giving at the ARM Developers’ Conference this October. I was extending a simple edge detection example to show different ways the image pipeline could be parallelized. I’d previously parallelized just the Sobel function using a row-based data decomposition strategy. No problem here I thought, all I need to do is replace the Sobel slice function with one processing a slice through the correction, noise filter and sobel functions. I ran the program, and I ended up with a pretty good looking result.

I say pretty good, because there was one line of pixels which looked a little off, but that was probably just an artifact, right? Wrong. That artifact saved me a lot of embarrassment, because, I’d blindly introduced data races into my code.
When you think it through, it’s easy to see that a filter slice can’t start processing until the correcting slice has completed and the edges of it’s neighbors have completed processing. Within a slice, the processing is sequential, but at the edges, there’s no guarantee that the edges of one slice will be corrected before another slice needs them. A simple but slightly overconstrained solution is to ensure that all correction processing completes before any filtering processing occurs, and, likewise, that all filtering completes before the Sobel slices begin.

The lesson learned is that it is very easy to overlook a critical dependency. I was lucky that my visual result gave me a tiny hint that something was wrong. It could have very easily passed all my tests, and I could have had an observant conference attendee put me in my place.
Simple though it was, this incident gave me a real appreciation for the importance of good tools, in this case, not so much for doing my parallel thinking for me, but for checking that what I did wasn’t doing anything wrong, and when there are mistakes, helping me understand what they are and ensuring that I’ve corrected them. I’ve added data race detection tools to my list of must haves for parallel programming.
Alas, there are still plenty of other ways I routinely embarrass myself…
- Skip Hovsmith