Prism Evaluation

Identifying Performance Bottlenecks

In this tutorial we take a look at how Prism can help you understand various performance issues which commonly occur in multithreaded software.

Setup

Each of the following sections assumes that you are familiar with the basics of using Prism and have successfully built the contents of the Tutorial_Examples project. If not, see the previous tutorials starting here.

Excessive threads

Excessive threads is generally a problem when the amount of computational work being done by the program is small compared to the overhead of managing all of the threads in the system. In addition to this, each thread requires a sizable amount of memory to be allocated to it for state and local stack which can quickly fill up memory unexpectedly.

Program with an excessive number of threads. (Click to Enlarge)

There are 2 ways of establishing if there are too many threads in the system. The first is by looking at the Function View.

In this example edgeDetect is the root function for each thread. Clearly very little runtime is actually being spent in this function especially considering that it is call so many times.

Switching off the filter shows that the __pthread_manager() is consuming a larger percentage of the cycles. Not a good sign.

__pthread_manager() consuming a large percentage of the cycles. (Click to Enlarge)

Next we can look at the Schedule View. In the Thread Schedule the huge number of threads is very obvious. Zooming in more closely you can see that each thread is doing very little spending about a third of its existence waiting to be scheduled.

Lock contention

Lock contention occurs when a number of threads are competing for control of a single mutex variable. This can badly impact performance by effectively serializing several threads as they each take turns at entering the code sequence guarded by the mutex. To make matters worse, the frequent locking and unlocking of the mutex will add substantial runtime overhead to the program.

From the core schedule that appears we clearly have a problem since there are large amounts of white space visible between executing thread segments.

Threads serializing due to a heavily contended lock. (Click to Enlarge)

To get a clearer picture we can switch to the Thread Schedule

At this zoom level we can see the arrows corresponding to the dependencies between the threads. Only the heads and tails of the arrows are displayed to prevent the view becoming too cluttered. Hovering over an arrow end will cause the whole thing to be drawn, if both ends are visible.

Thread waiting detail. (Click to Enlarge)

The exact cause of the contention can be found by zooming in further to the orange synchronize arrow representing the lock and can be traced back into the source code by right clicking and selecting Annotate Synchronize as shown below.

Finding the source of the contention. (Click to Enlarge)


End of tutorials