Intel® Advisor User Guide

ID 766448
Date 3/22/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Common Issues When Adding Parallelism

The types of problems encountered by parallel programs include shared memory data conflicts and incorrect locking.

Shared Memory Problems

Introducing parallelism can result in unexpected problems when parallel tasks access the same memory location. Such problems are known as data races. For example, in the Primes sample, the following line calls the function Tick():

  if (IsPrime(p)) Tick();

The called function Tick() increments the global variable primes:

void Tick() { primes++; }

Consider the following scenario, where the value of primes is incremented only once instead of twice:

Time

Thread 0

Thread 1

T1

Enters function Tick()

T2

Enters function Tick()

T3

Load value of primes

T4

Load value of primes

T5

Increment loaded value

T6

Store value of primes

T7

Increment loaded value

T8

Store value of primes

T9

Return

T10

Return

If you run this as a serial program, this problem does not occur. However, when you run it with multiple threads, the tasks may run in parallel and primes may not be incremented enough.

Such problems are non-deterministic, difficult to detect, and at first glance might seem to occur at random. The results can vary based on multiple factors, including the workload on the system, the data being processed, the number of cores, and the number of threads.

It is possible to use locks to restrict access to a shared memory location to one task at a time. However, all implementations of locks add overhead. It is more efficient to avoid the sharing by replicating the storage. This is possible if data values are not being communicated between the tasks, even though the memory locations are being reused.

Lock Problems

One thread (thread A) may have to wait for another thread (thread B) to release a lock before it can proceed. The core executing thread A is not performing useful work. This is a case of lock contention. In addition, thread B may be waiting for thread A to release a different lock before it can proceed. Such a condition is called a deadlock.

Like a data race, a deadlock can occur in a non-deterministic manner. It might occur only when certain factors exist, such as the workload on the system, the data being processed, or the number of threads.

Ensuring the Parallel Portions of a Program are Thread Safe

Intel® Advisor can detect many problems related to parallelism. Because it only analyzes the serial execution of your program, Intel Advisor cannot detect all possible errors. When you have finished using Intel Advisor to introduce parallelism into your program, you should use the Intel® Inspector and other Intel software suite products. These tools and using a debugger can detect parallelism problems that normal testing will not detect, and can also identify times when the cores are idle.