Intel® Advisor User Guide

ID 766448
Date 3/22/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Independent Updates

Independent updates can occur when multiple tasks contribute to determining the final value of a memory location.

The Basic Pattern

Suppose that multiple tasks write to a memory location, that the value written by each task is computed using the previous value in that location, and that the order in which the tasks update the memory location does not matter.

For example, consider a loop that sums all the values in an array:

extern int x;
// ...
ANNOTATE_SITE_BEGIN(site1);
for (i = 0; i != n; ++i) { 
    ANNOTATE_ITERATION_TASK(task1);
    x = x + a[i];
}
ANNOTATE_SITE_END(site1);
printf("%d\n", x);

The sharing problem looks like this:

  1. Task 0 reads x.

  2. Task 1 reads x.

  3. Task 0 adds a[0] to the value it read and stores the result back in x.

  4. Task 1 adds a[1] to the value it read and stores the result back in x, overwriting the value stored by task 0.

The important fact is that the updates of x can be performed in any order. All you need to do is to make sure that no task can write to x between the read from x and the write to x in any other task; the uses of x in the tasks are otherwise independent.

Reductions

Reductions are a special case of the independent update pattern. The reduction pattern occurs when a loop combines a collection of values using a commutative, associative function.

In Adding Parallelism to Your Program, you will see that the Intel® oneAPI Threading Building Blocks and OpenMP* parallel frameworks have special features for writing parallel reductions.

Transactions

In a more general form of this pattern, there may be multiple memory locations which must be updated together.

void insert_node_in_list(T *new_node, T *insert_after)
{
    new_node->next = insert_after->next;
    new_node->prev = insert_after->next->prev;
    insert_after->next->prev = new_node;
    insert_after->next = new_node;
}

Two insertions must not occur simultaneously, but the insertions may occur in any order, as long as the final list order does not matter.

A collection of updates that must all occur together is referred to as a transaction.

Guard Variables

A special case is the use of a shared memory location to control some additional code. The update and the code that depends on it may be treated as a transaction.

bool initialized = false;
void do_something()
{
    if (!initialized) {
        do_the_initialization();
        initialized = true;
    }
    do_the_real_work();
}

If do_something()is called from multiple tasks, then the sharing problem is:

  1. Task 0 reads initialized, which is false, and enters the body of the if statement.

  2. Task 1 reads initialized, which is false, and enters the body of the if statement, so do_the_initialization() is called twice.

It does not matter which task the initialization occurs in, so your only problem is to make sure that other tasks wait until this initialization has happened.

Independent Writes

The simplest case occurs when the value that the tasks write to the memory location does not depend on its previous value:

bool found = false;
ANNOTATE_SITE_BEGIN(site1);
for (i = 0; i != n; ++i) {
    ANNOTATE_ITERATION_TASK(task1);
    if (a[i] == b) found = true;
}
if (found) printf("found\n");
ANNOTATE_SITE_BEGIN(site1);

There is no read to keep together with the write, and it does not matter what order the writes to found occur in, so the tasks are totally independent, and can execute concurrently without restrictions. If a task writes to found at all, it will write the value true.

Note that if the task body were the following, then this example would fit the reduction pattern:

found = found || (a[i] == b);

This is also called a benign race because the program will always compute the same value, regardless of which thread does the last write.