Visible to Intel only — GUID: GUID-E8B4A8F7-45C3-489C-A5E3-1C9CC525BA9C
Visible to Intel only — GUID: GUID-E8B4A8F7-45C3-489C-A5E3-1C9CC525BA9C
APIs for Custom Synchronization
While the Intel Inspector supports a significant portion of the Windows* OS and POSIX* APIs, it is often useful to define your own synchronization constructs. Any specially built constructs that you create are not normally tracked by the Intel Inspector; however, the Intel Inspector supports synchronization APIs to help you gather semantic information related to your custom synchronization constructs.
Synchronization constructs may generally be modeled as a series of signals. One thread, or many threads, may wait for a signal from another group of threads before proceeding with some action. Synchronization APIs track when a thread begins waiting for a signal and when the signal occurs.
Using User-Defined Synchronization APIs in Your Code
Use This in C/C++ Code |
Use This in Fortran Code |
To Do This |
---|---|---|
void __itt_sync_acquired ( void *addr) |
subroutine itt_sync_acquired(addr) integer(kind=itt_ptr), intent(in), value :: addr end subroutine itt_sync_acquired |
Tell the Intel Inspector that the code received a signal on the specified synchronization object. |
void __itt_sync_releasing ( void *addr) |
subroutine itt_sync_releasing(addr) integer(kind=itt_ptr), intent(in), value :: addr end subroutine itt_sync_releasing |
Tell the Intel Inspector that the code is about to send a signal on the specified synchronization object. |
void __itt_sync_destroy ( void *addr) |
subroutine itt_sync_destroy(addr) integer(kind=itt_ptr), intent(in), value :: addr end subroutine itt_sync_destroy |
Tell the Intel Inspector that the synchronization object will not be used again, so the Intel Inspector can dispose of bookkeeping information associated with this object. |
The addr parameter is simply a value that uniquely identifies the synchronization object to be modeled. Unique values allow the Intel Inspector to track distinct custom synchronization objects. To use the same custom object to protect access in different parts of your code, use the same addr parameter around each.
Since each custom synchronization construct may involve any number of synchronization objects, each synchronization object must be triggered off a unique memory handle, which the synchronization APIs will use to track the object. You can track any number of synchronization objects at one time using synchronization APIs, as long as each object uses a unique memory pointer. You can think of this as modeling objects similar to the WaitForMultipleObjects function in the Windows* OS API. You can create more complex synchronization constructs from a group of synchronization objects.
API Usage Tips
Follow these guidelines to properly insert synchronization APIs within your code:
Insert an acquired API immediately after your code stops waiting for a synchronization object.
Insert a releasing API immediately before the code signals that it no longer holds a synchronization object.
If you place the synchronization APIs improperly, the Intel Inspector may report threading problems where there are none or fail to detect real threading problems.
Usage Example: User-Defined Synchronized Critical Section
The following code snippets show how to create a critical section construct that can be tracked with synchronization APIs:
C/C++ Example |
Fortran Example |
---|---|
#include <ittnotify.h> CSEnter(MyCriticalSection * cs) { while(cs->LockIsUsed) { if(cs->LockIsFree) { // Code to acquire the lock goes here __itt_sync_acquired((void *) cs); } } } CSLeave(MyCriticalSection *cs) { if(cs->LockIsMine) { __itt_sync_releasing((void *) cs); // Code to release the lock goes here } } |
use ittnotify subroutine CSEnter(cs) integer cs while(LockIsUsed(cs) .ne. 1) if(LockIsFree(cs) .eq. 1) ! Code to acquire the lock goes here call itt_sync_acquired(LOC(cs)) end if enddo end subroutine subroutine CSLeave(integer cs) { integer cs if(LockIsMine(cs) .eq. 1) call itt_sync_releasing(LOC(cs)); ! Code to release the lock goes here end if end subroutine |
Note the following when looking at this simple critical section example:
The acquired API is placed immediately after the code obtains the user lock.
The releasing API is placed before the code releases the user lock. This ensures another thread does not call the acquired API before the Intel Inspector realizes this thread has released the lock.
Usage Example: User-Level Synchronized Barrier
Higher-level constructs, such as barriers, are also easy to model using synchronization APIs. The following code snippets show how to create a barrier construct that can be tracked using synchronization APIs:
C/C++ Example |
Fortran Example |
---|---|
#include <ittnotify.h> Barrier() { teamflag = false; __itt_sync_releasing((void *) &counter); InterlockedIncrement(&counter); //use the atomic increment primitive appropriate to your OS and compiler if( counter == thread_count ) { __itt_sync_acquired((void *) &counter); __itt_sync_releasing((void *) &teamflag); counter = 0; teamflag = true; } else { Wait for team flag __ itt_sync_acquired((void *) &teamflag); } } |
use ittnotify subroutine barrier() common /x/ teamflag, counter, thread_count integer teamflag integer thread_count integer counter teamflag = 0 call itt_sync_releasing(LOC(counter)) !atomically update counter here !use the atomic increment primitive !appropriate to your OS and compiler If ( counter .eq. thread_count ) then call itt_sync_acquired(LOC(counter)) call itt_sync_releasing(LOC(teamflag)) counter = 0 teamflag = 1 else !Wait for team flag call itt_sync_acquired(LOC(teamflag)) end if end subroutine |
Note the following when looking at this example:
There are two synchronization objects in this barrier code. The counter object is used to do a gather-like signaling from all the threads to the final thread indicating that each thread has entered the barrier. Once the last thread hits the barrier, it uses the teamflag object to signal all the other threads that they may proceed.
As each thread enters the barrier, it calls the releasing API to tell the Intel Inspector it is about to signal the last thread by incrementing counter.
The last thread to enter the barrier calls the acquired API to tell the Intel Inspector it was successfully signaled by all the other threads.
The last thread to enter the barrier then calls the releasing API to tell the Intel Inspector it is going to signal the barrier completion to all the other threads by setting teamflag.
Finally, before leaving the barrier, each thread calls the acquired API to tell the Intel Inspector it successfully received the end-of-barrier signal.