qopt-streaming-stores, Qopt-streaming-stores

Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

Download PDF

ID 767253

Date 6/24/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-71FBA8B7-28E3-46C0-99F9-75D2958994C3

View Details

qopt-streaming-stores, Qopt-streaming-stores

Enables generation of streaming stores for optimization.

Syntax

Linux:

-qopt-streaming-stores=keyword

-qno-opt-streaming-stores

Windows:

/Qopt-streaming-stores:keyword

/Qopt-streaming-stores-

Arguments

keyword

Specifies whether streaming stores are generated. Possible values are:

always

Enables generation of streaming stores for optimization. The compiler optimizes under the assumption that the application is memory bound.

When this option setting is specified, it is your responsibility to also insert any memory barriers (fences) as required to ensure correct memory ordering within a thread or across threads. See the Examples section for one way to do this.

never

Disables generation of streaming stores for optimization. Normal stores are performed.

This setting has the same effect as specifying -qno-opt-streaming-stores or /Qopt-streaming-stores-.

auto

Lets the compiler decide which instructions to use.

Default

-qopt-streaming-stores=auto
or /Qopt-streaming-stores:auto

The compiler decides whether to use streaming stores or normal stores.

Description

This option enables generation of streaming stores for optimization. This method stores data with instructions that use a non-temporal buffer, which minimizes memory hierarchy pollution.

This option may be useful for applications that can benefit from streaming stores.

IDE Equivalent

None

Alternate Options

None

Example

The following example shows one way to insert fences when specifying -qopt-streaming-stores=always or /Qopt-streaming-stores:always. It inserts a _mm_sfence() intrinsic call just after the loops (such as the initialization loop) where the compiler may insert streaming store instructions.

void simple1(double * restrict a, double * restrict b, double * restrict c, double *d, int n)
{
    int i, j;

#pragma omp parallel for
      for (j=0; j<n; j++) {
        a[j] = 1.0;
        b[j] = 2.0;
        c[j] = 0.0;
        }

      _mm_sfence(); // OR _mm_mfence();

#pragma omp parallel for
    for (i=0; i<n; i++)
        a[i] = a[i] + c[i]*b[i];
}

Parent topic: Advanced Optimization Options

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

qopt-streaming-stores, Qopt-streaming-stores

Syntax

Arguments

Default

Description

IDE Equivalent

Alternate Options

Example

See Also