Visible to Intel only — GUID: GUID-A3272CE8-E597-4296-AFE1-52BF12625F84
Visible to Intel only — GUID: GUID-A3272CE8-E597-4296-AFE1-52BF12625F84
Load Intrinsics
The prototypes for Intel® Streaming SIMD Extensions (Intel® SSE) intrinsics for load operations are in the xmmintrin.h header file.
To use these intrinsics, include the immintrin.h file as follows:
#include <immintrin.h>
The results of each intrinsic operation are placed in a register. This register is illustrated for each intrinsic with R0-R3. R0, R1, R2, and R3 each represent one of the four 32-bit pieces of the result register.
Intrinsic Name |
Operation |
Corresponding |
---|---|---|
_mm_loadh_pi |
Load high |
MOVHPS reg, mem |
_mm_loadl_pi |
Load low |
MOVLPS reg, mem |
_mm_load_ss |
Load the low value and clear the three high values |
MOVSS |
_mm_load1_ps |
Load one value into all four words |
MOVSS + Shuffling |
_mm_load_ps |
Load four values, address aligned |
MOVAPS |
_mm_loadu_ps |
Load four values, address unaligned |
MOVUPS |
_mm_loadr_ps |
Load four values in reverse |
MOVAPS + Shuffling |
_mm_loadh_pi
__m128 _mm_loadh_pi(__m128 a, __m64 const *p);
Sets the upper two SP FP values with 64 bits of data loaded from the address p; the lower two values are passed through from a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 |
a1 |
*p0 |
*p1 |
_mm_loadl_pi
__m128 _mm_loadl_pi(__m128 a, __m64 const *p);
Sets the lower two SP FP values with 64 bits of data loaded from the address p; the upper two values are passed through from a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 |
a1 |
*p0 |
*p1 |
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
*p0 |
*p1 |
a2 |
a3 |
_mm_load_ss
__m128 _mm_load_ss(float * p);
Loads a SP FP value into the low word and clears the upper three words.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
*p |
0.0 |
0.0 |
0.0 |
_mm_load1_ps
__m128 _mm_load1_ps(float * p);
Loads a SP FP value, copying it into all four words.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
*p |
*p |
*p |
*p |
_mm_load_ps
__m128 _mm_load_ps(float * p);
Loads four SP FP values. The address must be 16-byte-aligned.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
p[0] |
p[1] |
p[2] |
p[3] |
_mm_loadu_ps
__m128 _mm_loadu_ps(float * p);
Loads four SP FP values. The address need not be 16-byte-aligned.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
p[0] |
p[1] |
p[2] |
p[3] |
_mm_loadr_ps
__m128 _mm_loadr_ps(float * p);
Loads four SP FP values in reverse order. The address must be 16-byte-aligned.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
p[3] |
p[2] |
p[1] |
p[0] |