Visible to Intel only — GUID: GUID-A41D4DCE-6725-447A-94EB-EA063631762E
Visible to Intel only — GUID: GUID-A41D4DCE-6725-447A-94EB-EA063631762E
Load Intrinsics
Intel® Streaming SIMD Extensions 2 (Intel® SSE2) intrinsics for floating-point load operations are listed in this topic. The prototypes for Intel® SSE2 intrinsics are in the emmintrin.h header file.
To use these intrinsics, include the immintrin.h file as follows:
#include <immintrin.h>
The load and set operations are similar in that both initialize __m128d data. However, the set operations take a double argument and are intended for initialization with constants, while the load operations take a double pointer argument and are intended to mimic the instructions for loading data from memory.
The results of each intrinsic operation are placed in a register. The information about what is placed in each register appears in the tables below, in the detailed explanation for each intrinsic. For each intrinsic, the resulting register is represented by R0 and R1, where R0 and R1 each represent one piece of the result register.
Intrinsic Name |
Operation |
Corresponding |
---|---|---|
_mm_load_pd | Loads two DP FP values |
MOVAPD |
_mm_load1_pd | Loads a single DP FP value, copying to both elements |
MOVSD + shuffling |
_mm_loadr_pd | Loads two DP FP values in reverse order |
MOVAPD + shuffling |
_mm_loadu_pd | Loads two DP FP values |
MOVUPD |
_mm_load_sd | Loads a DP FP value, sets upper DP FP to zero |
MOVSD |
_mm_loadh_pd | Loads a DP FP value as the upper DP FP value of the result |
MOVHPD |
_mm_loadl_pd | Loads a DP FP value as the lower DP FP value of the result |
MOVLPD |
_mm_load_pd
__m128d _mm_load_pd(double const*dp);
Loads two DP FP values. The address p must be 16-byte aligned.
R0 |
R1 |
---|---|
p[0] |
p[1] |
_mm_load1_pd
__m128d _mm_load1_pd(double const*dp);
Loads a single DP FP value, copying to both elements. The address p need not be 16-byte aligned.
R0 |
R1 |
---|---|
*p |
*p |
_mm_loadr_pd
__m128d _mm_loadr_pd(double const*dp);
Loads two DP FP values in reverse order. The address p must be 16-byte aligned.
R0 |
R1 |
---|---|
p[1] |
p[0] |
_mm_loadu_pd
__m128d _mm_loadu_pd(double const*dp);
Loads two DP FP values. The address p need not be 16-byte aligned.
R0 |
R1 |
---|---|
p[0] |
p[1] |
_mm_load_sd
__m128d _mm_load_sd(double const*dp);
Loads a DP FP value. The upper DP FP is set to zero. The address p need not be 16-byte aligned.
R0 |
R1 |
---|---|
*p |
0.0 |
_mm_loadh_pd
__m128d _mm_loadh_pd(__m128d a, double const*dp);
Loads a DP FP value as the upper DP FP value of the result. The lower DP FP value is passed through from a. The address p need not be 16-byte aligned.
R0 |
R1 |
---|---|
a0 |
*p |
_mm_loadl_pd
__m128d _mm_loadl_pd(__m128d a, double const*dp);
Loads a DP FP value as the lower DP FP value of the result. The upper DP FP value is passed through from a. The address p need not be 16-byte aligned.
R0 |
R1 |
---|---|
*p |
a1 |