Visible to Intel only — GUID: GUID-FA1F2EF0-414D-474B-8417-E42AC8B8331A
Visible to Intel only — GUID: GUID-FA1F2EF0-414D-474B-8417-E42AC8B8331A
Arithmetic Intrinsics
The prototypes for Intel® Streaming SIMD Extensions (Intel® SSE) intrinsics for arithmetic operations are in the xmmintrin.h header file.
To use these intrinsics, include the immintrin.h file as follows:
#include <immintrin.h>
The results of each intrinsic operation are placed in a register. This register is illustrated for each intrinsic with R0-R3. R0, R1, R2, and R3 each represent one of the four 32-bit pieces of the result register.
Intrinsic Name |
Operation |
Corresponding |
---|---|---|
_mm_add_ss |
Addition |
ADDSS |
_mm_add_ps |
Addition |
ADDPS |
_mm_sub_ss |
Subtraction |
SUBSS |
_mm_sub_ps |
Subtraction |
SUBPS |
_mm_mul_ss |
Multiplication |
MULSS |
_mm_mul_ps |
Multiplication |
MULPS |
_mm_div_ss |
Division |
DIVSS |
_mm_div_ps |
Division |
DIVPS |
_mm_sqrt_ss |
Squared Root |
SQRTSS |
_mm_sqrt_ps |
Squared Root |
SQRTPS |
_mm_rcp_ss |
Reciprocal |
RCPSS |
_mm_rcp_ps |
Reciprocal |
RCPPS |
_mm_rsqrt_ss |
Reciprocal Squared Root |
RSQRTSS |
_mm_rsqrt_ps |
Reciprocal Squared Root |
RSQRTPS |
_mm_min_ss |
Computes Minimum |
MINSS |
_mm_min_ps |
Computes Minimum |
MINPS |
_mm_max_ss |
Computes Maximum |
MAXSS |
_mm_max_ps |
Computes Maximum |
MAXPS |
_mm_add_ss
__m128 _mm_add_ss(__m128 a, __m128 b);
Adds the lower single-precision, floating-point (FP) values of a and b; the upper three single-precision FP values are passed through from a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 + b0 |
a1 |
a2 |
a3 |
_mm_add_ps
__m128 _mm_add_ps(__m128 a, __m128 b);
Adds the four single-precision FP values of a and b.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 +b0 |
a1 + b1 |
a2 + b2 |
a3 + b3 |
_mm_sub_ss
__m128 _mm_sub_ss(__m128 a, __m128 b);
Subtracts the lower single-precision FP values of a and b. The upper three single-precision FP values are passed through from a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 - b0 |
a1 |
a2 |
a3 |
_mm_sub_ps
__m128 _mm_sub_ps(__m128 a, __m128 b);
Subtracts the four single-precision FP values of a and b.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 - b0 |
a1 - b1 |
a2 - b2 |
a3 - b3 |
_mm_mul_ss
__m128 _mm_mul_ss(__m128 a, __m128 b);
Multiplies the lower single-precision FP values of a and b; the upper three single-precision FP values are passed through from a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 * b0 |
a1 |
a2 |
a3 |
_mm_mul_ps
__m128 _mm_mul_ps(__m128 a, __m128 b);
Multiplies the four single-precision FP values of a and b.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 * b0 |
a1 * b1 |
a2 * b2 |
a3 * b3 |
_mm_div_ss
__m128 _mm_div_ss(__m128 a, __m128 b);
Divides the lower single-precision FP values of a and b; the upper three single-precision FP values are passed through from a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 / b0 |
a1 |
a2 |
a3 |
_mm_div_ps
__m128 _mm_div_ps(__m128 a, __m128 b);
Divides the four single-precision FP values of a and b.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
a0 / b0 |
a1 / b1 |
a2 / b2 |
a3 / b3 |
_mm_sqrt_ss
__m128 _mm_sqrt_ss(__m128 a);
Computes the square root of the lower single-precision FP value of a ; the upper three single-precision FP values are passed through.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
sqrt(a0) |
a1 |
a2 |
a3 |
_mm_sqrt_ps
__m128 _mm_sqrt_ps(__m128 a);
Computes the square roots of the four single-precision FP values of a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
sqrt(a0) |
sqrt(a1) |
sqrt(a2) |
sqrt(a3) |
_mm_rcp_ss
__m128 _mm_rcp_ss(__m128 a);
Computes the approximation of the reciprocal of the lower single-precision FP value of a; the upper the single-precision FP values are passed through.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
recip(a0) |
a1 |
a2 |
a3 |
_mm_rcp_ps
__m128 _mm_rcp_ps(__m128 a);
Computes the approximations of reciprocals of the four single-precision FP values of a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
recip(a0) |
recip(a1) |
recip(a2) |
recip(a3) |
_mm_rsqrt_ss
__m128 _mm_rsqrt_ss(__m128 a);
Computes the approximation of the reciprocal of the square root of the lower single-precision FP value of a; the upper three single-precision FP values are passed through.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
recip(sqrt(a0)) |
a1 |
a2 |
a3 |
_mm_rsqrt_ps
__m128 _mm_rsqrt_ps(__m128 a);
Computes the approximations of the reciprocals of the square roots of the four single-precision FP values of a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
recip(sqrt(a0)) |
recip(sqrt(a1)) |
recip(sqrt(a2)) |
recip(sqrt(a3)) |
_mm_min_ss
__m128 _mm_min_ss(__m128 a, __m128 b);
Computes the minimum of the lower single-precision FP values of a and b; the upper three single-precision FP values are passed through from a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
min(a0, b0) |
a1 |
a2 |
a3 |
_mm_min_ps
__m128 _mm_min_ps(__m128 a, __m128 b);
Computes the minimum of the four single-precision FP values of a and b.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
min(a0, b0) |
min(a1, b1) |
min(a2, b2) |
min(a3, b3) |
_mm_max_ss
__m128 _mm_max_ss(__m128 a, __m128 b);
Computes the maximum of the lower single-precision FP values of a and b; the upper three single-precision FP values are passed through from a.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
max(a0, b0) |
a1 |
a2 |
a3 |
_mm_max_ps
__m128 _mm_max_ps(__m128 a, __m128 b);
Computes the maximum of the four single-precision FP values of a and b.
R0 |
R1 |
R2 |
R3 |
---|---|---|---|
max(a0, b0) |
max(a1, b1) |
max(a2, b2) |
max(a3, b3) |