Visible to Intel only — GUID: GUID-73A93D6E-B280-4AFD-8D69-EABCF95909D6
Visible to Intel only — GUID: GUID-73A93D6E-B280-4AFD-8D69-EABCF95909D6
_mm256_dp_ps
Calculates the dot product of float32 vectors. The corresponding Intel® AVX instruction is VDPPS.
Syntax
extern __m256 _mm256_dp_ps(__m256 m1, __m256 m2, const int mask); |
Arguments
m1 |
float32 vector used for the operation |
m2 |
float32 vector also used for the operation |
mask |
a constant of integer type where the high four bits of the mask determine how the resultant elements are summed and the low four bits determine whether the summed resultant value is to be broadcast to the destination vector or not |
Description
First performs a SIMD multiplication of the lower four packed single-precision floating-point elements (float32 elements) from the first source vector m1 with corresponding elements in the second source vector m2.
Each of the four resulting single-precision elements is conditionally summed depending on the high four bits in the mask parameter.
The resulting summed value is broadcast to each of the lower 4 positions in the destination vector, if the corresponding lower bit of the mask is "1". If the corresponding lower bit of the mask is zero, the corresponding lower element in the destination vector is set to zero.
The process is then replicated with the high elements of the source vectors.
Returns
Result of the operation.