Visible to Intel only — GUID: GUID-3B8D396F-96CF-4066-A75D-E5333E5A97A3
Visible to Intel only — GUID: GUID-3B8D396F-96CF-4066-A75D-E5333E5A97A3
Intrinsics for FP Reduction Operations
The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.
To use these intrinsics, include the immintrin.h file as follows:
#include <immintrin.h>
Intrinsic Name |
Operation |
Corresponding |
---|---|---|
_mm512_reduce_add_pd, _mm512_mask_reduce_add_pd |
Reduce float64 elements by addition. |
None. |
_mm512_reduce_add_ps, _mm512_mask_reduce_add_ps |
Reduce float32 elements by addition. |
None. |
_mm512_reduce_max_pd, _mm512_mask_reduce_max_pd |
Reduce float64 elements by maximum. |
None. |
_mm512_reduce_max_ps, _mm512_mask_reduce_max_ps |
Reduce float32 elements by maximum. |
None. |
_mm512_reduce_min_pd, _mm512_mask_reduce_min_pd |
Reduce float64 elements by minimum. |
None. |
_mm512_reduce_min_ps, _mm512_mask_reduce_min_ps |
Reduce float32 elements by minimum. |
None. |
_mm512_reduce_mul_pd, _mm512_mask_reduce_mul_pd |
Reduce float64 elements by multiplication. |
None. |
_mm512_reduce_mul_ps, _mm512_mask_reduce_mul_ps |
Reduce float32 elements by multiplication. |
None. |
variable | definition |
---|---|
k | writemask |
a | first source vector element |
_mm512_reduce_add_pd
extern double __cdecl _mm512_reduce_add_pd(__m512d a);
Reduces packed float64 elements in a by addition.
Returns the sum of all elements in a.
_mm512_mask_reduce_add_pd
extern double __cdecl _mm512_mask_reduce_add_pd(__mmask8 k, __m512d a);
Reduces packed float64 elements in a by addition using writemask k.
Returns the sum of all active elements in a.
_mm512_reduce_add_ps
extern float __cdecl _mm512_reduce_add_ps(__m512 a);
Reduces packed float32 elements in a by addition.
Returns the sum of all elements in a.
_mm512_mask_reduce_add_ps
extern float __cdecl _mm512_mask_reduce_add_ps(__mmask16 k, __m512 a);
Reduces packed float32 elements in a by addition using writemask k.
Returns the sum of all active elements in a.
_mm512_reduce_max_pd
extern double __cdecl _mm512_reduce_max_pd(__m512d a);
Reduces packed float64 elements in a by maximum.
Returns the maximum of all elements in a.
_mm512_mask_reduce_max_pd
extern double __cdecl _mm512_mask_reduce_max_pd(__mmask8 k, __m512d a);
Reduces packed float64 elements in a by maximum, using writemask k.
Returns the maximum of all active elements in a.
_mm512_reduce_max_ps
extern float __cdecl _mm512_reduce_max_ps(__m512 a);
Reduces packed float32 elements in a by maximum.
Returns the maximum of all elements in a.
_mm512_mask_reduce_max_ps
extern float __cdecl _mm512_mask_reduce_max_ps(__mmask16 k, __m512 a);
Reduces packed float32 elements in a by maximum, using writemask k.
Returns the maximum of all active elements in a.
_mm512_reduce_min_pd
extern double __cdecl _mm512_reduce_min_pd(__m512d a);
Reduces packed float64 elements in a by minimum.
Returns the minimum of all elements in a.
_mm512_mask_reduce_min_pd
extern double __cdecl _mm512_mask_reduce_min_pd(__mmask8 k, __m512d a);
Reduces packed float64 elements in a by minimum, using writemask k.
Returns the minimum of all active elements in a.
_mm512_reduce_min_ps
extern float __cdecl _mm512_reduce_min_ps(__m512 a);
Reduces packed float32 elements in a by minimum.
Returns the minimum of all elements in a.
_mm512_mask_reduce_min_ps
extern float __cdecl _mm512_mask_reduce_min_ps(__mmask16 k, __m512 a);
Reduces packed float32 elements in a by minimum, using writemask k.
Returns the minimum of all active elements in a.
_mm512_reduce_mul_pd
extern double __cdecl _mm512_reduce_mul_pd(__m512d a);
Reduces packed float64 elements in a by multiplication.
Returns the product of all elements in a.
_mm512_mask_reduce_mul_pd
extern double __cdecl _mm512_mask_reduce_mul_pd(__mmask8 k, __m512d a);
Reduces packed float64 elements in a by multiplication, using writemask k.
Returns the product of all active elements in a.
_mm512_reduce_mul_ps
extern float __cdecl _mm512_reduce_mul_ps(__m512 a);
Reduces packed float32 elements in a by multiplication.
Returns the product of all elements in a.
_mm512_mask_reduce_mul_ps
extern float __cdecl _mm512_mask_reduce_mul_ps(__mmask16 k, __m512 a);
Reduces packed float32 elements in a by multiplication, using writemask k.
Returns the product of all active elements in a.