Visible to Intel only — GUID: GUID-CD1B9AC5-2F6A-4532-AC4A-1F36775DCCD0
Visible to Intel only — GUID: GUID-CD1B9AC5-2F6A-4532-AC4A-1F36775DCCD0
Intrinsics for Integer Multiplication Operations
The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.
To use these intrinsics, include the immintrin.h file as follows:
#include <immintrin.h>
Intrinsic Name |
Operation |
Corresponding |
---|---|---|
_mm512_mul_epi32, _mm512_mask_mul_epi32, _mm512_maskz_mul_epi32 |
Multiplies alternating int32 vectors together to produce int64. |
VPMULDQ |
_mm512_mul_epu32, _mm512_mask_mul_epu32, _mm512_maskz_mul_epu32 |
Multiplies alternating unsigned int32 vectors together to produce int64. |
VPMULUDQ |
_mm512_mullo_epi32, _mm512_mask_mullo_epi32 |
Multiplies int32 vectors together to produce int64. |
VPMULLD |
_mm512_mullox_epi64, _mm512_mask_mullox_epi64 |
Multiplies int64 vectors together to produce int64. |
None. |
variable | definition |
---|---|
k | writemask used as a selector |
a | first source vector element |
b | second source vector element |
src | source element to use based on writemask result |
_mm512_mul_epi32
extern __m512i __cdecl _mm512_mul_epi32(__m512i a, __m512i b);
Multiplies the low int32 elements from each packed 64-bit element in a and b, and stores the signed 64-bit result.
_mm512_mask_mul_epi32
extern __m512i __cdecl _mm512_mask_mul_epi32(__m512i src, __mmask8 k, __m512i a, __m512i b);
Multiplies the low int32 elements from each packed 64-bit element in a and b, and stores the signed 64-bit result using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_maskz_mul_epi32
extern __m512i __cdecl _mm512_maskz_mul_epi32(__mmask8 k, __m512i a, __m512i b);
Multiplies the low int32 elements from each packed 64-bit element in a and b, and stores the signed 64-bit result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm512_mullo_epi32
extern __m512i __cdecl _mm512_mullo_epi32(__m512i a, __m512i b);
Multiplies the packed int32 elements in a and b, producing intermediate int64 elements, and stores the low 32 bits of the intermediate integers.
_mm512_mask_mullo_epi32
extern __m512i __cdecl _mm512_mask_mullo_epi32(__m512i src, __mmask16 k, __m512i a, __m512i b);
Multiplies the packed int32 elements in a and b, producing intermediate int64 elements, and stores the low 32 bits of the intermediate integers in destination using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_mul_epu32
extern __m512i __cdecl _mm512_mul_epu32(__m512i a, __m512i b);
Multiplies the low unsigned int32 elements from each packed 64-bit element in a and b, and stores the unsigned 64-bit result.
_mm512_mask_mul_epu32
extern __m512i __cdecl _mm512_mask_mul_epu32(__m512i src, __mmask8 k, __m512i a, __m512i b);
Multiplies the low unsigned int32 elements from each packed 64-bit element in a and b, and stores the unsigned 64-bit result using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_maskz_mul_epu32
extern __m512i __cdecl _mm512_maskz_mul_epu32(__mmask8 k, __m512i a, __m512i b);
Multiplies the low unsigned int32 elements from each packed 64-bit element in a and b, and stores the unsigned 64-bit result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm512_mullox_epi64
extern __m512i __cdecl _mm512_mullox_epi64(__m512i a, __m512i b);
Multiplies each packed int64 element in a and b, and selects the low bits of each product.
_mm512_mask_mullox_epi64
extern __m512i __cdecl _mm512_mask_mullox_epi64(__m512i, __mmask8 k, __m512i a, __m512i b);
Multiplies each packed int64 element in a and b, and selects the low bits of each product, using zeromask k (elements are zeroed out when the corresponding mask bit is not set).