Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512)...

Intel® C++ Compiler Classic Developer Guide and Reference

Download PDF

ID 767249

Date 12/16/2022

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) 4VNNIW Instructions

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) 4VNNIW instruction intrinsics are located in the zmmintrin.h header file.

To use these intrinsics, include the immintrin.h file as follows:

#include <immintrin.h>

_mm512_4dpwssd_epi32

__mm512i _mm512_4dpwssd_epi32 (__m512 c, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)

variable	definition
an	first source block 4 vectors
b	pointer to the second source block
c	third source; accumulator

Instructions: vp4dpwssd zmm1, zmm2+3, m128

Computes 4 vector source-block dot-products of two signed word operands with doubleword accumulation in c. The memory operand is sequentially selected in each of the four steps.

_mm512_mask_4dpwssd_epi32

__mm512i _mm512_mask_4dpwssd_epi32 (__m512 c, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)

variable	definition
an	first source block 4 vectors
b	pointer to the second source block
c	third source; accumulator
k	mask used as a selector

Instructions: vp4dpwssd zmm1 {k}, zmm2+3, m128

Computes 4 vector source-block dot-products of two signed word operands with doubleword accumulation using mask k, with accumulation in c. The memory operand is sequentially selected in each of the four steps. Elements are copied from c when the corresponding mask bit is not set.

_mm512_maskz_4dpwssd_epi32

__mm512i _mm512_maskz_4dpwssd_epi32 (__m512 c, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)

variable	definition
an	first source block 4 vectors
b	pointer to the second source block
c	third source; accumulator
k	mask used as a selector

Instructions: vp4dpwssd zmm1 {k}, zmm2+3, m128

Computes 4 vector source-block dot-products of two signed word operands with doubleword accumulation using mask k, with accumulation in c. The memory operand is sequentially selected in each of the four steps. Elements are zeroed out when the corresponding mask bit is not set.

_mm512_4dpwssds_epi32

__mm512i _mm512_4dpwssds_epi32 (__m512 c, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)

variable	definition
an	first source block 4 vectors
b	pointer to the second source block
c	third source; accumulator

Instructions: vp4dpwssds zmm1, zmm2+3, m128

Computes 4 vector source-block dot-products of two signed word operands with doubleword accumulation and signed saturation in c. The memory operand is sequentially selected in each of the four steps.

_mm512_mask_4dpwssds_epi32

__mm512i _mm512_mask_4dpwssds_epi32 (__m512 c, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)

variable	definition
an	first source block 4 vectors
b	pointer to the second source block
c	third source; accumulator
k	mask used as a selector

Instructions: vp4dpwssds zmm1 {k}, zmm2+3, m128

Computes 4 vector source-block dot-products of two signed word operands with doubleword accumulation and signed saturation using mask k, with accumulation in c. The memory operand is sequentially selected in each of the four steps. Elements are copied from c when the corresponding mask bit is not set.

_mm512_maskz_4dpwssds_epi32

__mm512i _mm512_maskz_4dpwssds_epi32 (__m512 c, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)

variable	definition
an	first source block 4 vectors
b	pointer to the second source block
c	third source; accumulator
k	mask used as a selector

Instructions: vp4dpwssds zmm1 {k}, zmm2+3, m128

Computes 4 vector source-block dot-products of two signed word operands with doubleword accumulation and signed saturation using mask k, with accumulation in c. The memory operand is sequentially selected in each of the four steps. Elements are zeroed out when the corresponding mask bit is not set.

Parent topic: Intrinsics

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® C++ Compiler Classic Developer Guide and Reference

Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) 4VNNIW Instructions