Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 3/31/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Intrinsics for Integer Bit Manipulation and Conflict Detection Operations

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.

To use these intrinsics, include the immintrin.h file as follows:

#include <immintrin.h>


Intrinsic Name

Operation

Corresponding
Intel® AVX-512 Instruction

_mm512_lzcnt_epi32, _mm512_mask_lzcnt_epi32, _mm512_maskz_lzcnt_epi32

Counts the leading zero bits in source int32 elements.

VPLZCNTD

_mm512_lzcnt_epi64, _mm512_mask_lzcnt_epi64, _mm512_maskz_lzcnt_epi64

Counts the leading zero bits in source int64 elements.

VPLZCNTQ

_mm512_ternarylogic_epi32, _mm512_mask_ternarylogic_epi32, _mm512_maskz_ternarylogic_epi32

Implements three-operand binary function specified by immediate value.

VPTERNLOGD

_mm512_ternarylogic_epi64, _mm512_mask_ternarylogic_epi64, _mm512_maskz_ternarylogic_epi64

Implements three-operand binary function specified by immediate value.

VPTERNLOGQ


variable definition
k

writemask used as a selector

a

first source vector element

b

second source vector element

c

third source vector element

imm8

binary function specifier

src

source element to use based on writemask result


_mm512_lzcnt_epi32

extern __m512i __cdecl _mm512_lzcnt_epi32(__m512i a);

Counts the number of leading zero bits in each packed 32-bit integer in a, and store the results in destination.



_mm512_mask_lzcnt_epi32

extern __m512i __cdecl _mm512_mask_lzcnt_epi32(__m512i src, __mmask16 k, __m512i a);

Counts the number of leading zero bits in each packed 32-bit integer in a, and store the results in destination using writemask k (elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_lzcnt_epi32

extern __m512i __cdecl _mm512_maskz_lzcnt_epi32(__mmask16 k, __m512i a);

Counts the number of leading zero bits in each packed 32-bit integer in a, and store the results in destination using zeromask k (elements are zeroed out when the corresponding mask bit is not set).



_mm512_lzcnt_epi64

extern __m512i __cdecl _mm512_lzcnt_epi64(__m512i a);

Counts the number of leading zero bits in each packed 64-bit integer in a, and store the results.



_mm512_mask_lzcnt_epi64

extern __m512i __cdecl _mm512_mask_lzcnt_epi64(__m512i src, __mmask8 k, __m512i a);

Counts the number of leading zero bits in each packed 64-bit integer in a, and store the results in using writemask k.

Elements are copied from src when the corresponding mask bit is not set.



_mm512_maskz_lzcnt_epi64

extern __m512i __cdecl _mm512_maskz_lzcnt_epi64(__mmask8 k, __m512i a);

Counts the number of leading zero bits in each packed 64-bit integer in a, and store the results in destination using zeromask k.

Elements are zeroed out when the corresponding mask bit is not set.



_mm512_ternarylogic_epi32

extern __m512i __cdecl _mm512_ternarylogic_epi32(__m512i a, __m512i b, __m512i c, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit.



_mm512_mask_ternarylogic_epi32

extern __m512i __cdecl _mm512_mask_ternarylogic_epi32(__m512i a, __mmask16 k, __m512i, __m512i b, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 32-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit using writemask k at 32-bit granularity (32-bit elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_ternarylogic_epi32

extern __m512i __cdecl _mm512_maskz_ternarylogic_epi32(__mmask16 k, __m512i a, __m512i b, __m512i c, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 32-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit using zeromask k at 32-bit granularity (32-bit elements are zeroed out when the corresponding mask bit is not set).



_mm512_ternarylogic_epi64

extern __m512i __cdecl _mm512_ternarylogic_epi64(__m512i a, __m512i b, __m512i c, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3-bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit.



_mm512_mask_ternarylogic_epi64

extern __m512i __cdecl _mm512_mask_ternarylogic_epi64(__m512i src, __mmask8 k, __m512i a, __m512i b, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 64-bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit using writemask k at 64-bit granularity (64-bit elements are copied from src when the corresponding mask bit is not set).



_mm512_maskz_ternarylogic_epi64

extern __m512i __cdecl _mm512_maskz_ternarylogic_epi64(__mmask8 k, __m512i a, __m512i b, __m512i c, int imm8);

Bitwise ternary logic to implement three-operand binary functions; the specific binary function is specified by value in imm8.

For each bit in each packed 64-bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding destination bit using zeromask k at 64-bit granularity (64-bit elements are zeroed out when the corresponding mask bit is not set).