Visible to Intel only — GUID: GUID-0819C676-9CEC-4B12-ADF4-1EF4B4DA30C0
Visible to Intel only — GUID: GUID-0819C676-9CEC-4B12-ADF4-1EF4B4DA30C0
Intrinsics for Bit Manipulation Operations
The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.
To use these intrinsics, include the immintrin.h file as follows:
#include <immintrin.h>
variable | definition |
---|---|
src | source element to use based on writemask result |
k | writemask used as a selector |
a | first source vector element |
_mm_lzcnt_epi32
__m128i _mm_lzcnt_epi32(__m128i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntd
Counts the number of leading zero bits in each packed 32-bit integer in a, and return the results.
_mm_mask_lzcnt_epi32
__m128i _mm_mask_lzcnt_epi32(__m128i src, __mmask8 k, __m128i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntd
Counts the number of leading zero bits in each packed 32-bit integer in a, and return the results using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm_maskz_lzcnt_epi32
__m128i _mm_maskz_lzcnt_epi32(__mmask8 k, __m128i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntd
Counts the number of leading zero bits in each packed 32-bit integer in a, and return the results using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm256_lzcnt_epi32
__m256i _mm256_lzcnt_epi32(__m256i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntd
Counts the number of leading zero bits in each packed 32-bit integer in a, and return the results.
_mm256_mask_lzcnt_epi32
__m256i _mm256_mask_lzcnt_epi32(__m256i src, __mmask8 k, __m256i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntd
Counts the number of leading zero bits in each packed 32-bit integer in a, and return the results using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm256_maskz_lzcnt_epi32
__m256i _mm256_maskz_lzcnt_epi32(__mmask8 k, __m256i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntd
Counts the number of leading zero bits in each packed 32-bit integer in a, and return the results using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm_lzcnt_epi64
__m128i _mm_lzcnt_epi64(__m128i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntq
Counts the number of leading zero bits in each packed 64-bit integer in a, and return the results.
_mm_mask_lzcnt_epi64
__m128i _mm_mask_lzcnt_epi64(__m128i src, __mmask8 k, __m128i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntq
Counts the number of leading zero bits in each packed 64-bit integer in a, and return the results using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm_maskz_lzcnt_epi64
__m128i _mm_maskz_lzcnt_epi64(__mmask8 k, __m128i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntq
Counts the number of leading zero bits in each packed 64-bit integer in a, and return the results using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm256_lzcnt_epi64
__m256i _mm256_lzcnt_epi64(__m256i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntq
Counts the number of leading zero bits in each packed 64-bit integer in a, and return the results.
_mm256_mask_lzcnt_epi64
__m256i _mm256_mask_lzcnt_epi64(__m256i src, __mmask8 k, __m256i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntq
Counts the number of leading zero bits in each packed 64-bit integer in a, and return the results using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm256_maskz_lzcnt_epi64
__m256i _mm256_maskz_lzcnt_epi64(__mmask8 k, __m256i a)
CPUID Flags: AVX512CD, AVX512VL
Instruction(s): vplzcntq
Counts the number of leading zero bits in each packed 64-bit integer in a, and return the results using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm_multishift_epi64_epi8
__m128i _mm_multishift_epi64_epi8(__m128i a, __m128i b)
CPUID Flags: AVX512VBMI, AVX512VL
Instruction(s): vpmultishiftqb
For each 64-bit element in b, select 8 unaligned bytes using a byte-granular shift control within the corresponding 64-bit element of a, and store the 8 assembled bytes to the corresponding 64-bit element of the return value.
_mm_mask_multishift_epi64_epi8
__m128i _mm_mask_multishift_epi64_epi8(__m128i src, __mmask16 k, __m128i a, __m128i b)
CPUID Flags: AVX512VBMI, AVX512VL
Instruction(s): vpmultishiftqb
For each 64-bit element in b, select 8 unaligned bytes using a byte-granular shift control within the corresponding 64-bit element of a, and store the 8 assembled bytes to the corresponding 64-bit element of the return value using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm_maskz_multishift_epi64_epi8
__m128i _mm_maskz_multishift_epi64_epi8(__mmask16 k, __m128i a, __m128i b)
CPUID Flags: AVX512VBMI, AVX512VL
Instruction(s): vpmultishiftqb
For each 64-bit element in b, select 8 unaligned bytes using a byte-granular shift control within the corresponding 64-bit element of a, and store the 8 assembled bytes to the corresponding 64-bit element of the return value using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm256_multishift_epi64_epi8
__m256i _mm256_multishift_epi64_epi8(__m256i a, __m256i b)
CPUID Flags: AVX512VBMI, AVX512VL
Instruction(s): vpmultishiftqb
For each 64-bit element in b, select 8 unaligned bytes using a byte-granular shift control within the corresponding 64-bit element of a, and store the 8 assembled bytes to the corresponding 64-bit element of the return value.
_mm256_mask_multishift_epi64_epi8
__m256i _mm256_mask_multishift_epi64_epi8(__m256i src, __mmask32 k, __m256i a, __m256i b)
CPUID Flags: AVX512VBMI, AVX512VL
Instruction(s): vpmultishiftqb
For each 64-bit element in b, select 8 unaligned bytes using a byte-granular shift control within the corresponding 64-bit element of a, and store the 8 assembled bytes to the corresponding 64-bit element of the return value using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm256_maskz_multishift_epi64_epi8
__m256i _mm256_maskz_multishift_epi64_epi8(__mmask32 k, __m256i a, __m256i b)
CPUID Flags: AVX512VBMI, AVX512VL
Instruction(s): vpmultishiftqb
For each 64-bit element in b, select 8 unaligned bytes using a byte-granular shift control within the corresponding 64-bit element of a, and store the 8 assembled bytes to the corresponding 64-bit element of the return value using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm512_multishift_epi64_epi8
__m512i _mm512_multishift_epi64_epi8(__m512i a, __m512i b)
CPUID Flags: AVX512VBMI
Instruction(s): vpmultishiftqb
For each 64-bit element in b, select 8 unaligned bytes using a byte-granular shift control within the corresponding 64-bit element of a, and store the 8 assembled bytes to the corresponding 64-bit element of the return value.
_mm512_mask_multishift_epi64_epi8
__m512i _mm512_mask_multishift_epi64_epi8(__m512i src, __mmask64 k, __m512i a, __m512i b)
CPUID Flags: AVX512VBMI
Instruction(s): vpmultishiftqb
For each 64-bit element in b, select 8 unaligned bytes using a byte-granular shift control within the corresponding 64-bit element of a, and store the 8 assembled bytes to the corresponding 64-bit element of the return value using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_maskz_multishift_epi64_epi8
__m512i _mm512_maskz_multishift_epi64_epi8(__mmask64 k, __m512i a, __m512i b)
CPUID Flags: AVX512VBMI
Instruction(s): vpmultishiftqb
For each 64-bit element in b, select 8 unaligned bytes using a byte-granular shift control within the corresponding 64-bit element of a, and store the 8 assembled bytes to the corresponding 64-bit element of the return value using zeromask k (elements are zeroed out when the corresponding mask bit is not set).