Visible to Intel only — GUID: GUID-312480DB-17AE-41A5-BEB1-AD92FF3830D0
Visible to Intel only — GUID: GUID-312480DB-17AE-41A5-BEB1-AD92FF3830D0
Intrinsics for Integer Load and Store Operations
The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.
To use these intrinsics, include the immintrin.h file as follows:
#include <immintrin.h>
Intrinsic Name |
Operation |
Corresponding |
---|---|---|
_mm512_load_epi32, _mm512_mask_load_epi32, _mm512_maskz_load_epi32 |
Load packed int32 elements from memory |
VMOVDQA32 |
_mm512_load_epi64, _mm512_mask_load_epi64, _mm512_maskz_load_epi64 |
Load packed int64 elements from memory |
VMOVDQA64 |
_mm512_loadu_si512 |
Unaligned load of 512-bit scalar integer |
VMOVDQU32 |
_mm512_mask_loadu_epi32, _mm512_maskz_loadu_epi32 |
Unaligned load of packed int32 elements |
VMOVDQU32 |
_mm512_mask_loadu_epi64, _mm512_maskz_loadu_epi64 |
Unaligned load of packed int64 elements |
VMOVDQU64 |
_mm512_stream_load_si512 | Load double quadword using non-temporal aligned hint. |
MOVNTDQA |
_mm512_mask_storeu_epi64 |
Store unaligned packed int64 elements |
VMOVDQU64 |
_mm512_stream_si512 |
Store packed integer values using non-temporal hint. |
VMOVNTDQA |
variable | definition |
---|---|
k | writemask used as a selector |
a | first source vector element |
mem_addr | pointer to base address in memory |
src | source element to use based on writemask result |
_mm512_load_si512
extern __m512i __cdecl _mm512_load_si512(void const* mem_addr);
Load 512-bits of integer data from memory into destination.
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_loadu_si512
extern __m512i __cdecl _mm512_loadu_si512(void const* mem_addr);
Load 512-bits of integer data from memory into destination.
mem_addr does not need to be aligned on any particular boundary.
_mm512_load_epi32
extern __m512i __cdecl _mm512_load_epi32(void const* mem_addr);
Load 512-bits (composed of sixteen packed 32-bit integers) from memory into destination.
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_mask_load_epi32
extern __m512i __cdecl _mm512_mask_load_epi32(__m512i src, __mmask16 k, void const* mem_addr);
Load packed int32 elements from memory into destination using writemask k (elements are copied from src when the corresponding mask bit is not set).
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_maskz_load_epi32
extern __m512i __cdecl _mm512_maskz_load_epi32(__mmask16 k, void const* mem_addr);
Load packed int32 elements from memory into destination using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_load_epi64
extern __m512i __cdecl _mm512_load_epi64(void const* mem_addr);
Load 512-bits (composed of eight packed int64 elements ) from memory into destination.
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_mask_load_epi64
extern __m512i __cdecl _mm512_mask_load_epi64(__m512i src, __mmask8 k, void const* mem_addr);
Load packed int64 elements from memory into destination using writemask k (elements are copied from src when the corresponding mask bit is not set).
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_maskz_load_epi64
extern __m512i __cdecl _mm512_maskz_load_epi64(__mmask8 k, void const* mem_addr);
Load packed int64 elements from memory into destination using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_mask_loadu_epi32
extern __m512i __cdecl _mm512_mask_loadu_epi32(__m512i src, __mmask16 k, void const* mem_addr);
Load packed int32 elements from memory into destination using writemask k (elements are copied from src when the corresponding mask bit is not set).
mem_addr does not need to be aligned on any particular boundary.
_mm512_maskz_loadu_epi32
extern __m512i __cdecl _mm512_maskz_loadu_epi32(__mmask16 k, void const* mem_addr);
Load packed int32 elements from memory into destination using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
mem_addr does not need to be aligned on any particular boundary.
_mm512_mask_loadu_epi64
extern __m512i __cdecl _mm512_mask_loadu_epi64(__m512i src, __mmask8 k, void const* mem_addr);
Load packed int64 elements from memory into destination using writemask k (elements are copied from src when the corresponding mask bit is not set).
mem_addr does not need to be aligned on any particular boundary.
_mm512_stream_load_si512
extern __m512i __cdecl _mm512_stream_load_si512(void * mem_addr);
Load 512-bits of integer data from memory into destination using a non-temporal memory hint.
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_store_epi32
extern void __cdecl _mm512_store_epi32(void* mem_addr, __m512i a);
Store 512-bits (composed of sixteen packed 32-bit integers) from a into memory.
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_mask_store_epi32
extern void __cdecl _mm512_mask_store_epi32(void* mem_addr, __mmask16 k, __m512i a);
Store packed int32 elements from a into memory using writemask k.
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_store_si512
extern void __cdecl _mm512_store_si512(void* mem_addr, __m512i a);
Store 512-bits of integer data from a into memory.
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_store_epi64
extern void __cdecl _mm512_store_epi64(void* mem_addr, __m512i a);
Store 512-bits (composed of eight packed int64 elements ) from a into memory.
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_mask_store_epi64
extern void __cdecl _mm512_mask_store_epi64(void* mem_addr, __mmask8 k, __m512i a);
Store packed int64 elements from a into memory using writemask k.
mem_addr must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_mask_storeu_epi32
extern void __cdecl _mm512_mask_storeu_epi32(void* mem_addr, __mmask16 k, __m512i a);
Store packed int32 elements from a into memory using writemask k.
mem_addr does not need to be aligned on any particular boundary.
_mm512_mask_storeu_epi64
extern void __cdecl _mm512_mask_storeu_epi64(void* mem_addr, __mmask8 k, __m512i a);
Store packed int64 elements from a into memory using writemask k.
mem_addr does not need to be aligned on any particular boundary.
_mm512_storeu_si512
extern void __cdecl _mm512_storeu_si512(void* mem_addr, __m512i a);
Store 512-bits of integer data from a into memory.
mem_addr does not need to be aligned on any particular boundary.
_mm512_stream_si512
extern void __cdecl _mm512_stream_si512(void* mem_addr, __m512i a);
Store 512-bits of integer data from a into memory using a non-temporal memory hint.