Visible to Intel only — GUID: GUID-24D3F6DC-C911-4445-9766-F866590D9DFC
Visible to Intel only — GUID: GUID-24D3F6DC-C911-4445-9766-F866590D9DFC
Intrinsics for FP Shuffle Operations
The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.
To use these intrinsics, include the immintrin.h file as follows:
#include <immintrin.h>
Intrinsic Name |
Operation |
Corresponding |
---|---|---|
_mm512_shuffle_pd, _mm512_mask_shuffle_pd, _mm512_maskz_shuffle_pd |
Shuffle float64 values. |
VSHUFPD |
_mm512_shuffle_ps, _mm512_mask_shuffle_ps, _mm512_maskz_shuffle_ps |
Shuffle float32 values. |
VSHUFPS |
_mm512_shuffle_f64x2, _mm512_mask_shuffle_f64x2, _mm512_maskz_shuffle_f64x2 |
Shuffle float64 values and store using mask. |
VSHUFF64X2 |
_mm512_shuffle_f32x4, _mm512_mask_shuffle_f32x4, _mm512_maskz_shuffle_f32x4 |
Shuffle float32 values and store using mask. |
VSHUFF32X4 |
variable | definition |
---|---|
k | writemask used as a selector |
a | first source vector element |
b | second source vector element |
src | source element to use based on writemask result |
imm | vector element selector |
_mm512_shuffle_f32x4
extern __m512 __cdecl _mm512_shuffle_f32x4(__m512 a, __m512 b, const int imm);
Shuffles four float32 elements from a and b, selected by imm, and stores the result.
_mm512_mask_shuffle_f32x4
extern __m512 __cdecl _mm512_mask_shuffle_f32x4(__m512 src, __mmask16 k, __m512 a, __m512 b, const int imm);
Shuffles four float32 elements from a and b, selected by imm, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_maskz_shuffle_f32x4
extern __m512 __cdecl _mm512_maskz_shuffle_f32x4(__mmask16 k, __m512 a, __m512 b, const int imm);
Shuffles four float32 elements from a and b, selected by imm, and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm512_shuffle_f64x2
extern __m512d __cdecl _mm512_shuffle_f64x2(__m512d a, __m512d b, const int imm);
Shuffles 128-bits (composed of two float64 elements from a and b, selected by imm, and stores the result.
_mm512_mask_shuffle_f64x2
extern __m512d __cdecl _mm512_mask_shuffle_f64x2(__m512d src, __mmask8 k, __m512d a, __m512d b, const int imm);
Shuffles 128-bits (composed of two float64 elements from a and b, selected by imm, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_maskz_shuffle_f64x2
extern __m512d __cdecl _mm512_maskz_shuffle_f64x2(__mmask8 k, __m512d a, __m512d b, const int imm);
Shuffles 128-bits (composed of two float64 elements from a and b, selected by imm, and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm512_shuffle_pd
extern __m512d __cdecl _mm512_shuffle_pd(__m512d a, __m512d b, const int imm);
Shuffles float64 elements from vectors a and b within 128-bit lanes using the control in imm, and stores the result.
_mm512_mask_shuffle_pd
extern __m512d __cdecl _mm512_mask_shuffle_pd(__m512d src, __mmask8 k, __m512d a, __m512d b, const int imm);
Shuffle float64 elements from vectors a and b within 128-bit lanes using the control in imm, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_maskz_shuffle_pd
extern __m512d __cdecl _mm512_maskz_shuffle_pd(__mmask8 k, __m512d a, __m512d b, const int imm);
Shuffle float64 elements from vectors a and b within 128-bit lanes using the control in imm, and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm512_shuffle_ps
extern __m512 __cdecl _mm512_shuffle_ps(__m512 a, __m512 b, const int imm);
Shuffles float32 elements from vectors a and b within 128-bit lanes using the control in imm, and stores the result.
_mm512_mask_shuffle_ps
extern __m512 __cdecl _mm512_mask_shuffle_ps(__m512 src, __mmask16 k, __m512 a, __m512 b, const int imm);
Shuffle float32 elements from vectors a and b within 128-bit lanes using the control in imm, and stores the result using writemask k.
Elements are copied from src when the corresponding mask bit is not set.
_mm512_maskz_shuffle_ps
extern __m512 __cdecl _mm512_maskz_shuffle_ps(__mmask16 k, __m512 a, __m512 b, const int imm);
Shuffle float32 elements from vectors a and b within 128-bit lanes using the control in imm, and stores the result using zeromask k.
Elements are zeroed out when the corresponding mask bit is not set.