Visible to Intel only — GUID: GUID-9C4EC8FF-251F-4741-87FD-AF0DF8E9D3A5
Visible to Intel only — GUID: GUID-9C4EC8FF-251F-4741-87FD-AF0DF8E9D3A5
Intrinsics for FP Insert and Extract Operations
The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.
To use these intrinsics, include the immintrin.h file as follows:
#include <immintrin.h>
Intrinsic Name |
Operation |
Corresponding |
---|---|---|
_mm512_extractf32x4_ps, _mm512_mask_extractf32x4_ps, _mm512_maskz_extractf32x4_ps |
Extract float32 values. |
VEXTRACTF32X4 |
_mm512_extractf64x4_pd_mm512_mask_extractf64x4_pd, _mm512_maskz_extractf64x4_pd |
Extract float64 values. |
VEXTRACTF64X4 |
_mm_extract_ps |
Extract packed float32 values. |
EXTRACTPS |
_mm512_getmant_pd, _mm512_mask_getmant_pd, _mm512_maskz_getmant_pd _mm512_getmant_round_pd, _mm512_mask_getmant_round_pd, _mm512_maskz_getmant_round_pd |
Extract float64 vector of normalized mantissas from float64 vector. |
VGETMANTPD |
_mm512_getmant_ps, _mm512_mask_getmant_ps, _mm512_maskz_getmant_ps _mm512_getmant_round_ps, _mm512_mask_getmant_round_ps, _mm512_maskz_getmant_round_ps |
Extract float32 vector of normalized mantissas from float32 vector. |
VGETMANTPS |
_mm512_getmant_ss, _mm512_mask_getmant_ss, _mm512_maskz_getmant_ss _mm512_getmant_round_ss, _mm512_mask_getmant_round_ss, _mm512_maskz_getmant_round_ss |
Extract float32 vector of normalized mantissas from float32 scalar. |
VGETMANTSS |
_mm512_getmant_sd, _mm512_mask_getmant_sd, _mm512_maskz_getmant_sd _mm512_getmant_round_sd, _mm512_mask_getmant_round_sd, _mm512_maskz_getmant_round_sd |
Extract float64 of normalized mantissas from float64 scalar. |
VGETMANTSD |
_mm512_insertf32x4, _mm512_mask_insertf32x4, _mm512_maskz_insertf32x4 |
Insert float32 values. |
VINSERTF32X4 |
_mm512_insertf64x4, _mm512_mask_insertf64x4, _mm512_mask_insertf64x4 |
Insert float64 values. |
VINSERTF64X4 |
_mm_insert_ps |
Insert scalar float32 values. |
VINSERTPS/INSERTPS |
variable | definition |
---|---|
k | writemask used as a selector |
a | first source vector element |
b | second source vector element |
src | source element to use based on writemask result |
imm | 8-bit immediate integer specifies offset for destination |
tmp | temporary storage location used during operation |
interval | Where _MM_MANTISSA_NORM_ENUM can be one of the following:
|
sign | Where _MM_MANTISSA_SIGN_ENUM can be one of the following:
|
round | Rounding control values; these can be one of the following (along with the sae suppress all exceptions flag):
|
_mm512_extractf32x4_ps
extern __m128 __cdecl _mm512_extractf32x4_ps(__m512 a, int imm);
Extracts 128 bits (composed of four packed float32 elements) from a, selected with imm, and stores the result.
_mm512_mask_extractf32x4_ps
extern __m128 __cdecl _mm512_mask_extractf32x4_ps(__m128 src, __mmask8 k, __m512 a, int imm);
Extracts 128 bits (composed of four packed float32 elements) from a, selected with imm, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_maskz_extractf32x4_ps
extern __m128 __cdecl _mm512_maskz_extractf32x4_ps(__mmask8 k, __m512, int imm);
Extracts 128 bits (composed of four packed float32 elements) from a, selected with imm, and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm512_extractf64x4_pd
extern __m256d __cdecl _mm512_extractf64x4_pd(__m512d a, int imm);
Extracts 256 bits (composed of four packed float64 elements) from a, selected with imm, and stores the result.
_mm512_mask_extractf64x4_pd
extern __m256d __cdecl _mm512_mask_extractf64x4_pd(__m256d src, __mmask8 k, __m512d a, int imm);
Extracts 256 bits (composed of four packed float64 elements) from a, selected with imm, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_maskz_extractf64x4_pd
extern __m256d __cdecl _mm512_maskz_extractf64x4_pd(__mmask8 k, __m512d a, int imm);
Extracts 256 bits (composed of four packed float64 elements) from a, selected with imm, and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm512_insertf32x4
extern __m512 __cdecl _mm512_insertf32x4(__m512 a, __m128 b, int imm);
Copies a to destination, then inserts 128 bits (composed of four packed float32 elements) from b into destination at the location specified by imm.
_mm512_mask_insertf32x4
extern __m512 __cdecl _mm512_mask_insertf32x4(__m512 src, __mmask16 k, __m512 a, __m128 b, int imm);
Copies a to destination, then inserts 128 bits (composed of four packed float32 elements) from b into destination at the location specified by imm.
_mm512_maskz_insertf32x4
extern __m512 __cdecl _mm512_maskz_insertf32x4(__mmask16 k, __m512 a, __m128 b, int imm);
Copies a to tmp, then inserts 128 bits (composed of four packed float32 elements) from b into tmp at the location specified by imm. Stores tmp to destination using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_insertf64x4
extern __m512d __cdecl _mm512_insertf64x4(__m512d a, __m256d b, int imm);
Copies a to tmp, then inserts 128 bits (composed of four packed float32 elements) from b into tmp at the location specified by imm. Stores tmp to destination using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
_mm512_mask_insertf64x4
extern __m512d __cdecl _mm512_mask_insertf64x4(__m512d src, __mmask8 k, __m512d a, __m256d b, int imm);
Copies a to destination, then inserts 256 bits (composed of four packed float64 elements) from b into destination at the location specified by imm.
_mm512_maskz_insertf64x4
extern __m512d __cdecl _mm512_maskz_insertf64x4(__mmask8 k, __m512d a, __m256d b, int imm);
Copies a to tmp, then inserts 256 bits (composed of four packed float64 elements) from b into tmp at the location specified by imm. Store tmp to destination using writemask k (elements are copied from src when the corresponding mask bit is not set).
_mm512_getmant_pd
extern __m512d __cdecl _mm512_getmant_pd(__m512d a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of packed float64 elements in a, and stores the result. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_mask_getmant_pd
extern __m512d __cdecl _mm512_mask_getmant_pd(__m512d src, __mmask8 k, __m512d a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of packed float64 elements in a, and stores the result. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_maskz_getmant_pd
extern __m512d __cdecl _mm512_maskz_getmant_pd(__mmask8 k, __m512d a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of packed float64 elements in a, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_getmant_round_pd
extern __m512d __cdecl _mm512_getmant_round_pd(__m512d a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of packed float64 elements in a, and stores the result. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_mask_getmant_round_pd
extern __m512d __cdecl _mm512_mask_getmant_round_pd(__m512d src, __mmask8 k, __m512d a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of packed float64 elements in a, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_maskz_getmant_round_pd
extern __m512d __cdecl _mm512_maskz_getmant_round_pd(__mmask8 k, __m512d a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of packed float64 elements in a, and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_getmant_ps
extern __m512 __cdecl _mm512_getmant_ps(__m512 a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of packed float32 elements in a, and stores the result. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_mask_getmant_ps
extern __m512 __cdecl _mm512_mask_getmant_ps(__m512 src, __mmask16 k, __m512 a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of packed float32 elements in a, and stores the result. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_maskz_getmant_ps
extern __m512 __cdecl _mm512_maskz_getmant_ps(__mmask16 k, __m512 a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of packed float32 elements in a, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_getmant_round_ps
extern __m512 __cdecl _mm512_getmant_round_ps(__m512 a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of packed float32 elements in a, and stores the result. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_mask_getmant_round_ps
extern __m512 __cdecl _mm512_mask_getmant_round_ps(__m512 src, __mmask16 k, __m512 a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of packed float32 elements in a, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm512_maskz_getmant_round_ps
extern __m512 __cdecl _mm512_maskz_getmant_round_ps(__mmask16 k, __m512 a, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of packed float32 elements in a, and stores the result using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_getmant_round_sd
extern __m128d __cdecl _mm_getmant_round_sd(__m128d a, __m128d b, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of the lower float64 element in a, stores the result in the lower destination element, and copies the upper element from b to the upper destination element. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_mask_getmant_round_sd
extern __m128d __cdecl _mm_mask_getmant_round_sd(__m128d src, __mmask8 k, __m128d a, __m128d b, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of the lower float64 element in a, store the result in the lower destination element, and copies the upper element from b to the upper destination element. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_maskz_getmant_round_sd
extern __m128d __cdecl _mm_maskz_getmant_round_sd(__mmask8 k, __m128d a, __m128d b, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of the lower float64 element in a, stores the result in the lower destination element using writemask k (the element is copied from src when mask bit 0 is not set), and copies the upper element from b to the upper destination element. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_getmant_sd
extern __m128d __cdecl _mm_getmant_sd(__m128d a, __m128d b, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of the lower float64 element in a, store the result in the lower destination element, and copies the upper element from b to the upper destination element. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_mask_getmant_sd
extern __m128d __cdecl _mm_mask_getmant_sd(__m128d a, __mmask8 k, __m128d b, __m128d c, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalize the mantissas of the lower float64 element in a, store the result in the lower destination element using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from b to the upper destination element. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_maskz_getmant_sd
extern __m128d __cdecl _mm_maskz_getmant_sd(__mmask8 k, __m128d a, __m128d b, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of the lower float64 element in a, stores the result in the lower destination element using zeromask k (the element is zeroed out when mask bit 0 is not set), and copies the upper element from b to the upper destination element. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_getmant_round_ss
extern __m128 __cdecl _mm_getmant_round_ss(__m128 a, __m128 b, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of the lower float32 element in a, stores the result in the lower destination element, and copies the upper three packed elements from b to the upper destination elements. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_mask_getmant_round_ss
extern __m128 __cdecl _mm_mask_getmant_round_ss(__m128 a, __mmask8 k, __m128 b, __m128 c, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of the lower float32 element in a, stores the result in the lower destination element, and copies the upper three packed elements from b to the upper destination elements. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_maskz_getmant_round_ss
extern __m128 __cdecl _mm_maskz_getmant_round_ss(__mmask8 k, __m128 a, __m128 b, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign, int round);
Normalizes the mantissas of the lower float32 element in a, stores the result in the lower destination element using writemask k (the element is copied from src when mask bit 0 is not set), and copies the upper three packed elements from b to the upper destination elements. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_getmant_ss
extern __m128 __cdecl _mm_getmant_ss(__m128 a, __m128 b, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of the lower float32 element in a, stores the result in the lower destination element, and copies the upper three packed elements from b to the upper destination elements. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_mask_getmant_ss
extern __m128 __cdecl _mm_mask_getmant_ss(__m128 a, __mmask8 k, __m128 b, __m128 c, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of the lower float32 element in a, stores the result in the lower destination element using writemask k (the element is copied from src when mask bit 0 is not set), and copies the upper three packed elements from b to the upper destination elements. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.
_mm_maskz_getmant_ss
extern __m128 __cdecl _mm_maskz_getmant_ss(__mmask8 k, __m128 a, __m128 b, _MM_MANTISSA_NORM_ENUM interval, _MM_MANTISSA_SIGN_ENUM sign);
Normalizes the mantissas of the lower float32 element in a, stores the result in the lower destination element using zeromask k (the element is zeroed out when mask bit 0 is not set), and copies the upper three packed elements from b to the upper destination elements. This intrinsic essentially calculates ±(2^k)*|x.significand|, where k depends on the interval range defined by interval and the sign depends on sign and the source sign.