Intrinsics for Rounding Operations (512-bit)

Intel® C++ Compiler Classic Developer Guide and Reference

Download PDF

ID 767249

Date 12/16/2022

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-66BDAB9B-D878-418F-8BC7-199BC88FFD3F

View Details

Intrinsics for Rounding Operations (512-bit)

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the zmmintrin.h header file.

To use these intrinsics, include the immintrin.h file as follows:

#include <immintrin.h>

Intrinsic Name	Operation	Corresponding Intel® AVX-512 Instruction
`_mm512_ceil_pd`, `_mm512_mask_ceil_pd`	Rounds float64 vector elements to nearest upper integer.	None.
`_mm512_ceil_ps`, `_mm512_mask_ceil_ps`	Rounds float32 vector elements to nearest upper integer.	None.
`_mm512_floor_pd`, `_mm512_mask_floor_pd`	Rounds float64 vector elements to nearest lower integer.	None.
`_mm512_floor_ps`, `_mm512_mask_floor_ps`	Rounds float32 vector elements to nearest lower integer.	None.
`_mm512_nearbyint_pd`, `_mm512_mask_nearbyint_pd`	Rounds float64 vector elements to nearest integer in floating point format.	None.
`_mm512_nearbyint_ps`, `_mm512_mask_nearbyint_ps`	Rounds float32 vector elements to nearest integer in floating point format.	None.
`_mm512_rint_pd`, `_mm512_mask_rint_pd`	Rounds float64 vector elements to nearest even integer.	None.
`_mm512_rint_ps`, `_mm512_mask_rint_ps`	Rounds float32 vector elements to nearest even integer.	None.
`_mm512_svml_round_pd`, `_mm512_mask_svml_round_pd`	Rounds float64 vector elements to nearest integer.	None.
`_mm512_trunc_pd`, `_mm512_mask_trunc_pd`	Rounds float64 vector elements to nearest integer not larger in absolute value.	None.
`_mm512_trunc_ps`, `_mm512_mask_trunc_ps`	Rounds float32 vector elements to nearest integer not larger in absolute value.	None.

variable	definition
`k`	writemask used as a selector
`a`	first source vector element
`src`	source element to use based on writemask result

_mm512_ceil_pd

extern __m512d __cdecl _mm512_ceil_pd(__m512d a);

Rounds off the elements of float64 vector a to the nearest upper integer value.

_mm512_mask_ceil_pd

extern __m512d __cdecl _mm512_mask_ceil_pd(__m512d src, __mmask8 k, __m512d a);

Rounds off the elements of float64 vector a to the nearest upper integer value, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

_mm512_ceil_ps

extern __m512 __cdecl _mm512_ceil_ps(__m512 a);

Rounds off the elements of float32 vector a to the nearest upper integer value.

_mm512_mask_ceil_ps

extern __m512 __cdecl _mm512_mask_ceil_ps(__m512 src, __mmask16 k, __m512 a);

Rounds off the elements of float32 vector a to the nearest upper integer value, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

_mm512_floor_pd

extern __m512d __cdecl _mm512_floor_pd(__m512d a);

Rounds off the elements of float64 vector a to the nearest lower integer value.

_mm512_mask_floor_pd

extern __m512d __cdecl _mm512_mask_floor_pd(__m512d src, __mmask8 k, __m512d a);

Rounds off the elements of float64 vector a to the nearest lower integer value, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

_mm512_floor_ps

extern __m512 __cdecl _mm512_floor_ps(__m512 a);

Rounds off the elements of float32 vector a to the nearest lower integer value.

_mm512_mask_floor_ps

extern __m512 __cdecl _mm512_mask_floor_ps(__m512 src, __mmask16 k, __m512 a);

Rounds off the elements of float32 vector a to the nearest lower integer value, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

_mm512_nearbyint_pd

extern __m512d __cdecl _mm512_nearbyint_pd(__m512d a);

Rounds off the elements of float64 vector a to the nearest integer value in floating point format without raising the inexact exception.

_mm512_mask_nearbyint_pd

extern __m512d __cdecl _mm512_mask_nearbyint_pd(__m512d src, __mmask8 k, __m512d a);

Rounds off the elements of float64 vector a to the nearest integer value in floating point format without raising the inexact exception, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

_mm512_nearbyint_ps

extern __m512 __cdecl _mm512_nearbyint_ps(__m512 a);

Rounds off the elements of float32 vector a to the nearest integer value in floating point format without raising the inexact exception.

_mm512_mask_nearbyint_ps

extern __m512 __cdecl _mm512_mask_nearbyint_ps(__m512 src, __mmask16 k, __m512 a);

Rounds off the elements of float32 vector a to the nearest integer value in floating point format without raising the inexact exception, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

_mm512_rint_pd

extern __m512d __cdecl _mm512_rint_pd(__m512d a);

Rounds off the elements of float64 vector a to the nearest even integer value.

_mm512_mask_rint_pd

extern __m512d __cdecl _mm512_mask_rint_pd(__m512d src, __mmask8 k, __m512d a);

Rounds off the elements of float64 vector a to the nearest even integer value, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

_mm512_rint_ps

extern __m512 __cdecl _mm512_rint_ps(__m512 a);

Rounds off the elements of float32 vector a to the nearest even integer value.

_mm512_mask_rint_ps

extern __m512 __cdecl _mm512_mask_rint_ps(__m512 src, __mmask16 k, __m512 a);

Rounds off the elements of float32 vector a to the nearest even integer value, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

_mm512_svml_round_pd

extern __m512d __cdecl _mm512_svml_round_pd(__m512d a);

Rounds off the elements of vector a to the nearest integer value. This intrinsic rounds the halfway cases away from zero regardless of the current rounding direction, instead of to the nearest even integer like the _mm512_rint_pd intrinsic.

_mm512_mask_svml_round_pd

extern __m512d __cdecl _mm512_mask_svml_round_pd(__m512d src, __mmask8 k, __m512d a);

The result is stored using writemask k (elements are copied from src when the corresponding mask bit is not set)

_mm512_trunc_pd

extern __m512d __cdecl _mm512_trunc_pd(__m512d a);

Rounds off the elements of float64 vector a to the nearest integer value which is not larger in absolute value.

_mm512_mask_trunc_pd

extern __m512d __cdecl _mm512_mask_trunc_pd(__m512d src, __mmask8 k, __m512d a);

Rounds off the elements of float64 vector a to the nearest integer value which is not larger in absolute value, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

_mm512_trunc_ps

extern __m512 __cdecl _mm512_trunc_ps(__m512 a);

Rounds off the elements of float32 vector a to the nearest integer value which is not larger in absolute value.

_mm512_mask_trunc_ps

extern __m512 __cdecl _mm512_mask_trunc_ps(__m512 src, __mmask16 k, __m512 a);

Rounds off the elements of float32 vector a to the nearest integer value which is not larger in absolute value, and stores the result using writemask k (elements are copied from src when the corresponding mask bit is not set).

Parent topic: Intrinsics for Short Vector Math Library Operations (SVML)

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® C++ Compiler Classic Developer Guide and Reference

Intrinsics for Rounding Operations (512-bit)