Visible to Intel only — GUID: GUID-A90DF672-7FB2-444A-BB2D-3E2C03E64268
Visible to Intel only — GUID: GUID-A90DF672-7FB2-444A-BB2D-3E2C03E64268
Intrinsics for Converting Half Floats
The half-float or 16-bit float is a popular type in some application domains. The half-float type is regarded as a storage type because although data is often stored as a half-float, computation is never done on values in these type. Usually values are converted to regular 32-bit floats before any computation.
Support for half-float type is restricted to just conversions to/from 32-bit floats. The main benefits of using half float type are:
- reduced storage requirements
- less consumption of memory bandwidth and cache
- accuracy and precision adequate for many applications
Half Float Intrinsics
The half-float intrinsics are provided to convert half-float values to 32-bit floats for computation purposes and, conversely, 32-bit float values to half-float values for data storage purposes.
The intrinsics are translated into library calls that do the actual conversions.
The half-float intrinsics are available on IA-32 and Intel® 64 architectures running supported operating systems. The minimum processor requirement is an Intel® Pentium 4 processor and an operating system supporting Intel® Streaming SIMD Extensions 2 (Intel® SSE2) instructions.
Role of Immediate Byte in Half Float Intrinsic Operations
For all half-float intrinsics an immediate byte controls rounding mode, flush to zero, and other non-volatile set values. The format of the imm8 byte is as shown in the diagram below.
The imm8 value is used for special MXCSR overrides.
In the diagram,
- MBZ = Most significant Bit is Zero; used for error checking
- MS1 = 1 : use MXCSR RC, else use imm8.RC
- SAE = 1 : all exceptions are suppressed
- MS2 = 1 : use MXCSR FTZ/DAZ control, else use imm8.FTZ/DAZ.
The compiler passes the bits to the library function, with error checking - the most significant bit must be zero.