Visible to Intel only — GUID: GUID-F92472CC-40AA-467A-BCB1-F011FB821DA1
Visible to Intel only — GUID: GUID-F92472CC-40AA-467A-BCB1-F011FB821DA1
_mm_i32gather_ps, _mm256_i32gather_ps
Gathers 2/4 packed single-precision floating point values from memory referenced by the given base address, dword indices, and scale. The corresponding Intel® AVX2 instruction is VGATHERDPS.
Syntax
extern __m128 _mm_mask_i32gather_ps(float const * base, __m128i vindex, const int scale); |
extern __m256 _mm256_mask_i32gather_ps(float const * base, __m256i vindex, const int scale); |
Arguments
base |
the base address used to reference the loaded FP elements. |
vindex |
the vector of dword indices used to reference the loaded FP elements. |
scale |
The compilation time literal constant, which is used as the vector indices scale to address the loaded elements. Possible values are one of the following: 1, 2, 4, 8. |
Description
The intrinsics load 2/4 packed single-precision floating-point values from memory using dword indices.
Below is the pseudo-code for the intrinsics:
_mm_i32gather_ps():
result[31:0] = mem[base+vindex[31:0]*scale]; result[63:32] = mem[base+vindex[63:32]*scale]; result[95:64] = mem[base+vindex[95:64]*scale]; result127:96] = mem[base+vindex[127:96]*scale];
_mm256_i32gather_ps():
result[31:0] = mem[base+vindex[31:0]*scale]; result[63:32] = mem[base+vindex[63:32]*scale]; result[95:64] = mem[base+vindex[95:64]*scale]; result127:96] = mem[base+vindex[127:96]*scale]; result[159:128] = mem[base+vindex[159:128]*scale]; result[191:160] = mem[base+vindex[191:160]*scale]; result[223:192] = mem[base+vindex[223:192]*scale]; result[255:224] = mem[base+vindex[255:224]*scale];
Returns
A 128/256-bit vector with unconditionally gathered single-precision FP values.