Visible to Intel only — GUID: GUID-D61ECC2B-BF98-4D01-8546-9ABC89B87040
Visible to Intel only — GUID: GUID-D61ECC2B-BF98-4D01-8546-9ABC89B87040
DynamicDequantize
General
The Dynamic Dequantize operation converts a quantized (s4, u4, s8, or u8) tensor to an bf16, f16 or f32 tensor. It supports per-tensor, per-channel, and per-group asymmetric linear de-quantization. The rounding mode is defined by the library implementation. Unlike the Dequantize, Dynamic Dequantize takes scales and zero-points as operator src tensors.
For per-tensor de-quantization

For per-channel de-quantization, taking channel axis = 1 as an example:

For per-group de-quantization, let’s take group shape = Gx1 as an example. It indicates that one scaling factor will de adopted for G elements in the src tensor. On the dimensions where group quantization is adopted, make channelNum equal to the dimension of src and groupNum equal to channelNum/group size:

Where:

On other dimensions:

Operation attributes
Attribute Name |
Description |
Value Type |
Supported Values |
Required or Optional |
---|---|---|---|---|
Specifies which de-quantization type is used. |
string |
per_tensor (default), per_channel |
Optional |
|
Specifies dimension on which per-channel de-quantization is applied. |
s64 |
An s64 value in the range of [-r, r-1] where r = rank(src), 1 by default. Negative values mean counting the dimension backwards from the end. |
Optional |
|
Specifies the group shape of an operation. |
s64 |
An s64 list indicates the group size on the dimensions where grouped quantization is adopted. |
Optional |
Execution arguments
The inputs and outputs must be provided according to below index order when constructing an operation.
Inputs
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
src |
Required |
1 |
scales |
Required |
2 |
zps |
Optional |
Outputs
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
dst |
Required |
Supported data types
DynamicDequantize operation supports the following data type combinations.
Src |
Dst |
Scales |
Zps |
---|---|---|---|
s8 |
f16, bf16, f32 |
f16, bf16, f32 |
s8, u8, s32 |
u8 |
f16, bf16, f32 |
f16, bf16, f32 |
s8, u8, s32 |
s4 |
f16, bf16, f32 |
f16, bf16, f32 |
s4, u4, s32 |
u4 |
f16, bf16, f32 |
f16, bf16, f32 |
s4, u4, s32 |
It’s expected that the data types of scales and dst should be the same.