Visible to Intel only — GUID: GUID-6CB0486D-2A61-474E-A259-B4B0F64178A3
Visible to Intel only — GUID: GUID-6CB0486D-2A61-474E-A259-B4B0F64178A3
LayerNormBackward
General
LayerNormBackward performs the backward of LayerNorm operation.
The backward propagation computes , , and based on , , , , , and .
The tensors marked with an asterisk are used only when the operation is configured to use , and
Operation attributes
Attribute Name |
Description |
Value Type |
Supported Values |
Required or Optional |
---|---|---|---|---|
begin_norm_axis is used to indicate which axis to start layer normalization. The normalization is from begin_norm_axis to last dimension. Negative values means indexing from right to left. This op normalizes over the last dimension by default, e.g. C in TNC for 3D and LDNC for 4D. |
s64 |
[-r,r-1],where r=rank(src). -1 is default |
Optional |
|
When set to True, this module has learnable per-element affine parameters. |
bool |
false , true (default) |
Optional |
|
The constant to improve numerical stability. |
f32 |
Arbitrary positive f32 value, 1e-5 (default) |
Optional |
Execution arguments
The inputs and outputs must be provided according to below index order when constructing an operation.
Inputs
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
src |
Required |
1 |
diff_dst |
Required |
2 |
mean |
Required |
3 |
variance |
Required |
4 |
gamma |
Optional |
5 |
beta |
Optional |
Outputs
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
diff_src |
Required |
1 |
diff_gamma |
Optional |
2 |
diff_beta |
Optional |
Supported data types
LayerNormBackward operation supports the following data type combinations.
Src / Diff_dst / Diff_src |
Gamma / Beta / Mean / Variance / Diff_gamma / Diff_beta |
---|---|
f32 |
f32 |
bf16 |
f32, bf16 |
f16 |
f32 |