Intel® oneAPI Deep Neural Network Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
Visible to Intel only — GUID: GUID-DFA5410D-ACB0-475D-9D1C-8AC3FE9F3E0A
Visible to Intel only — GUID: GUID-DFA5410D-ACB0-475D-9D1C-8AC3FE9F3E0A
struct dnnl::primitive_attr
Overview
Primitive attributes. More…
#include <dnnl.hpp> struct primitive_attr: public dnnl::handle { // construction primitive_attr(); primitive_attr(dnnl_primitive_attr_t attr); // methods fpmath_mode get_fpmath_mode() const; void get_fpmath_mode(fpmath_mode& mode, bool& apply_to_int) const; void set_fpmath_mode(fpmath_mode mode, bool apply_to_int = false); accumulation_mode get_accumulation_mode() const; void set_accumulation_mode(accumulation_mode mode); bool get_deterministic() const; void set_deterministic(bool value); scratchpad_mode get_scratchpad_mode() const; void set_scratchpad_mode(scratchpad_mode mode); void set_scales_mask(int arg, int mask); void set_scales( int arg, int mask, const memory::dims& groups, memory::data_type data_type = memory::data_type::f32 ); void set_zero_points_mask(int arg, int mask); void set_zero_points( int arg, int mask, const memory::dims& groups, memory::data_type data_type = memory::data_type::s32 ); const post_ops get_post_ops() const; void set_post_ops(const post_ops ops); void set_rnn_data_qparams(float scale, float shift); void get_rnn_data_qparams(float& scale, float& shift); void set_rnn_weights_qparams(int mask, const std::vector<float>& scales); void get_rnn_weights_qparams(int& mask, std::vector<float>& scales); void set_rnn_weights_projection_qparams( int mask, const std::vector<float>& scales ); void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales); };
Inherited Members
public: // methods handle<T, traits>& operator = (const handle<T, traits>&); handle<T, traits>& operator = (handle<T, traits>&&); void reset(T t, bool weak = false); T get(bool allow_empty = false) const; operator T () const; operator bool () const; bool operator == (const handle<T, traits>& other) const; bool operator != (const handle& other) const;
Detailed Documentation
Primitive attributes.
See also:
Construction
primitive_attr()
Constructs default (empty) primitive attributes.
primitive_attr(dnnl_primitive_attr_t attr)
Creates primitive attributes from a C API dnnl_primitive_attr_t handle.
The resulting handle is not weak and the C handle will be destroyed during the destruction of the C++ object.
Parameters:
attr |
The C API primitive attributes. |
Methods
fpmath_mode get_fpmath_mode() const
Returns the fpmath mode.
void get_fpmath_mode(fpmath_mode& mode, bool& apply_to_int) const
Returns the fpmath mode.
Parameters:
mode |
Specified fpmath mode. |
apply_to_int |
Use floating-point arithmetic for integer primitives. |
void set_fpmath_mode(fpmath_mode mode, bool apply_to_int = false)
Sets fpmath mode.
Parameters:
mode |
Specified fpmath mode. |
apply_to_int |
Boolean. Use of floating-point arithmetic for integer primitives. |
accumulation_mode get_accumulation_mode() const
Returns the accumulation mode.
void set_accumulation_mode(accumulation_mode mode)
Sets accumulation mode.
Parameters:
mode |
Specified accumulation mode. |
bool get_deterministic() const
Returns the deterministic attribute value.
void set_deterministic(bool value)
Sets deterministic attribute value.
Parameters:
value |
Specified deterministic mode. |
scratchpad_mode get_scratchpad_mode() const
Returns the scratchpad mode.
void set_scratchpad_mode(scratchpad_mode mode)
Sets scratchpad mode.
Parameters:
mode |
Specified scratchpad mode. |
void set_scales_mask(int arg, int mask)
Sets scaling factors for primitive operations for a given memory argument.
The scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_SCALES | arg.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Scaling factors correspondence mask that defines the correspondence between the tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
See also:
dnnl_primitive_attr_set_scales_mask
void set_scales( int arg, int mask, const memory::dims& groups, memory::data_type data_type = memory::data_type::f32 )
Sets scaling factors for primitive operations for a given memory argument.
The scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_SCALES | arg.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Scales correspondence mask that defines the correspondence between the tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scale is used for each index along that dimension. Set the mask to 0 to use a common scale for the whole output tensor. |
groups |
Scaling factors correspondence groups that define the correspondence between the tensor dimensions and the scales array. The set i-th dimension indicates a number of groups of scaling factors used for that logical dimension in a memory indicated by arg. |
See also:
dnnl_primitive_attr_set_scales
void set_zero_points_mask(int arg, int mask)
Sets zero points for primitive operations for a given memory argument.
The zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS | arg.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Zero point correspondence mask that defines the correspondence between the tensor dimensions and the zero_points vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor. |
See also:
dnnl_primitive_attr_set_zero_points_mask
void set_zero_points( int arg, int mask, const memory::dims& groups, memory::data_type data_type = memory::data_type::s32 )
Sets zero points for primitive operations for a given memory argument.
The zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS | arg.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Zero point correspondence mask that defines the correspondence between the tensor dimensions and the zero_points vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor. |
groups |
Zero point factors correspondence groups that define the correspondence between the tensor dimensions and the zero_points array. The set i-th dimension indicates a number of groups of zero point factors used for that logical dimension in a memory indicated by arg. |
See also:
dnnl_primitive_attr_set_zero_points
const post_ops get_post_ops() const
Returns post-ops previously set via set_post_ops().
Returns:
Post-ops.
void set_post_ops(const post_ops ops)
Sets post-ops.
Parameters:
ops |
Post-ops object to copy post-ops from. |
void set_rnn_data_qparams(float scale, float shift)
Sets quantization scale and shift parameters for RNN data tensors.
For performance reasons, the low-precision configuration of the RNN primitives expect input activations to have the unsigned 8-bit integer data type. The scale and shift parameters are used to quantize floating-point data to unsigned integer and must be passed to the RNN primitive using attributes.
The quantization formula is scale * data + shift.
Example usage:
// RNN parameters int l = 2, t = 2, mb = 32, sic = 32, slc = 32, dic = 32, dlc = 32; // Activations quantization parameters float scale = 63.f, shift = 64.f; primitive_attr attr; // Set scale and shift for int8 quantization of activation attr.set_rnn_data_qparams(scale, shift); // Create an RNN primitive descriptor. vanilla_rnn_forward::primitive_desc rnn_d( engine, /* arguments */, attr);
Parameters:
scale |
The value to scale the data by. |
shift |
The value to shift the data by. |
void get_rnn_data_qparams(float& scale, float& shift)
Returns the quantization scale and shift parameters for RNN data tensors.
Parameters:
scale |
The value to scale the data by. |
shift |
The value to shift the data by. |
void set_rnn_weights_qparams(int mask, const std::vector<float>& scales)
Sets quantization scaling factors for RNN weights tensors.
The low-precision configuration of the RNN primitives expect input weights to use the signed 8-bit integer data type. The scaling factors are used to quantize floating-point data to signed integer and must be passed to RNN primitives using attributes.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales |
Constant vector of output scaling factors. The following equality must hold: |
void get_rnn_weights_qparams(int& mask, std::vector<float>& scales)
Returns the quantization scaling factors for RNN projection weights tensors.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales |
Constant vector of output scaling factors. The following equality must hold: |
void set_rnn_weights_projection_qparams( int mask, const std::vector<float>& scales )
Sets quantization scaling factors for RNN projection weights tensors.
passed to RNN primitives using attributes.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales |
Constant vector of output scaling factors. The following equality must hold: |
void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales)
Returns the quantization scaling factors for RNN projection weights tensors.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales |
Constant vector of output scaling factors. The following equality must hold: |