Visible to Intel only — GUID: GUID-C51DC9C6-30CF-41F9-A0BB-BA9E0F08F197
Visible to Intel only — GUID: GUID-C51DC9C6-30CF-41F9-A0BB-BA9E0F08F197
struct dnnl::primitive_attr
Overview
Primitive attributes. More…
#include <dnnl.hpp> struct primitive_attr: public dnnl::handle { // construction primitive_attr(); primitive_attr(dnnl_primitive_attr_t attr); // methods fpmath_mode get_fpmath_mode() const; void set_fpmath_mode(fpmath_mode mode); scratchpad_mode get_scratchpad_mode() const; void set_scratchpad_mode(scratchpad_mode mode); void set_scales_mask(int arg, int mask); void set_zero_points_mask(int arg, int mask); const post_ops get_post_ops() const; void set_post_ops(const post_ops ops); void set_rnn_data_qparams(float scale, float shift); void get_rnn_data_qparams(float& scale, float& shift); void set_rnn_weights_qparams(int mask, const std::vector<float>& scales); void get_rnn_weights_qparams(int& mask, std::vector<float>& scales); void set_rnn_weights_projection_qparams( int mask, const std::vector<float>& scales ); void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales); };
Inherited Members
public: // methods handle<T, traits>& operator = (const handle<T, traits>&); handle<T, traits>& operator = (handle<T, traits>&&); void reset(T t, bool weak = false); T get(bool allow_empty = false) const; operator T () const; operator bool () const; bool operator == (const handle<T, traits>& other) const; bool operator != (const handle& other) const;
Detailed Documentation
Primitive attributes.
See also:
Construction
primitive_attr()
Constructs default (empty) primitive attributes.
primitive_attr(dnnl_primitive_attr_t attr)
Creates primitive attributes from a C API dnnl_primitive_attr_t handle.
The resulting handle is not weak and the C handle will be destroyed during the destruction of the C++ object.
Parameters:
attr |
The C API primitive attributes. |
Methods
fpmath_mode get_fpmath_mode() const
Returns the fpmath mode.
void set_fpmath_mode(fpmath_mode mode)
Sets fpmath mode.
Parameters:
mode |
Specified fpmath mode. |
scratchpad_mode get_scratchpad_mode() const
Returns the scratchpad mode.
void set_scratchpad_mode(scratchpad_mode mode)
Sets scratchpad mode.
Parameters:
mode |
Specified scratchpad mode. |
void set_scales_mask(int arg, int mask)
Sets scaling factors for primitive operations for a given memory argument.
The scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_SCALES | arg.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Scaling factors correspondence mask that defines the correspondence between the tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
See also:
dnnl_primitive_attr_set_scales_mask
void set_zero_points_mask(int arg, int mask)
Sets zero points for primitive operations for a given memory argument.
The zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS | arg.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Zero point correspondence mask that defines the correspondence between the tensor dimensions and the zero_points vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor. |
See also:
dnnl_primitive_attr_set_zero_points_mask
const post_ops get_post_ops() const
Returns post-ops previously set via set_post_ops().
Returns:
Post-ops.
void set_post_ops(const post_ops ops)
Sets post-ops.
Parameters:
ops |
Post-ops object to copy post-ops from. |
void set_rnn_data_qparams(float scale, float shift)
Sets quantization scale and shift parameters for RNN data tensors.
For performance reasons, the low-precision configuration of the RNN primitives expect input activations to have the unsigned 8-bit integer data type. The scale and shift parameters are used to quantize floating-point data to unsigned integer and must be passed to the RNN primitive using attributes.
The quantization formula is scale * data + shift.
Example usage:
// RNN parameters int l = 2, t = 2, mb = 32, sic = 32, slc = 32, dic = 32, dlc = 32; // Activations quantization parameters float scale = 63.f, shift = 64.f; primitive_attr attr; // Set scale and shift for int8 quantization of activation attr.set_rnn_data_qparams(scale, shift); // Create an RNN primitive descriptor. vanilla_rnn_forward::primitive_desc rnn_d( engine, /* arguments */, attr);
Parameters:
scale |
The value to scale the data by. |
shift |
The value to shift the data by. |
void get_rnn_data_qparams(float& scale, float& shift)
Returns the quantization scale and shift parameters for RNN data tensors.
Parameters:
scale |
The value to scale the data by. |
shift |
The value to shift the data by. |
void set_rnn_weights_qparams(int mask, const std::vector<float>& scales)
Sets quantization scaling factors for RNN weights tensors.
The low-precision configuration of the RNN primitives expect input weights to use the signed 8-bit integer data type. The scaling factors are used to quantize floating-point data to signed integer and must be passed to RNN primitives using attributes.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales |
Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor. |
void get_rnn_weights_qparams(int& mask, std::vector<float>& scales)
Returns the quantization scaling factors for RNN projection weights tensors.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales |
Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor. |
void set_rnn_weights_projection_qparams( int mask, const std::vector<float>& scales )
Sets quantization scaling factors for RNN projection weights tensors.
passed to RNN primitives using attributes.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales |
Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor. |
void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales)
Returns the quantization scaling factors for RNN projection weights tensors.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales |
Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor. |