struct dnnl::primitive

Intel® oneAPI Deep Neural Network Developer Guide and Reference

Download PDF

ID 768875

Date 2/28/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-B1B549E8-B101-4E64-989A-FFB75B25E48C

View Details

struct dnnl::primitive_attr

Overview

Primitive attributes. More…


#include <dnnl.hpp>

struct primitive_attr: public dnnl::handle
{
    // construction

    primitive_attr();
    primitive_attr(dnnl_primitive_attr_t attr);

    // methods

    fpmath_mode get_fpmath_mode() const;
    void set_fpmath_mode(fpmath_mode mode);
    scratchpad_mode get_scratchpad_mode() const;
    void set_scratchpad_mode(scratchpad_mode mode);
    void set_scales_mask(int arg, int mask);
    void set_zero_points_mask(int arg, int mask);
    const post_ops get_post_ops() const;
    void set_post_ops(const post_ops ops);
    void set_rnn_data_qparams(float scale, float shift);
    void get_rnn_data_qparams(float& scale, float& shift);
    void set_rnn_weights_qparams(int mask, const std::vector<float>& scales);
    void get_rnn_weights_qparams(int& mask, std::vector<float>& scales);

    void set_rnn_weights_projection_qparams(
        int mask,
        const std::vector<float>& scales
        );

    void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales);
};

Inherited Members


public:
    // methods

    handle<T, traits>& operator = (const handle<T, traits>&);
    handle<T, traits>& operator = (handle<T, traits>&&);
    void reset(T t, bool weak = false);
    T get(bool allow_empty = false) const;
    operator T () const;
    operator bool () const;
    bool operator == (const handle<T, traits>& other) const;
    bool operator != (const handle& other) const;

Detailed Documentation

Primitive attributes.

See also:

Primitive Attributes

Construction


primitive_attr()

Constructs default (empty) primitive attributes.


primitive_attr(dnnl_primitive_attr_t attr)

Creates primitive attributes from a C API dnnl_primitive_attr_t handle.

The resulting handle is not weak and the C handle will be destroyed during the destruction of the C++ object.

Parameters:

attr	The C API primitive attributes.

Methods


fpmath_mode get_fpmath_mode() const

Returns the fpmath mode.


void set_fpmath_mode(fpmath_mode mode)

Sets fpmath mode.

Parameters:

mode	Specified fpmath mode.


scratchpad_mode get_scratchpad_mode() const

Returns the scratchpad mode.


void set_scratchpad_mode(scratchpad_mode mode)

Sets scratchpad mode.

Parameters:

mode	Specified scratchpad mode.


void set_scales_mask(int arg, int mask)

Sets scaling factors for primitive operations for a given memory argument.

The scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_SCALES | arg.

Parameters:

arg	Parameter argument index as passed to the primitive::execute() call.
mask	Scaling factors correspondence mask that defines the correspondence between the tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.

See also:

dnnl_primitive_attr_set_scales_mask


void set_zero_points_mask(int arg, int mask)

Sets zero points for primitive operations for a given memory argument.

The zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS | arg.

Parameters:

arg	Parameter argument index as passed to the primitive::execute() call.
mask	Zero point correspondence mask that defines the correspondence between the tensor dimensions and the `zero_points` vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor.


const post_ops get_post_ops() const

Returns post-ops previously set via set_post_ops().

Returns:

Post-ops.


void set_post_ops(const post_ops ops)

Sets post-ops.

NOTE:

There is no way to check whether the post-ops would be supported by the target primitive. Any error will be reported by the respective primitive descriptor constructor.

Parameters:

ops	Post-ops object to copy post-ops from.


void set_rnn_data_qparams(float scale, float shift)

Sets quantization scale and shift parameters for RNN data tensors.

For performance reasons, the low-precision configuration of the RNN primitives expect input activations to have the unsigned 8-bit integer data type. The scale and shift parameters are used to quantize floating-point data to unsigned integer and must be passed to the RNN primitive using attributes.

The quantization formula is scale * data + shift.

Example usage:


// RNN parameters
int l = 2, t = 2, mb = 32, sic = 32, slc = 32, dic = 32, dlc = 32;
// Activations quantization parameters
float scale = 63.f, shift = 64.f;

primitive_attr attr;

// Set scale and shift for int8 quantization of activation
attr.set_rnn_data_qparams(scale, shift);

// Create an RNN primitive descriptor.
vanilla_rnn_forward::primitive_desc rnn_d(
        engine, /* arguments */, attr);

NOTE:

Quantization scale and shift are common for src_layer, src_iter, dst_iter, and dst_layer.

Parameters:

scale	The value to scale the data by.
shift	The value to shift the data by.


void get_rnn_data_qparams(float& scale, float& shift)

Returns the quantization scale and shift parameters for RNN data tensors.

NOTE:

Quantization scale and shift are common for src_layer, src_iter, dst_iter, and dst_layer.

Parameters:

scale	The value to scale the data by.
shift	The value to shift the data by.


void set_rnn_weights_qparams(int mask, const std::vector<float>& scales)

Sets quantization scaling factors for RNN weights tensors.

The low-precision configuration of the RNN primitives expect input weights to use the signed 8-bit integer data type. The scaling factors are used to quantize floating-point data to signed integer and must be passed to RNN primitives using attributes.

NOTE:

The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.

NOTE:

Quantization scales are common for weights_layer and weights_iteration

Parameters:

mask	Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scales	Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor.


void get_rnn_weights_qparams(int& mask, std::vector<float>& scales)

Returns the quantization scaling factors for RNN projection weights tensors.

NOTE:

The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.

Parameters:

mask	Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scales	Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor.


void set_rnn_weights_projection_qparams(
    int mask,
    const std::vector<float>& scales
    )

Sets quantization scaling factors for RNN projection weights tensors.

passed to RNN primitives using attributes.

NOTE:

The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.

NOTE:

Quantization scales are common for weights_layer and weights_iteration

Parameters:

mask	Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scales	Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor.


void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales)

Returns the quantization scaling factors for RNN projection weights tensors.

NOTE:

The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.

Parameters:

mask	Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scales	Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Deep Neural Network Developer Guide and Reference

struct dnnl::primitive_attr

Overview

Detailed Documentation