Quality Metrics for Multi-class Classification Algorithms

Intel® oneAPI Data Analytics Library Developer Guide and Reference

Download PDF

ID 772611

Date 3/22/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-CAEDD026-2B71-46E2-8682-33BF6FA77C2D

View Details

Quality Metrics for Multi-class Classification Algorithms

For l classes , given a vector of class labels computed at the prediction stage of the classification algorithm and a vector of expected class labels, the problem is to evaluate the classifier by computing the confusion matrix and connected quality metrics: precision, error rate, and so on.

QualityMetricsId for multi-class classification is confusionMatrix.

Details

Further definitions use the following notations:

Notations for Quality Metrics for Multi-class Classification Algorithms
	true positive	the number of correctly recognized observations for class
	true negative	the number of correctly recognized observations that do not belong to the class
	false positive	the number of observations that were incorrectly assigned to the class
	false negative	the number of observations that were not recognized as belonging to the class

The library uses the following quality metrics for multi-class classifiers:

Definitions of Quality Metrics for Multi-class Classification Algorithms
Quality Metric	Definition
Average accuracy
Error rate
Micro precision ()
Micro recall ()
Micro F-score ()
Macro precision ()
Macro recall ()
Macro F-score ()

For more details of these metrics, including the evaluation focus, refer to [Sokolova09].

The following is the confusion matrix:

Confusion Matrix for Multi-class Classification Algorithms
	Classified as Class		Classified as Class		Classified as Class
Actual Class

Actual Class

Actual Class

The positives and negatives are defined through elements of the confusion matrix as follows:

Batch Processing

Algorithm Input

The quality metric algorithm for multi-class classifiers accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.

Algorithm Input for Quality Metrics for Multi-class Classification Algorithms (Batch Processing)
Input ID	Input
`predictedLabels`	Pointer to the numeric table that contains labels computed at the prediction stage of the classification algorithm. This input can be an object of any class derived from `NumericTable` except `PackedSymmetricMatrix`, `PackedTriangularMatrix`, and `CSRNumericTable`.
`groundTruthLabels`	Pointer to the numeric table that contains expected labels. This input can be an object of any class derived from NumericTable except `PackedSymmetricMatrix`, `PackedTriangularMatrix`, and `CSRNumericTable`.

Algorithm Parameters

The quality metric algorithm has the following parameters:

Algorithm Parameters for Quality Metrics for Multi-class Classification Algorithms (Batch Processing)
Parameter	Default Value	Description
`algorithmFPType`	`float`	The floating-point type that the algorithm uses for intermediate computations. Can be `float` or `double`.
`method`	`defaultDense`	Performance-oriented computation method, the only method supported by the algorithm.
`nClasses`	0	The number of classes (l).
`useDefaultMetrics`	`true`	A flag that defines a need to compute the default metrics provided by the library.
`beta`	1	The parameter of the F-score quality metric provided by the library.

Algorithm Output

The quality metric algorithm calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.

Algorithm Output for Quality Metrics for Multi-class Classification Algorithms (Batch Processing)
Result ID	Result
`confusionMatrix`	Pointer to the numeric table with the confusion matrix. NOTE: By default, this result is an object of the `HomogenNumericTable` class, but you can define the result as an object of any class derived from NumericTable except `PackedTriangularMatrix`, `PackedSymmetricMatrix`, and `CSRNumericTable`.
`multiClassMetrics`	Pointer to the numeric table that contains quality metrics, which you can access by an appropriate Multi-class Metrics ID: `averageAccuracy` - average accuracy `errorRate` - error rate `microPrecision` - micro precision `microRecall` - micro recall `microFscore` - micro F-score `macroPrecision` - macro precision `macroRecall` - macro recall `macroFscore` - macro F-score NOTE: By default, this result is an object of the `HomogenNumericTable` class, but you can define the result as an object of any class derived from NumericTable except `PackedTriangularMatrix`, `PackedSymmetricMatrix`, and `CSRNumericTable`.