Cross-entropy Loss

Intel® oneAPI Data Analytics Library Developer Guide and Reference

Download PDF

ID 772611

Date 12/16/2022

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-5CAFDDD4-D967-4579-9CE6-BA3AA620501E

View Details

Cross-entropy Loss

Cross-entropy loss is an objective function minimized in the process of logistic regression training when a dependent variable takes more than two values.

Details

Given n feature vectors of np-dimensional feature vectors, a vector of class labels , where describes the class, to which the feature vector belongs, where T is the number of classes, optimization solver optimizes cross-entropy loss objective function by argument , it is a matrix of size . The cross entropy loss objective function has the following format where

, with and , ,

For a given set of indices , , , the value and the gradient of the sum of functions in the argument X respectively have the format:

where

Hessian matrix is a symmetric matrix of size , where

, where is the learning rate

For more details, see [Hastie2009].

Computation

Algorithm Input

The cross entropy loss algorithm accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.

Algorithm Input for Cross-entropy Loss Computaion
Input ID	Input
`argument`	A numeric table of size with the input argument of the objective function. NOTE: The sizes of the argument, gradient, and hessian numeric tables do not depend on `interceptFlag`. When `interceptFlag` is set to `false`, the computation of value is skipped, but the sizes of the tables should remain the same.
`data`	A numeric table of size with the data . NOTE: This parameter can be an object of any class derived from `NumericTable`.
`dependentVariables`	A numeric table of size with dependent variables . NOTE: This parameter can be an object of any class derived from `NumericTable`, except for `PackedTriangularMatrix` , `PackedSymmetricMatrix` , and `CSRNumericTable`.

Algorithm Parameters

The cross entropy loss algorithm has the following parameters. Some of them are required only for specific values of the computation method’s parameter method:

Algorithm Parameters for Cross-entropy Loss Computaion
Parameter	Default value	Description
`algorithmFPType`	`float`	The floating-point type that the algorithm uses for intermediate computations. Can be `float` or `double`.
`method`	`defaultDense`	Performance-oriented computation method.
`numberOfTerms`	Not applicable	The number of terms in the objective function.
`batchIndices`	Not applicable	The numeric table of size , where m is the batch size, with a batch of indices to be used to compute the function results. If no indices are provided, the implementation uses all the terms in the computation. NOTE: This parameter can be an object of any class derived from `NumericTable` except `PackedTriangularMatrix` and `PackedSymmetricMatrix` .
`resultsToCompute`	`gradient`	The 64-bit integer flag that specifies which characteristics of the objective function to compute. Provide one of the following values to request a single characteristic or use bitwise OR to request a combination of the characteristics: value Value of the objective function nonSmoothTermValue Value of non-smooth term of the objective function gradient Gradient of the smooth term of the objective function hessian Hessian of smooth term of the objective function proximalProjection Projection of proximal operator for non-smooth term of the objective function lipschitzConstant Lipschitz constant of the smooth term of the objective function gradientOverCertainFeature Certain component of gradient vector hessianOverCertainFeature Certain component of hessian diagonal proximalProjectionOfCertainFeature Certain component of proximal projection
`interceptFlag`	`true`	A flag that indicates a need to compute .
`penaltyL1`	0	L1 regularization coefficient
`penaltyL2`	0	L2 regularization coefficient
`nClasses`	Not applicable	The number of classes (different values of dependent variable)