Application Notes for Intel® oneAPI Math Kernel Library Summary Statistics

ID 772991
Date 12/04/2020
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Performing Robust Estimation of a Variance-Covariance Matrix

Use the Translated Biweight S-estimator (TBS) method to perform robust estimation of a variance-covariance matrix and mean vector [Rocke96]. The start point of the algorithm is computed using a single iteration of the Maronna algorithm with the reweighting step [Marrona2002]. The parameters of the TBS algorithm are packed into the params array. A pointer to this array along with other required parameters is passed to the task descriptor using the EditRobustCovariance editor. The structure of the params array is available in Table Structure of the Array of TBS Parameters in the Summary Statistics section of [MKLMan].

The algorithm outputs a robust variance-covariance matrix and the mean vector. The following example illustrates computation of a robust estimation for the variance-covariance matrix with the help of the TBS estimator:

#include "mkl_vsl.h"
 
#define DIM   10   /* dimension of the task */
#define N   1000   /* number of observations */
 
int main()
{
    VSLSSTaskPtr task;
    double x[DIM][N];  /* matrix of observations */
    double params[VSL_SS_TBS_PARAMS_N];
    double rcov[DIM*(DIM+1)/2], rmean[DIM];
    MKL_INT nparams, xstorage, rcovstorage;
    MKL_INT p, n;
    int status;
  
    double breakdown, alpha, sigma, max_iter;
 
    /* Parameters of the task and initialization */
    p = DIM;
    n = N;
    xstorage    = VSL_SS_MATRIX_STORAGE_ROWS;
    rcovstorage = VSL_SS_MATRIX_STORAGE_U_PACKED;
    nparams     = VSL_SS_TBS_PARAMS_N; /* number of TBS parameters */
 
    /* Parameters of the TBS estimator */
    breakdown = 0.3;
    alpha = 0.01;
    sigma = 0.01;
    max_iter = 30;
 
    params[0] = breakdown;
    params[1] = alpha;
    params[2] = sigma;
    params[3] = max_iter;
 
    /* Create a task */
    status = vsldSSNewTask( &task, &p, &n, &xstorage, (double*)x, 0, 0 );
 
    /* Initialize the task parameters */
    status = vsldSSEditRobustCovariance( task, &rcovstorage, 
                                         &nparams, params, rmean, rcov );
 
    /* Compute the robust variance-covariance matrix */
    status = vsldSSCompute( task, VSL_SS_ROBUST_COV, VSL_SS_METHOD_TBS );
                                     
    /* Deallocate the task resources */
    status = vslSSDeleteTask( &task );
 
    return 0;
}

To calculate a robust variance-covariance matrix, you need to get the invers variance-covariance matrix for computing the Mahalanobis distance. In some cases, the inverse matrix cannot be calculated, for example, if the random vector components are dependent. Summary Statistics TBS algorithm checks the reversibility of the matrix by calculating its eigenvalues. If the minimum eigenvalue is non-positive, the algorithm searches for the minimum positive matrix eigenvalue E that exceeds 1000*P, where P is the minimal positive floating-point number. If the routine fails to find such eigenvalue, it terminates the computations and returns an error code. Otherwise, the variance-covariance matrix is corrected by adding 0.01*E to the elements of the main diagonal, and the calculations continue. Upon successful completion, the function returns the VSL_SS_NOT_FULL_RANK_MATRIX warning, indicating that the algorithm has detected a variance-covariance matrix of an incomplete rank.

The max_iter parameter passed in the third position of the array of TBS parameters defines the maximal number of iterations the TBS algorithm can perform before terminating the calculations. If this parameter is set to zero, the function returns a robust estimate of the variance-covariance matrix computed by means of the Maronna method only.

Summary Statistics algorithms for computation of a robust variance-covariance matrix and an array of means do not support progressive processing of the datasets available in blocks.