Visible to Intel only — GUID: GUID-8D20EFAB-D93C-443D-9438-25E8BCC12228
Visible to Intel only — GUID: GUID-8D20EFAB-D93C-443D-9438-25E8BCC12228
Covariance
In statistics, covariance and correlation are two of the most fundamental measures of linear dependence between two random variables. The covariance and the correlation represent the joint variability of any two features. The correlation is dimensionless, while the covariance is measured in units obtained by multiplying the units of the two features. Another important distinction is that covariance can be affected by the higher variance of one feature, while correalation removes the effect of the variances by normalizing the covariance of two features by their square-root of variances. Their usage is application-dependent. The covariance algorithm computes the following:
Means
Covariance
Correlation
Operation |
Computational methods |
Programming Interface |
||
Mathematical formulation
Computing
Given a dataset \(X = \{ x_1, \ldots, x_n \}\) with n feature vectors of dimension p, the means is a \(1 imes p\) matrix, the covariance and the correlation matrices are \(p imes p\) square matrices. The means, the covariance, and the correlation are computed with the following formulas:
Statistic |
Definition |
---|---|
Means |
\(M = (m(1), \ldots , m(p))\), where \(m\left(j\right)=\frac{1}{n}\sum _{i}{x}_{ij}\) |
Covariance matrix |
\(Cov = (v_{ij})\), where \(v_{ij}=\frac{1}{n-1}\sum_{k=1}^{n}(x_{ki}-m(i))(x_{kj}-m(j))\), \(i=\overline{1,p}\), \(j=\overline{1,p}\) |
Correlation matrix |
\(Cor = (c_{ij})\), where \(c_{ij}=\frac{v_{ij}}{\sqrt{v_{ii}\cdot v_{jj}}}\), \(i=\overline{1,p}\), \(j=\overline{1,p}\) |
Computation method: dense
The method computes means, variance-covariance, or correlation matrix for the dense data. This is the default and the only method supported.
Programming Interface
Refer to API Reference: Covariance.
Distributed mode
The algorithm supports distributed execution in SMPD mode (only on GPU).