Visible to Intel only — GUID: GUID-F7544A81-010F-43D8-BDDB-010A5E157C35
Visible to Intel only — GUID: GUID-F7544A81-010F-43D8-BDDB-010A5E157C35
Basic Statistics
Basic statistics algorithm computes the following set of quantitative dataset characteristics:
minimums/maximums
sums
means
sums of squares
sums of squared differences from the means
second order raw moments
variances
standard deviations
variations
Operation |
Computational methods |
Programming Interface |
||
Mathematical formulation
Computing
Given a set X of np-dimensional feature vectors , the problem is to compute the following sample characteristics for each feature in the data set:
Statistic |
Definition |
---|---|
Minimum |
|
Maximum |
|
Sum |
|
Sum of squares |
|
Means |
|
Second order raw moment |
|
Sum of squared difference from the means |
|
Variance |
|
Standard deviation |
|
Variation coefficient |
Partial Computing
Given a block of a dataset with n feature vectors of p dimension, the sums is a matrix, the crossproduct is square matrices. The sums and the cross product are computed with the following formulas:
Statistic |
Definition |
---|---|
Partial Minimum |
|
Partial Maximum |
|
Partial Sum |
|
Partial Sum of squares |
Finalize Computing
Given a partial result with partial products, the means is a matrix, the covariance and correlation matrices are square matrices. The means, the covariance, and the correlation are computed with the following formulas:
Statistic |
Definition |
---|---|
Finalize Minimum |
|
Finalize Maximum |
|
Finalize Sum |
|
Finalize Sum of squares |
|
Finalize Means |
|
Finalize Second order raw moment |
|
Finalize Sum of squared difference from the means |
|
Finalize Variance |
|
Finalize Standard deviation |
|
Finalize Variation coefficient |
Computation method: dense
The method computes the basic statistics for each feature in the data set.
Programming Interface
Refer to API Reference: Basic statistics.
Online mode
The algorithm supports online mode.
Distributed mode
The algorithm supports distributed execution in SPMD mode (only on GPU).
Usage Example
Computing
void run_computing(const table& data) {
const auto bs_desc = dal::basic_statistics::descriptor{};
const auto result = dal::compute(bs_desc, data);
std::cout << "Minimum:\n" << result.get_min() << std::endl;
std::cout << "Maximum:\n" << result.get_max() << std::endl;
std::cout << "Sum:\n" << result.get_sum() << std::endl;
std::cout << "Sum of squares:\n" << result.get_sum_squares() << std::endl;
std::cout << "Sum of squared difference from the means:\n"
<< result.get_sum_squares_centered() << std::endl;
std::cout << "Mean:\n" << result.get_mean() << std::endl;
std::cout << "Second order raw moment:\n" << result.get_second_order_raw_moment() << std::endl;
std::cout << "Variance:\n" << result.get_variance() << std::endl;
std::cout << "Standard deviation:\n" << result.get_standard_deviation() << std::endl;
std::cout << "Variation:\n" << result.get_variation() << std::endl;
}
Examples
oneAPI DPC++
Batch Processing:
Online Processing:
oneAPI C++
Batch Processing:
Online Processing: