Get Started with the Intel® oneAPI Data Analytics Library
Intel® oneAPI Data Analytics Library (oneDAL) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation.
For general information about oneDAL, visit oneDAL official page.
Before You Begin
oneDAL is located in <install_dir>/dal directory where <install_dir> is the directory in which Intel® oneAPI Base Toolkit was installed.
The current version of oneDAL with SYCL support is available for Linux* and Windows* 64-bit operating systems. The prebuilt oneDAL libraries can be found in the <install_dir>/dal/<version>/redist directory.
To learn about the system requirements and the dependencies needed to build examples, refer to the System Requirements page.
End-to-end Example
Below you can find a typical usage workflow for a oneDAL algorithm on GPU. The example is provided for Principal Component Analysis algorithm (PCA).
The following steps depict how to:
Read the data from CSV file
Run the training and inference operations for PCA
Access intermediate results obtained at the training stage
Include the following header that makes all oneDAL declarations available.
#include "oneapi/dal.hpp" /* Standard library headers required by this example */ #include <cassert> #include <iostream>
Create a SYCL* queue with the desired device selector. In this case, GPU selector is used:
const auto queue = sycl::queue{ sycl::gpu_selector{} };
Since all oneDAL declarations are in the oneapi::dal namespace, import all declarations from the oneapi namespace to use dal instead of oneapi::dal for brevity:
using namespace oneapi;
Use CSV data source to read the data from the CSV file into a table:
const auto data = dal::read<dal::table>(queue, dal::csv::data_source{"data.csv"});
Create a PCA descriptor, configure its parameters, and run the training algorithm on the data loaded from CSV.
const auto pca_desc = dal::pca::descriptor<float> .set_component_count(3) .set_deterministic(true); const dal::pca::train_result train_res = dal::train(queue, pca_desc, data);
Print the learned eigenvectors:
const dal::table eigenvectors = train_res.get_eigenvectors(); const auto acc = dal::row_accessor<const float>{eigenvectors}; for (std::int64_t i = 0; i < eigenvectors.row_count(); i++) { /* Get i-th row from the table, the eigenvector stores pointer to USM */ const dal::array<float> eigenvector = acc.pull(queue, {i, i + 1}); assert(eigenvector.get_count() == eigenvectors.get_column_count()); std::cout << i << "-th eigenvector: "; for (std::int64_t j = 0; j < eigenvector.get_count(); j++) { std::cout << eigenvector[j] << " "; } std::cout << std::endl; }
Use the trained model for inference to reduce dimensionality of the data:
const dal::pca::model model = train_res.get_model(); const dal::table data_transformed = dal::infer(queue, pca_desc, data).get_transformed_data(); assert(data_transformed.column_count() == 3);
Build and Run Examples
Perform the following steps to build and run examples demonstrating the basic usage scenarios of oneDAL with SYCL support. Go to <install_dir>/dal/<version> and then set up an environment as shown in the example below:
Set up the required environment for oneDAL (variables such as CPATH, LIBRARY_PATH, and LD_LIBRARY_PATH):
On Linux, there are two possible ways to set up the required environment: via vars.sh script or via modulefiles.
Setting up oneDAL environment via vars.sh script
Run the following command:
source ./env/vars.sh
Setting up oneDAL environment via modulefiles
Initialize modules:
source $MODULESHOME/init/bash
NOTE:Refer to Environment Modules documentation for details.Provide modules with a path to the modulefiles directory:
module use ./modulefiles
Run the module:
module load dal
On Windows, run the following command:
/env/vars.bat
Copy ./examples/oneapi/dpc to a writable directory if necessary (since it creates temporary files):
cp –r ./examples/oneapi/dpc ${WRITABLE_DIR}
Set up the compiler environment for Intel® oneAPI DPC++/C++ Compiler. See Get Started with Intel® oneAPI DPC++/C++ Compiler for details.
Build and run the examples that show how to use oneDAL with SYCL support:
NOTE:You need to have write permissions to the examples folder to build examples, and execute permissions to run them. Otherwise, you need to copy examples/oneapi/dpc and examples/oneapi/data folders to the directory with right permissions. These two folders must be retained in the same directory level relative to each other.On Linux:
# Navigate to the directory containing examples and then build them: cd /examples/oneapi/dpc make so example=svm_two_class_thunder_dense_batch # This will compile and run Correlation example using Intel(R) oneAPI DPC++/C++ Compiler make so mode=build # This compiles all examples in the current directory
On Windows:
# Navigate to the directory containing examples and then build them: cd /examples/oneapi/dpc nmake dll example=svm_two_class_thunder_dense_batch+ # This will compile and run Correlation example using Intel(R) oneAPI DPC++/C++ Compiler nmake dll mode=build # This compiles all examples in the current directory
To see all available parameters of the build procedure, type make on Linux* or nmake on Windows*.
The resulting example binaries and log files are written into the _results directory.
NOTE:You should run the examples from examples/oneapi/dpc folder, not from _results folder. Most examples require data to be stored in examples/oneapi/data folder and to have a relative link to it started from examples/oneapi/dpc folder.You can build traditional C++ examples located in examples/oneapi/cpp folder in a similar way.
Compile and build applications with pkg-config
The pkg-config tool is a widely used tool for building software with dependencies. Intel® oneAPI Data Analytics Library provides files with pkg-config metadata for compiling and linking an application to the library.
Set up the environment
To use pkg-config, build the library and then set up the environment using vars.sh or vars.bat scripts:
On Linux: source ./env/vars.sh or source ./setvars.sh
On Windows: source /env/vars.bat or source setvars.bat
Choose a metadata file
The metadata files provided by oneDAL cover only host device configuration on 64-bit Linux, macOS, or Windows operating system for C++.
Choose the metadata file based on oneDAL threading mode and linking method you use:
Single-threaded (non-threaded) |
Multi-threaded (internally threaded) |
|
---|---|---|
Static linking |
dal-static-sequential-host |
dal-static-threading-host |
Dynamic linking |
dal-dynamic-sequential-host |
dal-dynamic-threading-host |
Compile a program using pkg-config
To compile a test.cpp program with oneDAL and pkg-config, provide the name of the oneDAL pkg-config metadata file as an input parameter. For example:
On Linux or macOS:
icc test.cpp pkg-config --cflags --libs dal-dynamic-threading-host
On Windows:
for /F "delims=," %i in ('pkg-config --cflags --libs dal-dynamic-threading-host) do icl test.cpp %i
A sample code for svm_two_class_thunder_dense_batch example with SYCL support. Run the following from the examples/oneapi/cpp directory:
On Linux or macOS:
icc -I source/ source/svm/svm_two_class_thunder_dense_batch.cpp icc test.cpp pkg-config --cflags --libs dal-dynamic-threading-host
On Windows:
for /F "delims=," %i in ('pkg-config --cflags --libs dal-dynamic-threading-host) do icl -I source/ icl svm_two_class_thunder_dense_batch.cpp %i
Find More
Document |
Description |
---|---|
Refer to oneDAL Developer Guide and Reference for detailed information about implemented algorithms. |
|
Check system requirements before you install Intel® oneAPI Data Analytics Library. |
|
Refer to release notes for Intel® oneAPI Data Analytics Library to learn about new updates in the latest release. |
|
Learn how to use oneDAL with daal4py, a Python* API. |
|
Learn about requirements for implementations of oneAPI Data Analytics Library. |
Notices and Disclaimers
Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.