Visible to Intel only — GUID: GUID-F463B54A-0FA8-4E00-8D00-D41F561A0B48
Visible to Intel only — GUID: GUID-F463B54A-0FA8-4E00-8D00-D41F561A0B48
Implicit Alternating Least Squares
The library provides the Implicit Alternating Least Squares (implicit ALS) algorithm [Fleischer2008], based on collaborative filtering.
Details
Given the input dataset of size , where m is the number of users and n is the number of items, the problem is to train the Alternating Least Squares (ALS) model represented as two matrices: X of size , and Y of size , where f is the number of factors. The matrices X and Y are the factors of low-rank factorization of matrix R:
Initialization Stage
Initialization of the matrix Y can be done using the following method: for each and are independent random numbers uniformly distributed on the interval , .
Training Stage
The ALS model is trained using the implicit ALS algorithm [Hu2008] by minimizing the following cost function:
where:
indicates the preference of user u of item i:
is the threshold used to define the preference values. is the only threshold valu supported so far.
, measures the confidence in observing
is the rate of confidence
is the element of the matrix R
is the parameter of the regularization
, denote the number of ratings of user u and item i respectively
Prediction Stage
Prediction of Ratings
Given the trained ALS model and the matrix D that describes for which pairs of factors X and Y the rating should be computed, the system calculates the matrix of recommended ratings Res: , if , ; .
Initialization
For initialization, the following computation modes are available:
Computation
The following computation modes are available:
Distributed processing for training and prediction of ratings
Examples
C++ (CPU)
Batch Processing:
Distributed Processing:
Java*
Batch Processing:
Distributed Processing:
Python*
Batch Processing:
Performance Considerations
To get the best overall performance of the implicit ALS recommender:
If input data is homogeneous, provide the input data and store results in homogeneous numeric tables of the same type as specified in the algorithmFPType class template parameter.
If input data is sparse, use CSR numeric tables.
Product and Performance Information |
---|
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex. Notice revision #20201201 |