Intel® Extension for Scikit-learn*
Scale your scikit-learn* (sklearn) workflows by changing a couple of lines of code.
Accelerate scikit-learn for Data Analytics & Machine Learning
Scikit-learn* (often referred to as sklearn) is a Python* module for machine learning. Intel® Extension for Scikit-learn* seamlessly speeds up your scikit-learn applications for Intel CPUs and GPUs across single- and multi-node configurations. This extension package dynamically patches scikit-learn estimators while improving performance for your machine learning algorithms.
The extension is part of the AI Tools that provide flexibility to use machine learning tools with your existing AI packages.
Using scikit-learn with this extension, you can:
- Speed up training and inference by up to 100x with the equivalent mathematical accuracy.
- Continue to use the open source scikit-learn API.
- Enable and disable the extension with a couple lines of code or at the command line.
Both scikit-learn and Intel Extension for Scikit-learn are part of the end-to-end suite of Intel® AI and machine learning development tools and resources.
Download the AI Tools
Intel Extension for Scikit-learn is available in the AI Tools Selector, which provides accelerated machine learning and data analytics pipelines with optimized deep learning frameworks and high-performing Python* libraries.
Download the Stand-Alone Version
A stand-alone version Intel Extension for Scikit-learn is available. You can install from the PIP* package manager for Python or Anaconda*, or download and build from the source.
Features
Drop-in Acceleration
- Speed up scikit-learn (sklearn) algorithms by replacing existing estimators with mathematically-equivalent accelerated versions. Supported Algorithms
- Run on your choice of an x86-compatible CPU or Intel GPU because the accelerations are powered by Intel® oneAPI Data Analytics Library (oneDAL).
- Choose how to apply the accelerations:
- Patch all compatible algorithms from the command line with no code changes.
- Add two lines of code to patch all compatible algorithms in your Python script.
- Specify in your script to patch only selected algorithms.
- Globally patch and unpatch your environment for all uses of scikit-learn.
Example Patch Using Two Lines of Code
import numpy as np
# Turn on scikit-learn optimizations with these 2 simple lines:
from sklearnex import patch_sklearn
patch_sklearn()
# Import scikit-learn algorithms after the patch is enabled
from sklearn.cluster import KMeans
X = np.array([[1, 2], [1, 4], [1, 0],
[10, 2], [10, 4], [10, 0]])
kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
print(f"kmeans.labels_ = {kmeans.labels_}")
Benchmarks
Documentation & Code Samples
Documentation
Code Samples
Get Started
Get Started with Intel Extension for Scikit-learn
Learn how to use the machine learning algorithms available with scikit-learn. This example uses a support vector machine classifier (SVC) for the digit recognition problem.
Speed Up Machine Learning
SVC for the Adult Dataset to Predict Income
Run an SVC algorithm with Intel Extension for Scikit-learn and compare its performance against the original stock version of scikit-learn.
Kaggle* Kernels
See how to use scikit-learn and Intel Extension for Scikit-learn to speed up machine learning tasks: classification, regression, and AutoML workflows.
Intel Extension for Scikit-learn Notebooks
With these Python* notebooks as examples, learn to use this extension for popular datasets.
Implement a Machine Learning Workload
End-to-End Census Workload
Build and run an end-to-end machine learning workload that uses Intel® Distribution of Modin* for extract, transform, and load (ETL) operations, and the Intel Extension for Scikit-learn ridge regression algorithm.
AI Workflows: Use Cases & Toolkits
Document Management: Apply Intelligent Indexing
This example shows how to build a natural language processing (NLP) pipeline to classify documents with their respective topics.
Assets Management: Predict Defects
An example shows how to create an asset maintenance solution to predict defects and anomalies before they happen.
Online Retail: Customer Segmentation
This example shows how to use machine learning to segment customers into clusters for further personalized and targeted campaigns.
Training
Machine Learning Workloads Acceleration
Demos
Benchmarking Intel Extension for Scikit-learn: How Much Faster Is It?
Walk through an example that compares the performance of the K-means fitting algorithm in open source scikit-learn to the accelerated version in Intel Extension for Scikit-learn.
Optimize Utility Maintenance Prediction for Better Service
Get started using a reference kit to predict the health of utility assets and the probability of failure using scikit-learn SVC. Compare its effectiveness with an XGBoost algorithm.
Drive 2x Performance into Your scikit-learn Machine Learning Tasks
This video demonstrates how to use Intel Extension for Scikit-learn, how it accelerates scikit-learn algorithms, and shows examples of end-to-end census and distributed linear regression.
Save Time and Money with Intel Extension for Scikit-learn
Accelerating scikit-learn algorithms can deliver performance improvements up to and sometimes exceeding 100x. If you are running in a commercial cloud environment, this can deliver significant compute cost savings.
Case Studies
Greener Machine Learning Computing with Intel® AI Acceleration
A set of experiments run by Anaconda showed that Intel Extension for Scikit-learn reduced CPU energy by up to 8.5x and DRAM energy by up to 7x while accelerating CPU compute time.
Supply-Chain Optimization at an Enterprise Scale
Architects from Red Hat* and Intel walk through the process of building an application to predict late deliveries using Red Hat OpenShift* Data Science and AI Tools.
HippoScreen* Improves AI Performance by 2.4x
The Taiwan-based neurotechnology startup combined its proprietary algorithms with accelerated scikit-learn algorithms to improve machine learning efficiency and training times for its Brain Waves AI system.
Specifications
Processors:
- All CPUs with x86 architecture
- All integrated and discrete GPUs from Intel
Operating systems:
- Linux*
- Windows* and Windows Server*
Language:
- Python
Get Help
Your success is our success. Access these support resources when you need assistance.
For additional help, see our general Support.
Stay Up to Date on AI Workload Optimizations
Sign up to receive hand-curated technical articles, tutorials, developer tools, training opportunities, and more to help you accelerate and optimize your end-to-end AI and data science workflows. Take a chance and subscribe. You can change your mind at any time.