Part 2: Advanced scikit-learn* Essentials for Machine Learning
Overview
This is the final part of a two-part workshop series that demonstrates how to:
- Speed up key machine learning algorithms that rely on scikit-learn*
- Get faster results without specialized hardware
This workshop builds on Part 1: Introduction to scikit-learn* Essentials for Machine Learning by extending the capability using the compute follows data method of taking advantage of Intel® Extension for Scikit-learn* for current and upcoming Intel GPUs.
In this more advanced workshop, you have the opportunities to:
- Practice what you learned in Part 1 to speed up two codes: an image-clustering example and a galaxy-collision example
- Revisit patching strategies to ensure that the performance of your scikit-learn code runs faster than stock scikit-learn
- Apply patching strategies with coarse-grained to surgical-level precision
- Use a special technique to perform scikit-learn computation on Intel GPUs
Speed up and scale your scikit-learn* workflows for CPUs and GPUs across single- and multi-node configurations with this Python* module for machine learning.
Highlights
0:00 Introductions
1:32 Learning objectives
3:05 Programming challenges
4:04 oneAPI initiative
5:26 Intel® oneAPI toolkits
5:50 Intel oneAPI ecosystem
8:28 Motivation for today
10:35 Computer follows data
12:25 Procedure for ignition
14:04 Gallery of algorithms optimized for Intel CPUs and GPUs
19:16 Patching and imports: The order
20:16 Train and test split: The syntax
21:00 Prepare for computer follows data: dpctl
21:53 Device queue
23:11 Apply a patch to scikit-learn
23:48 Prepare data
26:54 Casting
29:03 When to cast data returned from the device
29:37 Intel® Developer Cloud access
30:22 Intel Developer Cloud: Enroll and sign in
31:36 Intel Developer Cloud: Jupyter* lab access
50:05 Exercise: Module 05_01, introduction to Intel Extension for Scikit-learn
1:12:00 Queue function
1:16:45 scikit-learn K-means
1:36:00 Polling questions
1:37:17 Apply the patch
1:41:40 Exercise: Module_05_02, K-nearest neighbor (KNN): targeting a GPU and patching
1:47:37 Casting a NumPy array and predicting
1:52:40 Exercise: Module 05_03, a gallery of functions on GPUs