Achieve SYCL* Performance & Portability
Achieve SYCL* Performance & Portability
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Overview
Developers of high-performance computing (HPC) applications are faced with an increasingly diverse number of computing platforms that feature multiple generations of CPUs, GPUs, FPGAs, and other accelerators.
Developing code that is performant and portable across a diverse set of platforms can be expensive and time-consuming to achieve the best result.
This workshop explains why performance, portability, and productivity are important for HPC development. Get hands-on practice to learn methods for achieving performance-portable code that can run across different CPUs and GPUs available on the Intel® Developer Cloud. You also learn how to:
- Identify an algorithm, implement it using Intel® oneAPI Math Kernel Library (oneMKL), and then check for performance on CPUs and GPUs
- Implement the same algorithm using basic SYCL* programming
- Use ND-Range kernels and evaluate their impact on work-group size
- Use private memory and shared local memory to improve performance
Highlights
0:00 Speaker introductions
1:46 Who this course is for
2:17 What you will learn
3:23 Why it's important
4:10 Define performance, portability, and productivity
5:08 Test application for performance portability
5:58 Test application implementations
6:35 Platform configurations for the test application
7:30 Test application analysis tools
8:01 Performance analysis of those tools
8:20 How to sign up for Intel® Developer Cloud
10:00 Get started
12:30 Introduction module
13:30 What is oneAPI?
16:25 Kernel code walk-through
23:25 Module 1: Implementation of an Intel® oneAPI Math Kernel Library and SYCL* basic parallel kernel
44:55 Module 2: ND-Range implementation for matrix multiplication
57:40 Module 3: Local memory implementation for matrix multiplication
1:02:01 Q&A while the program loads
1:06:53 Continuation of Module 3
1:18:00 Q&A
1:23:00 Performance results
1:39:27 Q&A
Develop performant code quickly and correctly across hardware targets, including CPUs, GPUs, and FPGAs, with this standards-based, multiarchitecture compiler.
You May Also Like
Related On-Demand Workshops