Explore SYCL with Samples from Intel
SYCL* applications are C++ programs for parallelism that empower you with tools for data-parallel programming and heterogeneous computing. SYCL brings a uniform programming experience across various computing substrates such as CPU, GPU, FPGA, and AI accelerators by offering a consistent C++ language and Application Program Interfaces (APIs). You have the flexibility to program and utilize each architecture either individually or in combination. This approach encourages you to learn the programming model once and apply it to various accelerators. Achieving the best performance on each accelerator class necessitates tailoring and fine-tuning algorithms, yet the core language and programming model stay the same across different target devices. For in-depth information on SYCL, visit the SYCL Specification.
This guide aims to enlighten you on navigating the oneAPI programming model, focusing on selecting and refining the most suitable architecture to ensure application peak performance.
To explore FPGA-specific samples, visit the Explore SYCL Through Intel® FPGA Code Samples page.
Build and Run a Sample Project
The links below take you to the Get Started with the Intel® oneAPI Base Toolkit content for the Command Line and IDE:
- Build and Run a Sample Project Using the Command Line:
- Build and Run a Sample Project Using an IDE:
- GitHub (Each sample has a link for its specific GitHub repo; these links are found in their respective sample sections below.)
Sample 1: Simple Device Offload Structure
Sample 1 introduces Vector Add as the equivalent of a Hello, World! Sample for data parallel programs. It outlines the basic structure of a SYCL application by demonstrating how to target an offload device. Sample 1 includes two source files illustrating memory management using buffers or Unified Shared Memory (USM).
Vector Add supports both GPU and FPGA device selectors.
In this sample, you will learn to utilize SYCL's basic elements (features) to offload a straightforward computation using 1D arrays to accelerators. The basic features include:
- A one-dimensional array of data.
- A device selector queue, buffer, accessor, and kernel.
- Memory management using buffers and accessors or USM.
Visit Code Sample: Vector Add for a detailed code walkthrough.
Get the sample:
- CLI or IDE sample name: vector-add
- Git Repo for Vector Add Sample
Sample 2: Basic SYCL Features Defined
Sample 2 walks you through the base tenets of SYCL using a two-dimensional stencil to simulate a wave propagating in a 2D isotropic medium with:
- SYCL queues (including device selectors and exception handlers).
- SYCL buffers and accessors.
- The ability to call a function inside a kernel definition and pass accessor arguments as pointers. A function called inside the kernel performs a computation (it updates a grid point specified by the global ID variable) for a single time step.
Visit Code Sample: Two-Dimensional Finite-Difference Wave Propagation in Isotropic Media (ISO2DFD) for a detailed code walkthrough. Visit Explore Data Parallel C++ with Samples from Intel: ISO2DFD for a detailed video walkthrough.
Get the sample:
- CLI or IDE sample name: iso2dfd_dpcpp
- Git Repo for ISO2DFD Sample
Sample 3: Optimizing for More Complex Applications
Sample 3 builds on the SYCL concepts reviewed in the previous sample, explaining how to apply these concepts for solving complex stencil computations in 3D. Shifting from 2D to 3D grid sizes can expose common issues in general-purpose GPU (device) programming, such as inefficient data access patterns, low flops-to-byte ratios, and low occupancy. The sample demonstrates how to employ SYCL features to address these issues and optimize performance. It uses five versions of the same code, each iteration showing performance improvements.
The sample provides step-by-step instructions that walk you through the process of adapting CPU-based code for GPU offloading with SYCL and improving performance across several iterations with the help of Intel® Advisor. It shows the use of several important SYCL features:
Local buffers and accessors (declare local memory buffers and accessors to be accessed and managed by each SYCL workgroup).
- Shared local memory (SLM) optimizations.
- Kernels (including parallel_for function and nd-range<3> objects).
Get the sample:
- CLI or IDE sample name: iso3dfd_dpcpp
- Git Repo for Guided ISO3DFD Sample
Sample 4: Introducing Synchronization
Sample 4 introduces added complexity through a vast array of moving particles interacting with a stationary grid of cells. It serves to demonstrate new SYCL features, including Synchronization (atomic operations).
This code sample demonstrates how to offload computation to an accelerator using the following SYCL tools:
- SYCL queues (including device selectors and exception handlers).
- SYCL buffers and accessors (communicate data between the host and the device).
- SYCL kernels (including parallel_for function and range<1> objects).
- SYCL atomic operations for synchronization.
- API-based programming: Use oneMKL to generate random numbers.
Visit Code Sample: Particle Diffusion for a detailed code walkthrough.
Get the sample:
- CLI or IDE Sample name: particle-diffusion
- Git Repo for Particle Diffusion Sample
Next Steps
Code Walkthroughs
Next, try a detailed code walkthrough on the following topics:
- The foundations of SYCL through a Vector Add sample
- USM, a core feature of SYCL programing, through a Mandelbrot sample
Determine Which Code to Offload
You can use Intel® Advisor to determine which parts of your code would benefit from offloading to an accelerator. The Offload Advisor feature lets you collect performance predictor data on top of the standard profiling capabilities. It identifies code you can offload to a target device to boost your CPU-based applications' performance. The Get Started with Intel® Advisor helps you:
- Optimize CPU or GPU code for memory and computes with Roofline Analysis.
- Enhance vector parallelism and its efficiency.
- Model, tune, and test multiple threading designs.
- Develop and examine data flow and dependency computation using heterogeneous algorithms.
Transform CUDA Code into SYCL Code
With the Intel® DPC++ Compatibility Tool, a migration engine, you can convert CUDA code into standards-based SYCL code. The Get Started Guide and User Guide assist in migrating your existing CUDA applications, outlining the general workflow. The tool supports transforming programs with multiple source and header files and includes:
- One-time migration support for kernels and API calls.
- An inline comments guide used to produce output, which can be compiled with the Intel® oneAPI DPC++/C++ Compiler.
- Command-line tools and IDE plug-ins that streamline operations.
Additional Resources
You can access tutorials, videos, and webinar replays to learn more about SYCL and the supporting tools on the Intel® oneAPI Toolkits site.
Document |
Description |
---|---|
Learn about oneAPI and SYCL, programming models and interfaces, SYCL runtimes, APIs, and software development processes. |
|
Look through our content to search for specific documents. |
|
Look through the FPGA code samples for more in-depth information. |
Notices and Disclaimers
Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.