Code Samples
Explore the latest, ready-to-use code samples in GitHub* to develop, offload, and optimize multiarchitecture applications targeting CPUs, GPUs, and FPGAs.
Languages & Performance Optimization
Access coding techniques and tools for performant multiarchitecture development including SYCL*, C++, and Fortran.
Vector Add
This application is the equivalent of a Hello, World! code sample, and demonstrates how to use C++ with SYCL to offload computations to a GPU. It includes the basics of using buffers and Unified Shared Memory (USM).
Fortran Coarray
Illustrates a guided approach to build and run a serial Fortran application, and then convert it to run in parallel using coarrays.
2D Finite-Difference Wave Propagation
Demonstrates using SYCL queues, buffers, and accessors to solve complex 2D acoustic isotropic wave differential equations.
Fortran-Based Edge Detection with GPU Offload
Use the Sobel Edge Detection algorithm to find object boundaries in a Portable Pixel Map (PPM) format image. The algorithm is implemented in three data-parallel steps: image smoothing, edge detection, and edge highlighting. Use Fortran to offload the workloads to your system's GPU.
Find the Shortest Paths Between Pairs of Vertices
Demonstrates how to use the Floyd-Warshall algorithm to offload compute-intensive work to the GPU efficiently.
Ocean FFT
Simulates an ocean heightfield using the Intel® oneAPI Math Kernel Library (oneMKL) fast Fourier transform (FFT) functionality and offloading to a GPU or CPU. The code originates from CUDA but shows migration to SYCL using the open source SYCLomatic tool.
AI & Analytics
Find samples to architect, train, and deploy models, as well as end-to-end workloads, common optimizations using popular frameworks, ways to get started with Python* libraries, and more.
PyTorch* Training Optimizations with Intel® Advanced Matrix Extensions (Intel® AMX)
Illustrates how training a PyTorch* model using Intel® AMX changes performance.
Interactive Chat Based on a DialoGPT Model Using Intel® Extension for PyTorch* Quantization
Demonstrates how to create an interactive chat based on the pretrained DialoGPT model, and then add the Intel® Extension for PyTorch* quantization to it. Speed up operations on processors with an int8 data format and specialized computer instructions.
Fine-tuning a Text Classification Model with Intel® Neural Compressor
Demonstrates fine-tuning a text model for emotion classification tasks using quantization-aware training from Intel® Neural Compressor.
TensorFlow* fine-tuning of LLMs with AMX and Bfloat16 Sample
Demonstrates how to finetune a GPT-J (LLM) model using the GLUE cola dataset with the Intel® Optimization for TensorFlow*. Optimizes for performance boost on Intel® hardware, such as AVX-512 Vector Neural Network Instructions (AVX512 VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX).
Enable Automixed Precision for Transfer Learning
Demonstrates the end-to-end pipeline tasks typically performed in a deep learning use case and describes the benefits.
Intel® AI Reference Models for Intel® Architecture Inference with FP32 and int8
Demonstrates how combining TensorFlow* ResNet* 50 inference and oneMKL can increase inference performance.
Performance Libraries
Improve application performance and development for heterogeneous computing with these oneAPI-optimized libraries.
Optimize Applications Based on Available Resources with Dynamic Device Selection
Demonstrates how to use the Intel® oneAPI DPC++ Library (oneDPL) to apply dynamic device selection policies that can help determine on which device to run the application. It uses a basic sepia filter image conversion application to show different workloads performing differently based on policies such as auto-tune and load balancing.
oneTBB Tasks to Run Computational Kernels
Demonstrates the difference between two Intel® oneAPI Threading Building Blocks (oneTBB) tasks on kernels using SYCL and on oneTBB code implemented on CPUs and GPUs.
Black-Scholes for Randomly Generated Portfolios
Demonstrates using vector math and random number generators in oneMKL to calculate the prices of options.
Fourier Correlation
Demonstrates how to implement 1D and 2D Fourier correlations using SYCL and oneMKL.
cuBLAS Migration Sample
This collection of code samples demonstrate the cuBLAS equivalent in oneMKL. Each of the cuBLAS sample source files shows the use of oneMKL cuBLAS routines.
Analyzers & Debuggers
Design code early in the development cycle for optimal performance and accelerator offload, including threading, vectorization, memory, and power and thermal behavior.
Profile an Application Using Intel® VTune™ Profiler
Demonstrates multiple implementations of matrix multiplication using SYCL for CPUs and GPUs, and then analyzing using Intel VTune Profiler.
Profile an Application Using Intel® Advisor
Demonstrates multiple implementations of matrix multiplication using SYCL for CPUs and GPUs and running an analysis using Intel® Advisor.
Guided Matrix Multiply for Bad Buffers
Demonstrates how to identify different but related bugs in Unified Shared Memory (USM) and buffer matrix multiplier code. The sample includes the corrected code.
Optimized 2D Ray Tracer Rendering Program with Intel VTune Profiler
Use Intel VTune Profiler to identify performance opportunities by comparing different versions of the Tachyon sample, a 2D ray tracer rendering program. Improve the performance of serial programs by using parallel processing with OpenMP* or Intel oneTBB.
FPGAs
Accelerate innovation with Intel® FPGAs coupled with Intel toolkits, all optimized for a wide range of applications.
GNU Gzip or Snappy Decompression Engine
This reference design demonstrates implementations of Gzip or Snappy decompression engines on an Intel FPGA.
Loop Unroll
A tutorial demonstrates unrolling loops to improve throughput for an FPGA program that is compliant with SYCL.
Rendering & Ray Tracing
Create complex, photorealistic renderings that scale end-to-end on laptops, workstations, HPC, and cloud with fidelity-first, open source libraries.
Introduction to Ray Tracing with Intel® Embree
Demonstrates how to build a basic geometry, ray tracing application with this performant ray tracing library.
Path Tracing with Intel Embree
Demonstrates using components of the Intel® Rendering Toolkit to implement basic path tracing and shows how to use some new features.
Stay In the Know on All Things CODE
Sign up to receive the latest tech articles, tutorials, dev tools, training opportunities, product updates, and more, hand-curated to help you optimize your code, no matter where you are in your developer journey. Take a chance and subscribe. You can change your mind at any time.