Explore Code Samples

Languages & Performance Optimization

Access coding techniques and tools for performant multiarchitecture development including SYCL*, C++, and Fortran.

Vector Add

This application is the equivalent of a Hello, World! code sample, and demonstrates how to use C++ with SYCL to offload computations to a GPU. It includes the basics of using buffers and Unified Shared Memory (USM).

Fortran Coarray

Illustrates a guided approach to build and run a serial Fortran application, and then convert it to run in parallel using coarrays.

2D Finite-Difference Wave Propagation

Demonstrates using SYCL queues, buffers, and accessors to solve complex 2D acoustic isotropic wave differential equations.

Fortran-Based Edge Detection with GPU Offload

Use the Sobel Edge Detection algorithm to find object boundaries in a Portable Pixel Map (PPM) format image. The algorithm is implemented in three data-parallel steps: image smoothing, edge detection, and edge highlighting. Use Fortran to offload the workloads to your system's GPU.

Find the Shortest Paths Between Pairs of Vertices

Demonstrates how to use the Floyd-Warshall algorithm to offload compute-intensive work to the GPU efficiently.

Ocean FFT

Simulates an ocean heightfield using the Intel® oneAPI Math Kernel Library (oneMKL) fast Fourier transform (FFT) functionality and offloading to a GPU or CPU. The code originates from CUDA but shows migration to SYCL using the open source SYCLomatic tool.

See All Samples

Performance Libraries

Improve application performance and development for heterogeneous computing with these oneAPI-optimized libraries.

Optimize Applications Based on Available Resources with Dynamic Device Selection

Demonstrates how to use the Intel® oneAPI DPC++ Library (oneDPL) to apply dynamic device selection policies that can help determine on which device to run the application. It uses a basic sepia filter image conversion application to show different workloads performing differently based on policies such as auto-tune and load balancing.

oneTBB Tasks to Run Computational Kernels

Demonstrates the difference between two Intel® oneAPI Threading Building Blocks (oneTBB) tasks on kernels using SYCL and on oneTBB code implemented on CPUs and GPUs.

Black-Scholes for Randomly Generated Portfolios

Demonstrates using vector math and random number generators in oneMKL to calculate the prices of options.

Fourier Correlation

Demonstrates how to implement 1D and 2D Fourier correlations using SYCL and oneMKL.

cuBLAS Migration Sample

This collection of code samples demonstrate the cuBLAS equivalent in oneMKL. Each of the cuBLAS sample source files shows the use of oneMKL cuBLAS routines.

See All Samples

AI & Analytics

Find samples to architect, train, and deploy models, as well as end-to-end workloads, common optimizations using popular frameworks, ways to get started with Python* libraries, and more.

PyTorch* Training Optimizations with Intel® Advanced Matrix Extensions (Intel® AMX)

Illustrates how training a PyTorch* model using Intel® AMX changes performance.

Interactive Chat Based on a DialoGPT Model Using Intel® Extension for PyTorch* Quantization

Demonstrates how to create an interactive chat based on the pretrained DialoGPT model and then add the Intel® Extension for PyTorch* quantization to it. Speed up operations on processors with an int8 data format and specialized computer instructions.

Fine-tuning a Text Classification Model with Intel® Neural Compressor

Demonstrates fine-tuning a text model for emotion classification tasks using quantization-aware training from Intel® Neural Compressor.

TensorFlow* fine-tuning of LLMs with AMX and Bfloat16 Sample

Demonstrates how to finetune a GPT-J (LLM) model using the GLUE cola dataset with the Intel® Optimization for TensorFlow*. Optimizes for performance boost on Intel® hardware, such as AVX-512 Vector Neural Network Instructions (AVX512 VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX).

Enable Automixed Precision for Transfer Learning

Demonstrates the end-to-end pipeline tasks typically performed in a deep learning use case and describes the benefits.

Intel® AI Reference Models for Intel® Architecture Inference with FP32 and int8

Demonstrates how combining TensorFlow* ResNet* 50 inference and oneMKL can increase inference performance.

See All Samples

Analyzers & Debuggers

Design code early in the development cycle for optimal performance and accelerator offload, including threading, vectorization, memory, and power and thermal behavior.

Profile an Application Using Intel® VTune™ Profiler

Demonstrates multiple implementations of matrix multiplication using SYCL for CPUs and GPUs, and then analyzing using Intel VTune Profiler.

Profile an Application Using Intel® Advisor

Demonstrates multiple implementations of matrix multiplication using SYCL for CPUs and GPUs and running an analysis using Intel® Advisor.

Guided Matrix Multiply for Bad Buffers

Demonstrates how to identify different but related bugs in Unified Shared Memory (USM) and buffer matrix multiplier code. The sample includes the corrected code.

Optimized 2D Ray Tracer Rendering Program with Intel VTune Profiler

Use Intel VTune Profiler to identify performance opportunities by comparing different versions of the Tachyon sample, a 2D ray tracer rendering program. Improve the performance of serial programs by using parallel processing with OpenMP* or Intel oneTBB.

See All Samples

Rendering & Ray Tracing

Create complex, photorealistic renderings that scale end-to-end on laptops, workstations, HPC, and cloud with fidelity-first, open source libraries.

Introduction to Ray Tracing with Intel® Embree

Demonstrates how to build a basic geometry, ray tracing application with this performant ray tracing library.

Path Tracing with Intel Embree

Demonstrates using components of the Intel® Rendering Toolkit to implement basic path tracing and shows how to use some new features.

See All Samples

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Code Samples

Explore the latest, ready-to-use code samples in GitHub* to develop, offload, and optimize
multiarchitecture applications targeting CPUs and GPUs.

Languages & Performance Optimization

Vector Add

Fortran Coarray

2D Finite-Difference Wave Propagation

Fortran-Based Edge Detection with GPU Offload

Find the Shortest Paths Between Pairs of Vertices

Ocean FFT

Performance Libraries

Optimize Applications Based on Available Resources with Dynamic Device Selection

oneTBB Tasks to Run Computational Kernels

Black-Scholes for Randomly Generated Portfolios

Fourier Correlation

cuBLAS Migration Sample

AI & Analytics

PyTorch* Training Optimizations with Intel® Advanced Matrix Extensions (Intel® AMX)

Interactive Chat Based on a DialoGPT Model Using Intel® Extension for PyTorch* Quantization

Fine-tuning a Text Classification Model with Intel® Neural Compressor

TensorFlow* fine-tuning of LLMs with AMX and Bfloat16 Sample

Enable Automixed Precision for Transfer Learning

Intel® AI Reference Models for Intel® Architecture Inference with FP32 and int8

Analyzers & Debuggers

Profile an Application Using Intel® VTune™ Profiler

Profile an Application Using Intel® Advisor

Guided Matrix Multiply for Bad Buffers

Optimized 2D Ray Tracer Rendering Program with Intel VTune Profiler

Rendering & Ray Tracing

Introduction to Ray Tracing with Intel® Embree

Path Tracing with Intel Embree

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Code Samples

Explore the latest, ready-to-use code samples in GitHub* to develop, offload, and optimize multiarchitecture applications targeting CPUs and GPUs.

Explore the latest, ready-to-use code samples in GitHub* to develop, offload, and optimize
multiarchitecture applications targeting CPUs and GPUs.