Intel® oneAPI Threading Building Blocks
Scalable Parallel Programming at Your Fingertips
Simplified Development for Parallel Applications
Intel® oneAPI Threading Building Blocks (oneTBB)† is a flexible performance library that simplifies the work of adding parallelism to complex applications across accelerated architectures, even if you’re not a threading expert.
oneTBB is ideal for a wide range of compute-intense domains, such as:
- Numeric weather prediction
- Oceanography
- Astrophysics
- Genetic engineering
- Seismic exploration
- AI and automation
- Energy resource exploration
- Socioeconomics
†Intel® Threading Building Blocks (Intel® TBB) is now called Intel oneAPI Threading Building Blocks (oneTBB) to highlight that the tool is part of the oneAPI ecosystem.
Download as Part of the Toolkit
oneTBB is included in the Intel® oneAPI Base Toolkit, which is a core set of tools and libraries for developing high-performance, data-centric applications across diverse architectures.
Download the Stand-Alone Version
A stand-alone download of oneTBB is available. You can download binaries from Intel or choose your preferred repository.
Help oneTBB Evolve
oneTBB is part of the oneAPI industry standards initiative. We welcome you to participate.
Features
oneTBB differs from typical threading packages in the following ways:
Specify Logical Performance, Not Threads
A runtime library automatically maps logical parallelism onto threads, making the most efficient use of processor resources.
Targets Threading for Performance
Focuses on the specific goal of parallelizing computationally intensive work, delivering higher-level, simpler solutions.
Coexists with Other Threading Packages
Seamlessly compatible with other threading packages, it gives you the flexibility to keep your legacy code as-is and use oneTBB for new implementations.
Emphasizes Scalable, Data-Parallel Programming
Rather than breaking up a program into functional blocks and assigning a separate thread to each, oneTBB emphasizes data-parallel programming, enabling multiple threads to work on different parts of a collection. This scales well to larger numbers of processors by dividing the collection into smaller pieces. Program performance increases as you add cores and processors.
Benchmarks
This benchmark illustrates the performance scalability of oneTBB.
Documentation & Code Samples
Documentation
Code Samples
Get Started
Learn how to use the parallel_for algorithm by locating a substring in the string.
Use oneTBB and SYCL*
See how to split the computational kernel for running between a CPU and GPU.
Learn how to use parallel_for and resumable tasks to split the computational kernel between a CPU and GPU.
Observe how similar computational kernels are run by two oneTBB tasks with TBB code and code that's compliant with SYCL*.
Advanced Scenarios
Examples in this repository show how to use the parallel_for algorithm for a 2D ray tracer and renderer, seismic wave simulation, and more.
Examples in this repository show how to use a flow graph for a self-organizing map, the Cholesky Factorization algorithm, and more.
Learn how to use the parallel_reduce algorithm for the Sieve of Eratosthenes method, and more.
View All Code Samples (GitHub)
How to work with code samples:
Training
Optimization for Modern Architectures
Concurrent Containers
Scalable Memory Allocation
Specifications
Languages:
- Data Parallel C++ (DPC++)
Note Must have Intel® oneAPI Base Toolkit installed
- C++
Operating systems:
- Windows
- Linux
- macOS*
- Android* (additional with open source)
For more information, see the system requirements.
Get Help
Your success is our success. Access these forums when you need assistance.
Stay In the Know on All Things CODE
Sign up to receive the latest tech articles, tutorials, dev tools, training opportunities, product updates, and more, hand-curated to help you optimize your code, no matter where you are in your developer journey. Take a chance and subscribe. You can change your mind at any time.