Choose a High-Performance FFT in oneMKL or Intel® IPP

Note: This content applies to Intel® oneAPI Math Kernel Library (oneMKL) 2018.0 or later and Intel® Integrated Performance Primitives (Intel® IPP) 2018.0 or later.

Objective

Get information to help you decide whether a fast Fourier transform (FFT) algorithm in oneMKL or Intel IPP is best suited for your application.

Overview

Fourier transforms are used in signal processing, image processing, physics, statistics, finance, cryptography, and many other areas. The discrete Fourier transform (DFT) mathematical operation converts a signal from the time domain to the frequency domain and back.

DFT processing time can dominate a software application. Using FFT (a fast algorithm) reduces the number of arithmetic operations from O(N2) to O(N log2 N) operations. FFTs in oneMKL and Intel IPP are highly optimized for Intel® architecture-based multicore processors using the latest instruction sets, parallelism, and algorithms.

This article provides guidance for selecting the best FFT for your application. For summaries of the oneMKL and Intel IPP libraries, see table 1. For details, see the oneMKL website and the Intel IPP website.

Table 1. Comparison of oneMKL and Intel IPP Functionality

	oneMKL	Intel IPP
Target Applications	Mathematical applications for engineering, scientific and financial applications	Speeds performance for imaging, vision, signal, security, and storage applications
Library Structure	BLAS Sparse BLAS LAPACK ScaLAPACK FFT Vector Math Vector statistics Random- number generators Partial differential equations Optimization solvers Sparse solvers Deep neural network ¹	Signal processing Image processing Cryptography Data compression
Linkage Models	Static Dynamic Custom dynamic	Static Dynamic Custom dynamic
Operating Systems	Windows* Linux* macOS*	Windows* Linux* Android² macOS
Processor Support	IA-32 and Intel® 64 architecture-based platforms and compatible platforms	IA-32 and Intel® 64 architecture-based platforms and compatible platforms

Both libraries contain generic code that is optimized for processors with Intel® Streaming SIMD Extensions (Intel® SSE) and code optimized for processors with Intel SSE2, Intel SSE3, Intel SSE4.1, Intel SSE4.2, Intel® Advanced Vector Extensions, Intel® Advanced Vector Extensions 2, and Intel® Advanced Vector Extensions 512 instruction sets

Deep Neural Network has not been a part of the package since 2020.
Android has not been supported since 2020.

FFT Features in oneMKL and Intel IPP

These FFTs are targeted for:

oneMKL: engineering and scientific applications
Intel IPP: media and communication applications

To help you decide which FFT is best for your application, see table 2.

Table 2: Comparison of oneMKL and Intel IPP DFT Features

Feature	oneMKL	Intel IPP
API	DFT Cluster FFT FFTW 2.x and 3.x	FFT DFT
Interfaces	C, Fortran and DPC++ API ³ LP64 (64-bit long and pointer) ILP64 (64-bit int, long, and pointer)	C LP64 only
Dimensions	1-D up to 7-D	1-D (signal processing) 2-D (image processing)
Transform Sizes	32-bit platforms - maximum size is 2^31-1 64-bit platforms - 2⁶⁴ maximum size	FFT - Powers of 2 only ⁴ DFT -2³² maximum size ⁴
Mixed Radix Support	2, 3, 5, 7, 11, 13, and several larger kernels⁵	DFT - 2, 3, 5, 7, 11, 13 kernels⁵
Data Types (See Table 3 for detail)	Real and complex Single- and double-precision	Real and complex Single- and double-precision
Scaling	Transforms can be scaled by an arbitrary floating-point number (with precision the same as input data)	Integer ("fixed") scaling Forward 1/N Inverse 1/N Forward + Inverse SQRT (1/N)
Threading	Platform dependent IA-32: All (except 1D when performing a single transform and sizes are not power of two) Intel 64: All (except for in-place power of two)	1D and 2D⁶

DPC++ APIs have been available since version 2021.
The maximum size limits are:

For double precision complex DFT (64fc), the length upper bound is 67108863 (2^26 - 1).
For single precision complex DFT (32fc), the length upper bound is 134217727 (2^27 - 1).
For double precision complex FFT (64fc), the length upper bound is 2^27.
For single precision complex FFT (32fc), the length upper bound is 2^28.

Both libraries support arbitrary radix in optimized manner, that is O(N*log(N)), but these specific radixes are better optimized than others.
Since Intel IPP v.2021, only the nonthreaded version is available.

Data Types and Formats

The FFTs for oneMKL and Intel IPP support a variety of data types and formats for storing signal values. Mixed types interfaces are also supported. For details, see the product documentation.

Table 3. Comparison of oneMKL and Intel IPP Data Types and Formats

Feature	oneMKL	Intel IPP
Real FFTs
Precision	Single Double	Single Double
1D Data Types	Real for all dimensions	Real for all dimensions
2D Data Types	Real for all dimensions	Real for all dimensions
1D Packed Formats	CCS Pack Perm CCE	CCS Pack Perm
2D Packed Formats	CCS Pack Perm CCE	RCPack2D
3D Packed Formats	CCE	n/a
Format Conversion Functions	n/a	n/a
Complex FFTs
Precision	Single Double	Single Double
1D Data Types	Complex for all dimensions	Complex for all dimensions
2D Data Types	Complex for all dimensions	Complex for all dimensions

Formats Legend

CCE: Stores the values of the first half of the output complex conjugate-even signal.
CCS: Same format as CCE format for 1D. It is slightly different for multidimensional real transforms for 2D transforms. CCS, pack, and perm are not supported for 3D and higher ranks.
Pack: Compact representation of a complex conjugate-symmetric sequence.
Perm: Same as the Pack format for odd lengths and arbitrary permutation of the Pack format for even lengths.
RCPack2D: Exploits the complex conjugate symmetry of the transformed data to store only half of the resulting Fourier coefficients.

Performance

The oneMKL and Intel IPP are optimized for current and future Intel processors, and are specifically tuned for two areas:

oneMKL is suitable for large problem sizes typical to Fortran and C/C++ high-performance computing software such as engineering, scientific, and financial applications.
Intel IPP is designed for smaller problem sizes including those used in multimedia, data processing, communications, and embedded C/C++ applications.

Choosing the Best FFT for Your Application

Before making a decision, you must understand the specific requirements and constraints of the application. Consider these questions:

What are the performance requirements for the application? How is performance measured? What is the measurement criteria? Is a specific benchmark used? What are the known performance bottlenecks?
What type of application is being developed? What are the main operations being performed and on what kind of data?
What API is currently being used in the application for transforms? What programming languages are the application code written in?
Does the FFT output data need to be scaled (normalized)? What type of scaling is required?
What kind of input and output data does the transform process? What are the valid and invalid values? What type of precision is required?

Summary

oneMKL and Intel IPP both provide optimized FFT functions. For more detailed information on the FFT APIs, parameters, and formats, see the following documents:

Developer Reference for oneMKL (see the Fourier Transform Function chapter)
Developer Guide and Reference for Intel IPP

Other Resources

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Choose a High-Performance FFT in oneMKL or Intel® IPP

Objective

Overview

FFT Features in oneMKL and Intel IPP

Data Types and Formats

Performance

Choosing the Best FFT for Your Application

Summary

Product and Performance Information

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Choose a High-Performance FFT in oneMKL or Intel® IPP

Objective

Overview

FFT Features in oneMKL and Intel IPP

Data Types and Formats

Performance

​Choosing the Best FFT for Your Application

Summary

Product and Performance Information

Choosing the Best FFT for Your Application