Intel® oneAPI Math Kernel Library (oneMKL) Release Notes

Where to Find the Release

Intel® oneAPI Math Kernel Library

2024.2.2

System Requirements

This is a bugfix release.

Fixed Issues

Significant performance regressions have been observed in the oneMKL 2024.2.0 and oneMKL 2024.2.1 for key mkl_sparse_?_mv/mkl_sparse_?_mm-related workloads of C/Fortran Inspector-Executor Sparse BLAS routines for CPU. In this oneMKL 2024.2.2 patch release, many of these regressions have been resolved and in some cases like for the BSR matrix format in mkl_sparse_?_mv/mkl_sparse_?_mm, performance has been improved over what was available previously for CPUs.

2024.2.1

System Requirements

This is a bugfix release.

Fixed Issues

Fixed crash in half-precision GEMM on the processor graphics of Intel® Core™ Ultra processors (Series 1).
Fixed sporadic accuracy issues in some double precision complex BLAS routines on the Intel® Data Center GPU Max Series.
Fixed potential out-of-bounds accesses which may have resulted in page faults in several Sparse BLAS SYCL routines.
Fixed the issue that some VM routines don't raise floating point exception.

Optimization

Improved large power of two real and complex FFT performance on Intel® Arc™ A-Series Graphics.

Known Issues and Limitations

oneMKL RNG Device API Gaussian distribution may provide incorrect results on the processor graphics of Intel® Core™ Ultra Processors (code-named Lunar Lake).
oneMKL RNG Device API combination of Philox4x32x10 engine and SkipAhead routine may provide incorrect results on the processor graphics of Intel® Core™ Ultra Processors (code-named Lunar Lake).

2024.2

System Requirements Bug Fix Log

New Features and Optimizations

BLAS
- Features
  - Enabled out-of-place blas::trmm and blas::trsm SYCL* APIs.
  - Enabled support for the processor graphics of Intel® Core™ Ultra Processors (code-named Lunar Lake).
- Optimizations
  - Improved performance of level-1 BLAS functions for complex types.
Sparse BLAS
- Features
  - Introduced sparse::matmatd to compute sparse matrix times sparse matrix with dense matrix output.
LAPACK
- Features
  - Introduced SYCL* USM APIs to compute least squares solutions of general matrices (lapack::gels).
  - Extended SYCL* USM lapack::gels_batch group APIs to support transposed underdetermined case.
  - Unified exception reporting for SYCL* interfaces.
- Optimizations
  - Improved performance of double precision complex divide-and-conquer eigensolver (lapack::heevd) and generalized eigensolver (lapack::hegvd) USM APIs on Intel® Data Center GPU Max Series as well as for C and Fortran OpenMP* offloading (zheevd, zhegvd).
  - Improved performance of QR factorization (lapack::geqrf) on Intel® GPU.
  - Improved performance of Cholesky factorization (?potrf) and tridiagonal divide-and-conquer eigensolver (?stedc) for small and medium sizes on CPU with OpenMP* threading.
DFT 
- Features
  - Enabled using oneMKL DFT SYCL* API within SYCL* Graph with in-order queue.
  - Enabled real out-of-place and complex in-place and out-of-place FFT with very large prime factor (> 2^22) and non-unit stride on CPU.
  - Enabled FWD_STRIDES and BWD_STRIDES parameters in the oneapi::mkl::dft::descriptor::get_value member function.
- Optimizations
  - Improved performance on Intel® Data Center GPU Max Series of real and complex 2D and 3D FFT whose length in each dimension is between 64 and 1024 and with prime factorization involving only 2, 3 and 5 prime numbers.
  - Improved performance on Intel® Data Center GPU Max Series of 3D real FFT of size NxNx2M with N odd, N and M less than 2000 and with prime factorization involving only prime numbers no larger than 13.
Vector Math
- Features
  - Added complete support for a subset of Bessel functions: I0(), I1(), J0(), J1(), Jn(), Y0(), Y1(), Yn() on CPU and Intel® GPU via all C, Fortran, and SYCL* APIs.
- Optimizations
  - Improved performance of double precision lgamma function on Intel® GPU.
Vector Statistics
- Optimizations
  - Improved performance of sub-stream-based parallelization mode for mrg32k3a SYCL* Host API.

Known Issues and Limitations

Offloading complex double precision TPMV or TRMV to the integrated GPU on Intel® Core™ Ultra Processors under Windows* may cause issues.
oneMKL DFT SYCL* APIs using SYCL* buffer for data input do not support SYCL* sub-buffer inputs for a range of large power of two sizes [2²¹,2²⁶] 1D complex FFT.
oneMKL FFT with a large prime factor (larger than 1024) may fail on Intel® Data Center GPU Max Series.
Single precision backward out-of-place complex batched FFT of size 4096x4096 may hang on Intel® Iris® Xe Max Graphics when using the SYCL* buffer API and the Level Zero runtime. As a workaround use the SYCL* USM API or the OpenCL* runtime.
FFT with Fortran OpenMP* offload on Windows* and Intel® Arc™ A-Series Graphics or Intel® Data Center GPU Flex Series may crash when using the OpenMP* 5.1 dispatch construct.
Some Sparse BLAS SYCL* examples (sparse_gemm_col_major/sparse_gemm_row_major) are known to fail with oneMKL on Windows* when run in Debug mode. Please use Release mode for this functionality on Windows*.
Some Sparse BLAS C OpenMP* Offload 5.1 APIs (mkl_sparse_sp2m and mkl_sparse_x_trsm) with OpenCL* backend and with asynchronous behaviour may result in hangs or crashes on Intel® GPUs. The workaround is to use synchronous calls or the Level Zero backend.
Using a lower triangular matrix for sparse matrix-vector multiplication with 1-based indexing and OpenMP* Offload in C with mkl_sparse_optimize and mkl_sparse_?_mv can sporadically provide incorrect output with the Level Zero backend and OpenMP* 5.1 version on Intel® Data Center GPU Max Series. As a workaround, use OpenCL* backend or Level Zero with OpenMP* version <= 5.0.
Summary statistics routines for 3rd central sum/moment calculation may sporadically provide incorrect results in the case of Intel® Data Center GPU Max Series.

Deprecation/Removal

The "target variant dispatch" construct in the Intel Extensions to OpenMP* is deprecated since the 2024.2 release and scheduled to be removed in the 2025.0 release. Users should use OpenMP* specification syntax "dispatch".

2024.1

System Requirements Bug Fix Log

New Features and Optimizations

Intel® Optimized High Performance Conjugate Gradient Benchmark
- Features
  - Introduced the HPCG benchmark for Intel® GPUs, optimized for clusters of nodes each with one or more Intel® Data Center GPU Max Series GPUs attached.
BLAS
- Features
  - Introduced Conditional Numerical Reproducibility support for level-3 routines on Intel® Data Center GPU Max Series.
  - Introduced 32-bit SYCL* APIs for all BLAS group batch routines with integer SYCL* USM pointer or SYCL* buffer inputs.
- Optimizations
  - Improved performance for numerous level-2 APIs on Intel® Data Center GPU Max Series.
  - Improved performance for complex double precision and TF32 level 3 routines on Intel® Data Center GPU Max Series.
Sparse BLAS
- Features
  - Introduced sparse::trsm and sparse::optimize_trsm SYCL* APIs with support for CSR format sparse triangular solves with multiple dense right-hand sides in row-major or column-major layout.
  - Introduced new sparse::trsv SYCL* API with support for fused alpha scaling of the right-hand side in the sparse triangular solve.
  - Introduced sparse::set_coo_data SYCL* API which allows to input a sparse coordinate (COO) matrix format data into the sparse::matrix_handle_t object on CPU and GPU devices.
  - Extended support of SYCL* APIs using a sparse::matrix_handle_t object with COO format data:
    - CPU: sparse::omatcopy, sparse::gemv, sparse::trmv, sparse::gemvdot, sparse::trsv, sparse::trsm and sparse::gemm APIs
    - GPU: sparse::omatcopy, sparse::gemv APIs
    - Introduced a new C example demonstrating sparse format conversions using Inspector Executor Sparse BLAS APIs. The example is located at $MKLROOT/share/doc/mkl/examples.
- Optimizations
  - Improved support and performance for sparse::gemv using complex data with the CSR format with all non-transpose/transpose/conjugate-transpose operations.
LAPACK
- Features
  - Introduced new routines and integrated bug fixes from Netlib LAPACK 3.11.0. New functionality includes level-3 BLAS solvers for triangular systems (?latrs3) and triangular Sylvester equations (?trsyl3) and a new algorithm for solving least square problems (?gelst). oneMKL LAPACK functionality is now aligned with Netlib LAPACK 3.11.0.
  - Introduced SYCL* USM APIs to compute batched group least squares solutions for general matrices (lapack::gels_batch).
  - Introduced SYCL* and C/Fortran APIs to compute approximate singular value decompositions of a batch of matrices (lapack::gesvda_batch), and enabled C/Fortran OpenMP* offload support.
- Optimizations
  - Improved performance for batched group LU inverse (lapack::getri_batch) on Intel® GPUs for SYCL* APIs, especially for a smaller number of larger matrices.
  - Improved performance for double precision divide-and-conquer eigensolver (lapack::syevd) and generalized eigensolver (lapack::sygvd) on Intel® Data Center GPU Max Series as well as for C and Fortran OpenMP* offload (dsyevd, dsygvd).
  - Improved performance of QR factorization (lapack::geqrf) on Intel® Data Center GPU Max Series for SYCL* USM APIs as well as for C and Fortran OpenMP* offload (?geqrf).
DFT
- Features
  - Introduced new configuration parameters FWD_STRIDES and BWD_STRIDES for the DFT SYCL* API.
- Optimizations
  - Improved FFT performance on Intel® Data Center GPU Max Series for 1D complex FFT of large power of two size and batched 2D complex FFT of medium to large power of two size.
  - Improved FFT performance on Intel® Data Center GPU Max Series for 3D real FFT of medium power of two size or small odd size.
Vector Math
- Features
  - Introduced oneMKL VM Support for a subset of Bessel functions for C API and SYCL* API: I0(), I1(), J0(), J1(), Jn(), Y0(), Y1(), Yn().
- Optimizations
  - Improved performance for logb() and nextafter() on Intel® GPUs.
Vector Statistics
- Features
  - Introduced VERBOSE mode support for RNG C/Fortran API.
  - Introduced sub-stream based parallelization mode for SYCL API of mrg32k3a engine.
Library Engineering
- Advance Notice
  - Starting oneMKL 2025.0, a user-supplied "mkl_progress" function will not redefine the default "mkl_progress" function automatically and the "mkl_set_progress" function must be used to specify any overrides.

Known Issues and Limitations

OpenMP* offload of Fortran group batch routines to Intel® GPU on Windows* may produce incorrect results with the OpenMP* 5.1 “dispatch” construct. Use the “target variant dispatch” construct instead.
Certain sizes/configurations of int8 GEMMs may return incorrect results on Intel® Data Center GPU Max series when B is transposed (column-major) or A is transposed (row-major).
oneMKL DFT SYCL* APIs using SYCL* buffer for data input do not support SYCL* sub-buffer inputs for a range of large power of two sizes [2²¹,2²⁶] 1D complex FFT.
Double precision FFT of size that are multiple of very large primes may see incorrect results on CPU.
oneMKL FFT with a large prime factor (larger than 1024) may fail on Intel® Data Center GPU Max Series
On Intel® Iris® Xe MAX Graphics, {c,s}getrfnp_batch functions may hang or have a segmentation fault. As a workaround, use the {c,s}getrfnp_batch_strided functions instead.
Some Sparse BLAS SYCL* examples (sparse_gemm_col_major/sparse_gemm_row_major) are known to fail with oneMKL on Windows* when run in Debug mode. Please use Release mode for this functionality on Windows.
Using a lower triangular matrix for sparse matrix-vector multiplication with 1-based indexing and OpenMP* Offload in C with mkl_sparse_optimize and mkl_sparse_?_mv can sporadically provide incorrect output with the Level Zero backend and OpenMP* 5.1 version on Intel® Data Center GPU Max Series. As a workaround, use OpenCL* backend or Level Zero & OpenMP* version <= 5.0.
The deprecated sparse::release_matrix_handle API without a sycl::queue input may fail to wait for previously enqueued commands to be completed on an in-order queue if the sycl::event corresponding to the queue's last command is not provided as a dependency to the release API, or a queue synchronization point is not commanded before the release API call.
C and Fortran offload examples may exhibit a certain behavior resulting in the crash after completing computations. It is known to affect subset of Intel® GPUs including Intel® Data Center GPU Flex series, but not including Intel® Data Center GPU Max series. To work around this issue, it is recommended to switch Offload plugin to OpenCL* using ONEAPI_DEVICE_SELECTOR=opencl:gpu setting. This behavior does not affect accuracy or performance of oneMKL functions and will be fixed in 2024.2.

Deprecation/Removal

The INPUT_STRIDES and OUTPUT_STRIDES configuration parameters are deprecated for the oneMKL SYCL DFT APIs and will be removed in the oneMKL 2026.0 release. Please use the FWD_STRIDES and BWD_STRIDES configuration parameters instead.
Random number generation save_state/load_state API with std::string as a second parameter have been deprecated and will be removed in the oneMKL 2026.0 release. Please use save_state/load_state API with const std::uint8_t* as a second parameter instead.
The sparse triangular solve sparse::trsv SYCL API without an “alpha” scaling parameter has been deprecated. Please use the new sparse triangular solve with “alpha” as 1 or other value if desired.

Notes

For the 2024.1 release, the Third Party Programs file has been included as a section in this product’s release notes rather than as a separate text file.

Third Party Programs File

Intel(R) oneAPI Math Kernel Library (oneMKL) Third Party Programs File

This file is the "third-party-programs.txt" file specified in the associated Intel end user license agreement for the Intel software you are licensing.

Third party programs and their corresponding required notices and/or license terms are listed below.

-------------------------------------------------------------

1. Netlib BLACS - Basic Linear Algebra Communication Subprograms:
Copyright (c) 1992-2013 The University of Tennessee and The University of Tennessee Research Foundation. All rights reserved.
Copyright (c) 2000-2013 The University of California Berkeley. All rights reserved.
Copyright (c) 2006-2013 The University of Colorado Denver. All rights reserved.

Netlib LAPACK:
Copyright (c) 1992-2013 The University of Tennessee and The University of Tennessee Research Foundation. All rights reserved.
Copyright (c) 2000-2013 The University of California Berkeley. All rights reserved.
Copyright (c) 2006-2013 The University of Colorado Denver. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer listed in this license in the documentation and/or other materials provided with the distribution.

- Neither the name of the copyright holders nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

The copyright holders provide no reassurances that the source code provided does not infringe any patent, copyright, or any other intellectual property rights of third parties. The copyright holders disclaim any liability to any recipient for claims brought against recipient by any third party for infringement of that parties intellectual property rights.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

-------------------------------------------------------------
2. HPCG: High Performance Conjugate Gradient Benchmark:
Copyright (c) 2013-2019, hpcg-benchmark
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* Neither the name of hpcg nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

----------------------------------------------------------------
3. Mersenne Twister with improved initialization:
Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura, All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. The names of its contributors may not be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

----------------------------------------------------------------
4. SFMT:
Copyright (c) 2006,2007-2014 Mutsuo Saito, Makoto Matsumoto and Hiroshima University. All rights reserved.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

----------------------------------------------------------------
5. Sobol sequence generator:
Copyright (c) 2008, Frances Y. Kuo and Stephen Joe, All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

* Neither the names of the copyright holders nor the names of the University of New South Wales and the University of Waikato and its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

---------------------------------------------------------------
6. The FEAST Eigenvalue Solver:
Copyright (c) 2009-2012, The Regents of the University of Massachusetts, Amherst. Developed by E. Polizzi. All rights reserved.

Redistribution and use in source and binary forms, with or without modification,are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the University nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

----------------------------------------------------------------
7. Intel(R) Instrumentation and Tracing Technology API:
Copyright(c) 2019, Intel Corporation, All rights reserved.

LAPACK95
Copyright (c) 2000, Netlib
All rights reserved.

XBLAS
Copyright (c) 2008-2009 The University of California Berkeley. All rights reserved.

xbyak:
Copyright (c) 2007 MITSUNARI Shigeo
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, his list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
* Neither the name of Intel Corporation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

----------------------------------------------------------------
8. Intel Open Source Technology Center Safe String Library:
strcat_s.c
October 2008, Bo Berry
Copyright (c) 2008-2011 by Cisco Systems, Inc. All rights reserved.

level-zero:
Copyright (c) 2019 Intel Corporation

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

-------------------------------------------------------------
9. KhronosGroup/OpenCL-Headers

oneAPI Math Kernel Library (oneMKL) Interfaces
Copyright Intel Corporation

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

1. Definitions.

"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.

"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.

"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.

"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.

"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.

"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.

"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).

"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.

"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."

"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.

2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.

3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.

4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:

(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and

(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and

(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and

(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.

You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.

5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.

6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.

7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.

8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.

9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

-------------------------------------------------------------
10. HPL - High-Performance Linpack Benchmark:
HPL - 2.3 - December 2, 2018
Antoine P. Petitet
University of Tennessee, Knoxville
Innovative Computing Laboratory
(C) Copyright 2000-2008 All Rights Reserved

HPL License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions, and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. All advertising materials mentioning features or use of this software must display the following acknowledgement:
This product includes software developed at the University of Tennessee, Knoxville, Innovative Computing Laboratory.

4. The name of the University, the name of the Laboratory, or the names of its contributors may not be used to endorse or promote products derived from this software without specific written permission.

-- Disclaimer:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE UNIVERSITY OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

-------------------------------------------------------------
11. Intel(R) oneAPI Threading Building Blocks:
Copyright 2005-2019 Intel Corporation. All Rights Reserved.

Intel(R) Integrated Performance Primitives Library
Copyright 2020 Intel Corporation. All Rights Reserved.

Intel(R) OpenMP* Runtime:
Copyright 1985-2019 Intel Corporation. All Rights Reserved.

Intel Simplified Software License (Version October 2022)

Use and Redistribution. You may use and redistribute the software, which is provided in binary form only, (the "Software"), without modification, provided the following conditions are met:

*    Redistributions must reproduce the above copyright notice and these terms of use in the Software and in the documentation and/or other materials provided with the distribution.
*    Neither the name of Intel nor the names of its suppliers may be used to endorse or promote products derived from this Software without specific prior written permission.
*    No reverse engineering, decompilation, or disassembly of the Software is permitted, nor any modification or alteration of the Software or its operation at any time, including during execution.

No other licenses. Except as provided in the preceding section, Intel grants no licenses or other rights by implication, estoppel or otherwise to, patent, copyright, trademark, trade name, service mark or other intellectual property licenses or rights of Intel.

Third party software. "Third Party Software" means the files (if any) listed in the "third-party-software.txt" or other similarly-named text file that may be included with the Software. Third Party Software, even if included with the distribution of the Software, may be governed by separate license terms, including without limitation, third party license terms, open source software notices and terms, and/or other Intel software license terms. These separate license terms solely govern Your use of the Third Party Software.

DISCLAIMER. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT ARE DISCLAIMED. THIS SOFTWARE IS NOT INTENDED FOR USE IN SYSTEMS OR APPLICATIONS WHERE FAILURE OF THE SOFTWARE MAY CAUSE PERSONAL INJURY OR DEATH AND YOU AGREE THAT YOU ARE FULLY RESPONSIBLE FOR ANY CLAIMS, COSTS, DAMAGES, EXPENSES, AND ATTORNEYS FEES ARISING OUT OF ANY SUCH USE, EVEN IF ANY CLAIM ALLEGES THAT INTEL WAS NEGLIGENT REGARDING THE DESIGN OR MANUFACTURE OF THE SOFTWARE.

LIMITATION OF LIABILITY. IN NO EVENT WILL INTEL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

No support. Intel may make changes to the Software, at any time without notice, and is not obligated to support, update or provide training for the Software.

Termination. Your right to use the Software is terminated in the event of your breach of this license.

Feedback. Should you provide Intel with comments, modifications, corrections, enhancements or other input ("Feedback") related to the Software, Intel will be free to use, disclose, reproduce, license or otherwise distribute or exploit the Feedback in its sole discretion without any obligations or restrictions of any kind, including without limitation, intellectual property rights or licensing obligations.

Compliance with laws. You agree to comply with all relevant laws and regulations governing your use, transfer, import or export (or prohibition thereof) of the Software.

Governing law. All disputes will be governed by the laws of the United States of America and the State of Delaware without reference to conflict of law principles and subject to the exclusive jurisdiction of the state or federal courts sitting in the State of Delaware, and each party agrees that it submits to the personal jurisdiction and venue of those courts and waives any objections. THE UNITED NATIONS CONVENTION ON CONTRACTS FOR THE INTERNATIONAL SALE OF GOODS (1980) IS SPECIFICALLY EXCLUDED AND WILL NOT APPLY TO THE SOFTWARE.

----------------------------------------------------------------

12. Microsoft HPC Pack Software Development Kit (SDK)
Copyright Microsoft Corp.

Terms for Microsoft "Distributable Code."
1.   License. This software package from Intel (the "Software Package") contains code from Microsoft (the "Distributable Code"). You are provided a non-transferable, non-exclusive, non-sublicensable, limited right and license only to use and redistribute the Distributable Code as part of this Software Package. You are not allowed to copy, modify, remove the Distributable Code from the Software Package or redistribute the Distributable Code on a stand-alone basis.
2.   Restrictions. The Distributable Code is licensed, not sold. You are only provided the above rights to use the Distributable Code. Intel and Microsoft reserve all other rights. Unless applicable law gives you more rights, you may use the Distributable Code only as expressly permitted in these terms. In using the Distributable Code, you must comply with any technical limitations in the Distributable Code that only allow you to use it in certain ways. You may not:
*   work around any technical limitations in the Distributable Code;
*   reverse engineer, decompile or disassemble the software, or otherwise attempt to derive the source code for the Distributable Code, except and to the extent required by third party licensing terms governing use of certain open source components that may be included in the Distributable Code;
*   remove, minimize, block or modify any notices of Intel, Microsoft or its suppliers in the Distributable Code;
*   use the Distributable Code in any way that is against the law; or
*   share, publish, rent or lease the software, or provide the Distributable Code as a stand-alone offering for others to use.
3.   NO WARRANTY. THE DISTRIBUTABLE CODE IS PROVIDED "AS IS" WITHOUT ANY EXPRESS OR IMPLIED WARRANTY OF ANY KIND INCLUDING WARRANTIES OF MERCHANTABILITY, NONINFRINGEMENT, OR FITNESS FOR A PARTICULAR PURPOSE.
4.   LIMITATION ON AND EXCLUSION OF DAMAGES. YOU CAN RECOVER FROM INTEL, MICROSOFT OR THEIR SUPPLIERS ONLY DIRECT DAMAGES UP TO $5.00. YOU CANNOT RECOVER ANY OTHER DAMAGES, INCLUDING CONSEQUENTIAL, LOST PROFITS, SPECIAL, INDIRECT OR INCIDENTAL DAMAGES.
This limitation applies to (a) anything related to the Distributable Code; and (b) claims for breach of contract, breach of warranty, guarantee or condition, strict liability, negligence, or other tort to the extent permitted by applicable law. It also applies even if Intel or Microsoft knew or should have known about the possibility of the damages. The above limitation or exclusion may not apply to you because your state or country may not allow the exclusion or limitation of incidental, consequential or other damages.
5.   Export Restrictions. You must comply with all domestic and international export laws and regulations that apply to the software, which include restrictions on destinations, end users, and end use. For further information on export restrictions, visit www.microsoft.com/exporting.

----------------------------------------------------------------
13. memkind
Copyright (C) 2014-2020 Intel Corporation

Unless otherwise specified, files in the memkind source distribution are
subject to the following license:

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of Intel Corporation nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

----------------------------------------------------------------
14. spiralgen

1. Copyright Notice

2. License

2.1. This is your license from Intel Corp. under its intellectual property rights. You may have additional license terms from the party that provided you this software, covering your right to use that party's intellectual property rights.

2.2. Intel grants, free of charge, to any person ("Licensee") obtaining a copy of the source code appearing in this file ("Covered Code") an irrevocable, perpetual, worldwide license under Intel's copyrights in the base code distributed originally by Intel ("Original Intel Code") to copy, make derivatives, distribute, use and display any portion of the Covered Code in any form, with the right to sublicense such rights; and

2.3. Intel grants Licensee a non-exclusive and non-transferable patent license (with the right to sublicense), under only those claims of Intel patents that are infringed by the Original Intel Code, to make, use, sell, offer to sell, and import the Covered Code and derivative works thereof solely to the minimum extent necessary to exercise the above copyright license, and in no event shall the patent license extend to any additions to or modifications of the Original Intel Code. No other license or right is granted directly or by implication, estoppel or otherwise;

The above copyright and patent license is granted only if the following conditions are met:

3. Conditions

3.1. Redistribution of Source with Rights to Further Distribute Source. Redistribution of source code of any substantial portion of the Covered Code or modification with rights to further distribute source must include the above Copyright Notice, the above License, this list of Conditions, and the following Disclaimer and Export Compliance provision. In addition, Licensee must cause all Covered Code to which Licensee contributes to contain a file documenting the changes Licensee made to create that Covered Code and the date of any change. Licensee must include in that file the documentation of any changes made by any predecessor Licensee. Licensee must include a prominent statement that the modification is derived, directly or indirectly, from Original Intel Code.

3.2. Redistribution of Source with no Rights to Further Distribute Source. Redistribution of source code of any substantial portion of the Covered Code or modification without rights to further distribute source must include the following Disclaimer and Export Compliance provision in the documentation and/or other materials provided with distribution. In addition, Licensee may not authorize further sublicense of source of any portion of the Covered Code, and must include terms to the effect that the license from Licensee to its licensee is limited to the intellectual property embodied in the software Licensee provides to its licensee, and not to intellectual property embodied in modifications its licensee may make.

3.3. Redistribution of Executable. Redistribution in executable form of any substantial portion of the Covered Code or modification must reproduce the above Copyright Notice, and the following Disclaimer and Export Compliance provision in the documentation and/or other materials provided with the distribution.

3.4. Intel retains all right, title, and interest in and to the Original Intel Code.

3.5. Neither the name Intel nor any other trademark owned or controlled by Intel shall be used in advertising or otherwise to promote the sale, use or other dealings in products derived from or relating to the Covered Code without prior written authorization from Intel.

4. Disclaimer and Export Compliance

4.1. INTEL MAKES NO WARRANTY OF ANY KIND REGARDING ANY SOFTWARE PROVIDED HERE. ANY SOFTWARE ORIGINATING FROM INTEL OR DERIVED FROM INTEL SOFTWARE IS PROVIDED "AS IS," AND INTEL WILL NOT PROVIDE ANY SUPPORT, ASSISTANCE, INSTALLATION, TRAINING OR OTHER SERVICES. INTEL WILL NOT PROVIDE ANY UPDATES, ENHANCEMENTS OR EXTENSIONS. INTEL SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY, NONINFRINGEMENT AND FITNESS FOR A PARTICULAR PURPOSE.

4.2. IN NO EVENT SHALL INTEL HAVE ANY LIABILITY TO LICENSEE, ITS LICENSEES OR ANY OTHER THIRD PARTY, FOR ANY LOST PROFITS, LOST DATA, LOSS OF USE OR COSTS OF PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES, OR FOR ANY INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THIS AGREEMENT, UNDER ANY CAUSE OF ACTION OR THEORY OF LIABILITY, AND IRRESPECTIVE OF WHETHER INTEL HAS ADVANCE NOTICE OF THE POSSIBILITY OF SUCH DAMAGES. THESE LIMITATIONS SHALL APPLY NOTWITHSTANDING THE FAILURE OF THE ESSENTIAL PURPOSE OF ANY LIMITED REMEDY.

4.3. Licensee shall not export, either directly or indirectly, any of this software or system incorporating such software without first obtaining any required license or other approval from the U. S. Department of Commerce or any other agency or department of the United States Government. In the event Licensee exports any such software from the United States or re-exports any such software from a foreign destination, Licensee shall ensure that the distribution and export/re-export of the software is in compliance with all laws, regulations, orders, or other restrictions of the U.S. Export Administration Regulations. Licensee agrees that neither it nor any of its subsidiaries will export/re-export any technical data, process, software, or service, directly or indirectly, to any country for which the United States government or any agency thereof requires an export license, other governmental approval, or letter of assurance, without first obtaining such license, approval or letter.

----------------------------------------------------------------
15. METIS
Copyright 1997, Regents of the University of Minnesota

----------------------------------------------------------------
16. MC64 Algorithm, part of HSL (formerly the Harwell Subroutine Library)
Copyright (c) 1999 Council for the Central Laboratory
of the Research Councils
All rights reserved.

----------------------------------------------------------------

The following third party programs have their own third party program files. These additional third party program files are as follows:

1. List of third party for Intel Open Source Technology Center Safe String Library is available in the third-party-programs-safestring.txt.

2. List of third party for Intel(R) oneAPI Threading Building Blocks for Linux is available in <install-dir>/tbb/<version>/share/doc/tbb/licensing/third-party-programs.txt.

3. List of third party for Intel(R) oneAPI Threading Building Blocks for Windows is available in <install-dir>\tbb\<version>\share\doc\tbb\licensing\third-party-programs.txt.

4. List of third party for Intel(R) OpenMP* Runtime is available in the third-party-programs-openmp.txt file.

5. List of third party for Intel(R) Integrated Performance Primitives Library is available in the third-party-programs-ipp.txt file.

6. List of third party for oneAPI Math Kernel Library (oneMKL) Interfaces is available in the third-party-programs-oneMKL-Interfaces.txt file.

----------------------------------------------------------------

2024.0

System Requirements Bug Fix Log

What’s new?

Integrates Vector Math optimizations into Random Number Generators for high performance computer simulations, statistical sampling, and other areas on x86 CPUs and Intel GPUs.
Supports Vector Math for FP16 datatype on Intel® GPUs
Delivers high-performance benchmarks HPL and HPL-AI optimized for Intel® Xeon® CPU Max Series and Intel® Data Center GPU Max Series

Directory Layout

Directory layout is improved across all products to streamline installation and setup.

The Unified Directory Layout is implemented in 2024.0. If you have multiple toolkit versions installed, the Unified layout ensures that your development environment contains the correct component versions for each installed version of the toolkit.

The directory layout used before 2024.0, the Component Directory Layout, is still supported on new and existing installations.

For detailed information about the Unified layout, including how to initialize the environment and advantages with the Unified layout, refer to Use the setvars and oneapi-vars Scripts with Linux and Use the setvars and oneapi-vars Scripts with Windows.

New Features and Optimizations

BLAS
- Features
  - Scalar parameters (alpha, beta) to BLAS USM APIs may now be passed by pointer or by value.
  - Added complex_3m acceleration for GEMM (including batched variants) on Intel® Data Center GPU Max Series.
  - Added strided versions of gemm3m_batch C and Fortran APIs, including OpenMP* offload support.
  - Added {cblas_}gemm_f16f16f32 C APIs. These are the half-precision (MKL_F16) analogues of the previously introduced gemm_bf16bf16f32 APIs for bfloat16 (MKL_BF16).
- Optimizations
  - Enhanced HGEMM performance for small matrices on CPUs.
  - Improved general performance of GEMV and several BLAS level-1 routines on Intel® Data Center GPU Max Series.

Sparse BLAS
- Features
  - Inspector Executor Sparse BLAS C APIs now include mkl_sparse_<xyz>_64() APIs using MKL_INT64 for all integers in lp64 and ilp64 modes.
  - Added std::complex<float> and std::complex<double> support for all existing sparse BLAS SYCL* APIs.
  - Added support for oneapi::mkl::transpose::conjtrans operation to sparse::gemv and sparse::omatcopy SYCL* APIs.
  - Added support for oneapi::mkl::transpose::{trans, conjtrans} operation on the sparse matrix in sparse::gemm SYCL* API.
- Optimizations
  - Improved performance for sparse::gemv/trmv with matrices with high variability in the number of non-zeros per row.
  - Improved sparse::matmat performance for key workloads.
LAPACK
- Features
  - Introduced SYCL* APIs to compute LU factorization without pivotization (lapack::getrfnp); added support for OpenMP* offloading in C and Fortran (mkl_?getrfnp).
  - Introduced SYCL* APIs to compute batched matrix inverse of a group of general matrices (lapack::geinv_batch).
  - Added argument checking for lapack::gerqf, lapack::hetrf, lapack::orgbr, lapack::orgtr, lapack::ormrq, lapack::ormtr, lapack::sytrf, lapack::ungbr, lapack::ungtr, lapack::unmrq, lapack::unmtr, and their scratchpad size functions.
- Optimizations
  - Improved performance of QR factorization (lapack:: geqrf) on Intel® Data Center GPU Max Series for SYCL* USM APIs as well as for C and Fortran OpenMP* offloading.
  - Improved performance of orthogonal/unitary matrix multiplication (lapack::ormqr/ lapack::unmqr) on Intel® GPUs for SYCL* APIs and C and Fortran OpenMP* offloading.
  - Improved performance of batched strided LU inverse (lapack::getri_batch) on Intel® GPUs for SYCL* APIs, especially for a smaller number of larger matrices.
DFT 
- Features
  - Enabled FFTs larger than 4 GiB (up to 64GiB of data) on Intel® Data Center GPU Max Series.
- Optimizations
  - Improved double precision FFT performance on Intel® Data Center GPU Max Series for any FFT with at least one dimension divisible by a prime number in the range [11,61].
  - Improved 1D complex FFT performance on Intel® Data Center GPU Max Series for power of two sizes in the range [2²¹, 2²⁵].
Vector Math
- Features:
  - Added support for OpenMP* 5.1 offloading in C.
  - Added SYCL*–OpenMP* interoperability support for OpenMP* offloading.
  - Status and Mode were aligned in the Classic and Offloading versions of VM.
  - J0/J1 Bessel functions of 1st kind orders 0 and 1 for real arguments added for GPUs.
  - Y0/Y1 Bessel functions of 2nd kind orders 0 and 1 for real arguments added for GPUs.
  - I0/I1 Bessel functions of 1st kind orders 0 and 1 for real arguments added for GPUs.
- Optimizations:
  - HA versions of cexp, cln, csqrt were added in native precision for GPUs.
  - Native FP16 cos/exp/exp10/ln/log10/log2/sin were added for GPUs.
  - The FP16 host API performance on GPU was improved by up to 30%.

Vector Statistics
- Features
  - Enabled Verbose mode support for RNG SYCL* Host API.
  - Optimizations
  - Optimized mrg32k3a and philox4x32x10 RNG SYCL* Device API performance on Intel® Data Center GPU Max Series.
Sparse Solvers
- Features
  - Improved accuracy of generalized eigenvalues calculated using mkl_sparse_?_gv for symmetric matrix types.

Library Engineering

The following domain specific SYCL* libraries are now made available in addition to the combined mkl_sycl library:
- libmkl_sycl_blas.so
- libmkl_sycl_lapack.so (depends on libmkl_sycl_blas.so)
- libmkl_sycl_sparse.so (depends on libmkl_sycl_blas.so)
- libmkl_sycl_vm.so
- libmkl_sycl_rng.so
- libmkl_sycl_stats.so
- libmkl_sycl_data_fitting.so
  
  MKLConfig.cmake also provides corresponding targets to link domain specific SYCL* libraries via MKL::MKL_SYCL::<domain>
Dropped all SSSE3 and AVX optimizations
With the removal of classic compiler support, all references to this compiler have been replaced with icx.
MKLConfig.cmake now rejects operation when the oneMKL version found in the environment variable MKLROOT differs from the version found by CMake.
Removed find_package_handle_standard_args() in MKLConfig.cmake, as it incorrectly set MKL_FOUND.
MKLConfig.cmake: Removed oneMKL path from implicit include directories such that oneMKL include directory path is always explicitly defined, independent of whether it is present in the user’s CPATH environment variable or not. This resolves an issue when cmake is called from different environments. Please note, changes are for C and C++, not for Fortran, according to CMake 3.14+ doc implicit directory variable is not used for Fortran.
Removed __cdecl, its related macros, and *_win.h files.

Fixed issues:

oneMKL DFT SYCL* APIs may fail to compute correct results for 2D and 3D real FFT when using a user-allocated SYCL* buffer workspace and the OpenCL* runtime.
Improved BLAS support for host USM pointers.
Fixed SYMM/TRSM accuracy issues.
Fixed SGEMM/DGEMM/SYRK failures and memory leaks.
Fixed Fortran OpenMP* issues when complex-precision division is used on Windows on Intel® Iris® Xe Max and Intel® Arc™ A-Series GPUs with static linking.

Known Issues and Limitations

The getri_batch_usm and getri_oop_batch_usm LAPACK examples that are located at ${MKLROOT}/examples/dpcpp/lapack may fail on Intel® Iris® Xe MAX Graphics on Windows* in debug_mode.
On Intel® Iris® Xe MAX Graphics, {c,s}getrfnp_batch functions may hang or have a segmentation fault. As a workaround, use the {c,s}getrfnp_batch_strided functions instead.
OpenMP* offload of Fortran LAPACK functions cpotrf, cpotri, cpotrs, ctrtri, spotrf, spotri, spotrs, strtri to GPU under Windows* in static linking mode may crash. As a workaround, use dynamic linking mode.
oneMKL DFT SYCL* APIs using SYCL* buffer for data input do not support SYCL* sub-buffer inputs for a range of large power of two sizes [2²¹,2²⁶] 1D complex FFT.
Double precision FFT of size that are multiple of very large primes may see incorrect results on CPU.
2D and 3D FFT might hang on Intel® Data Center GPU Max Series when GPU debugging is enabled. As a workaround, set the following environment variables NEOReadDebugKeys=1 EnableRecoverablePageFaults=0 or disable GPU debugging by writing 0 in the files /sys/class/drm/card*/prelim_enable_eu_debug
Mrg32k3a random number engine may fail on Intel® Arc™ A-Series Graphics GPU in case of Windows* OS and /Od enabled option.
Random number generator Device APIs with enabled Vector Math Device APIs underneath do not work on Intel ® GPUs without native double precision support due to Vector Math restrictions.
Some Sparse BLAS SYCL* examples (sparse_gemm_col_major/sparse_gemm_row_major) are known to fail with oneMKL 2024.0 on Windows* when run in Debug mode. Please use Release mode linking to use this particular functionality.
Use the prebuilt oneMKL 2024.0 HPCG binaries with the oneAPI 2024.0 compiler runtime for the best performance. Compiling HPCG from sources with the current icpx compiler may result in slightly lower performance than when compiling it with compilers from earlier oneAPI releases.
oneapi::mkl::sparse::trsv() sycl::buffer APIs may crash with a segmentation fault when any of the CSR matrix data, x, or y vectors, are sub-buffer(s) of a sycl::buffer
Asynchronous execution of mkl_sparse_optimize() for mkl_sparse_x_mv() using OpenMP* offloading in C can sporadically hang on Intel® Data Center GPU Max Series. As a workaround, use synchronous offloading for mkl_sparse_optimize().
Strided and group batched non-pivoting LU (getrfnp_batch) for complex precisions provides incorrect values on Intel® Data Center GPU Max Series with certain drivers.
oneMKL SYCL DLL could leak memory after unloading on Windows. The problem can be avoided by adding mkl_free_buffer before unloading the DLL. 
The Intel® oneMKL NuGet packages intelmkl.static.cluster.win-x64 and intelmkl.devel.cluster.win-x64 cannot be added to a .Net Standard 2.0 or higher project because a dependent package (intelmpi.devel.win-x64) is not compatible with the 2.0 standard. An updated intelmpi.devel.win-x64 package will be published to address the compatibility with the 2.0 standard.

Known Issues and Limitations for Intel® GPU Driver Version 20231219

The limitations in this section do not apply to the execution of Intel® oneMKL on CPUs.

The LAPACK batch strided least squares solver (oneapi::mkl::lapack::gels_batch, ?gels_batch_strided with OpenMP* offload) may return incorrect results on all Intel® GPUs. As a workaround, the previous GPU driver version 20231031 can be used. A list of supported GPUs of that version can be found in the driver 20231031 release notes.
oneMKL double precision FFT may fail or crash on the integrated GPUs of Intel® Core Ultra processors for driver version 20231219. The issue will be fixed in future releases of the driver.
oneMKL RNG Sobol Host API and Stats routines may throw an exception in case of execution on any Intel® GPU device. As a workaround, the previous GPU driver version 20231031 can be used. A list of supported GPUs of that version can be found in the driver 20231031 release notes.

Deprecation/Removal

Graph domain APIs have been removed in the oneMKL 2024.0 release.
Intel® oneAPI Math Kernel Library (oneMKL) for macOS deprecated in release 2023.0 and will now be discontinued as of Intel ® oneMKL release version 2024.0 and later releases.

Previous oneAPI Releases

Notices and Disclaimers

Intel technologies may require enabled hardware, software or service activation.

No product or component can be absolutely secure.

Your costs and results may vary.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Math Kernel Library (oneMKL) Release Notes

Where to Find the Release

2024.2.2

Fixed Issues

2024.2.1

Fixed Issues

Optimization

Known Issues and Limitations

2024.2

New Features and Optimizations

BLAS

Features

Optimizations

Sparse BLAS

Features

LAPACK

Features

Optimizations

DFT

Features

Optimizations

Vector Math

Features

Optimizations

Vector Statistics

Optimizations

Known Issues and Limitations

Deprecation/Removal

2024.1

New Features and Optimizations

Intel® Optimized High Performance Conjugate Gradient Benchmark

Features

BLAS

Features

Optimizations

Sparse BLAS

Features

Optimizations

LAPACK

Features

Optimizations

DFT

Features

Optimizations

Vector Math

Features

Optimizations

Vector Statistics

Features

Library Engineering

Advance Notice

Known Issues and Limitations

Deprecation/Removal

Notes

Third Party Programs File

Intel(R) oneAPI Math Kernel Library (oneMKL) Third Party Programs File

2024.0

What’s new?

Directory Layout

New Features and Optimizations

BLAS

Features

Optimizations

Sparse BLAS

Features

Optimizations

LAPACK

Features

Optimizations

DFT

Features

Optimizations

Vector Math

Features:

Optimizations:

DFT 

DFT