Visible to Intel only — GUID: GUID-40CA987C-715D-452A-91A2-8FF5EED23DB5
Visible to Intel only — GUID: GUID-40CA987C-715D-452A-91A2-8FF5EED23DB5
OpenMP Offload Best Practices
In this chapter we present best practices for improving the performance of applications that offload onto the GPU. We organize the best practices into the following categories, which are described in the sections that follow:
- Using More GPU Resources
- Minimizing Data Transfers and Memory Allocations
- Making Better Use of OpenMP Constructs
- Memory Allocation
- Clauses: is_device_ptr, use_device_ptr, has_device_addr, use_device_addr
Note:
Used the following when collecting OpenMP performance numbers:
2-tile Intel® GPU
One GPU tile only (no implicit or explicit scaling).
Internal versions of the Intel® compilers, runtimes, and GPU driver
Level-Zero plugin
Introduced a dummy target construct at the beginning of a program, so as not to measure startup time.
Just-In-Time (JIT) compilation mode.