Visible to Intel only — GUID: GUID-CE195C65-C5AF-4FFE-94E2-E626B21FB6BE
Visible to Intel only — GUID: GUID-CE195C65-C5AF-4FFE-94E2-E626B21FB6BE
OpenMP Offload Best Practices
In this chapter we present best practices for improving the performance of applications that offload onto the GPU. We organize the best practices into the following categories, which are described in the sections that follow:
- Using More GPU Resources
- Minimizing Data Transfers and Memory Allocations
- Making Better Use of OpenMP Constructs
- Memory Allocation
- Fortran Example
- Clauses: is_device_ptr, use_device_ptr, has_device_addr, use_device_addr
- Prefetching
Note:
Used the following when collecting OpenMP performance numbers:
2-stack Intel® GPU
One GPU stack only (no implicit or explicit scaling).
Intel® compilers, runtimes, and GPU drivers
Level-Zero plugin
Introduced a dummy target construct at the beginning of a program, so as not to measure startup time.
Just-In-Time (JIT) compilation mode.