Intel® High Level Synthesis Compiler Pro Edition: Best Practices Guide

ID 683152
Date 3/28/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

5.6. Convert Nested Loops into a Single Loop

To maximize performance, combine nested loops into a single loop whenever possible. The control flow for a loop adds overhead both in logic required and FPGA hardware footprint. Combining nested loops into a single loop reduces these aspects, improving the performance of your component.

The following code examples illustrate the conversion of a nested loop into a single loop:

Nested Loop Converted Single Loop
for (i = 0; i < N; i++) { //statements for (j = 0; j < M; j++) { //statements } //statements }
for (i = 0; i < N*M; i++) { //statements }

You can also specify the loop_coalesce pragma to coalesce nested loops into a single loop without affecting the loop functionality. The following simple example shows how the compiler coalesces two loops into a single loop when you specify the loop_coalesce pragma.

Consider a simple nested loop written as follows:
#pragma loop_coalesce for (int i = 0; i < N; i++) for (int j = 0; j < M; j++) sum[i][j] += i+j;
The compiler coalesces the two loops together so that they run as if they were a single loop written as follows:
int i = 0; int j = 0; while(i < N){ sum[i][j] += i+j; j++; if (j == M){ j = 0; i++; } }

For more information about the loop_coalesce pragma, see "Loop Coalescing (loop_coalesce Pragma)" in Intel® High Level Synthesis Compiler Pro Edition Reference Manual.

You can also review the following tutorial: <quartus_installdir>/hls/examples/tutorials/best_practices/loop_coalesce