DSP Builder for Intel® FPGAs (Advanced Blockset): Handbook

ID 683337
Date 5/27/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

8.2.1. About Loops

Your design can contain many loops that can interact with or be nested inside each other. DSP Builder uses standard mathematical linear programming techniques to solve a set of simultaneous timing constraints.

Consider the following two main cases:

  • The simpler case is feed-forward. When no loops exist, feed-forward datapaths are balanced to ensure that all the input data reaches each functional unit in the same cycle. After analysis, DSP Builder inserts delays on all the non-critical paths to balance out the delays on the critical path.
  • The case with loops is more complex. Loops cannot be combinational—all loops in the Simulink design must include delay memory. Otherwise Simulink displays an 'algebraic loop' error. In hardware, the signal has to have a specified number of clock cycles latency round the feedback loop. Typically, one or more lumped delays exist with SampleDelay blocks specifying the latency around some or all of the loop. DSP Builder preserves the latency around the loop to maintain correct functional operation. To achieve the target clock frequency, the total delay of the sum of SampleDelay blocks around the loop must be greater or equal to the required pipelining.

If the pipelining requirements of the functional units around the loop are greater than the delay specified by the SampleDelay blocks on the loop path, DSP Builder generates an error message. The message states that distribution of memory failed as there was insufficient delay to satisfy the fMAX requirement. DSP Builder cannot simultaneously satisfy the pipelining to achieve the given fMAX and the loop criteria to re-circulate the data in the number of clock cycles specified by the SampleDelay blocks.

DSP Builder automatically adjusts the pipeline requirements of every Primitive block according to these factors

  • The type of block
  • The target fMAX
  • The device family and speedgrade
  • The inputs of inputs
  • The bit width in the data inputs
Note: DSP Builder implements multipliers on DSP blocks. The DSP block latency affects these multipliers and varies with target device family and clock-frequency. Very wide fixed-point multipliers incur higher latency when DSP Builder splits them into smaller multipliers and adders. You cannot count the multiplier and adder latencies separately because DSP Builder may combine them into a single DSP block. The latency of some blocks depends on what pipelining you apply to surrounding blocks. DSP Builder avoids pipelining every block but inserts pipeline stages after every few blocks in a long sequence of logical components, if fMAX is sufficiently low that timing closure is still achievable.

In the SynthesisInfo block, you can optionally specify a latency constraint limit that can be a workspace variable or expression, but must evaluate to a positive integer. However, only use this feature to add further latency. Never use the feature to reduce latency to less than the latency required to pipeline the design to achieve the target fMAX.

After you run a simulation in Simulink, the help page for the SynthesisInfo block shows the latency, port interface, and estimated resource utilization for the current Primitive subsystem.

When no loops exist, feed-forward datapaths are balanced to ensure that all the input data reaches each functional unit in the same cycle. After analysis, DSP Builder inserts delays on all the non-critical paths to balance out the delays on the critical path.

In designs with loops, DSP Builder advanced blockset must synthesize at least one cycle of delay in every feedback loop to avoid combinational loops that Simulink cannot simulate. Typically, one or more lumped delays exist. To preserve the delay around the loop for correct operation, the functional units that need more pipelining stages borrow from the lumped delay.