DSP Builder for Intel® FPGAs (Advanced Blockset): Handbook

ID 683337
Date 7/15/2024
Public
Document Table of Contents

7.2.1. About Loops

Your design can contain many loops that can interact with or be nested inside each other. DSP Builder uses standard mathematical linear programming techniques to solve a set of simultaneous timing constraints.

Consider the following two main cases, feed forward and loops:

  • Feed forward is the simpler case. When no loops exist, feed-forward datapaths are balanced to ensure that all the input data reaches each functional unit in the same cycle. After analysis, DSP Builder inserts delays on all the noncritical paths to balance out the delays on the critical path.
  • Loops are the more complex case. Loops cannot be combinational—all loops in the Simulink design must include a minimum of one unit of delay. Otherwise Simulink displays an algebraic loop error. In hardware, the signal has to have a specified number of clock cycles latency round the feedback loop. Typically, one or more lumped delays exist with SampleDelay blocks specifying the latency around some or all of the loop. DSP Builder preserves the latency around the loop to maintain correct functional operation. To achieve the target clock frequency, the total delay of the sum of SampleDelay blocks around the loop must be greater or equal to the required pipelining.

If the pipelining requirements of the functional units around the loop are greater than the delay specified by the SampleDelay blocks on the loop path, DSP Builder generates an error message. The message states that distribution of delay failed as the design has insufficient delay to satisfy the fMAX requirement. DSP Builder cannot simultaneously satisfy the pipelining to achieve the given fMAX and the loop criteria to recirculate the data in the number of clock cycles specified by the SampleDelay blocks.

DSP Builder automatically adjusts the pipeline requirements of every Primitive block according to these factors

  • The type of block
  • The target fMAX
  • The device family and speed grade
  • The number of inputs
  • The bit width in the data inputs
Note: DSP Builder implements multipliers on DSP blocks. The DSP block latency affects these multipliers and varies with target device family and clock-frequency. Very wide fixed-point multipliers incur higher latency when DSP Builder splits them into smaller multipliers and adders. You cannot count the multiplier and adder latencies separately because DSP Builder may combine them into a single DSP block. The latency of some blocks depends on what pipelining DSP Builder applies to surrounding blocks. In a long sequence of logical components, and if fMAX is sufficiently low that timing closure is still achievable, DSP Builder only inserts pipeline stages after every few blocks.

In the SynthesisInfo block, you can optionally specify a latency constraint limit that can be a workspace variable or expression, but must evaluate to a positive integer. However, only use this feature to add further latency. Never use the feature to reduce latency to less than the latency required to pipeline the design to achieve the target fMAX.

After you run a simulation in Simulink, the help page for the SynthesisInfo block shows the latency, port interface, and estimated resource utilization for the current Primitive subsystem.

When no loops exist, feed-forward datapaths are balanced to ensure that all the input data reaches each functional unit in the same cycle. After analysis, DSP Builder inserts delays on all the non-critical paths to balance out the delays on the critical path.

In designs with loops, DSP Builder must synthesize at least one cycle of delay in every feedback loop to avoid combinational loops that Simulink cannot simulate. Typically, one or more lumped delays exist. To preserve the delay around the loop for correct operation, the functional units that need more pipelining stages borrow from the lumped delay.