Intel® Quartus® Prime Standard Edition User Guide: Design Recommendations

ID 683323
Date 9/24/2018
Public
Document Table of Contents

2.6.3.2. Architectures with 6-Input LUTs in Adaptive Logic Modules

In Intel FPGA device families with 6-input LUT in their basic logic structure, ALMs can simultaneously add three bits. Take advantage of this feature by restructuring your code for better performance.
Although code targeting 4-input LUT architectures compiles successfully for 6-input LUT devices, the implementation can be inefficient. For example, to take advantage of the 6-input adaptive ALUT, you must rewrite large pipelined binary adder trees designed for 4-input LUT architectures. By restructuring the tree as a ternary tree, the design becomes much more efficient, significantly improving density utilization.
Note: You cannot pack a LAB full when using this type of coding style because of the number of LAB inputs. However, in a typical design, the Intel® Quartus® Prime Fitter can pack other logic into each LAB to take advantage of the unused ALMs.

Verilog HDL Pipelined Ternary Tree

The example shows a pipelined adder, but partitioning your addition operations can help you achieve better results in non-pipelined adders as well. If your design is not pipelined, a ternary tree provides much better performance than a binary tree. For example, depending on your synthesis tool, the HDL code sum = (A + B + C) + (D + E) is more likely to create the optimal implementation of a 3-input adder for A + B + C followed by a 3-input adder for sum1 + D + E than the code without the parentheses. If you do not add the parentheses, the synthesis tool may partition the addition in a way that is not optimal for the architecture.

module ternary_adder_tree (a, b, c, d, e, clk, out);
    parameter width = 16;
	input [width-1:0] a, b, c, d, e;
	input	clk;
	output [width-1:0] out;

	wire [width-1:0] sum1, sum2;
	reg [width-1:0] sumreg1, sumreg2;
	// registers

	always @ (posedge clk)
		begin
			sumreg1 <= sum1;
			sumreg2 <= sum2;
		end

	// 3-bit additions
	assign sum1 = a + b + c;
	assign sum2 = sumreg1 + d + e;
	assign out = sumreg2;
endmodule