FPGA AI Suite: IP Reference Manual

ID 768974
Date 9/06/2024
Public
Document Table of Contents

2.5.2.8. Parameter Group: xbar

For each layer of the graph, data passes through the convolution engine (referred to as the processing element [PE] array), followed by zero or more auxiliary modules. The auxiliary modules perform operations such as activation or pooling.

After the output data for a layer has been computed, it can be sent to one of the following places:

  • An internal buffer while waiting for the start of the next convolution layer.
  • The external memory for reading by the host program.

Internally, a crossbar (xbar) connects the modules together. The xbar parameters specify the connections between the PE array, the auxiliary modules, the input feeder (which holds data waiting for the next convolution layer), and the output writer (which writes the data to the external memory).

Consider the following example:

xbar {
	  xbar_k_vector : 16
	  max_input_interfaces : 4
	  max_output_interfaces : 4
	  xbar_ports {
	    xbar_aux_port {
	      name : 'activation'
	      input_connection : 'xbar_in_port'
	    }
	    xbar_aux_port {
	      name : 'pool'
	      input_connection : 'xbar_in_port'
	      input_connection : 'activation'
	    }
	  }
	  xbar_in_port {
	    external_connection  : 'pe_array'
	  }
	  xbar_out_port {
	    external_connection : 'input_feeder'
	    external_connection : 'output_writer'
	    input_connection    : 'xbar_in_port'
	    input_connection    : 'pool'
	  }
	}

The crossbar always has the following elements:

  • An xbar_in_port element that accepts the incoming connection from the PE array.
  • An xbar_out_port element that connects externally to the input feeder and output writer.

This example architecture also has two auxiliary modules defined with xbar_aux_port elements: a pool module and an activation module. The xbar_aux_port elements are specified in the xbar_ports section.

In this configuration, the activation module can accept data from the xbar_in_port (the PE array), while the pool module can accept data from the xbar_in_port or from the activation module.

Finally, the output port can accept data either directly from the xbar_in_port or from the pool module. These connections limit how data flows through the system.

In this example, the activation module cannot write out from the layer. Activations must be followed by pooling layers. Also, an activation layer cannot follow a pooling layer. However, the convolution can be followed by a pooling layer without an activation layer in between.

Connections cost area and can reduce fMAX. You can reduce area by including only the connections that are required for a given graph (sometimes called "depopulating" the crossbar). If you use this architecture as the starting point for the Architecture Optimizer, the result can improve throughput.

To see examples of other crossbar configurations, review the example_architectures/ directory.

Parameter: xbar/xbar_k_vector

This parameter defines the width of the interface into the crossbar. Typically, this parameter is set to be equal to the width of the widest interface into any of the auxiliary modules.

Typically, the architecture optimizer is used to set this parameter.

Legal values:
[2,4,8,16,32,64]