Multi Channel DMA for PCI Express* Intel® FPGA IP Design Example User Guide

ID 683517
Date 2/06/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

2.8. External Descriptor Controller

External descriptor controller example design supports descriptor fetching and queue management along with MCDMA IP core in Data Mover mode of operations. Example design supports the following features:

  • Total 16 DMA channels (16 H2D queues and 16 D2H queues)
  • Supports separate descriptor command queues for H2D and D2H Data Movers
  • Supports Writeback (WB) as descriptor completion mechanism
  • No Interrupt Support

The figure below is the internal representation of external descriptor controller block.

Figure 35. External Descriptor Controller Example Design

Global/Queue CSR: Implements the global CSR and Queue CSR registers required for controlling the DMA operation. Read/Write access to these registers happen through BAM interface of MCDMA IP. For details about the registers, refer to the Registers section.

H2D/D2H Descriptor Fetch: This block generates descriptor fetch commands on h2ddm_desc or d2hdm_desc interface, based on the context in QCSR registers.

Descriptor Completion Processing: This block processes the received descriptor completion packets on h2ddm_desc_cmpl interface for the descriptor fetch request send to MCDMA and stores the received descriptors in corresponding descriptor queue buffers (H2D/D2H Descriptor Queue). These fetched descriptors get queued, and corresponding data mover commands are sent to MCDMA IP though h2ddm_desc or d2hdm_desc interface.

Descriptor Status Processing: This block processes the status information received on h2ddm_desc_status and d2hdm_desc_status interface and generates appropriate writeback commands to MCDMA IP on d2hdm_desc interface.

Descriptor Fetch operation

The descriptor fetch process is described below:

  1. Software updates the Queue context registers in QCSR and update the Tail pointer register (offset 0x10 for D2H; offset 0x90 for H2D).
  2. H2D/D2H Descriptor fetch block detects Tail pointer register update and generate descriptor fetch commands on h2ddm_desc interface of MCDMA to get descriptors from host memory.
  3. The number of descriptors to be fetched are determined by difference between Head pointer (offset 0x18 for D2H; offset 0x98 for H2D) and Tail pointer (offset 0x10 for D2H; offset 0x90 for H2D).
  4. The Head pointer register (offset 0x18 for D2H; offset 0x98 for H2D) is updated based on number of descriptors fetched.
  5. MCDMA provides received descriptor completions on h2ddm_desc_cmpl interface.
  6. The received descriptors are stored on respective Queue buffers (H2D/D2H Descriptor Queue).
  7. The descriptor fetch process happens for multiple queues in a round-robin arbitration scheme.

H2D Data movement operation

  1. Descriptors from H2D Descriptor Queue are pulled out, translated to corresponding data mover commands, and sent to MCDMA though h2ddm_desc interface.
  2. Data transfer happens from Host memory to DMA_MEM though H2D Data Mover AVMM Write master.
  3. Once data transfer is completed, MCDMA sends status on h2ddm_desc_status interface.
  4. The status information is processed by Descriptor Status Processing block to generate appropriate writeback commands to MCDMA on d2hdm_desc interface.
  5. The Completed pointer register (offset 0xA8 for H2D) is updated based on number of descriptors processed.
  6. This flow continues for all descriptors in the H2D Descriptor Queue.

D2H Data movement operation

  1. Descriptors from D2H Descriptor Queue are pulled out, translated to corresponding data mover commands, and sent to MCDMA though d2hdm_desc interface.
  2. Data transfer happens from DMA_MEM to Host memory though D2H Data Mover AVMM Read master.
  3. Once data transfer is completed, MCDMA sends status on d2hdm_desc_status interface.
  4. The status information is processed by Descriptor Status Processing block to generate appropriate writeback commands to MCDMA on d2hdm_desc interface.
  5. The Completed pointer register (offset 0x28 for D2H) is updated based on number of descriptors processed.
  6. This flow continues for all descriptors in the D2H Descriptor Queue.

Hardware Test Results

The external descriptor controller in the example design is derived from the main MCDMA. The same MCDMA driver is used to demonstrate the external descriptor controller.
Figure 36. Data Validation Test
The differences between main MCDMA and example design are shown in the table below.
Table 13.  Differences between main MCDMA and Example Design
Feature Main MCDMA External Descriptor Controller Example Design

DMA Channels

Up to 2K

Fixed 16

SRIOV

Yes

No

MSI-X

Yes

No

(MSI-X may be supported in future release)

Writeback

Yes

Yes

Queue CSR

Yes

Yes

(For details, see register tables)

MSI

No

No

Descriptor Link bit

Yes

No

(The descriptors are formed continuously and in consecutive locations and provided with starting address in the QCSR)

External Descriptor Controller setup: In this mode, the external DMA controller registers are accessed by the PIO transaction dedicated to specific BAR. The BAM bridge is instantiated to route the transaction on specific BAR to external DMA controller ( by default, BAR0 is used) to access the register set similar to main MCDMA QCSR. For details on the external descriptor registers, see the register tables.

Descriptor Fetch from System Memory

The descriptor fetch engine reads QCSR registers to check for any differences in the queue tail pointer and queue head pointer, and processes queues on a first-come-first-serve basis. As the descriptor controller processes the queue in the context of queue tail pointer and queue head pointer, it updates the queue head pointer register to the tail pointer value. The descriptor fetch process is described below:

  1. Software updates the tail pointer in the external descriptor controller QCSR.
  2. Hardware reads the QCSR register in a round-robin fashion across multiple queues. If the tail pointer is not equal to the head pointer (tail pointer != head pointer), the difference is the number of descriptors the descriptor controller needs to fetch for data movement.
  3. If the number of descriptors to fetch is greater than the size of the descriptor ring buffer, located in the Host memory, ((tail pointer – head pointer) > descriptor ring buffer size), then the descriptor fetch process is split into two (upper descriptor block and lower descriptor block) based on the ring buffer size. Otherwise, if the number of descriptors to be fetched is less than the descriptor ring buffer size, then there is only one descriptor fetch process.
  4. Queue ID (QID), tail pointer, head pointer, and start address of the descriptor table are sent from queue arbiter to descriptor fetch process block for descriptor batching.
  5. Descriptor fetch process block splits the total descriptors to be fetched in batches of a maximum of 16 (MRRS/Descriptor Size = 512Bytes/32Bytes). Thus, multiple descriptor fetch commands get queued up in command queue.
  6. Once all the descriptor fetch command for the queue (QID) is sent to descriptor command queue, the descriptor fetch engine will update the QCSR head pointer register
  7. If tail pointer == head pointer, then queue arbiter will go to next queue and start from step 2.
The following table describes the simplified Host descriptor.
Note: Only the descriptor format is the same for D2H and H2D, and the descriptors are in a separate ring.
Table 14.  Host descriptor format
Name Width Description

SRC_ADDR[63:0]

64

Source address of allocated buffer read by DMA.

DEST_ADDR[127:64]

64

Destination address of allocated buffer written by DMA.

PYLD_CNT[147:128]

20

DMA payload size in bytes. Max 1MB, with 20’h0 indicating 1MB.

RSRV[159:148]

12

Reserved

DESC_IDX [175:160]

16

Unique Identifier for each descriptor, the  same number will be applied to Q_COMPLETED_POINTER.

Same as descriptor count which is used as tail pointer.

For example, descriptor count starts from 1 to 128 for 128 descriptors in a 4K page.

MSIX_EN[176]

1

Enable MSIX

WB_EN[177]

1

Enable Writeback

RSRVD [255:178]

78

Reserved