AN 456: PCI Express High Performance Reference Design

ID 683541
Date 12/12/2018
Public

1.11. Performance Benchmarking Results

The following tables list the performance of x1, x4, and x8 operations with the Stratix V GX FPGA development board for the Intel i7-3930K 3.8 GHz Sandy Bridge-E processor using this reference design. The table shows the average throughput with the following parameters:
  • 100 KByte transfer
  • 20 iterations
  • A 256-byte payload
  • Maximum 512-byte read request
  • 256-byte read completion
Note: Refer to the following web page for other available reference designs and application notes for PCI Express.
Table 7.  Arria 10 Hard IP for PCI Express Performance
Configuration DMA Read (MB/sec) DMA Write (MB/sec) Simultaneous DMA Read/Write (MB/sec) Theoretical maximum throughputs (MB/sec)
DMA Read (MB/sec) DMA Write (MB/sec)
Gen3, X4 3362 3478 3270/2809 3710 3710
Gen2, X8 3337 3505 3227/2808 3710 3710
Gen1, X1 208 214 199/192 231 231
Table 8.  Stratix V Hard IP for PCI Express Performance - Intel i7-3930K Processor
Configuration DMA Reads
(MB/s) DMA Writes
(MB/s) Simultaneous DMA Read/Writes
(MB/s) Theoretical Maximum Throughput
(MB/s)
Read Write
Gen3, x4 3324 3473 3212/2991 3710 3710
Gen2, x8 3326 3507 3267/2910 3710 3710
Gen2, x4 1704 1767 1653/1514 1855 1855
Gen2, x1 475 438 401/358 463 463
Gen1, x8 1676 1763 1647/1491 1855 1855
Gen1, x4 839 881 832/800 927 927
Gen1, x1 222 222 214/200 231 231
Table 9.  Cyclone V Hard IP for PCI Express Performance
Configuration DMA Read (MB/sec) DMA Write (MB/sec) Simultaneous DMA Read/Write (MB/sec) Theoretical maximum throughputs (MB/sec)
DMA Read (MB/sec) DMA Write (MB/sec)
Gen2, X4 1700 1762 1683/1485 1855 1855
Gen1, X4 832 882 849/801 927 927
Gen1, X1 222 225 220/209 231 231
Table 10.  Arria V GT Hard IP for PCI Express Performance
Configuration DMA Read (MB/sec) DMA Write (MB/sec) Simultaneous DMA Read/Write (MB/sec) Theoretical maximum throughputs (MB/sec)
DMA Read (MB/sec) DMA Write (MB/sec)
Gen2, x4 1719 1784 1673/1450 1855 1855
Gen2, x1 446 451 439/421 463 463
Gen1, x8 1699 1782 1669/1461 1855 1855
Gen1, x4 865 892 806/802 927 927
Gen1, x1 222 225 220/209 231 231

The following tables list the performance of the performance of x8, x4, and x1 operations for development boards using the Intel X58 and using this reference design. The table shows the average throughput with the following parameters:

  • 100 KByte transfer
  • 20 iterations
  • A 256-byte payload
  • Maximum 512- byte read request
  • 256-byte read completion
Table 11.  Stratix IV GX Performance - Intel X58 Chipset
Configuration DMA Reads
(MB/s) DMA Writes
(MB/s) Simultaneous DMA Read/Writes
(MB/s) Theoretical Maximum Throughput
(MB/s)
Read Write
Hard IP Implementation—Stratix IV GX
Gen2 x8, 128-bit 3304 3434 2956/2955 3710 3710
Gen2 x4, 128-bit 1708 1783 1684/1484 1855 1855
Gen2 x4, 64-bit 1727 1775 1691/1631 1855 1855
Gen2 x1 448 450 438/425 463 463
Gen1 x8, 128-bit 1694 1778 1678/1480 1855 1855
Gen1 x8, 64-bit 1706 1778 1680/1628 1855 1855
Gen1 x4 875 890 855/815 927 927
Gen1 x1 224 225 219/211 231 231
Soft IP Implementation—Stratix IV GX
Gen1 x4 873 890 854/811 927 927
Gen1 x1 222 225 219/209 231 231
Table 12.  Arria II GX Performance - Intel X58 Chipset
Configuration DMA Reads
(MB/s) DMA Writes
(MB/s) Simultaneous DMA Read/Writes
(MB/s) Theoretical Maximum Throughput
(MB/s)
Hard IP Implementation—Stratix IV GX
Gen1 x8, 128-bit 1497 1775 1210/1331 1855 1855
Gen1 x4, 64-bit 859 889 725/767 927 927
Gen1 x1, 64-bit 220 225 217/204 231 231
Soft IP Implementation—Stratix IV GX
Gen1 x4, 64-bit 860 887 854/780 927 927
Gen1 x1, 64-bit 220 225 203/203 231 231
Table 13.  Cyclone IV GX Performance - Intel X58 Chipset
Configuration DMA Reads
(MB/s) DMA Writes
(MB/s) Simultaneous DMA Read/Writes
(MB/s) Theoretical Maximum Throughput
(MB/s)
Soft IP Implementation—Stratix IV GX
Gen1 x1, 64-bit 220 225 217/203 231 231