Introduction
This tutorial shows how to install the FD.io Vector Packet Processing (VPP) package and build a packet forwarding engine on a bare metal Intel® Xeon® processor server. Two additional Intel Xeon processor platform systems are used to connect to the VPP host to pass traffic using iperf3*
and Cisco’s TRex* Realistic Traffic Generator (TRex*). Intel 40 Gigabit Ethernet (GbE) network interface cards (NICs) are used to connect the hosts.
Vector Packet Processing (VPP) Overview
VPP is open source high-performance packet processing software. It leverages the Data Plane Development Kit (DPDK) to take advantage of fast I/O. DPDK provides fast packet processing libraries and user space drivers. It receives and send packets with a minimum number of CPU cycles by bypassing the kernel and using a user poll mode driver. Details on how to configure DPDK can be found in the DPDK Documentation.
VPP can be used as a standalone product or as an extended data plane product. It is highly efficient because it scales well on modern Intel® processors and handles packet processing in batches, called vectors, up to 256 packets at a time. This approach ensures that cache hits will be maximized.
The VPP platform consists of a set of nodes in a directed graph called a packet processing graph. Each node provides a specific network function to packets, and each directed edge indicates the next network function that will handle packets. Instead of processing one packet at a time as the kernel does, the first node in the packet processing graph polls for a burst of incoming packets from a network interface; it collects similar packets into a frame (or vector), and passes the frame to the next node indicated by the directed edge. The next node takes the frame of packets, processes them based on the functionality it provides, passes the frame to the next node, and so on. This process repeats until the last node gets the frame, processes all the packets in the frame based on the functionality it provides, and outputs them on a network interface. When a frame of packets is handled by a node only the first packet in the frame needs to load the CPU’s instructions to the cache; the rest of the packets benefit from the instruction already in the cache. VPP architecture is flexible to allow users to create new nodes, enter them into the packet processing graph, and rearrange the graph.
Like DPDK, VPP operates in user space. VPP can be used on bare metal, virtual machines (VMs), or containers.
Build and Install VPP
In this tutorial, three systems named csp2s22c03
, csp2s22c04
, and net2s22c05
are used. The system csp2s22c03
, with VPP installed, is used to forward packets, and the systems csp2s22c04
and net2s22c05
are used to pass traffic. All three systems are equipped with Intel® Xeon® processor E5-2699 v4 @ 2.20 GHz, two sockets with 22 cores per socket, and are running 64-bit Ubuntu* 16.04 LTS. The Intel® Ethernet Converged Network Adapter XL710 10/40 GbE is used to connect these systems. Refer to Figure 1 and Figure 2 for configuration diagrams.
Build the FD.io VPP Binary
The instructions in this section describe how to build the VPP package from FD.io. Skip to the next section if you’d like to use the Debian* VPP packages instead.
With an admin privileges account in csp2s22c03
, we download a stable version of VPP (version 17.04 is used in this tutorial), and navigate to the build-root
directory to build the image:
To build the image with debug symbols:
After you've configured VPP, you can run the VPP binary from the fdio.1704
directory using the src/vpp/conf/startup.conf
configuration file:
Build the Debian* VPP Packages
If you prefer to use the Debian VPP packages, follow these instructions to build them:
In this output:
- vpp is the packet engine
- vpp-api-java is the Java* binding module
- vpp-api-lua is the Lua* binding module
- vpp-api-python is the Python* binding module
- vpp-dbg is the debug symbol version of VPP
- vpp-dev is the development support (headers and libraries)
- vpp-lib is the VPP runtime library
- vpp-plugins is the plugin module
Next, install the Debian VPP packages. At a minimum, you should install the VPP, vpp-lib, and vpp-plugins packages). We install them on the machine csp2s22c03
:
Verify that the VPP packages are installed successfully:
Configure VPP
During installation, two configuration files are created: /etc/sysctl.d/80-vpp.conf
and /etc/vpp/startup.conf/startup.conf
. The /etc/sysctl.d/80-vpp.conf
configuration file is used to set up huge pages. The /etc/vpp/startup.conf/startup.conf
configuration file is used to start VPP.
Configure huge pages
In the /etc/sysctl.d/80-vpp.conf
configuration file, set parameters as follows: the number of 2 MB huge pages vm.nr_hugepages
is chosen to be 4096, and vm.max_map_count
is 9216 (2.5 * 4096), shared memory max kernel.shmmax
is 8,589,934,592 (4096 * 2 * 1024 * 1024).
Apply these memory settings to the system and verify the huge pages:
Configure startup.conf
In the /etc/vpp/startup.conf/startup.conf
configuration file, the keyword interactive is added to enable the VPP Command-Line Interface (CLI). Also, four worker threads are selected and run on cores 2, 3, 22, and 23. Note that you can choose the NIC cards to use in this configuration or you can specify them later, as this exercise shows. The modified /etc/vpp/startup.conf/startup.conf
configuration file is shown below.
Run VPP as a Packet Processing Engine
In this section, four examples of running VPP are shown. In the first two examples, the iperf3 tool is used to generate traffic, and in the last two examples the TRex Realistic Traffic Generator is used. For comparison purposes, the first example shows packet forwarding using ordinary kernel IP forwarding, and the second example shows packet forwarding using VPP.
Example 1: Using Kernel Packet Forwarding with iperf3*
In this test, 40 GbE Intel Ethernet Network Adapters are used to connect the three systems. Figure 1 illustrates this configuration.
Figure 1 – VPP runs on a host that connects to two other systems via 40 GbE NICs.
For comparison purposes, in the first test, we configure kernel forwarding in csp2s22c03
and use the iperf3
tool to measure network bandwidth between csp2s22c03
and net2s22c05
. In the second test, we start the VPP engine in csp2s22c03
instead of using kernel forwarding.
On csp2s22c03
, we configure the system to have the addresses 10.10.1.1/24
and 10.10.2.1/24
on the two 40-GbE NICs. To find all network interfaces available on the system, use the lshw
Linux* command to list all network interfaces and the corresponding slots [0000:xx:yy.z]
. For example, the 40-GbE interfaces are ens802f0
and ens802f1
.
Configure the system to have 10.10.1.1
and 10.10.2.1
on the two 40-GbE NICs ens802f0
and ens802f1
, respectively.
List the route table:
On csp2s22c04
, we configure the system to have the address 10.10.1.2
and use the interface ens802
to route IP packets 10.10.2.0/24
. Use the lshw
Linux command to list all network interfaces and the corresponding slots [0000:xx:yy.z]
. For example, the interface ens802d1 (ens802)
is connected to slot [82:00.0]
:
For kernel forwarding, set 10.10.1.2
to the interface ens802, and add a static route for IP packet 10.10.2.0/24
:
After setting the route, we can ping from csp2s22c03
to csp2s22c04
, and vice versa:
Similarly, on net2s22c05
, we configure the system to have the address 10.10.2.2
and use the interface ens803f0
to route IP packets 10.10.1.0/24
. Use the lshw
Linux command to list all network interfaces and the corresponding slots [0000:xx:yy.z]
. For example, the interface ens803f0
is connected to slot [87:00.0]
:
For kernel forwarding, set 10.10.2.2
to the interface ens803f0
, and add a static route for IP packet 10.10.1.0/24
:
After setting the route, you can ping from csp2s22c03
to net2s22c05
, and vice versa. However, in order to ping between net2s22c05
and csp2s22c04
, kernel IP forwarding in csp2s22c03
has to be enabled:
If successful, verify that now you can ping between net2s22c05
and csp2s22c04
:
We use the iperf3 utility to measure network bandwidth between hosts. In this test, we download the iperf3 utility tool on both net2s22c05
and csp2s22c04
. On csp2s22c04
, we start the iperf3
server with “iperf3 –s
”, and then on net2s22c05
, we start the iperf3
client to connect to the server:
Using kernel IP forwarding, iperf3
shows the network bandwidth is about 8.12 Gbits per second.
Example 2: Using VPP with iperf3
First, disable kernel IP forward in csp2s22c03
to ensure the host cannot use kernel forwarding (all the settings in net2s22c05
and csp2s22c04
remain unchanged):
You can use DPDK’s device binding utility (./install-vpp-native/dpdk/sbin/dpdk-devbind
) to list network devices and bind/unbind them from specific drivers. The flag “-s/--status
” shows the status of devices; the flag “-b/--bind
” selects the driver to bind. The status of devices in our system indicates that the two 40-GbE XL710 devices are located at 82:00.0
and 82:00.1
. Use the device’s slots to bind them to the driver uio_pci_generic
:
Start the VPP service, and verify that VPP is running:
To access the VPP CLI, issue the command sudo vppctl
. From the VPP interface, list all interfaces that are bound to DPDK using the command show interface:
VPP shows that the two 40-Gbps ports located at 82:0:0
and 82:0:1
are bound. Next, you need to assign IP addresses to those interfaces, bring them up, and verify:
At this point VPP is operational. You can ping these interfaces either from net2s22c05
or csp2s22c04
. Moreover, VPP can forward packets whose IP address are 10.10.1.0/24
and 10.10.2.0/24, so you can ping between net2s22c05
and csp2s22c04
. Also, you can run iperf3
as illustrated in the previous example, and the result from running iperf3
between net2s22c05
and csp2s22c04
increases to 20.3 Gbits per second.
The VCC CLI command show run displays the graph runtime statistics. Observe that the average vector per node is 6.76, which means on average, a vector of 6.76 packets is handled in a graph node.
Example 3. Using VPP with the TREX* Realistic Traffic Generator
In this example we use only two systems, csp2s22c03
and net2s22c05, to run the TRex Realistic Traffic Generator. VPP is installed in csp2s22c03
and run as a packet forwarding engine. On net2s22c05, TRex is used to generate both client and server-side traffic. TRex is a high-performance traffic generator. It leverages DPDK to run in user space. Figure 2 illustrates this configuration.
VPP is set up on csp2s22c03
exactly as it was in Example 2. Only the setup on net2s22c05
is modified slightly to run TRex preconfigured traffic files.
Figure 2 – The TRex traffic generator sends packages to the host that has VPP running.
To install TRex, in net2s22c05
, download and extract TRex package:
Create the /etc/trex_cfg.yaml
configuration file. In this configuration file, the port should match the interfaces available in the target system, which is net2s22c05
in our example. The IP addresses correspond to Figure 2. For more information on the configuration file, please refer to the TRex Manual.
Stop the previous VPP session and start it again in order to add a route for new IP addresses 16.0.0.0/8
and 48.0.0.0/8
, according to Figure 2. Those IP addresses are needed because TRex generates packets that use these addresses. Refer to the TRex Manual for details on these traffic templates.
Now, you can generate a simple traffic flow from net2s22c05
using the traffic configuration file cap2/dns.yaml
:
On csp2s22c03
, the VCC CLI command show run displays the graph runtime statistics:
Example 4: Using VPP with TRex Mixed Traffic Templates
In this example, a more complicated traffic with delay profile on net2s22c05
is generated using the traffic configuration file avl/sfr_delay_10_1g.yaml
:
On csp2s22c03
, use the VCC CLI command show run to display the graph runtime statistics. Observe that the average vector per node is 10.69 and 14.47:
Summary
This tutorial showed how to download, compile, and install the VPP binary on an Intel® Architecture platform. Examples of /etc/sysctl.d/80-vpp.conf
and /etc/vpp/startup.conf/startup.conf
configuration files were provided to get the user up and running with VPP. The tutorial also illustrated how to detect and bind the network interfaces to a DPDK-compatible driver. You can use the VPP CLI to assign IP addresses to these interfaces and bring them up. Finally, four examples using iperf3
and TRex were included, to show how VPP processes packets in batches.
About the Author
Loc Q Nguyen received an MBA from University of Dallas, a master’s degree in Electrical Engineering from McGill University, and a bachelor's degree in Electrical Engineering from École Polytechnique de Montréal. He is currently a software engineer with Intel Corporation's Software and Services Group. His areas of interest include computer networking, parallel computing, and computer graphics.