Building and Installing:
------------------------
-Recommended to use DPDK 1.6.
+Required DPDK 1.7.
DPDK:
-Set dir i.g.: export DPDK_DIR=/usr/src/dpdk-1.6.0r2
+Set dir i.g.: export DPDK_DIR=/usr/src/dpdk-1.7.0
cd $DPDK_DIR
-update config/defconfig_x86_64-default-linuxapp-gcc so that dpdk generate single lib file.
+update config/common_linuxapp so that dpdk generate single lib file.
+(modification also required for IVSHMEM build)
CONFIG_RTE_BUILD_COMBINE_LIBS=y
-make install T=x86_64-default-linuxapp-gcc
+For default install without IVSHMEM:
+make install T=x86_64-native-linuxapp-gcc
+To include IVSHMEM (shared memory):
+make install T=x86_64-ivshmem-linuxapp-gcc
For details refer to http://dpdk.org/
Linux kernel:
DPDK kernel requirement.
OVS:
+Non IVSHMEM:
+export DPDK_BUILD=$DPDK_DIR/x86_64-native-linuxapp-gcc/
+IVSHMEM:
+export DPDK_BUILD=$DPDK_DIR/x86_64-ivshmem-linuxapp-gcc/
+
cd $(OVS_DIR)/openvswitch
./boot.sh
-export DPDK_BUILD=/usr/src/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc
./configure --with-dpdk=$DPDK_BUILD
make
- insert uio.ko
e.g. modprobe uio
- insert igb_uio.ko
- e.g. insmod DPDK/x86_64-default-linuxapp-gcc/kmod/igb_uio.ko
- - Bind network device to ibg_uio.
- e.g. DPDK/tools/pci_unbind.py --bind=igb_uio eth1
+ e.g. insmod $DPDK_BUILD/kmod/igb_uio.ko
+ - Bind network device to igb_uio.
+ e.g. $DPDK_DIR/tools/dpdk_nic_bind.py --bind=igb_uio eth1
Alternate binding method:
Find target Ethernet devices
lspci -nn|grep Ethernet
0000:02:00.0 0000:02:00.1 bind module new_id remove_id uevent unbind
Prepare system:
- - load ovs kernel module
- e.g modprobe openvswitch
- mount hugetlbfs
- e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/
+ e.g. mount -t hugetlbfs -o pagesize=1G none /dev/hugepages
Ref to http://www.dpdk.org/doc/quick-start for verifying DPDK setup.
start ovsdb-server
cd $OVS_DIR
./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock \
- --remote=db:OpenOpen_vSwitch,manager_options \
+ --remote=db:Open_vSwitch,Open_vSwitch,manager_options \
--private-key=db:Open_vSwitch,SSL,private_key \
- --certificate=dbitch,SSL,certificate \
+ --certificate=Open_vSwitch,SSL,certificate \
--bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
First time after db creation, initialize:
cd $OVS_DIR
Start vswitchd:
DPDK configuration arguments can be passed to vswitchd via `--dpdk`
-argument. dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter
+argument. This needs to be first argument passed to vswitchd process.
+dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter
for dpdk initialization.
e.g.
export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
- ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK --pidfile --detach
+ ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK --pidfile --detach
-If allocated more than 1 GB huge pages, set amount and use NUMA node 0 memory:
+If allocated more than one GB hugepage (as for IVSHMEM), set amount and use NUMA
+node 0 memory:
./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 \
- -- unix:$DB_SOCK --pidfile --detach
+ -- unix:$DB_SOCK --pidfile --detach
To use ovs-vswitchd with DPDK, create a bridge with datapath_type
"netdev" in the configuration database. For example:
############################# Script:
#! /bin/sh
-
# Move to command directory
-
cd /usr/src/ovs/utilities/
# Clear current flows
######################################
-Ideally for maximum throughput, the 100% task should not be scheduled out
-which temporarily halts the process. The following affinitization methods will
-help.
+With pmd multi-threading support, OVS creates one pmd thread for each
+numa node as default. The pmd thread handles the I/O of all DPDK
+interfaces on the same numa node. The following two commands can be used
+to configure the multi-threading behavior.
+
+ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=<hex string>
+
+The command above asks for a CPU mask for setting the affinity of pmd threads.
+A set bit in the mask means a pmd thread is created and pinned to the
+corresponding CPU core. For more information, please refer to
+`man ovs-vswitchd.conf.db`
+
+ ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=<integer>
+
+The command above sets the number of rx queues of each DPDK interface. The
+rx queues are assigned to pmd threads on the same numa node in round-robin
+fashion. For more information, please refer to `man ovs-vswitchd.conf.db`
+
+Ideally for maximum throughput, the pmd thread should not be scheduled out
+which temporarily halts its execution. The following affinitization methods
+can help.
+
+Lets pick core 4,6,8,10 for pmd threads to run on. Also assume a dual 8 core
+sandy bridge system with hyperthreading enabled where CPU1 has cores 0,...,7
+and 16,...,23 & CPU2 cores 8,...,15 & 24,...,31. (A different cpu
+configuration could have different core mask requirements).
+
+To kernel bootline add core isolation list for cores and associated hype cores
+(e.g. isolcpus=4,20,6,22,8,24,10,26,). Reboot system for isolation to take
+effect, restart everything.
+
+Configure pmd threads on core 4,6,8,10 using 'pmd-cpu-mask':
+
+ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=00000550
+
+You should be able to check that pmd threads are pinned to the correct cores
+via:
+
+ top -p `pidof ovs-vswitchd` -H -d1
+
+Note, the pmd threads on a numa node are only created if there is at least
+one DPDK interface from the numa node that has been added to OVS.
+
+Note, core 0 is always reserved from non-pmd threads and should never be set
+in the cpu mask.
+
+DPDK Rings :
+------------
+
+Following the steps above to create a bridge, you can now add dpdk rings
+as a port to the vswitch. OVS will expect the DPDK ring device name to
+start with dpdkr and end with a portid.
+
+ ovs-vsctl add-port br0 dpdkr0 -- set Interface dpdkr0 type=dpdkr
+
+DPDK rings client test application
+
+Included in the test directory is a sample DPDK application for testing
+the rings. This is from the base dpdk directory and modified to work
+with the ring naming used within ovs.
-At this time all ovs-vswitchd tasks end up being affinitized to cpu core 0
-but this may change. Lets pick a target core for 100% task to run on, i.e. core 7.
-Also assume a dual 8 core sandy bridge system with hyperthreading enabled.
-(A different cpu configuration will have different core mask requirements).
+location tests/ovs_client
-To give better ownership of 100%, isolation maybe useful.
-To kernel bootline add core isolation list for core 7 and associated hype core 23
- e.g. isolcpus=7,23
-Reboot system for isolation to take effect, restart everything
+To run the client :
+ cd /usr/src/ovs/tests/
+ ovsclient -c 1 -n 4 --proc-type=secondary -- -n "port id you gave dpdkr"
-List threads (and their pid) of ovs-vswitchd
- top -p `pidof ovs-vswitchd` -H -d1
+In the case of the dpdkr example above the "port id you gave dpdkr" is 0.
-Look for pmd* thread which is polling dpdk devices, this will be the 100% CPU
-bound task. Using this thread pid, affinitize to core 7 (mask 0x080),
-example pid 1762
+It is essential to have --proc-type=secondary
-taskset -p 080 1762
- pid 1762's current affinity mask: 1
- pid 1762's new affinity mask: 80
+The application simply receives an mbuf on the receive queue of the
+ethernet ring and then places that same mbuf on the transmit ring of
+the ethernet ring. It is a trivial loopback application.
-Assume that all other ovs-vswitchd threads to be on other socket 0 cores.
-Affinitize the rest of the ovs-vswitchd thread ids to 0x0FF007F
+DPDK rings in VM (IVSHMEM shared memory communications)
+-------------------------------------------------------
-taskset -p 0x0FF007F {thread pid, e.g 1738}
- pid 1738's current affinity mask: 1
- pid 1738's new affinity mask: ff007f
-. . .
+In addition to executing the client in the host, you can execute it within
+a guest VM. To do so you will need a patched qemu. You can download the
+patch and getting started guide at :
-The core 23 is left idle, which allows core 7 to run at full rate.
+https://01.org/packet-processing/downloads
-Future changes may change the need for cpu core affinitization.
+A general rule of thumb for better performance is that the client
+application should not be assigned the same dpdk core mask "-c" as
+the vswitchd.
Restrictions:
-------------
- This Support is for Physical NIC. I have tested with Intel NIC only.
- - vswitchd userspace datapath does affine polling thread but it is
- assumed that devices are on numa node 0. Therefore if device is
- attached to non zero numa node switching performance would be
- suboptimal.
- - There are fixed number of polling thread and fixed number of per
- device queues configured.
- Work with 1500 MTU, needs few changes in DPDK lib to fix this issue.
- Currently DPDK port does not make use any offload functionality.
+ ivshmem
+ - The shared memory is currently restricted to the use of a 1GB
+ huge pages.
+ - All huge pages are shared amongst the host, clients, virtual
+ machines etc.
Bug Reporting:
--------------