cascardo/ovs.git
8 years agodatapath-windows: Fix build.
Alin Serdean [Fri, 29 May 2015 21:22:54 +0000 (21:22 +0000)]
datapath-windows: Fix build.

Removing a variable which breaks the windows forwarding extension build.

The error:
warning C4189: 'bufContext' : local variable is initialized but not referenced

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agoodp-util: Fix alignment when scanning Geneve attributes.
Jesse Gross [Fri, 29 May 2015 17:41:05 +0000 (10:41 -0700)]
odp-util: Fix alignment when scanning Geneve attributes.

Clang complains about the fact that we use a byte array to scan
Geneve attributes into since there are different alignment requirements:

lib/odp-util.c:2936:30: error: cast from 'uint8_t *' (aka 'unsigned char *') to

      'struct geneve_opt *' increases required alignment from 1 to 2

      [-Werror,-Wcast-align]

    struct geneve_opt *opt = (struct geneve_opt *)key->d;

                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~

We can instead treat this as an array of Geneve option headers to
ensure we get the right alignment and then there are no need for
casts.

Reported-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
8 years agoodp-util: Geneve netlink decoding.
Jesse Gross [Mon, 18 May 2015 23:03:01 +0000 (16:03 -0700)]
odp-util: Geneve netlink decoding.

Even though userspace does not yet support Geneve options,
the kernel does and there is some basic support for decoding
those attributes. This adds the ability to print Geneve
attributes that might potentially come from the kernel.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
8 years agoutil: Library routines for printing and scanning large hex integers.
Jesse Gross [Thu, 21 May 2015 01:47:21 +0000 (18:47 -0700)]
util: Library routines for printing and scanning large hex integers.

Geneve options are variable length and up to 124 bytes long, which means
that they can't be easily manipulated by the integer string functions
like we do for other fields. This adds a few helper routines to make
these operations easier.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
8 years agoodp-util: Format tunnel attributes directly from netlink.
Jesse Gross [Sun, 17 May 2015 05:08:20 +0000 (22:08 -0700)]
odp-util: Format tunnel attributes directly from netlink.

When we format most netlink attributes we do so from the netlink
itself, iterating through each one and printing the contents out.
However, for tunnels we don't do this - we first convert to the
OVS userspace representation and then format that. While convienient,
this isn't really ideal as the primary use of printing netlink
attributes is debugging and this conversion is lossy, particularly
when the attributes aren't as expected. The result is that unexpected
keys are silently ignored and the level of detail on errors is
minimal.

This situation becomes worse when we introduce support for Geneve.
The conversion to userspace format requires additional information
which we might not have (ovs-dpctl) and is more complicated than
other attributes so it is likely to be confusing in the event of a
bug. The information from the kernel is self-describing so it's
much more reliable to display it directly from the netlink.

This converts tunnel attribute formatting to be more similar to
other types of attributes. As a nice bonus the output becomes
more compact because it doesn't print zeroed out attributes in
cases where they aren't relevant and therefore not present.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
8 years agoodp-util: Correctly generate wildcards when formating nested attributes.
Jesse Gross [Wed, 20 May 2015 18:57:35 +0000 (11:57 -0700)]
odp-util: Correctly generate wildcards when formating nested attributes.

When formatting netlink attributes if no mask is present a wildcarded
attribute is synthesized for the purposes of later processing. In
the case of nested attributes this must be done recursively, filling
in the correct attributes at each level rather than just generating
a set of zeros of the correct size. This is done already but it
always uses the attribute type for the top level keys - this corresponds
to nested ENCAP attributes. However, we have several levels of potentially
nested attributes for tunnels that each have their own types.

This uses an approach similar to the kernel where we have sets of
tables for the type of each attribute linked together by pointers.
This allows the mask generation function to automatically traverse
the nested attributes and always get the right types.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
8 years agodatapath-windows: Multiple NBLs support for ingress data path
Sorin Vinturis [Thu, 28 May 2015 21:00:39 +0000 (21:00 +0000)]
datapath-windows: Multiple NBLs support for ingress data path

Added support for creating and handling multiple NBLs with only one NB
for ingress data path.

Signed-off-by: Sorin Vinturis <svinturis at cloudbasesolutions.com>
Reported-by: Alessandro Pilotti <apilotti at cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/2
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agoofp-actions: Improve conjunction error message.
Joe Stringer [Mon, 18 May 2015 17:26:14 +0000 (10:26 -0700)]
ofp-actions: Improve conjunction error message.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agodatapath-windows: Removed memory barrier and master lock
Sorin Vinturis [Wed, 27 May 2015 17:08:00 +0000 (17:08 +0000)]
datapath-windows: Removed memory barrier and master lock

There is no need to enforce Netlink serialization on transactions
sent from userspace. The access to the driver's shared resources
is synchronized anyway. Thus I have removed the master lock.

I also removed the memory barrier from filter dispatch routine. A
memory barrier is already in place in OvsReleaseSwitchContext
function, due to the use of InterlockedCompareExchange function.

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agodatapath-windows: Document OVS tunnel filter callout
Sorin Vinturis [Wed, 27 May 2015 16:58:26 +0000 (16:58 +0000)]
datapath-windows: Document OVS tunnel filter callout

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agodatapath-windows: Support for multiple VXLAN tunnels
Sorin Vinturis [Wed, 27 May 2015 16:58:25 +0000 (16:58 +0000)]
datapath-windows: Support for multiple VXLAN tunnels

At the moment the OVS extension supports only one VXLAN tunnel that
is cached in the extension switch context. Replaced the latter
cached pointer with an array list that contains all VXLAN tunnel
vports.

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Reported-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/64
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agodatapath-windows: Support for custom VXLAN tunnel port
Sorin Vinturis [Wed, 27 May 2015 16:58:25 +0000 (16:58 +0000)]
datapath-windows: Support for custom VXLAN tunnel port

The kernel datapath supports only port 4789 for VXLAN tunnel creation.
Added support in order to allow for the VXLAN tunnel port to be
configurable to any port number set by the userspace.

The patch also checks to see if an existing WFP filter, for the
necessary UDP tunnel port, is already created before adding a new one.
This is a double check, because currently the userspace also verifies
this, but it is necessary to avoid future issues.

Custom VXLAN tunnel port requires the addition of a new WFP filter
with the new UDP tunnel port. The creation of a new WFP filter is
triggered in OvsInitVxlanTunnel function and the removal of the WFP
filter in OvsCleanupVxlanTunnel function.
But the latter functions are running at IRQL = DISPATCH_LEVEL, due
to the NDIS RW lock acquisition, and all WFP calls must be running at
IRQL = PASSIVE_LEVEL. This is why I have created a system thread which
records all filter addition/removal requests into a list for later
processing by the system thread. The ThreadStart routine processes all
received requests at IRQL = PASSIVE_LEVEL, which is the required IRQL
for the necessary WFP calls for adding/removal of the WFP filters.

The WFP filter for the default VXLAN port 4789 is not added anymore at
filter attach. All WFP filters for the tunnel ports are added when the
tunnel ports are initialized and are removed at cleanup. WFP operation
status is then reported to userspace.

It is necessary that OvsTunnelFilterUninitialize function is called
after OvsClearAllSwitchVports in order to allow for the added WFP
filters to be removed. OvsTunnelFilterUninitialize function closes the
global engine handle used by most of the WFP calls, including filter
removal.

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Reported-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/66
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agotests: Check that ofproto/trace accepts dpctl output.
Joe Stringer [Wed, 20 May 2015 20:30:55 +0000 (13:30 -0700)]
tests: Check that ofproto/trace accepts dpctl output.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agotests: Fix in_port(name) test for ofproto/trace.
Joe Stringer [Wed, 20 May 2015 17:35:15 +0000 (10:35 -0700)]
tests: Fix in_port(name) test for ofproto/trace.

Commit c2a77f33ade (tests/ofproto-dpif: Use vlog to test dpif
behaviour.) mistakenly changed the test which checked that ovs-dpctl
accepts named ports as input. Restore the name to the test.

Reported-by: Gurucharan Shetty <gshetty@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agoodp-util: Skip UFID when parsing datapath key.
Joe Stringer [Thu, 21 May 2015 00:04:33 +0000 (17:04 -0700)]
odp-util: Skip UFID when parsing datapath key.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agoofproto-dpif: Make odp/ofp parse errors more clear.
Joe Stringer [Wed, 20 May 2015 23:48:57 +0000 (16:48 -0700)]
ofproto-dpif: Make odp/ofp parse errors more clear.

It's useful to distinguish which type of flow that the parser thinks it
is parsing when we output error messages.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
8 years agoextract-ofp-fields: Detect duplicate fields.
Joe Stringer [Tue, 19 May 2015 21:13:12 +0000 (14:13 -0700)]
extract-ofp-fields: Detect duplicate fields.

Figure out if a developer accidentally defines new NXM fields using an
existing number, and warn them. Useful particularly if new fields are
introduced upstream while rebasing an in-progress patchset.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agoovs_threads: Avoid running pthread destructors from main thread exit.
Gurucharan Shetty [Wed, 8 Apr 2015 00:34:27 +0000 (17:34 -0700)]
ovs_threads: Avoid running pthread destructors from main thread exit.

Windows uses pthreads-win32 library to provide the Linux pthread
functionality. It is observed that when the main thread calls
a pthread destructor after it exits, undefined behavior is seen
(e.g., junk values in data, causing pthread deadlocks).
Similar behavior has been seen by
other people as seen in the following email thread:
https://sourceware.org/ml/pthreads-win32/2003/msg00001.html

To avoid this, this commit de-registers the thread destructor
when the main thread exits (via the atexit handler).

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agorhel: Add buildrequires for procps-ng.
Flavio Leitner [Tue, 28 Apr 2015 02:00:14 +0000 (23:00 -0300)]
rhel: Add buildrequires for procps-ng.

The testsuite is enabled by default and uses some of
the tools provided by procps-ng.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agorhel: Fix rundir ownership.
Flavio Leitner [Tue, 28 Apr 2015 02:01:09 +0000 (23:01 -0300)]
rhel: Fix rundir ownership.

Although the ovs-ctl/ovs-lib takes care of creating the rundir,
it is correct to let the systemd manages the directory and let
the rpm know about the ownership too.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agoovs-docker: Ability to set the MTU of the container interface.
Gurucharan Shetty [Mon, 25 May 2015 08:04:01 +0000 (01:04 -0700)]
ovs-docker: Ability to set the MTU of the container interface.

When containers are connected to a OVS bridge and tunnels
are created, it makese sense to reduce the MTU of the interafce.

Reported-by: Aurélien Poulai <aurepoulain@viacesi.fr>
Signed-off-by: Gurucharan Shetty <shettyg@nicira.com>
8 years agoovs-docker: Add the ability to set the mac address.
Gurucharan Shetty [Mon, 25 May 2015 07:50:01 +0000 (00:50 -0700)]
ovs-docker: Add the ability to set the mac address.

For testing OVN, it is useful to set the mac address
of the container. Since ovs-docker hasn't been part
of any released versions of OVS, it is probably OK
to change the options style.

Signed-off-by: Gurucharan Shetty <shettyg@nicira.com>
8 years agonetdev-windows: Add ARP lookup and next hop functionality.
Alin Serdean [Tue, 19 May 2015 17:21:25 +0000 (17:21 +0000)]
netdev-windows: Add ARP lookup and next hop functionality.

This patch implements two functionalities needed for an active manager:
1. ARP lookup
2. Next hop

The first uses the Windows GetIpNetTable() function:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365956%28v=vs.85%29.aspx

The second one uses GetAdaptersAddresses() function:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365915%28v=vs.85%29.aspx

Both API's are found in the Iphlpapi library. We need to add this library when compiling.

Documentation and appveyor config has been updated to match the use of the new library.

Tested using opendaylight.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Reported-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/63
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agodebian: install openvswitch kernel module under "updates" directory
Ansis Atteka [Tue, 26 May 2015 23:49:49 +0000 (16:49 -0700)]
debian: install openvswitch kernel module under "updates" directory

This patch fixes a bug where "modprobe openvswitch" command on Ubuntu
distribution would have sometimes tried to load OVS kernel module that
shipped together with Linux Kernel, even though one had also installed
OVS datapath debian package created with module-assistant.  Because of
this issue force-reload-kmod command occasionally malfunctioned and
failed to load the right kernel module.

This bug happened *occasionally* because the default Ubuntu depmod
configuration in /etc/depmod.d/ubuntu.conf is set to look for kernel
modules first in "updates" directory, then in "ubuntu" directory and
then in other directories.  If there were two openvswitch.ko modules
in "other directories", then modprobe would have loaded kernel
module that was nondeterministically listed first by file system.

Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agodatpath-windows: Make PacketIO.c compilable with WDK8.
Nithin Raju [Tue, 26 May 2015 06:20:23 +0000 (23:20 -0700)]
datpath-windows: Make PacketIO.c compilable with WDK8.

There's some code in PacketIO.c that is supported in WDK 8.1 only.
The variable declarations for that code must also be WDK 8.1 only.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agoodp-util: always output recirc_id in hex
Andy Zhou [Wed, 20 May 2015 20:46:01 +0000 (13:46 -0700)]
odp-util: always output recirc_id in hex

The match is in hex, this makes it more consistent.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
8 years agoRevert "ovs-ofctl: Always prints recirc_id in decimal"
Andy Zhou [Wed, 20 May 2015 20:14:29 +0000 (13:14 -0700)]
Revert "ovs-ofctl: Always prints recirc_id in decimal"

As there is the potential for this field to be maskable in future, and
the dpctl "-m" output prints a mask for it, return it to hexadecimal.
The next patch will make this consistent to the recirc action by making
the action print the recirc_id in hex as well.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
8 years agodpctl: Don't print UFID if not present.
Joe Stringer [Fri, 22 May 2015 17:24:34 +0000 (10:24 -0700)]
dpctl: Don't print UFID if not present.

With verbose dpctl, if userspace runs against an older kernel, every
entry will have "ufid:<empty>" at the beginning. This is unnecessary and
introduces an additional format for scripts to parse. Drop it.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agoextract-ofp-fields: Port to python3.
Joe Stringer [Tue, 19 May 2015 21:20:31 +0000 (14:20 -0700)]
extract-ofp-fields: Port to python3.

Mostly "print foo" -> "print(foo)" and "iteritems() -> items()". The
latter may be less efficient in python2, but we're not dealing with
massive numbers of items here so it shouldn't noticably slow the build.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
8 years agoextract-ofp-fields: Fix most pep8 style issues.
Joe Stringer [Tue, 19 May 2015 21:18:58 +0000 (14:18 -0700)]
extract-ofp-fields: Fix most pep8 style issues.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
8 years agosparse: Fix sparse when compiling DPDK.
Ethan Jackson [Sun, 17 May 2015 13:06:01 +0000 (06:06 -0700)]
sparse: Fix sparse when compiling DPDK.

Sparse doesn't like several of the DPDK header files.  This patch
works around it so we can get analysis when compiling DPDK.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agonetdev-dpdk: Adapt the requested number of tx and rx queues.
Daniele Di Proietto [Fri, 22 May 2015 16:14:22 +0000 (17:14 +0100)]
netdev-dpdk: Adapt the requested number of tx and rx queues.

This commit changes the semantics of 'netdev_set_multiq()' to allow OVS
DPDK to run on device with limited multi queue support.

* If a netdev doesn't have the requested number of rxqs it can simply
  inform the datapath without failing.
* If a netdev doesn't have the requested number of txqs it should try
  to create as many as possible and use locking.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
8 years agonetdev-dpdk: Use specific spinlock for stats.
Daniele Di Proietto [Fri, 22 May 2015 16:14:21 +0000 (17:14 +0100)]
netdev-dpdk: Use specific spinlock for stats.

Right now ethernet and ring devices use a mutex, while vhost devices use
a mutex or a spinlock to protect statistics.  This commit introduces a
single spinlock that's always used for stats updates.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
8 years agonetdev-dpdk: Properly support non pmd threads.
Daniele Di Proietto [Fri, 22 May 2015 16:14:20 +0000 (17:14 +0100)]
netdev-dpdk: Properly support non pmd threads.

We used to reserve DPDK lcore 0 for non pmd operations, making it
difficult to use core 0 for packet processing.
DPDK 2.0 properly support non EAL threads with lcore LCORE_ID_ANY.

Using non EAL threads for non pmd threads, we do not need to reserve
any core for non pmd operations

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
8 years agoovs-numa: Change 'core_id' to unsigned.
Daniele Di Proietto [Fri, 22 May 2015 16:14:19 +0000 (17:14 +0100)]
ovs-numa: Change 'core_id' to unsigned.

DPDK lcore_id is unsigned.  We need to support big values like
LCORE_ID_ANY (=UINT32_MAX).  Therefore I am changing the type everywhere
in OVS.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
8 years agodatapath: Support masked set actions.
Jarno Rajahalme [Fri, 22 May 2015 18:22:40 +0000 (11:22 -0700)]
datapath: Support masked set actions.

OVS kernel module support for masked set actions in already upstream
in Linux (commit 83d2b9ba1abca241df44a502b6da950a25856b5b).  This
patch adds the same for the OVS tree kernel module.

The existing set action sets many fields at once.  When only a subset
of the IP header fields, for example, should be modified, all the IP
fields need to be exact matched so that the other field values can be
copied to the set action.  A masked set action allows modification of
an arbitrary subset of the supported header bits without requiring the
rest to be matched.

Masked set action is now supported for all writeable key types, except
for the tunnel key.  The set tunnel action is an exception as any
input tunnel info is cleared before action processing starts, so there
is no tunnel info to mask.

The kernel module converts all (non-tunnel) set actions to masked set
actions.  This makes action processing more uniform, and results in
less branching and duplicating the action processing code.  When
returning actions to userspace, the conversion is inverted.  We use a
kernel internal action code to be able to tell the userspace provided
and converted masked set actions apart.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
8 years agodpif-netdev: Reset RSS hash when recirculating.
Daniele Di Proietto [Wed, 22 Apr 2015 18:22:52 +0000 (19:22 +0100)]
dpif-netdev: Reset RSS hash when recirculating.

Having the same RSS hash after recirculation can cause unnecessary
collisions in the exact match cache.  A simple solution is to rehash it
with the recirculation depth if it is non-zero.

Suggested-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
8 years agodpif-netdev: Clear flow batches before execute.
Ethan Jackson [Wed, 20 May 2015 23:55:17 +0000 (16:55 -0700)]
dpif-netdev: Clear flow batches before execute.

When executing actions, it's possible a recirculation will occur
causing dp_netdev_input() to be called multiple times.  If the batch
pointers embedded in dp_netdev_flow aren't cleared, it's possible
packets after the recirculation will be reinserted into a batch
associated with the original lookup.  This could be very bad.

This patch fixes the problem by zeroing out flow batch pointers before
calling packet_batch_execute().  This probably has a slightly negative
performance impact, though I haven't tried it.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agonetdev-dpdk: Use default NIC configuration.
Kevin Traynor [Thu, 21 May 2015 16:26:48 +0000 (17:26 +0100)]
netdev-dpdk: Use default NIC configuration.

This patch simplifies Rx/Tx NIC configuration by removing
custom values and using the defaults provided by the DPDK
PMDs. This also enables Rx vectorisation which improves
performance.

Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agodpif-netdev: Increase the number of EMC entries
Ciara Loftus [Wed, 13 May 2015 13:54:56 +0000 (14:54 +0100)]
dpif-netdev: Increase the number of EMC entries

Prior to this commit, the number of possible entries in the Exact
Match Cache stood at 1024 per thread exacting to 0.18Mb. A typical
server system will have 2.5Mb cache per core meaning a larger EMC will
comfortably fit in. This patch increases the number of entries to 8192
per thread (1.4Mb) which in turn yields improved throughput when
processing multiple flows of traffic.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
8 years agoAUTHORS: Add Dan McGregor.
Ben Pfaff [Thu, 21 May 2015 01:46:00 +0000 (18:46 -0700)]
AUTHORS: Add Dan McGregor.

Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agonetdev-bsd: Include net/bpf.h.
Dan McGregor [Tue, 19 May 2015 19:24:26 +0000 (12:24 -0700)]
netdev-bsd: Include net/bpf.h.

The documentation says it is required to use bpf ioctls on both
NetBSD and FreeBSD. It causes a compile time failure on FreeBSD 10.

Signed-off-by: Dan McGregor <dan.mcgregor@usask.ca>
Signed-off-by: Ben Pfaff <blp@nicira.com>
8 years agodpdk: Ditch MAX_PKT_BURST macro.
Ethan Jackson [Sat, 16 May 2015 15:18:20 +0000 (08:18 -0700)]
dpdk: Ditch MAX_PKT_BURST macro.

The MAX_PKT_BURST and NETDEV_MAX_RX_BATCH macros had a confusing
relationship.  They basically purport to do the same thing, making it
unclear which is the source of truth.

Furthermore, while NETDEV_MAX_RX_BATCH was 256, MAX_PKT_BURST was 32,
meaning we never process a batch larger than 32 packets further adding
to the confusion.

This patch resolves the issue by removing MAX_PKT_BURST completely,
and shrinking the new NETDEV_MAX_BURST macro to only 32.  This should
have no change in the execution path except shrinking a couple of
structs and memory allocations (can't hurt).

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agonetdev-dpdk: Fix sparse warnings.
Ethan Jackson [Mon, 18 May 2015 15:49:24 +0000 (08:49 -0700)]
netdev-dpdk: Fix sparse warnings.

These are all minor style issues.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoovs-ofctl: Always prints recirc_id in decimal
Andy Zhou [Tue, 19 May 2015 01:10:29 +0000 (18:10 -0700)]
ovs-ofctl: Always prints recirc_id in decimal

The output of 'ovs-ofctl dump-flows' command prints recirc_id in decimal
in action parts of the output, while prints that in hex in matching
parts of the same output.

This patch fixes the inconsistency by always printing recirc_id
values in decimal.

Reported-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agodatapath-windows: Fix warning from the powershell module
Alin Serdean [Mon, 4 May 2015 16:44:49 +0000 (16:44 +0000)]
datapath-windows: Fix warning from the powershell module

This patch fixes the warning when datapath-windows/misc/OVS.psm1 is
imported.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Reported-by: Hemanth Kumar Mantri <mantri@nutanix.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/69
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
8 years agodpif-netdev: Share emc and fast path output batches.
Daniele Di Proietto [Mon, 18 May 2015 17:47:51 +0000 (10:47 -0700)]
dpif-netdev: Share emc and fast path output batches.

Until now the exact match cache processing was able to handle only four
megaflows.  The rest of the packets was passed to the megaflow
classifier.

The limit was arbitraly set to four also because the algorithm used to
group packets in output batches didn't perform well with a lot of
megaflows.

After changing the algorithm and after some performance testing it seems
much better just to share the same output batches between the exact
match cache and the megaflow classifier.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodpif-netdev: Store batch pointer in dp_netdev_flow.
Daniele Di Proietto [Mon, 18 May 2015 17:47:50 +0000 (10:47 -0700)]
dpif-netdev: Store batch pointer in dp_netdev_flow.

The userspace datapath

1. receives a batch of packets.
2. finds a 'netdev_flow' (megaflow) for each packet.
3. groups the packets in output batches based on the 'netdev_flow'.

Until now the grouping (2) was done using a simple algorithm with a
O(N^2) runtime, where N is the number of distinct megaflows of the packets
in the incoming batch.  This could quickly become a bottleneck, even with
a small number of megaflows.

With this commit the datapath simply stores in the 'netdev_flow' (the
megaflow) a pointer to the output batch, if one has been created for the
current input batch.  The pointer will be cleared when the output batch
is sent.

In a simple phy2phy test with 128 megaflows the throughput is more than
doubled.

The reason that stopped us from doing this change was that the
'netdev_flow' memory was shared between multiple threads: this is no
longer the case with the per-thread classifier.

Also, this commit reorders struct dp_netdev_flow to group toghether the
members used in the fastpath.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodpif-netdev: Store pkt_metadata structure in dp_netdev_port.
Daniele Di Proietto [Mon, 18 May 2015 17:47:49 +0000 (10:47 -0700)]
dpif-netdev: Store pkt_metadata structure in dp_netdev_port.

Initializing a struct pkt_metadata for every packet can be surprisingly
expensive.  It's much faster to keep a copy for each port and copying it
on each packet.

Suggested-by: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodp-packet: Style fixes.
Daniele Di Proietto [Mon, 18 May 2015 17:47:48 +0000 (10:47 -0700)]
dp-packet: Style fixes.

Also, removes an unused function

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodp-packet: Merge 'allocated' member with DPDK mbuf 'buf_len'.
Daniele Di Proietto [Mon, 18 May 2015 17:47:47 +0000 (10:47 -0700)]
dp-packet: Merge 'allocated' member with DPDK mbuf 'buf_len'.

DPDK buf_len is only 16-bit wide ('allocated' was 32-bit), but it should
be enough to store the number of allocated bytes.

This will reduce 'struct dp_packet' size.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodp-packet: Remove 'frame' member.
Daniele Di Proietto [Mon, 18 May 2015 17:47:46 +0000 (10:47 -0700)]
dp-packet: Remove 'frame' member.

In 'struct ofpbuf' the 'frame' pointer was used to parse different kinds of
data (Ethernet, OpenFlow, Netlink attributes).  For Ethernet packets the
'frame' pointer was supposed to have the same value as the 'data'
pointer.

Since 'struct dp_packet' is only used for Ethernet packets, there's no
need for a separate 'frame' pointer: we can use the 'data' pointer
instead.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodp-packet: Remove 'list' member.
Daniele Di Proietto [Mon, 18 May 2015 17:47:45 +0000 (10:47 -0700)]
dp-packet: Remove 'list' member.

The 'list' member is only used (two users) in the slow path.
This commit removes it to reduce the struct size

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agoofproto: Fix memory leak in flow deletion.
Jarno Rajahalme [Mon, 18 May 2015 17:24:02 +0000 (10:24 -0700)]
ofproto: Fix memory leak in flow deletion.

Fix a memory leak that was introduced in commit 834fe5cb997b (ofproto:
Additional simplifications.).  We used to unref the flow
asynchronously, but forgot to do it when the support for asynchronous
operations was removed.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
8 years agodatapath: Fix Sparse warning.
Pravin B Shelar [Fri, 15 May 2015 13:32:32 +0000 (06:32 -0700)]
datapath: Fix Sparse warning.

CHECK   /home/pravin/ovs/w8/datapath/linux/flow_table.c
/home/pravin/ovs/w8/datapath/linux/flow_table.c:536:6: warning: symbol
'ovs_flow_cmp_unmasked_key' was not declared. Should it be static?

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
8 years agodatapath: backport kfree_skb_list()
Pravin B Shelar [Fri, 15 May 2015 13:27:35 +0000 (06:27 -0700)]
datapath: backport kfree_skb_list()

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
8 years agoINSTALL.DPDK: Notes on running ovs-vswitchd/dpdk inside a VM
Oleg Strikov [Fri, 8 May 2015 19:05:13 +0000 (12:05 -0700)]
INSTALL.DPDK: Notes on running ovs-vswitchd/dpdk inside a VM

Additional configuration is required if you want to run ovs-vswitchd
with DPDK backend inside a QEMU virtual machine. This happens because,
by default, virtio NIC provided to the guest doesn't support multiple
TX queues which are required by ovs-vswitchd/dpdk. This commit updates
INSTALL.DPDK.md to provide guidelines on how to enable support for
multiple TX queues using QEMU command line and Libvirt config file.

Signed-off-by: Oleg Strikov <oleg.strikov@canonical.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetdev-dpdk: Add vhost enqueue retries.
Kevin Traynor [Tue, 12 May 2015 04:58:14 +0000 (21:58 -0700)]
netdev-dpdk: Add vhost enqueue retries.

The max allowed burst size for a single vhost enqueue is 32.
This code facilitates trying to send greater than the burst
size of packets to the vhost interface by adding a retry loop
and calling vhost enqueue multiple times. As this could
potentially block, a timeout is added.

Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetdev-dpdk: Change phy rx burst size.
Kevin Traynor [Tue, 12 May 2015 04:58:12 +0000 (21:58 -0700)]
netdev-dpdk: Change phy rx burst size.

Change phy rx burst size from 192 to 32. This aligns the
burst size with the other dpdk interfaces and significantly
improves performance when forwarding to dpdk vhost ports.

Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoutilities: Add new pipeline generator script.
Ethan Jackson [Thu, 26 Mar 2015 19:52:42 +0000 (12:52 -0700)]
utilities: Add new pipeline generator script.

When doing OVS performance testing, it's important to have both
realistic traffic traces and OpenFlow pipelines on which to evaluate
prospective changes.  As a first step in this direction, this patch
adds a python script which generates an OpenFlow pipeline intended to
simulate typical network virtualization workloads.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
9 years agoofp-util: Use OFPGMFC_OUT_OF_BUCKETS for indirect groups with !=1 buckets.
Ben Pfaff [Fri, 8 May 2015 16:15:43 +0000 (09:15 -0700)]
ofp-util: Use OFPGMFC_OUT_OF_BUCKETS for indirect groups with !=1 buckets.

OpenFlow 1.3 says:

    If a switch cannot add the incoming group entry due to restrictions
    (hardware or otherwise) limiting the number of group buckets, it must
    refuse to add the group entry and must send an ofp_error_msg with
    OFPET_GROUP_MOD_FAILED type and OFPGMFC_OUT_OF_BUCKETS code.

This indicates that OFPGMFC_OUT_OF_BUCKETS is appropriate for an indirect
group with the wrong number of buckets, but OVS was using a different
error.  This fixes the problem.

ONF-JIRA: EXT-546
Reported-by: Mrinmoy Das <mrdas@ixiacom.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agodatapath: Add support for 4.0 kernel.
Joe Stringer [Fri, 10 Apr 2015 01:40:51 +0000 (18:40 -0700)]
datapath: Add support for 4.0 kernel.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agotravis: Fix clang build for DPDK-2.0.
Joe Stringer [Thu, 7 May 2015 20:36:43 +0000 (13:36 -0700)]
travis: Fix clang build for DPDK-2.0.

-Wno-cast-align is a CFLAG, not a configure option.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
9 years agodatapath-windows: Correctly link newly allocated NBL
Sorin Vinturis [Fri, 8 May 2015 06:17:43 +0000 (06:17 +0000)]
datapath-windows: Correctly link newly allocated NBL

OvsPartialCopyToMultipleNBLs function failed to correctly link the newly
created NBL with single NB to the multiple NBLs list.

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: Added new function for native forwarded traffic
Sorin Vinturis [Fri, 8 May 2015 06:16:51 +0000 (06:16 +0000)]
datapath-windows: Added new function for native forwarded traffic

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: define compat __skb_gso_segment()
Pravin B Shelar [Thu, 7 May 2015 17:17:26 +0000 (10:17 -0700)]
datapath: define compat __skb_gso_segment()

OVS correctly define skb_gso_segment() to handle MPLS and VLAN
segmentation correctly. But OVS also uses __skb_gso_segment() in
some cases. Following patch defines compat __skb_gso_segment()
to handle all segmentation cases.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agodpctl: Ignore enumeration errors if there is at least one datapath.
Daniele Di Proietto [Wed, 6 May 2015 18:00:26 +0000 (19:00 +0100)]
dpctl: Ignore enumeration errors if there is at least one datapath.

When dpctl commands are used to inspect a userspace datapath, but OVS
has also built-in support for the kernel datapath, an error message is
reported if the kernel module is not loaded.  This commit suppresses the
message.

Suggested-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodpctl: Factor out common code to iterate through all dpifs.
Daniele Di Proietto [Wed, 6 May 2015 18:00:25 +0000 (19:00 +0100)]
dpctl: Factor out common code to iterate through all dpifs.

This commit introduces dps_for_each() which calls a callback for each
datapath of each registered type.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agolldp: Fix clang warning.
Joe Stringer [Wed, 6 May 2015 21:31:55 +0000 (14:31 -0700)]
lldp: Fix clang warning.

Clang-3.7 generates warnings such as the following:
../lib/ovs-lldp.c:394:19: error: address of array 'hardware->h_ifname'
will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]

This value is fetched from a netdev, which as far as I can tell must
always have a non-NULL name. Simplify this code.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Dennis Flynn <drflynn@avaya.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodpctl: Add OVS_PRINTF_FORMAT annotation to dpctl_* functions.
Daniele Di Proietto [Wed, 6 May 2015 18:00:24 +0000 (19:00 +0100)]
dpctl: Add OVS_PRINTF_FORMAT annotation to dpctl_* functions.

Fixes passing variable data as a printf() format string.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoAUTHORS: Add Billy O'Mahony.
Ben Pfaff [Thu, 7 May 2015 17:54:23 +0000 (10:54 -0700)]
AUTHORS: Add Billy O'Mahony.

Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodocs: Clarify creation & bonding of DPDK enabled interfaces.
Billy O'Mahony [Tue, 5 May 2015 16:37:31 +0000 (17:37 +0100)]
docs: Clarify creation & bonding of DPDK enabled interfaces.

Unlike system interfaces, DPDK enabled interfaces must have their interface
type explicitly set when used to create ports.  Mention this in relevant parts
of the documentation and add references to INTALL.DPDK.md, where there are many
examples.

Signed-off-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoFAQ: Explain how "tap" devices work and why you should not use them.
Ben Pfaff [Wed, 6 May 2015 00:24:50 +0000 (17:24 -0700)]
FAQ: Explain how "tap" devices work and why you should not use them.

CC: 张伟 <zhangwqh@126.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
9 years agodatapath: stt compatibility for RHEL7
Pravin B Shelar [Sun, 3 May 2015 18:56:54 +0000 (11:56 -0700)]
datapath: stt compatibility for RHEL7

RHEL7 backported nf_hookfn from newer kernel. Handle compatibility
by checking nf_hookfn declaration.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agoodp-util: Fix a bug in parse_flag().
Alex Wang [Sat, 2 May 2015 03:54:00 +0000 (20:54 -0700)]
odp-util: Fix a bug in parse_flag().

This commit fixes a bug in the parse_flag() function which causes
failure of parsing tunnel flags like:

tunnel(tun_id=0x0,src=1.2.3.4,dst=1.2.3.5,tos=0,ttl=64,flags(-df+csum+key))

Reported-by: Jacob Cherkas <jcherkas@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoxenserver: Use kernel uname version for XenServer 6.5
Edwin Chiu [Tue, 28 Apr 2015 22:15:54 +0000 (15:15 -0700)]
xenserver: Use kernel uname version for XenServer 6.5

In XenServer 6.5, multiple kernel packages with different
rpm versions can have the same uname.  So, it is not
necessary for openvswitch kernel module to require the
exact rpm version.  Instead, the kernel module package
should check the uname version.

This commit will add a new variable %{kernel_uname} to
specify whether to use kernel uname version or kernel
rpm version as requirement.

When %{kernel_name} is used, openvswitch-module will have
"Requires: kernel-uname-r = <uname version>" set instead of
"Requires: kernel = <version>".

Reported-by: Gosen Chien <astgosen@ccu.edu.tw>
Signed-off-by: Edwin Chiu <echiu@vmware.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
9 years agodatapath: gre: Reset fix_segment pointer.
Pravin B Shelar [Sat, 2 May 2015 00:30:44 +0000 (17:30 -0700)]
datapath: gre: Reset fix_segment pointer.

For kernel version 3.12 to 3.18, GRE uses compat code to
transmit packets which used fix_segment to segment packets.
but ovs_gso_cb->fix_segment is not initialized for GRE tunnels.
Following patches fixes it by resetting fix_segment.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agodpctl: cleaner dpctl output for tunnel ports.
Pravin B Shelar [Fri, 1 May 2015 18:02:02 +0000 (11:02 -0700)]
dpctl: cleaner dpctl output for tunnel ports.

Currently dont-fragment and TTL are initialized to zero, but
those are not default config for tunnel ports.  dpctl
does not show default config of a port.  So by setting these
values to default we can get cleaner `dpctl show` output.

% ovs-dpctl show
system@ovs-system:
port 0: ovs-system (internal)
port 1: br0 (internal)
port 4: gre_sys (gre: df_default=false, ttl=0)

% ovs-dpctl show # After initializing default values.
system@ovs-system:
port 0: ovs-system (internal)
port 1: br0 (internal)
port 4: gre_sys (gre)

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agoDPDK: add support for v2.0.0
Mark Kavanagh [Mon, 20 Apr 2015 19:37:14 +0000 (12:37 -0700)]
DPDK: add support for v2.0.0

Update relevant artifacts to add support for DPDK v2.0.0
 - INSTALL.DPDK.md
 - travis build script
 - acinclude.m4: add 'mssse3' flag to OVS_CFLAGS
 - netdev-dpdk: fix build with unified offload types in DPDK v2.0.0

Note that this breaks compatibility with DPDK v1.8.0

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agogitignore: Add file to .gitignore
Alin Serdean [Tue, 28 Apr 2015 22:36:29 +0000 (22:36 +0000)]
gitignore: Add file to .gitignore

Add testsuite.tmp.orig to .gitignore

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agodatapath: Tidy up duplicate symbol detection.
Joe Stringer [Wed, 29 Apr 2015 20:33:25 +0000 (13:33 -0700)]
datapath: Tidy up duplicate symbol detection.

Don't print each symbol that is iterated.
Make the error message more clear by prefixing "error: ".

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agotest-ovsdb: Fix conditional statement.
Alex Wang [Wed, 29 Apr 2015 17:41:39 +0000 (10:41 -0700)]
test-ovsdb: Fix conditional statement.

Old version of python does not support the following conditional
statement syntax in one assignment:

   var = value1 if cond else value2

This commit fixes it by convert it back to use two assignments.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Russell Bryant <rbryant@redhat.com>
9 years agodatapath: Add Stateless TCP Tunneling protocol.
Pravin B Shelar [Fri, 10 Apr 2015 03:12:32 +0000 (20:12 -0700)]
datapath: Add Stateless TCP Tunneling protocol.

The Stateless TCP Tunnel (STT) protocol encapsulates traffic in
IPv4/TCP packets.
STT uses TCP segmentation offload available in most of NIC. On
packet xmit STT driver appends STT header along with TCP header
to the packet. For GSO packet GSO parameters are set according
to tunnel configuration and packet is handed over to networking
stack. This allows use of segmentation offload available in NICs

The protocol is documented at
http://www.ietf.org/archive/id/draft-davie-stt-06.txt

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agoovs-hyperv: make kernel return values netlink socket like
Nithin Raju [Tue, 28 Apr 2015 21:35:37 +0000 (14:35 -0700)]
ovs-hyperv: make kernel return values netlink socket like

In this patch, we make changes to usersapce as well as
kernel datapath on hyperv to make it more netlink socket
like. Previously, the kernel datapath did not distinguish
between "transport errors" and other errors. Netlink
semantics dictate that netlink functions should only
return an error only in the case of a "transport error"
which is generally something fatal. Eg. failure to
communicate with the OVS module, or an invalid command
altogether. Other errors such as an unsupported action,
or an invalid flow key is not considered a "transport
error", and in such cases, netlink functions are to return
success with a 'struct nlmsgerr' populated in the output
buffer.

This patch implements these semantics.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/72
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: Enable extension after rrestart
Sorin Vinturis [Wed, 29 Apr 2015 12:58:16 +0000 (12:58 +0000)]
datapath-windows: Enable extension after rrestart

The extension failed to be activated during booting due to the
failure to initialize tunnel filter. This happened because the Base
Filtering Engine (BFE) is not started and no session to the engine
could be acquired.

The solution for this was to registered a BFE notification callback
that is called whenever the BFE's state changes. Only if the BFE's
state is running the tunnel filter is initialized.

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Reported-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/77
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agodatapath-windows: Removed duplicate instance pid removal
Sorin Vinturis [Thu, 23 Apr 2015 20:37:02 +0000 (20:37 +0000)]
datapath-windows: Removed duplicate instance pid removal

Instance PID is already deleted in the OvsCleanupPacketQueue function.

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agodatapath: Fix check-export-symbol for non-bash shells
YAMAMOTO Takashi [Mon, 27 Apr 2015 05:48:46 +0000 (14:48 +0900)]
datapath: Fix check-export-symbol for non-bash shells

Avoid using a bash construct (=~) in the target.

An alternative would be to make the configure script require
bash explicitly.  (Currently it doesn't and on NetBSD /bin/ksh
is likely used.)

The code in question was introduced by
commit b296b82a87326e68773b970284b8e012def0e3ba .
("datapath: Check the export of public functions in linux/compat/linux/.")

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agodatapath: Stop using __DATE__ and __TIME__ in startup string.
Jesse Gross [Mon, 27 Apr 2015 19:28:55 +0000 (12:28 -0700)]
datapath: Stop using __DATE__ and __TIME__ in startup string.

An increasing number of distributions ship with GCC 4.9 (including
Fedora and Ubuntu) that has -Werror=date-time. This causes kernel
compilation to fail because the builds are not exactly reproducible.

This simply removes the use of those constants, which was already
done for the upstream Linux version of the module. It retains the
version string, however, which should provide the same information
in most cases.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoAllow subclasses of Idl to define a notification hook
Terry Wilson [Sat, 25 Apr 2015 19:57:44 +0000 (14:57 -0500)]
Allow subclasses of Idl to define a notification hook

It is useful to make the notification events that Idl processes
accessible to users of the library. This will make it possible to
keep external systems in sync, but does not impose any particular
notification pattern.

The Row.from_json() call is added to be able to convert the 'old'
JSON response on an update to a Row object to make it easy for
users of notify() to see what changed, though this usage of Row
is quite different than Idl's typical use.

Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofp-parse: Correctly report error parsing selection method parameter.
Ben Pfaff [Sun, 26 Apr 2015 17:17:27 +0000 (10:17 -0700)]
ofp-parse: Correctly report error parsing selection method parameter.

Found by LLVM scan-build.

Reported-by: Kevin Lo <kevlo@FreeBSD.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Kevin Lo <kevlo@FreeBSD.org>
9 years agodatapath: Use kernel Geneve implementation on 4.0 and above.
Jesse Gross [Sun, 26 Apr 2015 00:00:11 +0000 (17:00 -0700)]
datapath: Use kernel Geneve implementation on 4.0 and above.

When Geneve was originally backported, it wasn't available as part
of a released kernel version but it is now, so we can take advantage
of the native implementation.

Note that Geneve was actually first available as part of the 3.18
kernel release but some drivers erroreously try to offload it as
if it were VXLAN, which was fixed in the 4.0 release. Since our
UDP tunnel compat layer already takes care of this, we continue
using the OVS Geneve implementation until 4.0.

Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
9 years agoudptunnel: Kernel 3.20 doesn't exist.
Jesse Gross [Sat, 25 Apr 2015 23:28:23 +0000 (16:28 -0700)]
udptunnel: Kernel 3.20 doesn't exist.

When the UDP tunnel compat code was written, it backported some
functions that were slated to be in the next kernel release, then
called 3.20. However, this was ultimately released as 4.0 instead.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
9 years agodatapath: Check the export of public functions in linux/compat/linux/.
Alex Wang [Mon, 20 Apr 2015 03:54:50 +0000 (20:54 -0700)]
datapath: Check the export of public functions in linux/compat/linux/.

This commit adds check in datapath/Makefile to make sure that all public
functions and exported symbols in linux/compat/ are either rpl_ or ovs_
prefixed, except those defined in compat/build-aux/export-check-whitelist.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agodatapath: Prevent linker error of unknown symbol.
Alex Wang [Tue, 21 Apr 2015 01:19:53 +0000 (18:19 -0700)]
datapath: Prevent linker error of unknown symbol.

With the latest change of separating vports into their own modules,
it is necessary to export all public functions in linux/compat/
directory.  Also, we should prefix functions which replace the
upstream ones with 'rpl_' and others with 'ovs_'.  This will prevent
the linker error when vport modules use those functions in the future.
e.g., the to be merged vport-stt module will use the flex_array_*
functions which are not currently exported.

Co-authored-by: Tuan Nguyen <tuan.nguyen@veriksystems.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agodatapath: Remove linux/compat/include/linux/log2.h.
Alex Wang [Tue, 21 Apr 2015 21:03:31 +0000 (14:03 -0700)]
datapath: Remove linux/compat/include/linux/log2.h.

No longer need this compat file, we can use the upstream version
of the function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agodatapath-windows: Removed gOvsCtrlLock global spinlock
Sorin Vinturis [Thu, 23 Apr 2015 20:27:53 +0000 (20:27 +0000)]
datapath-windows: Removed gOvsCtrlLock global spinlock

There is no need to use gOvsCtrlLock spinlock to guard the switch
context, as there is now the switch context's reference count used
for this purpose.

Now the gOvsCtrlLock spinlock guards only one shared resource, the
OVS_OPEN_INSTANCE global instance array.

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agoRemove compiler warning
Alin Serdean [Thu, 23 Apr 2015 18:46:39 +0000 (18:46 +0000)]
Remove compiler warning

When linking executables on windows the following argument is passed
to the linker -Qunused-arguments.
This results in the following warning:
Command line warning D9002 : ignoring unknown option '-Qunused-arguments'

This patch removes that warning.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agodatapath-windows: don't free switch cxt until ref == 0
Nithin Raju [Thu, 23 Apr 2015 00:10:10 +0000 (17:10 -0700)]
datapath-windows: don't free switch cxt until ref == 0

This is a hard to hit corner case, because currently we recommend that
all handles to the kernel datapath be closed before trying to unload the
OVS extension.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agotestsuite: Don't apply the testsuite.patch on non-Windows platforms.
Gurucharan Shetty [Thu, 23 Apr 2015 14:13:04 +0000 (07:13 -0700)]
testsuite: Don't apply the testsuite.patch on non-Windows platforms.

On CentOS machines which use autoconf version 2.63, the patch
application would fail.

Reported-by: Ian Stokes <ian.stokes@intel.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agonetdev-dpdk: Reset RSS hash on transmit
Mark D. Gray [Mon, 13 Apr 2015 13:36:56 +0000 (06:36 -0700)]
netdev-dpdk: Reset RSS hash on transmit

When using DPDK rings (dpdkr port type), packet buffers get shared
to consumers of the rings (e.g. Virtual Machines). The packet buffers
also include the RSS hash. This is a hash of a number of fields
in the packet and is used in order to do a fast lookup in the EMC.

However, if a consumer of the packet modifies the packet without
regenerating the RSS hash, the EMC will use the same hash for lookup
even though the packet may belong to a different flow. This would
cause unnecessary collisions in the EMC reducing performance in the
presence of multiple flows.

To avoid receiving an incorrect RSS hash on reception from a DPDK
ring, the RSS hash needs to be reset on transmission. This will reduce
performance of the forwarding path as the RSS hash will need to
calculated for every packet received from an dpdkr but will behave
correctly in the presence of a large number of flows that get
modified by the consumer of a DPDK ring

Signed-off-by: Mark D. Gray <mark.d.gray@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>