cascardo/ovs.git
7 years agoovsdb: Fix ovsdb-server replication blocking bug.
Mario Cabrera [Tue, 28 Jun 2016 22:08:05 +0000 (15:08 -0700)]
ovsdb: Fix ovsdb-server replication blocking bug.

With this patch ovsdb-server no longer blocks waiting for the remote server
connection when doing replication.

Signed-off-by: Mario Cabrera <mario.cabrera@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodpctl.at: Ignore vlog rate limit warning.
Paul Boca [Tue, 28 Jun 2016 20:16:09 +0000 (20:16 +0000)]
dpctl.at: Ignore vlog rate limit warning.

The message "Dropped 1 log messages in the last ..." makes this test fail.

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
7 years agoovs-ofctl.at: Prevent msys from getting confused with ipv6 address.
Paul Boca [Sun, 26 Jun 2016 12:12:29 +0000 (12:12 +0000)]
ovs-ofctl.at: Prevent msys from getting confused with ipv6 address.

msys converts ::0/0 into ;c:\MinGW\msys\1.0\1.

To prevent this, use fullform ipv6 address of the form 0:0:0:0:0:0:0:0
instead.

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agotests: Fixed PMD tests on Windows
Paul Boca [Sun, 26 Jun 2016 12:12:23 +0000 (12:12 +0000)]
tests: Fixed PMD tests on Windows

CHECK_CPU_DISCOVERED check the log file now, not the stderr.
On Windows the ovs-vswitchd output is logged only in log file, not to stderr.
Tested both on Windows and Linux

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agobridge: allow OVS to interact with controller through sockets outside run dir
Ansis Atteka [Mon, 20 Jun 2016 21:19:40 +0000 (14:19 -0700)]
bridge: allow OVS to interact with controller through sockets outside run dir

Currently Open vSwitch is unable to create or connect to Unix Domain
Sockets outside designated 'run' directory, because of fear of potential
remote exploits where a hacked remote OVSDB manager would tell Open vSwitch
to connect to a unix domain socket owned by other daemon on the same
hypervisor.

This patch allows to disable this behavior by changing
/etc/default/openvswitch (Ubuntu) or /etc/sysconfig/openvswitch (RHEL)
file to:

...
OVS_CTL_OPTS=--no-self-confinement
...

Note, that it is better to stick with default behavior, unless:
1. You have Open vSwitch running under SELinux or AppArmor
   that would prevent OVS from messing with sockets owned by other
   daemons; OR
2. You are sure that relying on OpenFlow handshake is enough to
   prevent OVS to adversely interact with those other daemons
   running on the same hypervisor; OR
3. You don't have much worries of remote exploits in the first
   place, because perhaps OVSDB manager is running on the same host
   as OVS.

The initial use-case for this patch is to allow to connect to OpenFlow
controller that has its socket outside OVS run directory.  However,
in the future it could be generalized to allow to disable self-confinement
for other things like DPDK vhost-user sockets or anything else
that is specifiable in OVSDB with full path.

Signed-off-by: Ansis Atteka <aatteka@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
VMware-BZ: #1525857

7 years agodatapath-windows: Conntrack - Fix variable initialization
Sairam Venugopal [Sat, 25 Jun 2016 01:16:09 +0000 (18:16 -0700)]
datapath-windows: Conntrack - Fix variable initialization

Initialize the variable pktMdLabel.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agodatapath-windows: Handle possible NULL pointer dereference in STT.
Paul Boca [Mon, 27 Jun 2016 20:44:01 +0000 (20:44 +0000)]
datapath-windows: Handle possible NULL pointer dereference in STT.

Check if OvsAllocatememoryWithTag succeeded or not.
In case of failure propagate cleanup and return.

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agotests: Fixed ovsdb-monitor tests.
Paul Boca [Fri, 24 Jun 2016 16:51:49 +0000 (16:51 +0000)]
tests: Fixed ovsdb-monitor tests.

Redirect ovsdb-client stderr to /dev/null.
This fixes the series of tests that use OVSDB_CHECK_MONITOR macro.

The theory behind the fix was explained by Ben Pfaff as follows:

"I suspect I understand what's happening here.

To execute the following command, Autotest internally redirects stdout
and stderr to files named "stdout" and "stderr":
> ./ovsdb-monitor.at:47: ovsdb-client -vjsonrpc \
--pidfile="`pwd`"/client-pid -d json monitor --format=csv \
unix:socket ordinals ordinals  > output &
> stderr:
> stdout:

Ordinarily, after the command exits it would close the file, but & means
that it holds the file open.  While the next few ovsdb-client commands
run, it queues up some output in stdio buffers but doesn't bother to
actually flush it[*].

    [*] There's either a hole in my theory here or Windows is not fully
        ANSI C conformant since ANSI C says that "As initially opened,
        the standard error stream is not fully buffered; ..." which
        means that it'd probably be line buffered, so that each line of
        the log is flushed separately.

On Unix-like OSes, the following Autotest commands don't really care
about this open file, since the OS will happily delete and replace the
"stderr" file and allow the previous file with that name to remain open.
On Windows, the OS won't permit that, so I guess the shell is actually
just opening the existing file.

Later, "ovs-appctl --target=`pwd`/unixctl exit" causes ovsdb-server to
exit.  It flushes its accumulated stderr buffer to the OS, and therefore
it shows up in the "stderr" output as part of ovs-appctl's output since
ovs-appctl and ovsdb-server both had their output sent to the same file.

Probably, adding 2>/dev/null to the ovsdb-server command would solve the
problem.  To get better output for debugging failures, also add
--log-file and AT_CAPTURE_FILE([ovsdb-server.log])."

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agosystem-traffic: Remove basic connectivity tests.
Joe Stringer [Thu, 23 Jun 2016 01:00:44 +0000 (18:00 -0700)]
system-traffic: Remove basic connectivity tests.

For many of the tests, we would first execute a "basic connectivity
check" to validate the sanity of the setup before running the test
traffic which probes the actual OVS behaviour. However, by running
traffic through the rules prior to running the test, it is more likely
that the traffic hits datapath flows and doesn't test the "execute" path
(from userspace to kernel). This can hide some classes of bugs.

The first few tests in system-traffic already check the basic sanity of
the environment, so these redundant pieces are unnecessary. Remove them.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agocompat: Backport ip_do_fragment().
Joe Stringer [Thu, 23 Jun 2016 01:00:43 +0000 (18:00 -0700)]
compat: Backport ip_do_fragment().

Prior to upstream Linux commit d6b915e29f4a ("ip_fragment: don't forward
defragmented DF packet"), the fragmentation behaviour was incorrect when
dealing with linear skbs, as it would not respect the "max_frag_size"
that ip_defrag() provides, but instead attempt to use the output
device's MTU.

If OVS reassembles an IP message and passes it up to userspace, it
also provides a PACKET_ATTR_MRU to indicate the maximum received unit
size for this message. When userspace executes actions to output this
packet, it passes the MRU back down and this is the desired refragment
size. When the packet data is placed back into the skb in the execute
path, a frags list is not created so fragmentation code will treat it
as one big linear skb. Due to the above bug it would use the device's
MTU to refragment instead of the provided MRU. In the case of regular
ports, this is not too dangerous as the MTU would be a reasonable value.
However, in the case of a tunnel port the typical MTU is a very large
value. As such, rather than refragmenting the message on output, it
would simply output the (too-large) frame to the tunnel.

Depending on the tunnel type and other factors, this large frame could
be dropped along the path, or it could end up at the remote tunnel
endpoint and end up being delivered towards a remote host stack or VM.
If OVS is also controlling that endpoint, it will likely drop the packet
when sending to the final destination, because the packet exceeds the
port MTU.

Different OpenFlow rule configurations could end up preventing IP
messages from being refragmented correctly for as many as the first four
attempts in each connection.

Fix this issue by backporting ip_do_fragment() so that it will respect
the MRU value that is provided in the execute path.

VMWare-BZ: #1651589
Fixes: 213e1f54b4b3 ("compat: Wrap IPv4 fragmentation.")
Reported-by: Salman Malik <salmanm@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agocompat: ipv4: Pass struct net through ip_fragment.
Eric W. Biederman [Thu, 23 Jun 2016 01:00:42 +0000 (18:00 -0700)]
compat: ipv4: Pass struct net through ip_fragment.

Upstream commit:
    ipv4: Pass struct net through ip_fragment

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Upstream: 694869b3c544 ("ipv4: Pass struct net through ip_fragment")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Pass net into ovs_fragment.
Eric W. Biederman [Thu, 23 Jun 2016 01:00:41 +0000 (18:00 -0700)]
datapath: Pass net into ovs_fragment.

Upstream commit:
    openvswitch: Pass net into ovs_fragment

    In preparation for the ipv4 and ipv6 fragmentation code taking a net
    parameter pass a struct net into ovs_fragment where the v4 and v6
    fragmentation code is called.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Upstream: c559cd3ad32b ("openvswitch: Pass net into ovs_fragment")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agoutil: New function nullable_xstrdup().
Ben Pfaff [Sat, 25 Jun 2016 04:23:16 +0000 (21:23 -0700)]
util: New function nullable_xstrdup().

It's a pretty common pattern so create a function for it.

Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoofp-util: Zero out padding bytes in ofputil_ipfix_stats_to_reply().
Ben Pfaff [Sun, 26 Jun 2016 21:54:04 +0000 (14:54 -0700)]
ofp-util: Zero out padding bytes in ofputil_ipfix_stats_to_reply().

Otherwise IPFIX statistics leak memory from ovs-vswitchd.

Reported-by: William Tu <u9012063@gmail.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-June/073769.html
Acked-by: William Tu <u9012063@gmail.com>
Tested-by: Daniel Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoFAQ: Describe how to use "learn" as a primitive.
Ben Pfaff [Sun, 19 Jun 2016 17:24:11 +0000 (10:24 -0700)]
FAQ: Describe how to use "learn" as a primitive.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoipfix: Export user specified virtual observation ID
Wenyu Zhang [Sat, 25 Jun 2016 00:10:07 +0000 (17:10 -0700)]
ipfix: Export user specified virtual observation ID

In virtual network, users want more info about the virtual point to observe the traffic.
It should be a string to provide clear info, not a simple interger ID.

Introduce "other-config: virtual_obs_id" in IPFIX, which is a string configured by user.
Introduce an enterprise IPFIX entity "virtualObsID"(898) to export the value. The entity is a
variable-length string.

Signed-off-by: Wenyu Zhang <wenyuz@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovsdb: Add table exclusion functionality to OVSDB replication
Mario Cabrera [Tue, 29 Mar 2016 17:01:00 +0000 (11:01 -0600)]
ovsdb: Add table exclusion functionality to OVSDB replication

A blacklist of tables that will be excluded from replication can be
specified by the following option:

--sync-exclude-tables=db:table[,db:table]…

Where 'table' corresponds to a table name, and 'db' corresponds to the
database name where the table resides.

Signed-off-by: Mario Cabrera <mario.cabrera@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovsdb: Introduce OVSDB replication feature
Mario Cabrera [Sat, 25 Jun 2016 00:13:06 +0000 (17:13 -0700)]
ovsdb: Introduce OVSDB replication feature

Replication is enabled by using the following option when starting the
database server:

--sync-from=server

Where 'server' can take any form described in the ovsdb-client(1)
manpage as an active connection. If this option is specified, the
replication process is immediately started.

Signed-off-by: Mario Cabrera <mario.cabrera@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodocs: OVSDB replication design document
Mario Cabrera [Tue, 29 Mar 2016 15:28:05 +0000 (09:28 -0600)]
docs: OVSDB replication design document

The database replication functionality is designed to provide "fail
over" characteristics. There are two participating databases, one of
which is the "active" database and the other is the "stand by" database.
Replication happens exclusively from the active to the stand by
database.

This document explains how the replication functionality is implemented.

Signed-off-by: Mario Cabrera <mario.cabrera@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath:backport: openvswitch: Add packet len info to upcall.
William Tu [Fri, 24 Jun 2016 22:50:58 +0000 (15:50 -0700)]
datapath:backport: openvswitch: Add packet len info to upcall.

Upstream commit:
    commit b95e5928fcc76d156352570858abdea7b2628efd
    Author: William Tu <u9012063@gmail.com>
    Date:   Mon Jun 20 07:26:17 2016 -0700

    The commit f2a4d086ed4c ("openvswitch: Add packet truncation support.")
    introduces packet truncation before sending to userspace upcall receiver.
    This patch passes up the skb->len before truncation so that the upcall
    receiver knows the original packet size. Potentially this will be used
    by sFlow, where OVS translates sFlow config header=N to a sample action,
    truncating packet to N byte in kernel datapath. Thus, only N bytes instead
    of full-packet size is copied from kernel to userspace, saving the
    kernel-to-userspace bandwidth.

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140135299
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
7 years agoFAQ: Update support for NAT and Geneve.
Jesse Gross [Fri, 24 Jun 2016 21:55:37 +0000 (14:55 -0700)]
FAQ: Update support for NAT and Geneve.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Nithin Raju <nithin@vmware.com>
7 years agodatapath-windows: Address minor alignment issues in Stt code.
Yin Lin [Fri, 24 Jun 2016 21:44:31 +0000 (14:44 -0700)]
datapath-windows: Address minor alignment issues in Stt code.

Signed-off-by: Yin Lin <linyi@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agodatapath-windows: Add Geneve support
Yin Lin [Fri, 24 Jun 2016 21:44:30 +0000 (14:44 -0700)]
datapath-windows: Add Geneve support

Signed-off-by: Yin Lin <linyi@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agonetdev-dpdk: Fix using uninitialized link_status.
Ilya Maximets [Fri, 24 Jun 2016 13:28:32 +0000 (16:28 +0300)]
netdev-dpdk: Fix using uninitialized link_status.

'rte_eth_link_get_nowait()' works only with physical ports.
In case of vhost-user port, 'link' will stay uninitialized and there
will be random messages in log about link status.

Ex.:
|dpdk(dpdk_watchdog2)|DBG|Port -1 Link Up - speed 10000 Mbps - full-duplex

Fix that by calling 'check_link_status()' only for physical ports.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agodatapath-windows: Remove double semi-colon(; ) in Tunnel.c
Sairam Venugopal [Fri, 24 Jun 2016 21:19:08 +0000 (14:19 -0700)]
datapath-windows: Remove double semi-colon(; ) in Tunnel.c

Found by inspection.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agodatapath-windows: Handle memory allocation failure for event creation
Sairam Venugopal [Tue, 21 Jun 2016 23:54:02 +0000 (16:54 -0700)]
datapath-windows: Handle memory allocation failure for event creation

Release the lock and return if an event entry fails to get allocated.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agodatapath-windows: Add support for UDP and ICMP to Conntrack Module
Sairam Venugopal [Tue, 21 Jun 2016 01:15:22 +0000 (18:15 -0700)]
datapath-windows: Add support for UDP and ICMP to Conntrack Module

Enable support for UDP and ICMP in the connection tracking module on
Hyper-V. Define 1s as variable and reuse it.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoovsdb-idl: Fix issues detected in Partial Map Update feature
Lutz, Arnoldo [Mon, 13 Jun 2016 16:06:48 +0000 (16:06 +0000)]
ovsdb-idl: Fix issues detected in Partial Map Update feature

We found some issues affecting Partial Map Update feature included in
master branch.  This patch fixes a memory leak due to lack of freeing datum
allocated in the process of requesting a change to a map.  It also fix an
error produced when NDEBUG flag is not set that causes an assertion when
preparing the map to be changed.

Fix of a memory leak not freeing datums.
Change use of ovsdb_idl_read function when preparing changes to maps.

Signed-off-by: arnoldo.lutz.guevara@hpe.com <arnoldo.lutz.guevara@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agotest: Add more pmd tests.
Daniele Di Proietto [Tue, 7 Jun 2016 00:05:49 +0000 (17:05 -0700)]
test: Add more pmd tests.

These tests stress the pmd thread and multiqueue handling in
dpif-netdev.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agonetdev-dummy: Allow configuring the numa_id for testing purposes.
Daniele Di Proietto [Tue, 7 Jun 2016 00:05:49 +0000 (17:05 -0700)]
netdev-dummy: Allow configuring the numa_id for testing purposes.

This commit introduces an (undocumented) option for dummy Interfaces to
specify a dummy numa_id, to which the device belongs.  It will be used
to test the pmd threads in dpif-netdev.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-controller: Fix port binding update on OVS port delete events.
Ryan Moats [Fri, 24 Jun 2016 20:39:28 +0000 (15:39 -0500)]
ovn-controller: Fix port binding update on OVS port delete events.

Patch "Convert binding_run to incremental processing." introduced
a bug where the port binding table is not correctly updated when
an OVS port is deleted.  Fix this by
- persisting the lport shash used to record OVS ports
- change get_local_iface_ids to return a bool indicating if
  the persisted lport shash has changed
- change port binding table processing from incremental to full
  if the persisted lport shash has changed

Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoacinclude: check for numa library
Bhanuprakash Bodireddy [Sat, 18 Jun 2016 22:13:44 +0000 (23:13 +0100)]
acinclude: check for numa library

Numa library is needed for NUMA aware vHost User functionality.
Incase of missing numa package, the OVS DPDK configuration fails with
"error: Could not find DPDK libraries in <DPDK_LOC>/TARGET/lib" though
the DPDK library is installed.

This patch fixes this inappropriate error by checking for presence of
numa library and output an appropriate error message "error: unable to
find libnuma, install the dependency package" in case of missing package.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoRevert "ipfix: Export user specified virtual observation ID".
Ben Pfaff [Fri, 24 Jun 2016 20:35:23 +0000 (13:35 -0700)]
Revert "ipfix: Export user specified virtual observation ID".

This reverts commit 337bebe91c94d9d201e28811c469869d32e978ff, which caused a
crash in test 1048 "ofproto-dpif - Flow IPFIX sanity check" (now test 1051)
with the following backtrace:

 #0 hmap_first_with_hash (hmap=<optimized out>, hmap=<optimized out>,
    hash=<optimized out>) at ../lib/hmap.h:328
 #1 smap_find__ (smap=0x94, key=key@entry=0x817f7ab "virtual_obs_id",
    key_len=14, hash=2537071222) at ../lib/smap.c:366
 #2 0x0812b9d7 in smap_get_node (smap=0x9738a276,
    key=0x817f7ab "virtual_obs_id") at ../lib/smap.c:198
 #3 0x0812ba30 in smap_get (smap=0x94, key=0x817f7ab "virtual_obs_id")
    at ../lib/smap.c:189
 #4 0x08055a60 in bridge_configure_ipfix (br=<optimized out>)
    at ../vswitchd/bridge.c:1237
 #5 bridge_reconfigure (ovs_cfg=0x94) at ../vswitchd/bridge.c:666
 #6 0x080568d3 in bridge_run () at ../vswitchd/bridge.c:2972
 #7 0x0804c9dd in main (argc=10, argv=0xffd8b934)
    at ../vswitchd/ovs-vswitchd.c:112

Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agorhel: Fix RHEL package build breakage
Ansis Atteka [Fri, 24 Jun 2016 02:04:02 +0000 (19:04 -0700)]
rhel: Fix RHEL package build breakage

This patch fixes following error:

error: Installed (but unpackaged) file(s) found:
   /usr/bin/ovs-tcpdump
   /usr/share/man/man8/ovs-tcpdump.8.gz

Signed-off-by: Ansis Atteka <aatteka@ovn.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Russell Bryant <russell@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
7 years agodatapath-windows: Add GRE checksum
Alin Serdean [Fri, 17 Jun 2016 20:00:54 +0000 (20:00 +0000)]
datapath-windows: Add GRE checksum

This patch introduces GRE checksum computation if the userspace requires
it on Tx. On Rx we verify the GRE checksum if the checksum bit was
specified and also inform the userspace about it.

Also fix the GRE header length as specified by the GRE flags not the
tunnel flags.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoofp-actions: Add truncate action.
William Tu [Fri, 24 Jun 2016 14:42:30 +0000 (07:42 -0700)]
ofp-actions: Add truncate action.

The patch adds a new action to support packet truncation.  The new action
is formatted as 'output(port=n,max_len=m)', as output to port n, with
packet size being MIN(original_size, m).

One use case is to enable port mirroring to send smaller packets to the
destination port so that only useful packet information is mirrored/copied,
saving some performance overhead of copying entire packet payload.  Example
use case is below as well as shown in the testcases:

    - Output to port 1 with max_len 100 bytes.
    - The output packet size on port 1 will be MIN(original_packet_size, 100).
    # ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)'

    - The scope of max_len is limited to output action itself.  The following
      packet size of output:1 and output:2 will be intact.
    # ovs-ofctl add-flow br0 \
            'actions=output(port=1,max_len=100),output:1,output:2'
    - The Datapath actions shows:
    # Datapath actions: trunc(100),1,1,2

Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
7 years agodatapath:backport: openvswitch: Add packet truncation support.
William Tu [Fri, 24 Jun 2016 14:42:29 +0000 (07:42 -0700)]
datapath:backport: openvswitch: Add packet truncation support.

Upstream commit:
    commit f2a4d086ed4c588d32fe9b7aa67fead7280e7bf1
    Author: William Tu <u9012063@gmail.com>
    Date:   Fri Jun 10 11:49:33 2016 -0700

    openvswitch: Add packet truncation support.

    The patch adds a new OVS action, OVS_ACTION_ATTR_TRUNC, in order to
    truncate packets. A 'max_len' is added for setting up the maximum
    packet size, and a 'cutlen' field is to record the number of bytes
    to trim the packet when the packet is outputting to a port, or when
    the packet is sent to userspace.

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
7 years agoipfix: Export user specified virtual observation ID
Wenyu Zhang [Fri, 24 Jun 2016 12:25:57 +0000 (05:25 -0700)]
ipfix: Export user specified virtual observation ID

In virtual network, users want more info about the virtual point to observe
the traffic.  It should be a string to provide clear info, not a simple
interger ID.

Introduce "other-config: virtual_obs_id" in IPFIX, which is a string
configured by user.  Introduce an enterprise IPFIX entity
"virtualObsID"(898) to export the value. The entity is a variable-length
string.

Signed-off-by: Wenyu Zhang <wenyuz@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-controller: Use new ovsdb-idl helpers to make logic more readable.
Ben Pfaff [Fri, 24 Jun 2016 00:00:51 +0000 (17:00 -0700)]
ovn-controller: Use new ovsdb-idl helpers to make logic more readable.

Also there were lots of 'continue's sprinkled around that didn't seem to
be needed given some simple code rearrangement.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agonetdev-linux: Add new QoS type linux-noop.
bschanmu@redhat.com [Mon, 13 Jun 2016 08:30:19 +0000 (14:00 +0530)]
netdev-linux: Add new QoS type linux-noop.

Linux ``No operation'' qos type is used to inform the vswitch that the
traffic control for the port is managed externally. Any configuration values
set for this type will have no effect.

This patch provides a solution suggested in this mail -
http://openvswitch.org/pipermail/discuss/2015-May/017687.html

Signed-off-by: Babu Shanmugam <bschanmu@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovsdb: Strong references cascade performance fix.
Rodriguez Betancourt, Esteban [Fri, 10 Jun 2016 16:35:09 +0000 (16:35 +0000)]
ovsdb: Strong references cascade performance fix.

Improves the performance of OVSDB avoiding the chain
reaction produced when modifing rows with a strong
reference and the pointed rows have more strong
references.

The approach taken was using the change bitmap to avoid
triggering a change count when the column hasn't changed.

One way to trigger the issue is emulating a simple linked list
with strong references within a table, where each new row
points to the previous.

Without the fix OVSDB creates a ovsdb_txn_row (and a copy
of the row) for each row in the table.
With the fix it only creates two ovsdb_txn_row: the new row and
the directly pointed row.

Signed-off-by: Esteban Rodriguez Betancourt <estebarb@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agotests: Fix issue in use of OVS_APP_EXIT_AND_WAIT.
Lance Richardson [Fri, 10 Jun 2016 16:17:57 +0000 (12:17 -0400)]
tests: Fix issue in use of OVS_APP_EXIT_AND_WAIT.

Commit f9b11f2a09b4 introduced a loop to wait for process exit
in OVS_APP_EXIT_AND_WAIT after the "exit" command has been sent.
Unfortunately, this does not work for cases where a unixctl socket
has to be used to send the "exit" command because the process
ID cannot be determined from the socket path.

OVS_APP_EXIT_AND_WAIT_BY_TARGET has since been introduced to enable
graceful termination of daemons via unixctl sockets.

This set of changes addresses the problem described above by
making OVS_APP_EXIT_AND_WAIT_BY_TARGET take the unixctl socket
path and corresponding process ID as separate parameters. In order
to better detect issues in this logic in the future, checks have
been added to verify that the pidfile exists before using its
contents.

Tested on a Linux system.

Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-controller: Add 'put_dhcp_opts' action in ovn-controller
Numan Siddique [Wed, 15 Jun 2016 09:17:35 +0000 (14:47 +0530)]
ovn-controller: Add 'put_dhcp_opts' action in ovn-controller

This patch adds a new OVN action 'put_dhcp_opts' to support native
DHCP in OVN.

ovn-controller parses this action and adds a NXT_PACKET_IN2
OF flow with 'pause' flag set and the DHCP options stored in
'userdata' field.

When the valid DHCP packet is received by ovn-controller, it frames a
new DHCP reply packet with the DHCP options present in the
'userdata' field and resumes the packet and stores 1 in the 1-bit subfield.
If the packet is invalid, it resumes the packet without any modifying and
stores 0 in the 1-bit subfield.

Eg. reg0[0] = put_dhcp_opts(offerip = 10.0.0.4, router = 10.0.0.1,
                  netmask = 255.255.255.0, lease_time = 3600,....)

A new 'DHCP_Options' table is added in SB DB which stores
the supported DHCP options with DHCP code and type. ovn-northd is
expected to popule this table.

The next patch will add logical flows with this action.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoexpr: Refactor parsing of assignments and exchanges.
Ben Pfaff [Thu, 9 Jun 2016 05:37:59 +0000 (22:37 -0700)]
expr: Refactor parsing of assignments and exchanges.

As written, it was difficult for the OVN logical action code to add support
for new actions of the form "dst = ...", because the code to parse the left
side of the assignment was a monolithic part of the expr library.  This
commit refactors the code division so that an upcoming patch can support a
new "dst = func(args);" kind of action.

Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoexpr: Shorten declarations of expr_context.
Ben Pfaff [Thu, 9 Jun 2016 05:37:58 +0000 (22:37 -0700)]
expr: Shorten declarations of expr_context.

Seems to me that this makes the code slightly easier to follow.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agodebian: Add the tcpdump utility to the debian package
Aaron Conole [Wed, 8 Jun 2016 21:49:57 +0000 (17:49 -0400)]
debian: Add the tcpdump utility to the debian package

Add ovs-tcpdump to the debian build.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
7 years agofedora: Add pcap, tcpdump and tcpundump utilities to test
Aaron Conole [Wed, 8 Jun 2016 21:49:56 +0000 (17:49 -0400)]
fedora: Add pcap, tcpdump and tcpundump utilities to test

The openvswitch-test package is setup for enabling / performing tests
for openvswitch setups.  Adding these utilities would enable a richer
set of debugging utilities for performing diagnostics.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
7 years agoovs-tcpdump: Add a tcpdump wrapper utility
Aaron Conole [Wed, 8 Jun 2016 21:49:55 +0000 (17:49 -0400)]
ovs-tcpdump: Add a tcpdump wrapper utility

Currently, there is some documentation which describes setting up and
using port mirrors for bridges. This documentation is helpful to setup
a packet capture for specific ports.

However, a utility to do such packet capture would be valuable, both
as an exercise in documenting the steps an additional time, and as a way
of providing an out-of-the-box experience for running a capture.

This commit adds a tcpdump-wrapper utility for such purpose. It uses the
Open vSwitch python library to add/remove ports and mirrors to/from the
Open vSwitch database. It will create a tcpdump instance listening on
the mirror port (allowing the user to specify additional arguments), and
dump data to the screen (or otherwise).

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
7 years agonetlink-notifier: change message to a less scary one
Thadeu Lima de Souza Cascardo [Fri, 17 Jun 2016 19:33:23 +0000 (16:33 -0300)]
netlink-notifier: change message to a less scary one

"received bad netlink message" may be interpreted as a corrupt netlink message.
However, the parse functions may return failure when the message contains
unexpected attributes or misses non optional attributes. Indicating the message
contained "unexpected contents" will avoid some interpretation that there may be
some netlink message corruption.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Cc: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agolport: Persist lport_index and mcgroup_index structures.
RYAN D. MOATS [Thu, 9 Jun 2016 01:01:39 +0000 (20:01 -0500)]
lport: Persist lport_index and mcgroup_index structures.

This is preparatory to making physical_run and lflow_run process
incrementally as changes to the data in these structures control
that processing.

Signed-off-by: RYAN D. MOATS <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoofproto-dpif.at: Fix typo.
Flavio Fernandes [Wed, 22 Jun 2016 21:49:38 +0000 (17:49 -0400)]
ofproto-dpif.at: Fix typo.

Correct spelling of the word 'dropped'.

The typo appears to have been introduced in this changeset:
http://openvswitch.org/pipermail/dev/2014-March/037433.html

Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
7 years agoConvert binding_run to incremental processing.
RYAN D. MOATS [Tue, 7 Jun 2016 18:52:51 +0000 (13:52 -0500)]
Convert binding_run to incremental processing.

Ensure that the entire port binding table is processed
when chassis are added/removed or when get_local_iface_ids
finds new ports on the local vswitch.

Side effects:
  - Persist local_datapaths and patch_datapaths across runs so
    that changes to either can be used as a trigger to reset
    incremental flow processing.
  - Persist all_lports structure
  - Revert commit 9baaabfff3c7df014e9acbd4c68189b568552ca9
    (ovn: Fix localnet ports deletion and recreation sometimes
    after restart.) as these changes are not desirable once
    local_datatpath is persisted.

Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-controller: Change encaps_run to work incrementally.
Ryan Moats [Tue, 7 Jun 2016 18:52:50 +0000 (13:52 -0500)]
ovn-controller: Change encaps_run to work incrementally.

As a side effect, tunnel context is persisted.

Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: Rename local variable in Vport.c
Sairam Venugopal [Wed, 22 Jun 2016 20:50:34 +0000 (13:50 -0700)]
datapath-windows: Rename local variable in Vport.c

Declaration of 'event' hides previous local declaration. Rename this to
evt. The other variable wasn't being used.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoovn-northd: no logical router icmp response for directed broadcasts
Flavio Fernandes [Mon, 20 Jun 2016 20:57:22 +0000 (16:57 -0400)]
ovn-northd: no logical router icmp response for directed broadcasts

Responding to icmp queries where the L3 destination is a directed broadcast
was not being properly handled, causing the reply to be sent to all logical
ports except for the one port that should receive it.

This is a proposal for using choice B in the mail discussion; where icmp
queries to broadcast are simply not responded by the logical router.

Reported-at: http://openvswitch.org/pipermail/discuss/2016-June/021610.html
Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
7 years agodoc: Fix an error in FAQ.
Han Zhou [Tue, 7 Jun 2016 05:56:50 +0000 (22:56 -0700)]
doc: Fix an error in FAQ.

Signed-off-by: Han Zhou <zhouhan@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agodatapath-windows: Remove unused headers in Event.c
Sairam Venugopal [Tue, 21 Jun 2016 22:23:47 +0000 (15:23 -0700)]
datapath-windows: Remove unused headers in Event.c

Cleanup unused headers. Found by inspection.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoovn: Allow IP packets destined to router ip for SNAT
Chandra S Vejendla [Wed, 22 Jun 2016 01:36:43 +0000 (18:36 -0700)]
ovn: Allow IP packets destined to router ip for SNAT

By default all the ip traffic destined to router ip is dropped in
lr_in_ip_input stage. When the router ip is used as snat ip, allow
reverse snat traffic destined to the router ip.

Signed-off-by: Chandra Sekhar Vejendla <csvejend@us.ibm.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agodatapath-windows: Remove unused headers from Datapath.c
Sairam Venugopal [Tue, 21 Jun 2016 18:09:52 +0000 (11:09 -0700)]
datapath-windows: Remove unused headers from Datapath.c

Clean up unused headers in Datapath.c. Found by inspection.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Yin Lin<linyi@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoovn: DNAT and SNAT on a gateway router.
Gurucharan Shetty [Wed, 11 May 2016 01:59:01 +0000 (18:59 -0700)]
ovn: DNAT and SNAT on a gateway router.

For traffic from physical space to virtual space we need DNAT.
The DNAT happens in the gateway router and reaches the logical
port. The return traffic should be unDNATed.

Traffic originating in virtual space heading to physical space
should be SNATed. The return traffic is unSNATted.

East-west traffic with the public destination IP address needs
a DNAT. This traffic is punted to the l3 gateway where DNAT
takes place. This traffic is also SNATed and eventually loops back to
its destination. The SNAT is needed because we need the reverse traffic
to go back to the l3 gateway and not short-circuit directly to the source.

This commit introduces 4 new logical actions.
1. ct_snat: To send the packet through SNAT zone to unSNAT packets.
2. ct_snat(IP): To SNAT to the provided IP address.
3. ct_dnat: To send the packet throgh DNAT zone to unDNAT packets.
4. ct_dnat(IP): To DNAT to the provided IP.

This commit only provides the ability to do IP based NAT. This will
eventually be enhanced to do PORT based NAT too.

Command hints:

Consider a distributed router "R1" that has switch foo (192.168.1.0/24)
with a lport foo1 (192.168.1.2) and bar (192.168.2.0/24) with lport bar1
(192.168.2.2) connected to it. You connect "R1" to
a gateway router "R2" via a switch "join" in (20.0.0.0/24) network.

R2 has a switch "alice" (172.16.1.0/24) connected to it (to simulate
external network).

case: Add pure DNAT (north-south)

Add a DNAT rule in R2:
ovn-nbctl -- --id=@nat create nat type="dnat" logical_ip=192.168.1.2 \
external_ip=30.0.0.2 -- add logical_router R2 nat @nat

Now alice1 should be able to ping 192.168.1.2 via 30.0.0.2.

case2 : Add pure SNAT (south-north)

Add a SNAT rule in R2:

ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \
external_ip=30.0.0.1 -- add logical_router R2 nat @nat

(You need a static route in R1 to send packets destined to outside
world to go through R2. The logical_ip can be a subnet.)

When bar1 pings alice1, alice1 receives traffic from 30.0.0.1

case3 : SNAT and DNAT (east-west traffic)

When bar1 pings 30.0.0.2, the traffic jumps to the gateway router
and loops back to foo1 with a source ip address of 30.0.0.1

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Flavio Fernandes <flavio@flaviof.com>
7 years agotests: make ovn logical router test case more reliable
Lance Richardson [Mon, 6 Jun 2016 18:03:00 +0000 (14:03 -0400)]
tests: make ovn logical router test case more reliable

The "ovn -- 1 HVs, 2 LSs, 1 lport/LS, 1 LR" test case creates a
configuration including a logical router, then:
    1) Sends a packet that is expected to be forwarded by the
       logical router.
    2) Disables the logical router.
    3) Sends another packet, identical to the one sent in (1), that
       should not be forwarded.

This test case fails intermittently, apparently because the disabling
of the logical router in (2) has not yet been propagated to the
forwarding plane at the time the second packet is sent. (When the
failure occurs, two packets are captured whereas only one is expected.)

Address this issue by adding a one second sleep between steps (2) and
(3). Adding a sleep does not actually fix anything, but it
does make this test case more likely to work correctly.

In one series of tests, this test case failed 11 times out of 20
without this fix and succeeded 20 times out of 20 attempts with
this fix.

Fixes: 5412db307420 ("ovn: Add column enabled to table Logical_Router")
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
7 years agotun-metadata: Use correct offset when accessing fragmented metadata.
Jesse Gross [Sun, 29 May 2016 02:17:27 +0000 (19:17 -0700)]
tun-metadata: Use correct offset when accessing fragmented metadata.

Since tunnel metadata is stored in a fixed area in the flow match
field, we must allocate space for options as they are registered with
the switch. In order to avoid exposing implementation complexity to
the controller, we support fragmentation when we run out of contiguous
blocks that are large enough to handle new requests.

When reading or writing to these fragmented blocks, there is a bug
that would cause us to keep on using the area after the allocated
space rather than moving to the next offset. This corrects that to
use the offset for each block.

Unfortunately, while we did have a test for this exact use case, since
the same bug was present in both reading and writing code, everything
appeared to work as normal from the outside.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
7 years agoofproto: Set to revalidate when a new version is available.
Jarno Rajahalme [Tue, 21 Jun 2016 16:41:02 +0000 (09:41 -0700)]
ofproto: Set to revalidate when a new version is available.

There is no need to set the revalidate flag after each flow mod
separately, as we can do it once after the whole transaction is
finished.  It is not done at all if the transaction fails.

In the successful case this change makes no functional difference,
since the revalidation thread is triggered by the main thread only
after a bundle transaction has been fully processed.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoxlate: Fix typo in comment.
Jarno Rajahalme [Tue, 21 Jun 2016 16:41:01 +0000 (09:41 -0700)]
xlate: Fix typo in comment.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath: Fix cached ct with helper.
Joe Stringer [Tue, 21 Jun 2016 01:51:09 +0000 (18:51 -0700)]
datapath: Fix cached ct with helper.

Upstream commit:
    commit 16ec3d4fbb967bd0e1c8d9dce9ef70e915a86615
    Author: Joe Stringer <joe@ovn.org>
    Date:   Wed May 11 10:29:26 2016 -0700

    openvswitch: Fix cached ct with helper.

    When using conntrack helpers from OVS, a common configuration is to
    perform a lookup without specifying a helper, then go through a
    firewalling policy, only to decide to attach a helper afterwards.

    In this case, the initial lookup will cause a ct entry to be attached to
    the skb, then the later commit with helper should attach the helper and
    confirm the connection. However, the helper attachment has been missing.
    If the user has enabled automatic helper attachment, then this issue
    will be masked as it will be applied in init_conntrack(). It is also
    masked if the action is executed from ovs_packet_cmd_execute() as that
    will construct a fresh skb.

    This patch fixes the issue by making an explicit call to try to assign
    the helper if there is a discrepancy between the action's helper and the
    current skb->nfct.

    Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Fixes: 11251c170d92 ("datapath: Allow attaching helpers to ct action")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: __nf_ct_l{3,4}proto_find() always return a valid pointer
Pablo Neira Ayuso [Tue, 21 Jun 2016 01:51:09 +0000 (18:51 -0700)]
datapath: __nf_ct_l{3,4}proto_find() always return a valid pointer

Upstream commit:
    commit 3b78155b1b3688dbe910fecdc3e003f431b46630
    Author: Pablo Neira Ayuso <pablo@netfilter.org>
    Date:   Tue May 3 11:13:29 2016 +0200

    openvswitch: __nf_ct_l{3,4}proto_find() always return a valid pointer

    If the protocol is not natively supported, this assigns generic protocol
    tracker so we can always assume a valid pointer after these calls.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: change nf_connlabels_get bit arg to 'highest used'
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:09 +0000 (18:51 -0700)]
datapath: change nf_connlabels_get bit arg to 'highest used'

Upstream commit:
    commit adff6c65600000ec2bb71840c943ee12668080f5
    Author: Florian Westphal <fw@strlen.de>
    Date:   Tue Apr 12 18:14:25 2016 +0200

    netfilter: connlabels: change nf_connlabels_get bit arg to 'highest used'

    nf_connlabel_set() takes the bit number that we would like to set.
    nf_connlabels_get() however took the number of bits that we want to
    support.

    So e.g. nf_connlabels_get(32) support bits 0 to 31, but not 32.
    This changes nf_connlabels_get() to take the highest bit that we want
    to set.

    Callers then don't have to cope with a potential integer wrap
    when using nf_connlabels_get(bit + 1) anymore.

    Current callers are fine, this change is only to make folloup
    nft ct label set support simpler.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
OVS compat code defined nf_connlabels_get() if it was missing.  Now we
redefine it if it is missing, or if it has the old signature.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: call only into reachable nf-nat code
Arnd Bergmann [Tue, 21 Jun 2016 01:51:09 +0000 (18:51 -0700)]
datapath: call only into reachable nf-nat code

Upstream commit:
    commit 99b7248e2ad57ca93ada10c6598affb267ffc99a
    Author: Arnd Bergmann <arnd@arndb.de>
    Date:   Fri Mar 18 14:33:45 2016 +0100

    openvswitch: call only into reachable nf-nat code

    The openvswitch code has gained support for calling into the
    nf-nat-ipv4/ipv6 modules, however those can be loadable modules
    in a configuration in which openvswitch is built-in, leading
    to link errors:

    net/built-in.o: In function `__ovs_ct_lookup':
    :(.text+0x2cc2c8): undefined reference to `nf_nat_icmp_reply_translation'
    :(.text+0x2cc66c): undefined reference to `nf_nat_icmpv6_reply_translation'

    The dependency on (!NF_NAT || NF_NAT) prevents similar issues,
    but NF_NAT is set to 'y' if any of the symbols selecting
    it are built-in, but the link error happens when any of them
    are modular.

    A second issue is that even if CONFIG_NF_NAT_IPV6 is built-in,
    CONFIG_NF_NAT_IPV4 might be completely disabled. This is unlikely
    to be useful in practice, but the driver currently only handles
    IPv6 being optional.

    This patch improves the Kconfig dependency so that openvswitch
    cannot be built-in if either of the two other symbols are set
    to 'm', and it replaces the incorrect #ifdef in ovs_ct_nat_execute()
    with two "if (IS_ENABLED())" checks that should catch all corner
    cases also make the code more readable.

    The same #ifdef exists ovs_ct_nat_to_attr(), where it does not
    cause a link error, but for consistency I'm changing it the same
    way.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Fixes: c5f6c06b58d6 ("datapath: Interface with NAT.")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Fix checking for new expected connections.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: Fix checking for new expected connections.

Upstream commit:
    commit 5745b0be05a0f8ccbc92a36b69f3a6bc58e91954
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Mon Mar 21 11:15:19 2016 -0700

    openvswitch: Fix checking for new expected connections.

    OVS should call into CT NAT for packets of new expected connections only
    when the conntrack state is persisted with the 'commit' option to the
    OVS CT action.  The test for this condition is doubly wrong, as the CT
    status field is ANDed with the bit number (IPS_EXPECTED_BIT) rather
    than the mask (IPS_EXPECTED), and due to the wrong assumption that the
    expected bit would apply only for the first (i.e., 'new') packet of a
    connection, while in fact the expected bit remains on for the lifetime of
    an expected connection.  The 'ctinfo' value IP_CT_RELATED derived from
    the ct status can be used instead, as it is only ever applicable to
    the 'new' packets of the expected connection.

    Fixes: 05752523e565 ('openvswitch: Interface with NAT.')
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes: c5f6c06b58d6 ("datapath: Interface with NAT.")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Use proper buffer size in nla_memcpy
Haishuang Yan [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: Use proper buffer size in nla_memcpy

Upstream commit:
    commit ac71b46efd2838c02ec193987c8f61c3ba33b495
    Author: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
    Date:   Mon Mar 28 18:08:59 2016 +0800

    openvswitch: Use proper buffer size in nla_memcpy

    For the input parameter count, it's better to use the size
    of destination buffer size, as nla_memcpy would take into
    account the length of the source netlink attribute when
    a data is copied from an attribute.

Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Fixes: c5f6c06b58d6 ("datapath: Interface with NAT.")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: conntrack NF_NAT_RANGE_PROTO_RANDOM_FULLY compat code.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: conntrack NF_NAT_RANGE_PROTO_RANDOM_FULLY compat code.

Linux kernel 3.13 and older do not have
NF_NAT_RANGE_PROTO_RANDOM_FULLY (unless backported by the
distribution).  Silently fall back to NF_NAT_RANGE_PROTO_RANDOM to
maintain OVS API compatibility.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: conntrack NAT helper compat code for Linux 4.5 and earlier.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: conntrack NAT helper compat code for Linux 4.5 and earlier.

Upstream commit:
    commit 264619055bd52bc2278af848472176642d759874
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:17 2016 -0800

    netfilter: Allow calling into nat helper without skb_dst.

    NAT checksum recalculation code assumes existence of skb_dst, which
    becomes a problem for a later patch in the series ("openvswitch:
    Interface with NAT.").  Simplify this by removing the check on
    skb_dst, as the checksum will be dealt with later in the stack.

Suggested-by: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds a corresponding backport for Linux 4.5 and older into
datapath/conntrack.c, changing a TCP or UDP packet to CHECKSUM_PARTIAL
to avoid triggering the skb_dst dependency that otherwise crashes the
kernel when checksums are recalculated after NAT helper has mangled
TCP or UDP packet contents.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Interface with NAT.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: Interface with NAT.

Upstream commit:
    commit 05752523e56502cd9975aec0a2ded465d51a71f3
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:23 2016 -0800

    openvswitch: Interface with NAT.

    Extend OVS conntrack interface to cover NAT.  New nested
    OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action.
    A bare OVS_CT_ATTR_NAT only mangles existing and expected connections.
    If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested
    attributes, new (non-committed/non-confirmed) connections are mangled
    according to the rest of the nested attributes.

    The corresponding OVS userspace patch series includes test cases (in
    tests/system-traffic.at) that also serve as example uses.

    This work extends on a branch by Thomas Graf at
    https://github.com/tgraf/ovs/tree/nat.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Delay conntrack helper call for new connections.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Delay conntrack helper call for new connections.

Upstream commit:
    commit 28b6e0c1ace45779c60e7cefe6d469b7ecb520b8
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:22 2016 -0800

    openvswitch: Delay conntrack helper call for new connections.

    There is no need to help connections that are not confirmed, so we can
    delay helping new connections to the time when they are confirmed.
    This change is needed for NAT support, and having this as a separate
    patch will make the following NAT patch a bit easier to review.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Handle NF_REPEAT in conntrack action.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Handle NF_REPEAT in conntrack action.

Upstream commit:
    commit 5b6b929376a621e2bd3367f5de563d7123506597
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:21 2016 -0800

    openvswitch: Handle NF_REPEAT in conntrack action.

    Repeat the nf_conntrack_in() call when it returns NF_REPEAT.  This
    avoids dropping a SYN packet re-opening an existing TCP connection.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Find existing conntrack entry after upcall.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Find existing conntrack entry after upcall.

Upstream commit:
    commit 289f225349cb2a97448fd14599ab34b741f706f3
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:20 2016 -0800

    openvswitch: Find existing conntrack entry after upcall.

    Add a new function ovs_ct_find_existing() to find an existing
    conntrack entry for which this packet was already applied to.  This is
    only to be called when there is evidence that the packet was already
    tracked and committed, but we lost the ct reference due to an
    userspace upcall.

    ovs_ct_find_existing() is called from skb_nfct_cached(), which can now
    hide the fact that the ct reference may have been lost due to an
    upcall.  This allows ovs_ct_commit() to be simplified.

    This patch is needed by later "openvswitch: Interface with NAT" patch,
    as we need to be able to pass the packet through NAT using the
    original ct reference also after the reference is lost after an
    upcall.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Update the CT state key only after nf_conntrack_in().
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Update the CT state key only after nf_conntrack_in().

Upstream commit:
    commit 394e910e909b174270b8231fd51942eb2f541fb9
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:19 2016 -0800

    openvswitch: Update the CT state key only after nf_conntrack_in().

    Only a successful nf_conntrack_in() call can effect a connection state
    change, so it suffices to update the key only after the
    nf_conntrack_in() returns.

    This change is needed for the later NAT patches.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Add commentary to conntrack.c
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Add commentary to conntrack.c

Upstream commit:
    commit 9f13ded8d3c715147c4759f937cfb712c185ca13
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:18 2016 -0800

    openvswitch: Add commentary to conntrack.c

    This makes the code easier to understand and the following patches
    more focused.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Remove NF_CT_NEW_REPLY
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:06 +0000 (18:51 -0700)]
datapath: Remove NF_CT_NEW_REPLY

Upstream commit:
    commit bfa3f9d7f3b349acea8982d2248e33a0ed84c687
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:16 2016 -0800

    netfilter: Remove IP_CT_NEW_REPLY definition.

    Remove the definition of IP_CT_NEW_REPLY from the kernel as it does
    not make sense.  This allows the definition of IP_CT_NUMBER to be
    simplified as well.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: compat for NAT.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:06 +0000 (18:51 -0700)]
datapath: compat for NAT.

Compat code required to make the NAT code in the following patch
compile with Linux 3.10 - 4.6.

Some compat code applies to the conntrack.c itself; these are added
after the main NAT backport for conntrack.c later in the series.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agoacinclude: Add OVS_FIND_PARAM_IFELSE.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:06 +0000 (18:51 -0700)]
acinclude: Add OVS_FIND_PARAM_IFELSE.

OVS_FIND_PARAM_IFELSE is more robust macro for checking function
parameters, as it does not require the parameter to be on the same
line as the function name like the OVS_GREP_IFELSE does.

Use this to fix the check for struct conntrack_zone parameter, which
is on a different line on Linux 4.3 and higher.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agotests: Clear TCP state from conntrack dumps.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:06 +0000 (18:51 -0700)]
tests: Clear TCP state from conntrack dumps.

When the TCP state is not important it is better ignore it.  This
makes test cases more robust w.r.t. kernel versions and timing.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath-windows: comment cleanup and indentation
Nithin Raju [Thu, 16 Jun 2016 17:17:09 +0000 (10:17 -0700)]
datapath-windows: comment cleanup and indentation

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolution.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agonetdev-dpdk: NUMA Aware vHost User
Ciara Loftus [Mon, 13 Jun 2016 10:10:09 +0000 (11:10 +0100)]
netdev-dpdk: NUMA Aware vHost User

This commit allows for vHost User memory from QEMU, DPDK and OVS, as
well as the servicing PMD, to all come from the same socket.

The socket id of a vhost-user port used to be set to that of the master
lcore. Now it is possible to update the socket id if it is detected
(during VM boot) that the vhost device memory is not on this node. If
this is the case, a new mempool is created from the new node, and the
PMD thread currently servicing the port will no longer, in favour of a
thread from the new node (if enabled in the pmd-cpu-mask).

To avail of this functionality, one must enable the
CONFIG_RTE_LIBRTE_VHOST_NUMA DPDK configuration option.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agodatapath-windows: use ip proto for tunnel port lookup
Nithin Raju [Fri, 17 Jun 2016 17:51:52 +0000 (10:51 -0700)]
datapath-windows: use ip proto for tunnel port lookup

In Actions.c, based on the IP Protocol type and L4 port of
the outer packet, we lookup the tunnel port. The function
that made this happen took the tunnel type as an argument.
Semantically, is is better to pass the IP protocol type and
let the lookup code map IP protocol type to tunnel type.

In the vport add code, we make sure that we block tunnel
port addition if there's already a tunnel port that uses
the same IP protocol type and L4 port number.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Yin Lin <linyi@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoipfix: Support tunnel information for Flow IPFIX.
Benli Ye [Tue, 14 Jun 2016 08:53:34 +0000 (16:53 +0800)]
ipfix: Support tunnel information for Flow IPFIX.

Add support to export tunnel information for flow-based IPFIX.
The original steps to configure flow level IPFIX:
    1) Create a new record in Flow_Sample_Collector_Set table:
       'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
    2) Add IPFIX configuration which is referred by corresponding
       row in Flow_Sample_Collector_Set table:
       'ovs-vsctl -- set Flow_Sample_Collector_Set
       "Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX
       targets=\"IP:4739\" obs_domain_id=123 obs_point_id=456
       cache_active_timeout=60 cache_max_flows=13'
    3) Add sample action to the flows:
       'ovs-ofctl add-flow mybridge in_port=1,
       actions=sample'('probability=65535,collector_set_id=1,
       obs_domain_id=123,obs_point_id=456')',output:3'
NXAST_SAMPLE action was used in step 3. In order to support exporting tunnel
information, the NXAST_SAMPLE2 action was added and with NXAST_SAMPLE2 action
in this patch, the step 3 should be configured like below:
       'ovs-ofctl add-flow mybridge in_port=1,
       actions=sample'('probability=65535,collector_set_id=1,obs_domain_id=123,
       obs_point_id=456,sampling_port=3')',output:3'
'sampling_port' can be equal to ingress port or one of egress ports. If sampling
port is equal to output port and the output port is a tunnel port,
OVS_USERSPACE_ATTR_EGRESS_TUN_PORT will be set in the datapath flow sample action.
When flow sample action upcall happens, tunnel information will be retrieved from
the datapath and then IPFIX can export egress tunnel port information. If
samping_port=65535 (OFPP_NONE), flow-based IPFIX will keep the same behavior
as before.

This patch mainly do three tasks:
    1) Add a new flow sample action NXAST_SAMPLE2 to support exporting
       tunnel information. NXAST_SAMPLE2 action has a new added field
       'sampling_port'.
    2) Use 'other_configure: enable-tunnel-sampling' to enable or disable
       exporting tunnel information.
    3) If 'sampling_port' is equal to output port and output port is a tunnel
       port, the translation of OpenFlow "sample" action should first emit
       set(tunnel(...)), then the sample action itself. It makes sure the
       egress tunnel information can be sampled.
    4) Add a test of flow-based IPFIX for tunnel set.

How to test flow-based IPFIX:
    1) Setup a test environment with two Linux host with Docker supported
    2) Create a Docker container and a GRE tunnel port on each host
    3) Use ovs-docker to add the container on the bridge
    4) Listen on port 4739 on the collector machine and use wireshark to filter
       'cflow' packets.
    5) Configure flow-based IPFIX:
       - 'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
       - 'ovs-vsctl -- set Flow_Sample_Collector_Set
          "Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX \
          targets=\"IP:4739\" cache_active_timeout=60 cache_max_flows=13 \
          other_config:enable-tunnel-sampling=true'
       - 'ovs-ofctl add-flow mybridge in_port=1,
          actions=sample'('probability=65535,collector_set_id=1,obs_domain_id=123,
          obs_point_id=456,sampling_port=3')',output:3'
       Note: The in-port is container port. The output port and sampling_port
             are both open flow port and the output port is a GRE tunnel port.
    6) Ping from the container whose host enabled flow-based IPFIX.
    7) Get the IPFIX template pakcets and IPFIX information packets.

Signed-off-by: Benli Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agonetdev-dpdk: Remove vhost send retries when no packets have been sent.
Kevin Traynor [Fri, 10 Jun 2016 16:49:38 +0000 (17:49 +0100)]
netdev-dpdk: Remove vhost send retries when no packets have been sent.

If the guest is connected but not servicing the virt queue, this leads
to vhost send retries until timeout. This is fine in isolation but if
there are other high rate queues also being serviced by the same PMD
it can lead to a performance hit on those queues. Change to only retry
when at least some packets have been successfully sent on the previous
attempt.

Also, limit retries to avoid a similar delays if packets are being sent
at a very low rate due to few available descriptors.

Reported-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Acked-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoofp-util: Fix parsing of parenthesized values within key-value pairs.
Ben Pfaff [Mon, 13 Jun 2016 21:53:01 +0000 (14:53 -0700)]
ofp-util: Fix parsing of parenthesized values within key-value pairs.

Reported-by: james hopper <jameshopper@email.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-June/021662.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoat test vlog: Switch from stderr to log
Alin Serdean [Wed, 8 Jun 2016 14:02:20 +0000 (14:02 +0000)]
at test vlog: Switch from stderr to log

Using the --detach parameter the child does not propagate the first
message to the parent.

Proposed change use the log file instead of the stderr.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Tested-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovs-ofctl: Fixed PID file naming on windows
Paul Boca [Wed, 8 Jun 2016 08:40:34 +0000 (08:40 +0000)]
ovs-ofctl: Fixed PID file naming on windows

On Windows if a relative file name is given to --pidfile parameter
(not containing ':') then the application name is used for PID file,
ignoring the given name.

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: Fix misc on vport
Alin Serdean [Tue, 10 May 2016 00:46:01 +0000 (00:46 +0000)]
datapath-windows: Fix misc on vport

Remove ununsed variables, found by inspection.

On fail reset extInfo name.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: Sample action support.
Sorin Vinturis [Wed, 1 Jun 2016 15:50:27 +0000 (15:50 +0000)]
datapath-windows: Sample action support.

This patch adds support for sampling to the OVS extension.

The following flow was used for generating sample actions:
  ovs-ofctl add-flow tcp:127.0.0.1:9999 "actions=sample(
    probability=12345,collector_set_id=23456,obs_domain_id=34567,
    obs_point_id=45678)"

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoipfix: Bug fix for not sending template packets on 32-bit OS
Benli Ye [Tue, 14 Jun 2016 03:09:45 +0000 (11:09 +0800)]
ipfix: Bug fix for not sending template packets on 32-bit OS

'last_template_set_time' in truct dpif_ipfix_exporter is declared
as time_t and time_t is long int type. If we initialize
'last_template_set_time' as TIME_MIN, whose value is -2147483648
on 32-bit OS and -2^63 on 64-bit OS. There will be a problem on
32-bit OS when comparing 'last_template_set_time' with a unisgned int
type variable, because type casting will happen and negative value
could be a large positive number. Fix this problem by simply initialize
'last_template_set_time' as 0.

Signed-off-by: Benli Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
7 years agoipfix: Add support for exporting ipfix statistics.
Benli Ye [Mon, 13 Jun 2016 21:44:09 +0000 (14:44 -0700)]
ipfix: Add support for exporting ipfix statistics.

It is meaningful for user to check the stats of IPFIX.
Using IPFIX stats, user can know how much flows the system
can support. It is also can be used for performance check
of IPFIX.

IPFIX stats is added for per IPFIX exporter. If bridge IPFIX is
enabled on the bridge, the whole bridge will have one exporter.
For flow IPFIX, the system keeps per id (column in
Flow_Sample_Collector_Set) per exporter.

1) Add 'ovs-ofctl dump-ipfix-bridge SWITCH' to export IPFIX stats of
   the bridge which enable bridge IPFIX. The output format:
   NXST_IPFIX_BRIDGE reply (xid=0x2):
     bridge ipfix: flows=0, current flows=0, sampled pkts=0, \
                   ipv4 ok=0, ipv6 ok=0, tx pkts=0
                   pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0
2) Add 'ovs-ofctl dump-ipfix-flow SWITCH' to export IPFIX stats of
   the bridge which enable flow IPFIX. The output format:
   NXST_IPFIX_FLOW reply (xid=0x2): 2 ids
     id   1: flows=4, current flows=4, sampled pkts=14, ipv4 ok=13, \
             ipv6 ok=0, tx pkts=0
             pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0
     id   2: flows=0, current flows=0, sampled pkts=0, ipv4 ok=0, \
             ipv6 ok=0, tx pkts=0
             pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0

flows: the number of total flow records, including those exported.
current flows: the number of current flow records cached.
sampled pkts: Successfully sampled packet count.
ipv4 ok: successfully sampled IPv4 flow packet count.
ipv6 ok: Successfully sampled IPv6 flow packet count.
tx pkts: the count of IPFIX exported packets sent  to the collector(s).
pkts errs: count of packets failed when sampling, maybe not supported or other error.
ipv4 errs: Count of IPV4 flow packet in the error packets.
ipv6 errs: Count of IPV6 flow packet in the error packets.
tx errs: the count of IPFIX exported packets failed when sending to the collector(s).

Signed-off-by: Benli Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-vsctl: Support identifying Flow_Sample_Collector_Set records by id.
Ben Pfaff [Fri, 10 Jun 2016 22:19:03 +0000 (15:19 -0700)]
ovs-vsctl: Support identifying Flow_Sample_Collector_Set records by id.

This allows commands like
    ovs-vsctl list Flow_Sample_Collector_Set 123
if there's a record with id 123.  It's not perfect, since there can be
more than one record with the same id, but it's helpful.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
8 years agonetlink-notifier: Support multiple groups.
Jarno Rajahalme [Mon, 13 Jun 2016 21:22:32 +0000 (14:22 -0700)]
netlink-notifier: Support multiple groups.

A netlink notifier ('nln') already supports multiple notifiers.  This
patch allows each of these notifiers to subscribe to a different
multicast group.  Sharing a single socket for multiple event types
(each on their own multicast group) provides serialization of events
when reordering of different event types could be problematic.  For
example, if a 'create' event and 'delete' event are on different
netlink multicast group, we may want to process those events in the
order in which kernel issued them, rather than in the order we happen
to check for them.

Moving the multicast group argument from nln_create() to
nln_notifier_create() allows each notifier to specify a different
multicast group.  The parse callback needs to identify the group the
message belonged to by returning the corresponding group number, or 0
when an parse error occurs.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
8 years agodpif-netdev: Print installed flows in dpif format.
Jesse Gross [Sat, 28 May 2016 16:56:07 +0000 (09:56 -0700)]
dpif-netdev: Print installed flows in dpif format.

When debug logging is enabled, dpif-netdev can print each flow as it is
installed, which it currently does using OpenFlow match formatting. Compared
to ODP formatting, there generally isn't too much difference since the
fields are largely the same but it is inconsistent with other logging in
dpif-netdev as well as the analogous functions that deal with the kernel.

However, in some cases there is a difference between the two formats, such
as in the cases of input port or tunnel metadata. For input port, datapath
format helped detect that the generated masks were incorrect. As for tunnels,
at the moment, it's possible to convert between the two formats on demand as
we have a global metadata table. In the future, though this won't be possible
as the metadata table becomes per-bridge which the datapath won't have access
to.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoodp-util: Remove odp_in_port from struct odp_flow_key_parms.
Jesse Gross [Thu, 9 Jun 2016 20:32:50 +0000 (13:32 -0700)]
odp-util: Remove odp_in_port from struct odp_flow_key_parms.

When calling odp_flow_key_from_flow (or _mask), the in_port included
as part of the flow is ignored and must be explicitly passed as a
separate parameter. This is because the assumption was that the flow's
version would often be in OFP format, rather than ODP.

However, at this point all flows that are ready for serialization in
netlink format already have their in_port properly set to ODP format.
As a result, every caller needs to explicitly initialize the extra
paramter to the value that is in the flow. This switches to just use
the value in the flow to simply things and avoid the possibility of
forgetting to initialize the extra parameter.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoofproto-dpif-upcall: Translate input port as part of upcall translation.
Jesse Gross [Thu, 9 Jun 2016 20:18:45 +0000 (13:18 -0700)]
ofproto-dpif-upcall: Translate input port as part of upcall translation.

When we generate wildcards for upcalled flows, the flows and therefore
the wildcards, are in OpenFlow format. These are mostly the same but
one exception is the input port. We work around this problem by simply
performing an exact match on the input port when generating netlink
formatted keys. (This does not lose any information in practice because
action translation also always exact matches on input port.)

While this works fine for kernel based flows, it misses the userspace
datapath, which directly consumes the OFP format mask for the input
port. The effect of this is that the in_port mask is sometimes only
the lower 16 bits of the field. (This is because OFP format is a 16-bit
value stored in a 32-bit field. The full width of the field is initialized
with an exact match mask but certain operations result in cleaving this
down to 16 bits.) In practice this does not cause a problem because datapath
port numbers are almost always in the lower 16 bits of the range anyways.

This moves the masking of the datapath format field to translation so that
all datapaths see the same result. This also makes more sense conceptually
as the input port in the flow is also in ODP format at this stage.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoovn-architecture.7.xml: Fix ovn-controller behavior in VIF life cycle
Hui Kang [Mon, 13 Jun 2016 16:43:26 +0000 (12:43 -0400)]
ovn-architecture.7.xml: Fix ovn-controller behavior in VIF life cycle

Signed-off-by: Hui Kang <kangh@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>