cascardo/ovs.git
8 years agocompat: Rename OVS frag caches.
Joe Stringer [Tue, 2 Feb 2016 23:19:00 +0000 (15:19 -0800)]
compat: Rename OVS frag caches.

These should not have the same name as the upstream ones, to reduce
confusion when they are created. Rename them.

Suggested-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agodatapath: Fix kernel-4.3 build.
Joe Stringer [Tue, 2 Feb 2016 23:18:59 +0000 (15:18 -0800)]
datapath: Fix kernel-4.3 build.

Commit 792e5ed750ce ("datapath: inet: frag: Always orphan skbs inside
ip_defrag().") broke the build for OVS backport against kernel-4.3. Fix
the build.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agobridge: Do not add bridges with '/' in name.
Daniele Di Proietto [Tue, 2 Feb 2016 21:28:11 +0000 (13:28 -0800)]
bridge: Do not add bridges with '/' in name.

This effectively stops vswitchd from creating bridges with '/' in the
name. OVS used to print a warning but the bridge was created anyway.

This restriction is implemented because the bridge name is part of a
filesystem path.

This check is no substitute for Mandatory Access Control, but it
certainly helps to catch the error early.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
[blp@ovn.org added a test]
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Detect and handle errors in ofproto_port_add().
Ben Pfaff [Wed, 3 Feb 2016 01:57:46 +0000 (17:57 -0800)]
ofproto: Detect and handle errors in ofproto_port_add().

The update_port() function called in ofproto_port_add() can encounter
errors that prevent a port from being added, but nothing was checking for
the error and in fact update_port() didn't even pass the error along to
its caller.  This commit fixes the problem.

The scenario that led me to examine this code can be triggered as follows
from the sandbox, as long as you change --enable-dummy=override to
--enable-dummy=system in ovs-sandbox:

ovs-vsctl add-br br0
ovs-vsctl add-port br0 tun0 \
    -- set interface tun0 type=stt options:remote_ip=1.2.3.4
ovs-vsctl add-port br0 tun1 \
    -- set interface tun1 type=stt options:remote_ip=1.2.3.4

The second add-port will fail due to the duplicate tunnel options, but
ofproto_port_add() will not return the error.  Instead, it will report to
the caller that it succeeded and tell it that it has ofp_port OFPP_NONE
(65535), which is invalid and it obviously does not.  The result is that
you get bizarre log messages like this:

    tunnel|WARN|tun1: attempting to add tunnel port with same config as port 'tun0' (::->1.2.3.4, key=0, dp port=7471, pkt mark=0)
    ofproto|WARN|br0: could not add port tun1 (File exists)
    bridge|INFO|bridge br0: added interface tun1 on port 65535
    ofproto|WARN|br0: cannot configure bfd on nonexistent port 65535
    ofproto|WARN|br0: cannot configure LLDP on nonexistent port 65535
    ofproto|WARN|br0: cannot get STP status on nonexistent port 65535
    ofproto|WARN|br0: cannot get RSTP status on nonexistent port 65535
    ofproto|WARN|br0: cannot get STP stats on nonexistent port 65535
    ofproto|WARN|br0: cannot get STP stats on nonexistent port 65535

VMware-BZ: #1598643
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agodpif-netdev: Fix improper use of CMAP_FOR_EACH.
Daniele Di Proietto [Wed, 27 Jan 2016 02:53:52 +0000 (18:53 -0800)]
dpif-netdev: Fix improper use of CMAP_FOR_EACH.

It is ok to iterate a cmap with CMAP_FOR_EACH and remove elements with
cmap_remove(), but having quiescent states inside the loop might create
problems, since some of the postponed cleanup done inside the cmap might
be executed, freeing the memory that the iterator is using.

We had several of these errors in dpif-netdev, because when we rearrange
ports or threads we often need to wait on a condition variable (which
implies a quiescent state).

This problem caused iterations to skip elements or to list them twice,
resulting in the main thread waiting on a condition without anyone else
to signal.

Fix these cases by moving the possible quiescent states outside
CMAP_FOR_EACH loops.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agodpif-netdev: Delay packets' metadata initialization.
Daniele Di Proietto [Fri, 29 Jan 2016 01:47:51 +0000 (17:47 -0800)]
dpif-netdev: Delay packets' metadata initialization.

When a group of packets arrives from a port, we loop through them to
initialize metadata and then we loop through them again to extract the
flow and perform the exact match classification.

This commit combines the two loops into one, and initializes packet->md
in emc_processing() to improve performance.

Since emc_processing() might also be called after recirculation (in
which case the metadata is already valid), an extra parameter is added
to support both cases.

This commits also implements simple prefetching of packet metadata,
to further improve performance.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Andy Zhou <azhou@ovn.org>
Acked-by: Chandran, Sugesh <sugesh.chandran@intel.com>
8 years agocompat: Detect and use nf_ct_frag6_gather().
Joe Stringer [Fri, 8 Jan 2016 01:47:23 +0000 (17:47 -0800)]
compat: Detect and use nf_ct_frag6_gather().

This function is a likely candidate for backporting, and currently
relies on version checks to include the source or not. Grep for the
appropriate functions instead, and include the backport based on that.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use inet_getpeer_v4().
Joe Stringer [Fri, 8 Jan 2016 01:58:59 +0000 (17:58 -0800)]
compat: Detect and use inet_getpeer_v4().

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use __skb_dst_copy().
Joe Stringer [Thu, 24 Dec 2015 19:41:40 +0000 (11:41 -0800)]
compat: Detect and use __skb_dst_copy().

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use nf_connlabels_get().
Joe Stringer [Thu, 24 Dec 2015 19:34:35 +0000 (11:34 -0800)]
compat: Detect and use nf_connlabels_get().

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use nf_ipv6_ops->fragment.
Joe Stringer [Thu, 24 Dec 2015 19:32:38 +0000 (11:32 -0800)]
compat: Detect and use nf_ipv6_ops->fragment.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use struct nf_conntrack_zone.
Joe Stringer [Thu, 24 Dec 2015 19:29:34 +0000 (11:29 -0800)]
compat: Detect and use struct nf_conntrack_zone.

Rather than relying on version checks, detect the presence of this
structure and use it if available.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use inet_frags->lock.
Joe Stringer [Thu, 24 Dec 2015 19:06:18 +0000 (11:06 -0800)]
compat: Detect and use inet_frags->lock.

Prior to ab1c724f6330 ("inet: frag: use seqlock for hash rebuild")
upstream, a rwlock was used when rebuilding inet_frags. Rather than
using a version check to detect this, search for it in the header and
enable the code based on whether it exists.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use inet_frags->frags_work.
Joe Stringer [Thu, 24 Dec 2015 18:54:37 +0000 (10:54 -0800)]
compat: Detect and use inet_frags->frags_work.

Kernels 3.17 and newer have a work queue to evict old fragments, while
older kernel versions use an LRU in the fast path; see upstream commit
b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue").
This commit fixes the version checking so that rather than enabling the
code for either of these approaches using version checks, it is
triggered based on the presence of the work queue in "struct inet_frags".

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use inet_frag_queue->last_in.
Joe Stringer [Thu, 24 Dec 2015 18:40:02 +0000 (10:40 -0800)]
compat: Detect and use inet_frag_queue->last_in.

Kernels 3.17 and older have this field, while newer kernels use the
'flags' field. Detect this in the build in case anyone backports this
change to an older kernel.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agonetdev-dpdk: Fix leak on netdev_dpdk_vhost_user_construct failure.
Ilya Maximets [Tue, 2 Feb 2016 11:02:16 +0000 (14:02 +0300)]
netdev-dpdk: Fix leak on netdev_dpdk_vhost_user_construct failure.

Memory pool for vhost-user ports always created even if construction
fails. And message about successfull socket creation also printed.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev-dpdk: Unlink vhost-user sockets on fatal signals.
Ilya Maximets [Tue, 2 Feb 2016 11:02:15 +0000 (14:02 +0300)]
netdev-dpdk: Unlink vhost-user sockets on fatal signals.

While killing OVS may not call rte_vhost_driver_unregister()
for vhost-user ports. As a result corresponding socket will
remain in a system and opening of that port after restart
will fail.

(Even after this patch this remains a problem for signals
that OVS does not or cannot catch, such as SIGSEGV and
SIGKILL.)

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoodp-util: Fix formatting and parsing of 'frag' in tnl_push ipv4 argument.
Ben Pfaff [Mon, 1 Feb 2016 19:31:54 +0000 (11:31 -0800)]
odp-util: Fix formatting and parsing of 'frag' in tnl_push ipv4 argument.

ip_frag_off is an ovs_be16 so it must be converted between host and
network byte order for parsing and formatting.

Reported-by: Dimitri John Ledkov <xnox@ubuntu.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-January/020072.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Dimitri John Ledkov <xnox@ubuntu.com>
8 years agoovn-northd: Don't set custom log level defaults.
Russell Bryant [Mon, 1 Feb 2016 14:58:22 +0000 (09:58 -0500)]
ovn-northd: Don't set custom log level defaults.

ovn-northd set some custom log level defaults, which I believe were
copied from ovs-vsctl.  Other daemons don't set this.  The difference in
behavior in ovn-northd vs other daemons has caused some confusion during
OpenStack+OVN development and testing, so make it consistent.

Reported-by: Ryan Moats <rmoats@us.ibm.com>
Reported-at: https://bugs.launchpad.net/bugs/1539994
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-By: Kyle Mestery <mestery@mestery.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoacinclude.m4: Fix dpdk build if -mssse3 not supported.
Ilya Maximets [Tue, 12 Jan 2016 11:15:39 +0000 (14:15 +0300)]
acinclude.m4: Fix dpdk build if -mssse3 not supported.

On arm/arm64:
gcc: error: unrecognized command line option '-mssse3'

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath: inet: frag: Always orphan skbs inside ip_defrag().
Joe Stringer [Fri, 29 Jan 2016 19:01:56 +0000 (11:01 -0800)]
datapath: inet: frag: Always orphan skbs inside ip_defrag().

When the linux stack is an endpoint connected to OVS which is performing
IP fragmentation via conntrack actions, it's possible to hit a kernel
BUG. The following upstream commit fixes the issue inside ip_defrag().
For the backport, we provide this inside ip_defrag() for kernels that we
currently backport that function, and also provide just the bugfix for
newer kernels, so we can continue to use upstream functionality as much
as possible.

Upstream commit:
    Later parts of the stack (including fragmentation) expect that there is
    never a socket attached to frag in a frag_list, however this invariant
    was not enforced on all defrag paths. This could lead to the
    BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the
    end of this commit message.

    While the call could be added to openvswitch to fix this particular
    error, the head and tail of the frags list are already orphaned
    indirectly inside ip_defrag(), so it seems like the remaining fragments
    should all be orphaned in all circumstances.

    kernel BUG at net/ipv4/ip_output.c:586!
    [...]
    Call Trace:
     <IRQ>
     [<ffffffffa0205270>] ? do_output.isra.29+0x1b0/0x1b0 [openvswitch]
     [<ffffffffa02167a7>] ovs_fragment+0xcc/0x214 [openvswitch]
     [<ffffffff81667830>] ? dst_discard_out+0x20/0x20
     [<ffffffff81667810>] ? dst_ifdown+0x80/0x80
     [<ffffffffa0212072>] ? find_bucket.isra.2+0x62/0x70 [openvswitch]
     [<ffffffff810e0ba5>] ? mod_timer_pending+0x65/0x210
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffffa03205a2>] ? nf_conntrack_in+0x252/0x500 [nf_conntrack]
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffffa02051a3>] do_output.isra.29+0xe3/0x1b0 [openvswitch]
     [<ffffffffa0206411>] do_execute_actions+0xe11/0x11f0 [openvswitch]
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffffa0206822>] ovs_execute_actions+0x32/0xd0 [openvswitch]
     [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffffa02068a2>] ovs_execute_actions+0xb2/0xd0 [openvswitch]
     [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
     [<ffffffffa0215019>] ? ovs_ct_get_labels+0x49/0x80 [openvswitch]
     [<ffffffffa0213a1d>] ovs_vport_receive+0x5d/0xa0 [openvswitch]
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
     [<ffffffffa02148fc>] internal_dev_xmit+0x6c/0x140 [openvswitch]
     [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
     [<ffffffff81660299>] dev_hard_start_xmit+0x2b9/0x5e0
     [<ffffffff8165fc21>] ? netif_skb_features+0xd1/0x1f0
     [<ffffffff81660f20>] __dev_queue_xmit+0x800/0x930
     [<ffffffff81660770>] ? __dev_queue_xmit+0x50/0x930
     [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
     [<ffffffff81669876>] ? neigh_resolve_output+0x106/0x220
     [<ffffffff81661060>] dev_queue_xmit+0x10/0x20
     [<ffffffff816698e8>] neigh_resolve_output+0x178/0x220
     [<ffffffff816a8e6f>] ? ip_finish_output2+0x1ff/0x590
     [<ffffffff816a8e6f>] ip_finish_output2+0x1ff/0x590
     [<ffffffff816a8cee>] ? ip_finish_output2+0x7e/0x590
     [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
     [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
     [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
     [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
     [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
     [<ffffffff816ab4c0>] ip_output+0x70/0x110
     [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
     [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
     [<ffffffff816abf89>] ip_send_skb+0x19/0x40
     [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
     [<ffffffff816df21a>] icmp_push_reply+0xea/0x120
     [<ffffffff816df93d>] icmp_reply.constprop.23+0x1ed/0x230
     [<ffffffff816df9ce>] icmp_echo.part.21+0x4e/0x50
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffff810d5f9e>] ? rcu_read_lock_held+0x5e/0x70
     [<ffffffff816dfa06>] icmp_echo+0x36/0x70
     [<ffffffff816e0d11>] icmp_rcv+0x271/0x450
     [<ffffffff816a4ca7>] ip_local_deliver_finish+0x127/0x3a0
     [<ffffffff816a4bc1>] ? ip_local_deliver_finish+0x41/0x3a0
     [<ffffffff816a5160>] ip_local_deliver+0x60/0xd0
     [<ffffffff816a4b80>] ? ip_rcv_finish+0x560/0x560
     [<ffffffff816a46fd>] ip_rcv_finish+0xdd/0x560
     [<ffffffff816a5453>] ip_rcv+0x283/0x3e0
     [<ffffffff810b6302>] ? match_held_lock+0x192/0x200
     [<ffffffff816a4620>] ? inet_del_offload+0x40/0x40
     [<ffffffff8165d062>] __netif_receive_skb_core+0x392/0xae0
     [<ffffffff8165e68e>] ? process_backlog+0x8e/0x230
     [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
     [<ffffffff8165d7c8>] __netif_receive_skb+0x18/0x60
     [<ffffffff8165e678>] process_backlog+0x78/0x230
     [<ffffffff8165e6dd>] ? process_backlog+0xdd/0x230
     [<ffffffff8165e355>] net_rx_action+0x155/0x400
     [<ffffffff8106b48c>] __do_softirq+0xcc/0x420
     [<ffffffff816a8e87>] ? ip_finish_output2+0x217/0x590
     [<ffffffff8178e78c>] do_softirq_own_stack+0x1c/0x30
     <EOI>
     [<ffffffff8106b88e>] do_softirq+0x4e/0x60
     [<ffffffff8106b948>] __local_bh_enable_ip+0xa8/0xb0
     [<ffffffff816a8eb0>] ip_finish_output2+0x240/0x590
     [<ffffffff816a9a31>] ? ip_do_fragment+0x831/0x8a0
     [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
     [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
     [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
     [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
     [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
     [<ffffffff816ab4c0>] ip_output+0x70/0x110
     [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
     [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
     [<ffffffff816abf89>] ip_send_skb+0x19/0x40
     [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
     [<ffffffff816d55d3>] raw_sendmsg+0x7d3/0xc30
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffff816e7557>] ? inet_sendmsg+0xc7/0x1d0
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffff816e759a>] inet_sendmsg+0x10a/0x1d0
     [<ffffffff816e7495>] ? inet_sendmsg+0x5/0x1d0
     [<ffffffff8163e398>] sock_sendmsg+0x38/0x50
     [<ffffffff8163ec5f>] ___sys_sendmsg+0x25f/0x270
     [<ffffffff811aadad>] ? handle_mm_fault+0x8dd/0x1320
     [<ffffffff8178c147>] ? _raw_spin_unlock+0x27/0x40
     [<ffffffff810529b2>] ? __do_page_fault+0x1e2/0x460
     [<ffffffff81204886>] ? __fget_light+0x66/0x90
     [<ffffffff8163f8e2>] __sys_sendmsg+0x42/0x80
     [<ffffffff8163f932>] SyS_sendmsg+0x12/0x20
     [<ffffffff8178cb17>] entry_SYSCALL_64_fastpath+0x12/0x6f
    Code: 00 00 44 89 e0 e9 7c fb ff ff 4c 89 ff e8 e7 e7 ff ff 41 8b 9d 80 00 00 00 2b 5d d4 89 d8 c1 f8 03 0f b7 c0 e9 33 ff ff f
     66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48
    RIP  [<ffffffff816a9a92>] ip_do_fragment+0x892/0x8a0
     RSP <ffff88006d603170>

    Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agodatapath: Fix IPv6 fragment expiry crash.
Joe Stringer [Wed, 27 Jan 2016 00:49:36 +0000 (00:49 +0000)]
datapath: Fix IPv6 fragment expiry crash.

Prior to a series of commits in 3.17 like the following, the model
used to manage and expire fragments was different. We already backport
several of these functions (See datapath/compat/inet_fragment.c) to do
things like allocate/evict/destroy frags and frag queues. In the IPv4
code, we use these. In most of the IPv6 cases, we already reuse these
also. However, for timed frag expiration we instead call the upstream
version of the function, which proceeds to use the upstream versions
of the functions we backport in inet_fragment.c. There can be some
discrepancy between the offsets used in these upstream versions vs. the
backport versions, so if you mix/match them then it leads to invalid
dereferences.

b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue")
ab1c724f6330 ("inet: frag: use seqlock for hash rebuild")

Fixes the following kernel oops on kernels < 3.17 when IPv6 fragments
are expired without reassembling the frame.

BUG: unable to handle kernel paging request at 00000006845d69a8
IP: [<ffffffff8172c09e>] _raw_spin_lock+0xe/0x50
...
Call Trace:
 <IRQ>
 [<ffffffff816a32d3>] inet_frag_kill+0x63/0x100
 [<ffffffff816ead93>] ip6_expire_frag_queue+0x63/0x110
 [<ffffffffa01130e6>] nf_ct_frag6_expire+0x26/0x30 [openvswitch]
 [<ffffffff810744f6>] call_timer_fn+0x36/0x100
 [<ffffffffa01130c0>] ? nf_ct_net_init+0x20/0x20 [openvswitch]
 [<ffffffff8107548f>] run_timer_softirq+0x1ef/0x2f0
 [<ffffffff8106cccc>] __do_softirq+0xec/0x2c0
 [<ffffffff8106d215>] irq_exit+0x105/0x110
 [<ffffffff81737095>] smp_apic_timer_interrupt+0x45/0x60
 [<ffffffff81735a1d>] apic_timer_interrupt+0x6d/0x80
 <EOI>
 [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
 [<ffffffff8101cb2f>] default_idle+0x1f/0xc0
 [<ffffffff8101d406>] arch_cpu_idle+0x26/0x30
 [<ffffffff810bf3a5>] cpu_startup_entry+0xc5/0x290
 [<ffffffff817122e7>] rest_init+0x77/0x80
 [<ffffffff81d34f70>] start_kernel+0x438/0x443

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agoovn: Remove top ovn directory from PATHs.
Ilya Maximets [Fri, 29 Jan 2016 09:20:13 +0000 (12:20 +0300)]
ovn: Remove top ovn directory from PATHs.

Since 5b5c922b0ca6 ("ovn-nbctl: Move ovn-nbctl to utilities directory.")
there is no more executables in top ovn directory.

Removing of this directory from PATHs helps to avoid problems when
old executable ./ovn/ovn-nbctl used instead of ./ovn/utilities/ovn-nbctl.

This may happen if source directory was updated to commit 5b5c922b0ca6
without calling 'make clean'.

Fixes: 5b5c922b0ca6 ("ovn-nbctl: Move ovn-nbctl to utilities directory.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
8 years agodatapath: test for netlink_set_err returning void
Simon Horman [Fri, 27 Nov 2015 06:07:23 +0000 (22:07 -0800)]
datapath: test for netlink_set_err returning void

In v2.6.33 netlink_set_err returns void. However, 1a50307ba182 ("netlink:
fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err()") was backported and
included in v2.6.33.2 and in that and subsequent v2.6.33 stable releases
netlink_set_err returns an int.

It seems plausible that there are other backports floating around. So check
for netlink_set_err returning void rather than including compatibility code
based on the version of the kernel.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agonetdev-dpdk: Add vhost-user multiqueue support
Flavio Leitner [Tue, 26 Jan 2016 18:58:14 +0000 (16:58 -0200)]
netdev-dpdk: Add vhost-user multiqueue support

Most of the network cards today supports multiple receive
and transmit queues (MQ).  The core idea is that on packet
reception, a NIC can send different packets to different
queues to distribute processing among CPUs running in parallel.
The packet distribution is based on a result of a filter applied
on each packet headers. The filter should keep all packets from
the same flow on the same queue to avoid re-ordering while
distributing different flows among all available queues.

This is how the packet moves in a typical vhost-user use-case:

NIC             OVS
DPDK port ==== bridge --- vhost-user ==== qemu ==== virtio eth0

The DPDK ports, OVS bridges, virtio network driver and
recently QEMU (vhost-user) supports MQ.  This patch adds MQ
support to OVS that leverages DPDK vhost library to implement
vhost-user interfaces.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <kevin.traynor@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoofproto-dpif-xlate: Do not execute resubmit again after recirculation.
Ben Pfaff [Wed, 27 Jan 2016 17:14:18 +0000 (09:14 -0800)]
ofproto-dpif-xlate: Do not execute resubmit again after recirculation.

Consider the following flow table:

    table=0 actions=resubmit(,1),2
    table=1 actions=debug_recirc

When debug_recirc triggers recirculation and we later resume processing,
only the output to port 2 should be executed, because the effects of
"resubmit" have already taken place.  However, until now, the "resubmit"
was added to the actions to execute post-recirculation, resulting in an
infinite loop.

Now consider this flow table (as seen in the "MPLS handling" test in
ofproto-dpif.at):

    table=0 actions=pop_mpls(0x0806),resubmit(,1)
    table=1 ip,nw_dst=1.2.3.4 actions=controller

Here, we do want to add the "resubmit" to the actions to execute
post-recirculation, since the "resubmit" cannot be processed until after
recirculation makes the nw_dst field available.

This commit fixes the problem in both cases.

Found when testing a feature based on recirculation.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agoREADME.ovs-vtep.md: Fix incorrect spacing.
Kyle Mestery [Wed, 27 Jan 2016 23:55:28 +0000 (17:55 -0600)]
README.ovs-vtep.md: Fix incorrect spacing.

This fixes a simple formatting issue with this file I noticed while reviewing
the example of experimenting with the OVS HW-VTEP simulator.

Signed-off-by: Kyle Mestery <mestery@mestery.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoNEWS: DPDK 2.2 is now required.
Flavio Leitner [Wed, 27 Jan 2016 16:18:09 +0000 (14:18 -0200)]
NEWS: DPDK 2.2 is now required.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodpif-netdev: Unique and sequential tx_qids.
Ilya Maximets [Tue, 26 Jan 2016 06:12:34 +0000 (09:12 +0300)]
dpif-netdev: Unique and sequential tx_qids.

Currently tx_qid is equal to pmd->core_id. This leads to unexpected
behavior if pmd-cpu-mask different from '/(0*)(1|3|7)?(f*)/',
e.g. if core_ids are not sequential, or doesn't start from 0, or both.

Example:
starting 2 pmd threads with 1 port, 2 rxqs per port,
pmd-cpu-mask = 00000014 and let dev->real_n_txq = 2

It that case pmd_1->tx_qid = 2, pmd_2->tx_qid = 4 and
txq_needs_locking = true (if device hasn't ovs_numa_get_n_cores()+1
queues).

In that case, after truncating in netdev_dpdk_send__():
'qid = qid % dev->real_n_txq;'
pmd_1: qid = 2 % 2 = 0
pmd_2: qid = 4 % 2 = 0

So, both threads will call dpdk_queue_pkts() with same qid = 0.
This is unexpected behavior if there is 2 tx queues in device.
Queue #1 will not be used and both threads will lock queue #0
on each send.

Fix that by using sequential tx_qids.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agodpif-netdev: Rework of rx queue management.
Ilya Maximets [Tue, 26 Jan 2016 06:12:33 +0000 (09:12 +0300)]
dpif-netdev: Rework of rx queue management.

Current rx queue management model is buggy and will not work properly
without additional barriers and other syncronization between PMD
threads and main thread.

Known BUGS of current model:
* While reloading, two PMD threads, one already reloaded and
  one not yet reloaded, can poll same queue of the same port.
  This behavior may lead to dpdk driver failure, because they
  are not thread-safe.
* Same bug as fixed in commit e4e74c3a2b
  ("dpif-netdev: Purge all ukeys when reconfigure pmd.") but
  reproduced while only reconfiguring of pmd threads without
  restarting, because addition may change the sequence of
  other ports, which is important in time of reconfiguration.

Introducing the new model, where distribution of queues made by main
thread with minimal synchronizations and without data races between
pmd threads. Also, this model should work faster, because only
needed threads will be interrupted for reconfiguraition and total
computational complexity of reconfiguration is less.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoovs-lib: Try to call exit before killing.
Ilya Maximets [Wed, 16 Dec 2015 12:32:21 +0000 (15:32 +0300)]
ovs-lib: Try to call exit before killing.

While killing OVS may not free all allocated resources.

Example:
Socket for vhost-user port will stay in a system
after 'systemctl stop openvswitch' and opening
that port after restart will fail.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoUpdate relevant artifacts to add support for DPDK v2.2.0.
mweglicx [Wed, 23 Dec 2015 10:20:22 +0000 (10:20 +0000)]
Update relevant artifacts to add support for DPDK v2.2.0.

Following changes have been applied:
 - INSTALL.DPDK.md: change DPDK version number,
 - build.sh: change DPDK version number.

Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoofproto-dpif-xlate: Fix recirculation for resubmit to current table.
Ben Pfaff [Fri, 22 Jan 2016 23:58:55 +0000 (15:58 -0800)]
ofproto-dpif-xlate: Fix recirculation for resubmit to current table.

When recirculation defers actions for processing later, it decides
based on the actions being saved whether it needs to record the table
and cookie from which they originated.  Until now, it was thought that
this was only important for actions that send packets to the controller
(because those actions send the table ID and cookie).  This overlooked
a special case of the "resubmit" action which also depends on the
current table ID, which meant that this special case malfunctioned if
it came after recirculation.  This commit fixes the problem, and adds
a test.

Found while testing another feature under development.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agodatapath: compat: Add NULL check for tun-dst.
Pravin B Shelar [Thu, 21 Jan 2016 05:17:45 +0000 (21:17 -0800)]
datapath: compat: Add NULL check for tun-dst.

tun-dst could be NULL in case of incorrect action list
where set tunnel action is missing but packet is sent
to tunnel vport.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoovn-controller: Update check for parent port.
Russell Bryant [Wed, 20 Jan 2016 16:17:58 +0000 (11:17 -0500)]
ovn-controller: Update check for parent port.

There were a couple of checks that checked for a parent port as the
field being non-NULL.  We should treat an empty string the same as NULL
for this field.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-nbctl: Update show format for addresses.
Russell Bryant [Thu, 14 Jan 2016 16:00:52 +0000 (11:00 -0500)]
ovn-nbctl: Update show format for addresses.

This patch updates the formatting for the Logical_Port addresses column
in the show command output.  Previously, output would look like:

  addresses: 00:00:00:00:00:01 192.168.1.1 00:00:00:00:00:01 192.168.1.2

Now it looks like:

  addresses: ["00:00:00:00:00:01 192.168.1.1", "00:00:00:00:00:01 192.168.1.2"]

The grouping of addresses is important, so it should be reflected in the
output.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-nbctl: Help catch lport-set-addresses mistakes.
Russell Bryant [Thu, 14 Jan 2016 15:47:18 +0000 (10:47 -0500)]
ovn-nbctl: Help catch lport-set-addresses mistakes.

While debugging a broken OVN environment yesterday, the problem turned
out to be invalid entries in the logical port addresses column.  In
particular, the following command had been used:

  $ ovn-nbctl lport-set-addresses lp0 MAC IP

instead of:

  $ ovn-nbctl lport-set-addresses lp0 "MAC IP"

This is really easy to mess up, so add some simple validation to the
lport-set-addresses command.  If the beginning of an argument is ever
an IP address, it's wrong.

In passing, also add a note to the ovn-nb db documentation to note that
the order of "MAC IP" is required, as "IP MAC" is not valid.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath: Fix panic sending IP frags over tunnels.
Joe Stringer [Wed, 20 Jan 2016 23:26:49 +0000 (15:26 -0800)]
datapath: Fix panic sending IP frags over tunnels.

The entire OVS_GSO_CB was not preserved when handling IP fragments,
leading to the following NULL pointer dereference in ovs_stt_xmit(). Fix
this in the fragmentation handling code by preserving the whole CB.

BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
IP: [<ffffffffa0cfc5b1>] ovs_stt_xmit+0x61/0x260 [openvswitch]
Call Trace:
 [<ffffffff815f682e>] ? __alloc_skb+0x7e/0x2b0
 [<ffffffffa0cf1134>] ovs_vport_send+0x44/0xb0 [openvswitch]
 [<ffffffffa0ce241f>] ovs_vport_output+0x10f/0x190 [openvswitch]
 [<ffffffff8163fe98>] ip_fragment+0x238/0x870
 [<ffffffffa0ce2310>] ? do_output.isra.35+0x120/0x120 [openvswitch]
 [<ffffffffa0d02093>] ovs_fragment+0x283/0x292 [openvswitch]
 [<ffffffff81073ff7>] ? mod_timer_pending+0x67/0x1b0
 [<ffffffff8160e2d0>] ? dst_ifdown+0x90/0x90
 [<ffffffff8160e2d0>] ? dst_ifdown+0x90/0x90
 [<ffffffffa0b30165>] ? nfnetlink_has_listeners+0x15/0x20 [nfnetlink]
 [<ffffffffa0cdb164>] ? ctnetlink_conntrack_event+0x74/0x7ee [nf_conntrack_netlink]
 [<ffffffffa0b873cd>] ? nf_ct_deliver_cached_events+0xad/0xf0 [nf_conntrack]
 [<ffffffff81360331>] ? csum_partial+0x11/0x20
 [<ffffffffa0ce2747>] ? execute_masked_set_action+0x2a7/0xa60 [openvswitch]
 [<ffffffffa0ce22a8>] do_output.isra.35+0xb8/0x120 [openvswitch]
 [<ffffffffa0ce2ff4>] do_execute_actions+0xf4/0x7f0 [openvswitch]
 [<ffffffffa0ce3730>] ovs_execute_actions+0x40/0x130 [openvswitch]
 [<ffffffffa0ce7c69>] ovs_packet_cmd_execute+0x2b9/0x2e0 [openvswitch]
 [<ffffffff81634fad>] genl_family_rcv_msg+0x18d/0x370
 [<ffffffff81635190>] ? genl_family_rcv_msg+0x370/0x370
 [<ffffffff81635221>] genl_rcv_msg+0x91/0xd0
 [<ffffffff816332c9>] netlink_rcv_skb+0xa9/0xc0
 [<ffffffff816337c8>] genl_rcv+0x28/0x40
 [<ffffffff816329b5>] netlink_unicast+0xd5/0x1b0
 [<ffffffff81632d9e>] netlink_sendmsg+0x30e/0x680
 [<ffffffff8162fc84>] ? netlink_rcv_wake+0x44/0x60
 [<ffffffff81630d12>] ? netlink_recvmsg+0x1a2/0x3a0
 [<ffffffff815ed7fb>] sock_sendmsg+0x8b/0xc0
 [<ffffffff8114d06d>] ? __alloc_pages_nodemask+0x16d/0xac0
 [<ffffffff8101c4b9>] ? sched_clock+0x9/0x10
 [<ffffffff815edbc9>] ___sys_sendmsg+0x349/0x360
 [<ffffffff811f8a39>] ? ep_scan_ready_list.isra.7+0x199/0x1c0
 [<ffffffff8110705c>] ? acct_account_cputime+0x1c/0x20
 [<ffffffff811cd90f>] ? fget_light+0x8f/0xf0
 [<ffffffff815ee922>] __sys_sendmsg+0x42/0x80
 [<ffffffff815ee972>] SyS_sendmsg+0x12/0x20
 [<ffffffff8170f22f>] tracesys+0xe1/0xe6

VMware-BZ: #1587324
Fixes: a94ebc39996b ("datapath: Add conntrack action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agotests: Set enable-dummy=system for ovn-controller-vtep tests.
Russell Bryant [Thu, 14 Jan 2016 20:07:59 +0000 (15:07 -0500)]
tests: Set enable-dummy=system for ovn-controller-vtep tests.

All of the ovn-controller-vtep tests were failing on my laptop due to an
unexpected message in the ovs-vswitchd log related to my VPN.  This
setting resolves it and makes all tests pass.

Fixes: 0c1e8a7d637e ("ovn-controller-vtep: Add gateway module.")
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Fix memory leak and memory exhaustion bugs in group_mod.
Ben Pfaff [Thu, 14 Jan 2016 06:15:09 +0000 (22:15 -0800)]
ofproto: Fix memory leak and memory exhaustion bugs in group_mod.

In handle_group_mod() cases where adding a group failed, nothing freed the
list of buckets, causing a leak.  The same was true in every case of
modifying a group.  This commit fixes the problem by changing add_group()
to never steal or free the buckets (modify_group() already acted this way)
and then making handle_group_mod() always free the buckets when it's done.

This approach might at first raise objections, because it makes add_group()
copy the buckets instead of just take the existing ones.  But it actually
fixes a worse problem too: when OF1.4+ REQUESTFORWARD is enabled, the
group_mod is reused for the request forwarding.  Until now, for a group_mod
that adds a new group and that has some buckets, the previous stealing of
buckets in add_group() meant that the group_mod's buckets were no longer
valid; in practice, the list of buckets became linked in a way that
iteration never terminated, which caused memory to be exhausted while
composing the requestforward message.  By making add_group() no longer
modify the group_mod, we also fix this problem.

The requestforward test in the testsuite did not find the latter problem
because it only added a group without any buckets.  This commit also
updates the testsuite to include a bucket in its group_mod, which would
have found the problem.

Found by pain and suffering.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agoovn-tutorial: fix a typo
William Tu [Sun, 17 Jan 2016 01:23:15 +0000 (17:23 -0800)]
ovn-tutorial: fix a typo

switch_in_pre_acl -> switch_out_pre_acl

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
8 years agoovn: Use assigned Geneve class.
Jesse Gross [Thu, 14 Jan 2016 22:25:17 +0000 (14:25 -0800)]
ovn: Use assigned Geneve class.

The most recent version of the Geneve draft included an option
class assignment for OVN:
https://tools.ietf.org/html/draft-ietf-nvo3-geneve-01

As a result, we can stop using the experimental class and switch to
the allocated one (0x0102).

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-bugtool: Add conntrack output.
William Tu [Wed, 13 Jan 2016 23:51:44 +0000 (15:51 -0800)]
ovs-bugtool: Add conntrack output.

Add a script to show all the connection entries in the tracker.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Gurucharan Shetty <guru@ovn.org>
8 years agodatapath: STT: Fix nf-hook softlockup.
Pravin B Shelar [Thu, 14 Jan 2016 00:42:10 +0000 (16:42 -0800)]
datapath: STT: Fix nf-hook softlockup.

nf-hook is not unregistered on STT device delete, But when
second time it was created it nf-hook is again registered.
which causes following softlockup.
Following patch fixes it by registering nf-hook only on very
first stt device.

---8<---

BUG: soft lockup - CPU#1 stuck for 22s! [ovs-vswitchd:11293]
RIP: 0010:[<ffffffffa0e48308>]  [<ffffffffa0e48308>] nf_ip_hook+0xf8/0x180 [openvswitch]
Stack:
 <IRQ>
 [<ffffffff8163bf60>] ? ip_rcv_finish+0x350/0x350
 [<ffffffff8163572a>] nf_iterate+0x9a/0xb0
 [<ffffffff8163bf60>] ? ip_rcv_finish+0x350/0x350
 [<ffffffff816357bc>] nf_hook_slow+0x7c/0x120
 [<ffffffff8163bf60>] ? ip_rcv_finish+0x350/0x350
 [<ffffffff8163c343>] ip_local_deliver+0x73/0x80
 [<ffffffff8163bc8d>] ip_rcv_finish+0x7d/0x350
 [<ffffffff8163c5e8>] ip_rcv+0x298/0x3d0
 [<ffffffff81605f26>] __netif_receive_skb_core+0x696/0x880
 [<ffffffff81606128>] __netif_receive_skb+0x18/0x60
 [<ffffffff81606cce>] process_backlog+0xae/0x180
 [<ffffffff81606512>] net_rx_action+0x152/0x270
 [<ffffffff8106accc>] __do_softirq+0xec/0x300
 [<ffffffff81710a1c>] do_softirq_own_stack+0x1c/0x30

Fixes: fee43fa2 ("datapath: Fix deadlock on STT device destroy.")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Joe Stringer <joe@ovn.org>
8 years ago{lib, utilities}: Fix ct_state constants in docs.
Joe Stringer [Wed, 13 Jan 2016 18:59:03 +0000 (10:59 -0800)]
{lib, utilities}: Fix ct_state constants in docs.

These pieces of documentation were not updated when the CS_* flags were
reordered on the OpenFlow interface.

Fixes: 63bc9fb1c69f ("packets: Reorder CS_* flags to remove gap.")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agonetdev-dpdk: Fix thread_is_pmd() symbol conflict.
Joe Stringer [Tue, 12 Jan 2016 19:32:41 +0000 (11:32 -0800)]
netdev-dpdk: Fix thread_is_pmd() symbol conflict.

DPDK build was broken after commit 2f8932e8403a ("poll: Suppress logging
for pmd threads.") due to the following error:

lib/netdev-dpdk.c:245:13: error: static declaration of ‘thread_is_pmd’
follows non-static declaration
lib/ovs-thread.h:526:6: note: previous declaration of ‘thread_is_pmd’
was here

The version used in this file operates in the fastpath, so it cannot
switch to using the newly introduced version; the new version lives
outside of the dpdk portions of OVS so its implementation cannot be
shared with this function. Rename it to resolve the conflict.

Fixes: 2f8932e8403a ("poll: Suppress logging for pmd threads.")
Suggested-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agodatapath: Fix deadlock on STT device destroy.
Pravin B Shelar [Tue, 12 Jan 2016 04:13:40 +0000 (20:13 -0800)]
datapath: Fix deadlock on STT device destroy.

STT unregisters nf-hook when there are no other STT devices
left in the namespace. On some kernel versions the nf-unreg API
take RTNL lock, but it is already taken in the tunnel device
destroy code path which results in deadlock. To fix the issue
I moved the unreg call into net-exit.

VMware-BZ: #1582410
Reported-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoovs-ofctl.8.in: Fix indentation.
Joe Stringer [Tue, 12 Jan 2016 00:43:52 +0000 (16:43 -0800)]
ovs-ofctl.8.in: Fix indentation.

This extraneous .RE caused the indentation for the subsequent actions to
drop back an extra step, fix it.

Fixes: 8e53fe8cf7a1 ("Add connection tracking mark support.")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofp-parse: Use xstrdup() instead of strdup().
Ben Pfaff [Mon, 11 Jan 2016 17:21:58 +0000 (09:21 -0800)]
ofp-parse: Use xstrdup() instead of strdup().

This avoids a null pointer dereference in the case of memory allocation
failure.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agopoll: Suppress logging for pmd threads.
Ilya Maximets [Tue, 22 Dec 2015 14:26:47 +0000 (17:26 +0300)]
poll: Suppress logging for pmd threads.

'Unreasonably long poll interval's are reasonable for PMD threads.
Also reporting of high CPU usage is not necessary.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: Add LSOv2 support for VXLAN
Alin Serdean [Fri, 11 Dec 2015 22:29:38 +0000 (22:29 +0000)]
datapath-windows: Add LSOv2 support for VXLAN

This patch adds LSO version 2 support for the windows datapath.
(https://msdn.microsoft.com/en-us/library/windows/hardware/ff568840%28v=vs.85%29.aspx)

Tested using psping and iperf3.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: Fix bug small bug in GRE.
Alin Serdean [Fri, 11 Dec 2015 22:24:49 +0000 (22:24 +0000)]
datapath-windows: Fix bug small bug in GRE.

Allow GRE encapsulation to take place in the case we have a TCP payload
without LSO.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Fix memory leak reported by valgrind.
William Tu [Tue, 5 Jan 2016 21:38:43 +0000 (13:38 -0800)]
ofproto: Fix memory leak reported by valgrind.

Test case 757: ofproto - table description (OpenFlow 1.4)
Call stacks:
    parse_ofp_table_vacancy (ofp-parse.c:896)
    parse_ofp_table_mod (ofp-parse.c:978)
    ofctl_mod_table (ovs-ofctl.c:2011)
    ovs_cmdl_run_command (command-line.c:121)
    main (ovs-ofctl.c:135)
Reason: return without freeing memory

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agorstp: Fix memory leak reported by valgrind.
William Tu [Tue, 5 Jan 2016 21:38:42 +0000 (13:38 -0800)]
rstp: Fix memory leak reported by valgrind.

test case: 1650 RSTP Single bridge, call stacks
    hmap_insert_at (hmap.h:235)
    rstp_port_set_port_number__ (rstp.c:744)
    rstp_add_port (rstp.c:1164)
    new_bridge (test-rstp.c:123)
    test_rstp_main (test-rstp.c:514)
    ovstest_wrapper_test_rstp_main__ (test-rstp.c:714)
    ovs_cmdl_run_command (command-line.c:121)
    main (ovstest.c:132)
fix it by adding hmap_destroy() at rstp_unref()

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Daniele Venturino <daniele.venturino@m3s.it>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-ofctl: Fix memory leak reported by valgrind.
William Tu [Tue, 5 Jan 2016 21:38:41 +0000 (13:38 -0800)]
ovs-ofctl: Fix memory leak reported by valgrind.

Reported by 348: ovs-ofctl parse-flows (skb_priority)
Reason: return without freeing memory

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agostream-ssl: Fix memory leak reported by valgrind.
William Tu [Thu, 7 Jan 2016 23:59:34 +0000 (15:59 -0800)]
stream-ssl: Fix memory leak reported by valgrind.

test case 1628: peer ca cert
    ASN1_item_dup
    do_ca_cert_bootstrap (stream-ssl.c:413)
    ssl_connect (stream-ssl.c:468)
    scs_connecting (stream.c:297)
    stream_connect (stream.c:320)
Fix by removing the X509_dup().

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agopython: Fix the TypeError exception seen when idl.Idl parses lock reply
Numan Siddique [Fri, 8 Jan 2016 06:29:47 +0000 (11:59 +0530)]
python: Fix the TypeError exception seen when idl.Idl parses lock reply

File "/usr/lib/python2.7/site-packages/ovs/db/idl.py", line 334,
in __parse_lock_notify
  self.__update_has_lock(self, new_has_lock)
TypeError: __update_has_lock() takes exactly 2 arguments (3 given)

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
8 years agoofproto-dpif-upcall: Don't delete modified ukeys.
Joe Stringer [Thu, 7 Jan 2016 19:47:46 +0000 (11:47 -0800)]
ofproto-dpif-upcall: Don't delete modified ukeys.

If revalidation returns the result UKEY_DELETE, then both the ukey and
its corresponding flow should be deleted. However, if revalidation
returns UKEY_MODIFY, the ukey itself should be modified in-place and
should not be deleted.

Fix this by only applying the ukey deletion to ukeys whose datapath
operations delete a flow.

This may fix statistics accounting issues in rare cases involving
OpenFlow rule modification where actions are updated but flows remain
the same.

Found by inspection.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agoofproto-dpif-upcall: Avoid double-delete of ukeys.
Ben Pfaff [Wed, 6 Jan 2016 23:44:39 +0000 (15:44 -0800)]
ofproto-dpif-upcall: Avoid double-delete of ukeys.

revalidate_sweep__() has two cases where it calls ukey_delete() to
remove a ukey from the umap via cmap_remove().  The first case is a direct
call to ukey_delete(), when !flow_exists.  The second case is an indirect
call via push_ukey_ops(), when result != UKEY_KEEP.  If both of these
conditions are simultaneously true, however, the code would call
ukey_delete() twice, causing an assertion failure in the second call.  This
commit fixes the problem by eliminating one of the calls.

The version tested by Ben Warren differs from this version, see:
    http://openvswitch.org/pipermail/dev/2016-January/064117.html

Reported-by: Keith Holleman <keith.holleman@gmail.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2015-December/019772.html
CC: Joe Stringer <joe@ovn.org>
VMware-BZ: #1579057
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Ben Warren <ben@skyportsystems.com>
8 years agoofproto-dpif-rid: Fix memory leak in recirc_state.
Ben Pfaff [Wed, 6 Jan 2016 00:51:54 +0000 (16:51 -0800)]
ofproto-dpif-rid: Fix memory leak in recirc_state.

recirc_state_clone() copies the stack and actions and nothing ever freed
them.

CC: Jarno Rajahalme <jarno@ovn.org>
CC: Andy Zhou <azhou@ovn.org>
Reported-by: William Tu <u9012063@gmail.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-January/064040.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofp-util: Avoid use-after-free error in ofputil_append_meter_config().
Ben Pfaff [Wed, 16 Dec 2015 06:51:29 +0000 (22:51 -0800)]
ofp-util: Avoid use-after-free error in ofputil_append_meter_config().

Reported-by: weizj <334965317@qq.com>
Reported-at: https://github.com/openvswitch/ovs/pull/97
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoodp-util: Fix memory leak reported by valgrind.
William Tu [Tue, 5 Jan 2016 00:18:41 +0000 (16:18 -0800)]
odp-util: Fix memory leak reported by valgrind.

Test case: OVS datapath key parsing and formatting (377)
Return without freeing buf:
    xmalloc(util.c:112)
    ofpbuf_init(ofpbuf.c:124)
    parse_odp_userspace_action(odp-util.c:987)
    parse_odp_action(odp-util.c:1552)
    odp_actions_from_string(odp-util.c:1721)
    parse_actions(test-odp.c:132)

Test case: OVS datapath actions parsing and formatting (380)
Exit withtou uninit in test-odp.c
    xrealloc(util.c:123)
    ofpbuf_resize__(ofpbuf.c:243)
    ofpbuf_put_uninit(ofpbuf.c:364)
    nl_msg_put_uninit(netlink.c:178)
    nl_msg_put_unspec_uninit(netlink.c:216)
    nl_msg_put_unspec(netlink.c:243)
    parse_odp_key_mask_attr(odp-util.c:3974)
    odp_flow_from_string(odp-util.c:4151)
    parse_keys(test-odp.c:49)
    test_odp_main(test-odp.c:237)
    ovstest_wrapper_test_odp_main__(test-odp.c:251)
    ovs_cmdl_run_command(command-line.c:121)
    main(ovstest.c:132)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: Fix subscribe/unsubscribe packets
Alin Serdean [Mon, 4 Jan 2016 23:04:11 +0000 (23:04 +0000)]
datapath-windows: Fix subscribe/unsubscribe packets

The policy of the subscribe packets is defined by the following:
    const NL_POLICY policy[] =  {
        [OVS_NL_ATTR_PACKET_PID] = {.type = NL_A_U32 },
        [OVS_NL_ATTR_PACKET_SUBSCRIBE] = {.type = NL_A_U8 }
        };
Switch the value of the join operation with the one from the policy.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetlink-socket: Fix log message for subscribe/unsubscribe on Windows.
Alin Serdean [Mon, 4 Jan 2016 23:04:10 +0000 (23:04 +0000)]
netlink-socket: Fix log message for subscribe/unsubscribe on Windows.

The warning message was inverted on the performed operation.

Also use the error returned by nl_sock_subscribe_packet__.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-northd: Can't use ct() for router ports.
l0310 [Wed, 2 Dec 2015 11:20:07 +0000 (19:20 +0800)]
ovn-northd: Can't use ct() for router ports.

This patch ensures that we do not attempt to use connection tracking for
logical ports with type=router.  This does not work as the traffic
through a logical router port is not symmetric since logical routers are
distributed.  The result was that traffic between logical ports on
different hypervisors that went through a logical router would fail if
ACLs were in use.

GitHub-PR: #92
Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1522022
Signed-off-by: l0310 <liw@dtdream.com>
[russell@ovn.org updated commit message, style tweaks]
Signed-off-by: Russell Bryant <russell@ovn.org>
8 years agonetdev-bsd: Destroy mutex on netdev_bsd_construct_system() error path.
xushengping [Thu, 24 Dec 2015 07:50:47 +0000 (15:50 +0800)]
netdev-bsd: Destroy mutex on netdev_bsd_construct_system() error path.

Signed-off-by: xushengping <shengping.xu@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofp-print: Fix memory leak at ofp_print_bundle_add().
William Tu [Thu, 24 Dec 2015 18:28:40 +0000 (10:28 -0800)]
ofp-print: Fix memory leak at ofp_print_bundle_add().

Call ds_put_and_free_cstr instead of ds_put_cstr to free msg.
Reported by test cases: 325, 326
    ofp_print_bundle_add (ofp-print.c:3027)
    ofp_to_string__ (ofp-print.c:3410)
    ofp_to_string (ofp-print.c:3465)
    ofp_print (ofp-print.c:3497)
    ofctl_ofp_print (ovs-ofctl.c:3818)
    ovs_cmdl_run_command (command-line.c:121)
    main (ovs-ofctl.c:135)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
[blp@ovn.org simplified the code slightly]
Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agotest-sflow: Fix memory leak in main function.
Ilya Maximets [Thu, 24 Dec 2015 10:22:53 +0000 (13:22 +0300)]
test-sflow: Fix memory leak in main function.

Reported by valgrind on test case 886.

 912 (24 direct, 888 indirect) bytes in 1 blocks are definitely lost
    at malloc
    by xmalloc (util.c:112)
    by unixctl_server_create (unixctl.c:250)
    by test_sflow_main (test-sflow.c:688)
    by ovstest_wrapper_test_sflow_main__ (test-sflow.c:786)
    by ovs_cmdl_run_command (command-line.c:121)
    by main (ovstest.c:132)

 1,500 bytes in 1 blocks are definitely lost
    at malloc
    by xmalloc (util.c:112)
    by ofpbuf_init (ofpbuf.c:124)
    by test_sflow_main (test-sflow.c:696)
    by ovstest_wrapper_test_sflow_main__ (test-sflow.c:786)
    by ovs_cmdl_run_command (command-line.c:121)
    by main (ovstest.c:132)

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Fix using uninitialized delete_reason.
Ilya Maximets [Thu, 24 Dec 2015 07:57:44 +0000 (10:57 +0300)]
ofproto: Fix using uninitialized delete_reason.

replace_rule_finish() makes decision using uninitialized
for intrenal flow fm->delete_reason.
Reported by valgrind for test cases 886, 942 and 943.

 Conditional jump or move depends on uninitialised value(s)
    at rule_insert (ofproto-dpif.c:4134)
    by replace_rule_finish (ofproto.c:4831)
    by add_flow_finish (ofproto.c:4661)
    by modify_flows_finish (ofproto.c:4994)
    by ofproto_flow_mod_finish (ofproto.c:6821)
    by handle_flow_mod__ (ofproto.c:5323)
    by ofproto_dpif_add_internal_flow (ofproto-dpif.c:5680)
    by add_internal_miss_flow (ofproto-dpif.c:1385)
    by add_internal_flows (ofproto-dpif.c:1412)
    by construct (ofproto-dpif.c:1367)
    by ofproto_create (ofproto.c:577)
    by bridge_reconfigure (bridge.c:633)
    by bridge_run (bridge.c:2975)
    by main (ovs-vswitchd.c:120)
  Uninitialised value was created by a stack allocation
    at ofproto_dpif_add_internal_flow (ofproto-dpif.c:5658)

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-dev.py: Fix libcap-ng-dev dependency.
Joe Stringer [Wed, 23 Dec 2015 22:16:09 +0000 (14:16 -0800)]
ovs-dev.py: Fix libcap-ng-dev dependency.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-ofctl: Document arp_op match field.
Ben Pfaff [Wed, 23 Dec 2015 21:20:02 +0000 (13:20 -0800)]
ovs-ofctl: Document arp_op match field.

Reported-by: ZHANG Zhiming <zhangzhiming@yunshan.net.cn>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agodatapath: ip4_dst_hoplimit compat code is needed prior to v2.6.38
Simon Horman [Fri, 18 Dec 2015 04:50:01 +0000 (20:50 -0800)]
datapath: ip4_dst_hoplimit compat code is needed prior to v2.6.38

ip4_dst_hoplimit was introduced in v2.6.38 rather than v2.6.39.

Fixes: e23775f20e1a ("datapath: Add support for lwtunnel")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agoMakefile: Mark non-file targets as .PHONY.
Yin Lin [Wed, 23 Dec 2015 21:18:29 +0000 (13:18 -0800)]
Makefile: Mark non-file targets as .PHONY.

Some lately added targets (ovsext_make and thread-safety-check) are not
files but were not marked as .PHONY. This causes them to be rebuilt
unnecessarily during "make check" and "make install" process.

Signed-off-by: Yin Lin <linyi@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodebian: Remove old PKI directory migration code
Ansis Atteka [Wed, 23 Dec 2015 02:23:42 +0000 (18:23 -0800)]
debian: Remove old PKI directory migration code

Open vSwitch 1.3 and older was creating certificates and private
key in /usr/share/openvswitch/pki.  However, since PKI directory
is mutable, then this was considered a bug and PKI directory was
moved to /var directory in Open vSwitch 1.4 Commit 14bd2d51 (debian:
Move PKI directory to FHS-compliant location.)

Note, that Ubuntu 12.04 already was shipping with Open vSwitch 1.4
and should have created (in case of fresh install) or moved (in
case of upgrade from Open vSwitch 1.3) this directory to the right
location.

So I am inclined to remove this code because the only reason for it
to exist would be, if someone would be upgrading from Open vSwitch
1.3 or older version directly to 2.5 without using any intermediary
upgrade releases.

Signed-Off-By: Ansis Atteka <aatteka@nicira.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovsdb-server: Fix memory leak using perf counter without initialization.
William Tu [Wed, 23 Dec 2015 18:58:15 +0000 (10:58 -0800)]
ovsdb-server: Fix memory leak using perf counter without initialization.

perf_counter_accumulate() is invoked without perf_counters_init() being
called first, which leads to a memory leak reported by Valgrind (test
cases 104, 106, and 107). A call trace is below:
    xmalloc (util.c:112)
    shash_add_nocopy__ (shash.c:109)
    shash_add_nocopy (shash.c:121)
    shash_add (shash.c:129)
    shash_add_once (shash.c:136)
    shash_add_assert (shash.c:146)
    perf_counter_init (perf-counter.c:86)
    perf_counter_accumulate (perf-counter.c:95)
    ovsdb_txn_commit (transaction.c:850)
    ovsdb_file_open__ (file.c:217)
    open_db (ovsdb-server.c:418)
    main (ovsdb-server.c:263)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agosystem-traffic: Skip all vxlan tests if unsupported.
Joe Stringer [Wed, 23 Dec 2015 00:47:26 +0000 (16:47 -0800)]
system-traffic: Skip all vxlan tests if unsupported.

The vxlan tests require a new enough 'ip' tool to configure native VXLAN
tunnels on the host kernel (as well as a new enough kernel). If this
isn't available, simply skip the test. This commit makes the cases where
this is checked consistent.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agodatapath-windows: Reduce padding size in _OVS_PACKET_HDR_INFO.
Nithin Raju [Mon, 7 Dec 2015 23:13:03 +0000 (15:13 -0800)]
datapath-windows: Reduce padding size in _OVS_PACKET_HDR_INFO.

Fixes: efee3309 ("datapath-windows: Support for OVS_KEY_ATTR_SCTP attribute")
Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
8 years agoofp-actions: Add padding in ofpacts_pull_openflow_instructions()
William Tu [Fri, 11 Dec 2015 01:58:15 +0000 (17:58 -0800)]
ofp-actions: Add padding in ofpacts_pull_openflow_instructions()

ofpacts_pull_openflow_instructions() should fill 'ofpacts' with a list
of OpenFlow actions and each action (including the last one) should be
padded to OFP_ACTION_ALIGN(8) bytes.

In most of the cases this is taken care of (e.g. by ofpacts_decode), but
for the Goto-Table instruction (and Clear-Actions, based on a quick code
inspection), this wasn't the case.

This caused the copy operation in recirc_unroll_actions() to read two
extra bytes after an allocated area (not a big deal, but enough to
displease the AddressSanitizer).

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Remove flows from all tables upon group deletion.
Zoltán Balogh [Wed, 23 Dec 2015 01:10:40 +0000 (17:10 -0800)]
ofproto: Remove flows from all tables upon group deletion.

When a group is deleted, all flows which include a Group action with the ID
of the deleted group should be removed.  Until now, only flows in table 0
were removed.  This fixes the problem.

Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
[blp@ovn.org added a test]
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodpif-netdev: Avoid using uninitialized memory with tunnel options.
Jesse Gross [Wed, 9 Dec 2015 20:55:17 +0000 (12:55 -0800)]
dpif-netdev: Avoid using uninitialized memory with tunnel options.

When handling an upcall with the userspace datapath, it's currently
possible for a flow from a packet with no tunnel options to come back
with matches on the options. If that happens, dpif-netdev will
attempt to translate the wildcards provided by ofproto into the format
used by dpif. The translation requires use of the original wildcards
from the flow, which since they didn't exist, is uninitalized memory.

Matching on fields which don't actually exist is itself a bug. However,
this can occur when we attempt to set a tunnel option on the packet -
ofproto generates a match on the field in the original packet. This is
being fixed separately.

In other situations where we have a match on an unexpected field, we
simply ignore it. This happens with tunnel options with the kernel
datapath, non-tunnel fields that don't exist in the packet, and even
with Geneve where we do have some options but not the particular one
that was matched on. This brings the same behavior for this case and
avoids the possibility of accessing uninitialized memory.

Reported-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agorhel: Add support DPDK port creation via network scripts
Panu Matilainen [Tue, 1 Dec 2015 14:48:04 +0000 (16:48 +0200)]
rhel: Add support DPDK port creation via network scripts

Add support for creating a userspace bridge and the four DPDK port
types via network scripts + basic documentation.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev_dpdk: pci_dev pointer check.
mweglicx [Thu, 3 Dec 2015 07:30:16 +0000 (23:30 -0800)]
netdev_dpdk: pci_dev pointer check.

This change prevents netdev_dpdk from accessing pointer
which is not valid.

Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agotun-metadata: Fix memory leak in table_free()
William Tu [Tue, 22 Dec 2015 17:44:14 +0000 (09:44 -0800)]
tun-metadata: Fix memory leak in table_free()

Found by valgrind, test case 643.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
8 years agotypes: Define OVS_*128_MAX statically.
Joe Stringer [Mon, 21 Dec 2015 23:56:40 +0000 (15:56 -0800)]
types: Define OVS_*128_MAX statically.

The previous definitions of these variables using designated
initializers caused a variety of issues when attempting to compile with
MSVC, particularly if including these headers from C++ code. By defining
them like this, we can appease MSVC and keep the definitions the same on
all platforms.

VMware-BZ: #1517163
Suggested-by: Yin Lin <linyi@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agosystem-kmod-macros: Do not require the 'conntrack' tool.
Daniele Di Proietto [Mon, 2 Nov 2015 22:44:30 +0000 (14:44 -0800)]
system-kmod-macros: Do not require the 'conntrack' tool.

We can use 'ovstest test-netlink-conntrack' instead.  Now that it is
not required anymore, we can remove the HAVE_CONNTRACK macro in the
build system.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agosystem-traffic: use `dpctl/*conntrack` instead of `conntrack` tool.
Daniele Di Proietto [Mon, 2 Nov 2015 22:24:54 +0000 (14:24 -0800)]
system-traffic: use `dpctl/*conntrack` instead of `conntrack` tool.

Often in the tests we inspect the conntrack tables with the 'conntrack'
command line utility.  Since this may not always be available, and since
these tests are supposed to run with the upcoming userspace connection
tracker, it is better to use the newly implemented dpctl command.

Due to the tcp state mapping done in tcp_state_coalesce(), SYN_RECV is
replaced by ESTABLISHED in four places in the testsuite.  The rest of
the changes are just done to match the formatting style.

Also, check the conntrack entries for the IPv6 HTTP test.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoovstest: Add test-netlink-conntrack command.
Daniele Di Proietto [Thu, 29 Oct 2015 18:00:38 +0000 (11:00 -0700)]
ovstest: Add test-netlink-conntrack command.

Add a new test module to help debug Linux kernel conntrack development
unsing the netlink-conntrack module.

The tool has three uses:

* `ovstest test-netlink-conntrack dump [zone=zone]`

  shows a list of the connection table

* `ovstest test-netlink-conntrack monitor`

  displays the updates on the connection table, until killed with Ctrl-C

* `ovstest test-netlink-conntrack flush [zone=zone]`

  empties connection (and therefore expectations table).

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodpctl: Add new 'flush-conntrack' command.
Daniele Di Proietto [Wed, 28 Oct 2015 17:34:52 +0000 (10:34 -0700)]
dpctl: Add new 'flush-conntrack' command.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodpif-netlink: Implement ct_flush.
Daniele Di Proietto [Wed, 28 Oct 2015 17:34:26 +0000 (10:34 -0700)]
dpif-netlink: Implement ct_flush.

This member function is used by the ct-dpif module to provide its
services.  It's implemented using the netlink-conntrack module.

N.B. The Linux kernel datapaths share the connection tracker among them
and with the rest of the system.  Therefore the operations are not
really dpif specific.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodpctl: Add 'conntrack-dump' command.
Daniele Di Proietto [Wed, 28 Oct 2015 18:38:00 +0000 (11:38 -0700)]
dpctl: Add 'conntrack-dump' command.

It can be used to inspect the connection tracking entries in the
datapath.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodpif-netlink: Implement ct_dump_{start,next,done}.
Daniele Di Proietto [Wed, 28 Oct 2015 18:26:18 +0000 (11:26 -0700)]
dpif-netlink: Implement ct_dump_{start,next,done}.

These member functions are used by the ct-dpif module to provide its
services.  They're implemented using the netlink-conntrack module.

N.B. The Linux kernel datapaths share the connection tracker among them
and with the rest of the system.  Therefore the operations are not
really dpif specific.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoct-dpif: Add ct_dpif_flush().
Daniele Di Proietto [Wed, 28 Oct 2015 17:32:32 +0000 (10:32 -0700)]
ct-dpif: Add ct_dpif_flush().

This function will flush the connection tracking tables of a specific
datapath.

It simply calls a function pointer in the dpif_class. No dpif
currently implements the required interface.

The next commits will provide an implementation in dpif-netlink.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoct-dpif: Add ct_dpif_dump_{start,next,done}().
Daniele Di Proietto [Wed, 28 Oct 2015 18:24:25 +0000 (11:24 -0700)]
ct-dpif: Add ct_dpif_dump_{start,next,done}().

These function can be used to dump conntrack entries from a datapath.

They simply call a function pointer in the dpif_class. No dpif currently
implements the interface.

The next commits will provide an implementation in dpif-netlink.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agonetlink-conntrack: New module.
Daniele Di Proietto [Tue, 3 Nov 2015 21:52:44 +0000 (13:52 -0800)]
netlink-conntrack: New module.

This module uses the netlink interface provide by the Linux kernel
connection tracker to provide some visibility into the conntrack tables.

The module provides functions to:

* Convert a netlink representation of a connection into a
  struct 'ct_dpif_entry'.

* Dump all the connections.

* Flush all the connections.

* Listen for updates by registering a netlink notifier.

It will be used by dpif-netlink to implement the interface required by
the ct-dpif module.

Based on original work by Jarno Rajahalme

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoct-dpif: New module.
Daniele Di Proietto [Tue, 3 Nov 2015 23:00:03 +0000 (15:00 -0800)]
ct-dpif: New module.

This defines some structures (and their related formatting functions) to
manipulate entries in connection tracking tables.

It will be used by next commits.

Based on original work by Jarno Rajahalme

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodatapath: Backport: openvswitch: Fix serialization of non-masked set actions.
Pravin B Shelar [Mon, 21 Dec 2015 22:57:36 +0000 (14:57 -0800)]
datapath: Backport: openvswitch: Fix serialization of non-masked set actions.

I found this missing commit while checking diff against upstream OVS.

Upstream Commit msg:
    Set actions consist of a regular OVS_KEY_ATTR_* attribute nested inside
    of a OVS_ACTION_ATTR_SET action attribute. When converting masked actions
    back to regular set actions, the inner attribute length was not changed,
    ie, double the length being serialized. This patch fixes the bug.

    Fixes: 83d2b9b ("net: openvswitch: Support masked set actions.")
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: f4f8e738505 ("openvswitch: Fix serialization of non-masked set
actions")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodatapath: stt: Fix device list management.
Pravin B Shelar [Mon, 21 Dec 2015 01:05:24 +0000 (17:05 -0800)]
datapath: stt: Fix device list management.

STT receive can accept packet on device which is not UP state.
Following patch fixes this issue by introducing another list
of devices which contains only devices in up state. This list can
be used for searching stt devices on packet receive.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agostream-ssl: Fix misleading bound address format.
Ben Pfaff [Sat, 19 Dec 2015 06:09:57 +0000 (22:09 -0800)]
stream-ssl: Fix misleading bound address format.

When the SSL code presents the name of the address to which it is bound,
it should include an "ssl:" or "pssl:" prefix instead of "tcp:" or "ptcp:".

Reported-by: meishengxin <meishengxin@huawei.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2015-December/019694.html
Fixes: e731d71bf47b ("Add IPv6 support for OpenFlow, OVSDB, NetFlow, and sFlow.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agodatapath: stt: Fix error handling in stt_start().
Pravin B Shelar [Sun, 20 Dec 2015 06:21:56 +0000 (22:21 -0800)]
datapath: stt: Fix error handling in stt_start().

The bug was reported by Joe Stringer.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: stt: Do not access stt_dev socket in lookup.
Pravin B Shelar [Sun, 20 Dec 2015 03:19:22 +0000 (19:19 -0800)]
datapath: stt: Do not access stt_dev socket in lookup.

STT device is added to the device list at device create time. and
the dev socket is initialized when dev is UP. So avoid accessing
stt socket while searching a device.

---8<---
IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
Oops: 0000 [#1] PREEMPT SMP
Hardware name: VMware, Inc. VMware Virtual Platform/440BX
RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
Stack:
 ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
 ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
 ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
Call Trace:
 <IRQ>
 [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
 [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
 [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
 [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
 [<ffffffff90647398>] ip_rcv+0x298/0x3d0
 [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
 [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
 [<ffffffff90610b3e>] process_backlog+0xae/0x180
 [<ffffffff906102c2>] net_rx_action+0x152/0x270
 [<ffffffff9006d625>] __do_softirq+0xf5/0x320
 [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30

Reported-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Joe Stringer <joe@ovn.org>