cascardo/ovs.git
8 years agonetdev-bsd: Destroy mutex on netdev_bsd_construct_system() error path.
xushengping [Thu, 24 Dec 2015 07:50:47 +0000 (15:50 +0800)]
netdev-bsd: Destroy mutex on netdev_bsd_construct_system() error path.

Signed-off-by: xushengping <shengping.xu@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofp-print: Fix memory leak at ofp_print_bundle_add().
William Tu [Thu, 24 Dec 2015 18:28:40 +0000 (10:28 -0800)]
ofp-print: Fix memory leak at ofp_print_bundle_add().

Call ds_put_and_free_cstr instead of ds_put_cstr to free msg.
Reported by test cases: 325, 326
    ofp_print_bundle_add (ofp-print.c:3027)
    ofp_to_string__ (ofp-print.c:3410)
    ofp_to_string (ofp-print.c:3465)
    ofp_print (ofp-print.c:3497)
    ofctl_ofp_print (ovs-ofctl.c:3818)
    ovs_cmdl_run_command (command-line.c:121)
    main (ovs-ofctl.c:135)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
[blp@ovn.org simplified the code slightly]
Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agotest-sflow: Fix memory leak in main function.
Ilya Maximets [Thu, 24 Dec 2015 10:22:53 +0000 (13:22 +0300)]
test-sflow: Fix memory leak in main function.

Reported by valgrind on test case 886.

 912 (24 direct, 888 indirect) bytes in 1 blocks are definitely lost
    at malloc
    by xmalloc (util.c:112)
    by unixctl_server_create (unixctl.c:250)
    by test_sflow_main (test-sflow.c:688)
    by ovstest_wrapper_test_sflow_main__ (test-sflow.c:786)
    by ovs_cmdl_run_command (command-line.c:121)
    by main (ovstest.c:132)

 1,500 bytes in 1 blocks are definitely lost
    at malloc
    by xmalloc (util.c:112)
    by ofpbuf_init (ofpbuf.c:124)
    by test_sflow_main (test-sflow.c:696)
    by ovstest_wrapper_test_sflow_main__ (test-sflow.c:786)
    by ovs_cmdl_run_command (command-line.c:121)
    by main (ovstest.c:132)

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Fix using uninitialized delete_reason.
Ilya Maximets [Thu, 24 Dec 2015 07:57:44 +0000 (10:57 +0300)]
ofproto: Fix using uninitialized delete_reason.

replace_rule_finish() makes decision using uninitialized
for intrenal flow fm->delete_reason.
Reported by valgrind for test cases 886, 942 and 943.

 Conditional jump or move depends on uninitialised value(s)
    at rule_insert (ofproto-dpif.c:4134)
    by replace_rule_finish (ofproto.c:4831)
    by add_flow_finish (ofproto.c:4661)
    by modify_flows_finish (ofproto.c:4994)
    by ofproto_flow_mod_finish (ofproto.c:6821)
    by handle_flow_mod__ (ofproto.c:5323)
    by ofproto_dpif_add_internal_flow (ofproto-dpif.c:5680)
    by add_internal_miss_flow (ofproto-dpif.c:1385)
    by add_internal_flows (ofproto-dpif.c:1412)
    by construct (ofproto-dpif.c:1367)
    by ofproto_create (ofproto.c:577)
    by bridge_reconfigure (bridge.c:633)
    by bridge_run (bridge.c:2975)
    by main (ovs-vswitchd.c:120)
  Uninitialised value was created by a stack allocation
    at ofproto_dpif_add_internal_flow (ofproto-dpif.c:5658)

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-dev.py: Fix libcap-ng-dev dependency.
Joe Stringer [Wed, 23 Dec 2015 22:16:09 +0000 (14:16 -0800)]
ovs-dev.py: Fix libcap-ng-dev dependency.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-ofctl: Document arp_op match field.
Ben Pfaff [Wed, 23 Dec 2015 21:20:02 +0000 (13:20 -0800)]
ovs-ofctl: Document arp_op match field.

Reported-by: ZHANG Zhiming <zhangzhiming@yunshan.net.cn>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agodatapath: ip4_dst_hoplimit compat code is needed prior to v2.6.38
Simon Horman [Fri, 18 Dec 2015 04:50:01 +0000 (20:50 -0800)]
datapath: ip4_dst_hoplimit compat code is needed prior to v2.6.38

ip4_dst_hoplimit was introduced in v2.6.38 rather than v2.6.39.

Fixes: e23775f20e1a ("datapath: Add support for lwtunnel")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agoMakefile: Mark non-file targets as .PHONY.
Yin Lin [Wed, 23 Dec 2015 21:18:29 +0000 (13:18 -0800)]
Makefile: Mark non-file targets as .PHONY.

Some lately added targets (ovsext_make and thread-safety-check) are not
files but were not marked as .PHONY. This causes them to be rebuilt
unnecessarily during "make check" and "make install" process.

Signed-off-by: Yin Lin <linyi@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodebian: Remove old PKI directory migration code
Ansis Atteka [Wed, 23 Dec 2015 02:23:42 +0000 (18:23 -0800)]
debian: Remove old PKI directory migration code

Open vSwitch 1.3 and older was creating certificates and private
key in /usr/share/openvswitch/pki.  However, since PKI directory
is mutable, then this was considered a bug and PKI directory was
moved to /var directory in Open vSwitch 1.4 Commit 14bd2d51 (debian:
Move PKI directory to FHS-compliant location.)

Note, that Ubuntu 12.04 already was shipping with Open vSwitch 1.4
and should have created (in case of fresh install) or moved (in
case of upgrade from Open vSwitch 1.3) this directory to the right
location.

So I am inclined to remove this code because the only reason for it
to exist would be, if someone would be upgrading from Open vSwitch
1.3 or older version directly to 2.5 without using any intermediary
upgrade releases.

Signed-Off-By: Ansis Atteka <aatteka@nicira.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovsdb-server: Fix memory leak using perf counter without initialization.
William Tu [Wed, 23 Dec 2015 18:58:15 +0000 (10:58 -0800)]
ovsdb-server: Fix memory leak using perf counter without initialization.

perf_counter_accumulate() is invoked without perf_counters_init() being
called first, which leads to a memory leak reported by Valgrind (test
cases 104, 106, and 107). A call trace is below:
    xmalloc (util.c:112)
    shash_add_nocopy__ (shash.c:109)
    shash_add_nocopy (shash.c:121)
    shash_add (shash.c:129)
    shash_add_once (shash.c:136)
    shash_add_assert (shash.c:146)
    perf_counter_init (perf-counter.c:86)
    perf_counter_accumulate (perf-counter.c:95)
    ovsdb_txn_commit (transaction.c:850)
    ovsdb_file_open__ (file.c:217)
    open_db (ovsdb-server.c:418)
    main (ovsdb-server.c:263)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agosystem-traffic: Skip all vxlan tests if unsupported.
Joe Stringer [Wed, 23 Dec 2015 00:47:26 +0000 (16:47 -0800)]
system-traffic: Skip all vxlan tests if unsupported.

The vxlan tests require a new enough 'ip' tool to configure native VXLAN
tunnels on the host kernel (as well as a new enough kernel). If this
isn't available, simply skip the test. This commit makes the cases where
this is checked consistent.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agodatapath-windows: Reduce padding size in _OVS_PACKET_HDR_INFO.
Nithin Raju [Mon, 7 Dec 2015 23:13:03 +0000 (15:13 -0800)]
datapath-windows: Reduce padding size in _OVS_PACKET_HDR_INFO.

Fixes: efee3309 ("datapath-windows: Support for OVS_KEY_ATTR_SCTP attribute")
Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
8 years agoofp-actions: Add padding in ofpacts_pull_openflow_instructions()
William Tu [Fri, 11 Dec 2015 01:58:15 +0000 (17:58 -0800)]
ofp-actions: Add padding in ofpacts_pull_openflow_instructions()

ofpacts_pull_openflow_instructions() should fill 'ofpacts' with a list
of OpenFlow actions and each action (including the last one) should be
padded to OFP_ACTION_ALIGN(8) bytes.

In most of the cases this is taken care of (e.g. by ofpacts_decode), but
for the Goto-Table instruction (and Clear-Actions, based on a quick code
inspection), this wasn't the case.

This caused the copy operation in recirc_unroll_actions() to read two
extra bytes after an allocated area (not a big deal, but enough to
displease the AddressSanitizer).

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Remove flows from all tables upon group deletion.
Zoltán Balogh [Wed, 23 Dec 2015 01:10:40 +0000 (17:10 -0800)]
ofproto: Remove flows from all tables upon group deletion.

When a group is deleted, all flows which include a Group action with the ID
of the deleted group should be removed.  Until now, only flows in table 0
were removed.  This fixes the problem.

Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
[blp@ovn.org added a test]
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodpif-netdev: Avoid using uninitialized memory with tunnel options.
Jesse Gross [Wed, 9 Dec 2015 20:55:17 +0000 (12:55 -0800)]
dpif-netdev: Avoid using uninitialized memory with tunnel options.

When handling an upcall with the userspace datapath, it's currently
possible for a flow from a packet with no tunnel options to come back
with matches on the options. If that happens, dpif-netdev will
attempt to translate the wildcards provided by ofproto into the format
used by dpif. The translation requires use of the original wildcards
from the flow, which since they didn't exist, is uninitalized memory.

Matching on fields which don't actually exist is itself a bug. However,
this can occur when we attempt to set a tunnel option on the packet -
ofproto generates a match on the field in the original packet. This is
being fixed separately.

In other situations where we have a match on an unexpected field, we
simply ignore it. This happens with tunnel options with the kernel
datapath, non-tunnel fields that don't exist in the packet, and even
with Geneve where we do have some options but not the particular one
that was matched on. This brings the same behavior for this case and
avoids the possibility of accessing uninitialized memory.

Reported-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agorhel: Add support DPDK port creation via network scripts
Panu Matilainen [Tue, 1 Dec 2015 14:48:04 +0000 (16:48 +0200)]
rhel: Add support DPDK port creation via network scripts

Add support for creating a userspace bridge and the four DPDK port
types via network scripts + basic documentation.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev_dpdk: pci_dev pointer check.
mweglicx [Thu, 3 Dec 2015 07:30:16 +0000 (23:30 -0800)]
netdev_dpdk: pci_dev pointer check.

This change prevents netdev_dpdk from accessing pointer
which is not valid.

Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agotun-metadata: Fix memory leak in table_free()
William Tu [Tue, 22 Dec 2015 17:44:14 +0000 (09:44 -0800)]
tun-metadata: Fix memory leak in table_free()

Found by valgrind, test case 643.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
8 years agotypes: Define OVS_*128_MAX statically.
Joe Stringer [Mon, 21 Dec 2015 23:56:40 +0000 (15:56 -0800)]
types: Define OVS_*128_MAX statically.

The previous definitions of these variables using designated
initializers caused a variety of issues when attempting to compile with
MSVC, particularly if including these headers from C++ code. By defining
them like this, we can appease MSVC and keep the definitions the same on
all platforms.

VMware-BZ: #1517163
Suggested-by: Yin Lin <linyi@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agosystem-kmod-macros: Do not require the 'conntrack' tool.
Daniele Di Proietto [Mon, 2 Nov 2015 22:44:30 +0000 (14:44 -0800)]
system-kmod-macros: Do not require the 'conntrack' tool.

We can use 'ovstest test-netlink-conntrack' instead.  Now that it is
not required anymore, we can remove the HAVE_CONNTRACK macro in the
build system.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agosystem-traffic: use `dpctl/*conntrack` instead of `conntrack` tool.
Daniele Di Proietto [Mon, 2 Nov 2015 22:24:54 +0000 (14:24 -0800)]
system-traffic: use `dpctl/*conntrack` instead of `conntrack` tool.

Often in the tests we inspect the conntrack tables with the 'conntrack'
command line utility.  Since this may not always be available, and since
these tests are supposed to run with the upcoming userspace connection
tracker, it is better to use the newly implemented dpctl command.

Due to the tcp state mapping done in tcp_state_coalesce(), SYN_RECV is
replaced by ESTABLISHED in four places in the testsuite.  The rest of
the changes are just done to match the formatting style.

Also, check the conntrack entries for the IPv6 HTTP test.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoovstest: Add test-netlink-conntrack command.
Daniele Di Proietto [Thu, 29 Oct 2015 18:00:38 +0000 (11:00 -0700)]
ovstest: Add test-netlink-conntrack command.

Add a new test module to help debug Linux kernel conntrack development
unsing the netlink-conntrack module.

The tool has three uses:

* `ovstest test-netlink-conntrack dump [zone=zone]`

  shows a list of the connection table

* `ovstest test-netlink-conntrack monitor`

  displays the updates on the connection table, until killed with Ctrl-C

* `ovstest test-netlink-conntrack flush [zone=zone]`

  empties connection (and therefore expectations table).

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodpctl: Add new 'flush-conntrack' command.
Daniele Di Proietto [Wed, 28 Oct 2015 17:34:52 +0000 (10:34 -0700)]
dpctl: Add new 'flush-conntrack' command.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodpif-netlink: Implement ct_flush.
Daniele Di Proietto [Wed, 28 Oct 2015 17:34:26 +0000 (10:34 -0700)]
dpif-netlink: Implement ct_flush.

This member function is used by the ct-dpif module to provide its
services.  It's implemented using the netlink-conntrack module.

N.B. The Linux kernel datapaths share the connection tracker among them
and with the rest of the system.  Therefore the operations are not
really dpif specific.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodpctl: Add 'conntrack-dump' command.
Daniele Di Proietto [Wed, 28 Oct 2015 18:38:00 +0000 (11:38 -0700)]
dpctl: Add 'conntrack-dump' command.

It can be used to inspect the connection tracking entries in the
datapath.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodpif-netlink: Implement ct_dump_{start,next,done}.
Daniele Di Proietto [Wed, 28 Oct 2015 18:26:18 +0000 (11:26 -0700)]
dpif-netlink: Implement ct_dump_{start,next,done}.

These member functions are used by the ct-dpif module to provide its
services.  They're implemented using the netlink-conntrack module.

N.B. The Linux kernel datapaths share the connection tracker among them
and with the rest of the system.  Therefore the operations are not
really dpif specific.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoct-dpif: Add ct_dpif_flush().
Daniele Di Proietto [Wed, 28 Oct 2015 17:32:32 +0000 (10:32 -0700)]
ct-dpif: Add ct_dpif_flush().

This function will flush the connection tracking tables of a specific
datapath.

It simply calls a function pointer in the dpif_class. No dpif
currently implements the required interface.

The next commits will provide an implementation in dpif-netlink.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoct-dpif: Add ct_dpif_dump_{start,next,done}().
Daniele Di Proietto [Wed, 28 Oct 2015 18:24:25 +0000 (11:24 -0700)]
ct-dpif: Add ct_dpif_dump_{start,next,done}().

These function can be used to dump conntrack entries from a datapath.

They simply call a function pointer in the dpif_class. No dpif currently
implements the interface.

The next commits will provide an implementation in dpif-netlink.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agonetlink-conntrack: New module.
Daniele Di Proietto [Tue, 3 Nov 2015 21:52:44 +0000 (13:52 -0800)]
netlink-conntrack: New module.

This module uses the netlink interface provide by the Linux kernel
connection tracker to provide some visibility into the conntrack tables.

The module provides functions to:

* Convert a netlink representation of a connection into a
  struct 'ct_dpif_entry'.

* Dump all the connections.

* Flush all the connections.

* Listen for updates by registering a netlink notifier.

It will be used by dpif-netlink to implement the interface required by
the ct-dpif module.

Based on original work by Jarno Rajahalme

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoct-dpif: New module.
Daniele Di Proietto [Tue, 3 Nov 2015 23:00:03 +0000 (15:00 -0800)]
ct-dpif: New module.

This defines some structures (and their related formatting functions) to
manipulate entries in connection tracking tables.

It will be used by next commits.

Based on original work by Jarno Rajahalme

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodatapath: Backport: openvswitch: Fix serialization of non-masked set actions.
Pravin B Shelar [Mon, 21 Dec 2015 22:57:36 +0000 (14:57 -0800)]
datapath: Backport: openvswitch: Fix serialization of non-masked set actions.

I found this missing commit while checking diff against upstream OVS.

Upstream Commit msg:
    Set actions consist of a regular OVS_KEY_ATTR_* attribute nested inside
    of a OVS_ACTION_ATTR_SET action attribute. When converting masked actions
    back to regular set actions, the inner attribute length was not changed,
    ie, double the length being serialized. This patch fixes the bug.

    Fixes: 83d2b9b ("net: openvswitch: Support masked set actions.")
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: f4f8e738505 ("openvswitch: Fix serialization of non-masked set
actions")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodatapath: stt: Fix device list management.
Pravin B Shelar [Mon, 21 Dec 2015 01:05:24 +0000 (17:05 -0800)]
datapath: stt: Fix device list management.

STT receive can accept packet on device which is not UP state.
Following patch fixes this issue by introducing another list
of devices which contains only devices in up state. This list can
be used for searching stt devices on packet receive.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agostream-ssl: Fix misleading bound address format.
Ben Pfaff [Sat, 19 Dec 2015 06:09:57 +0000 (22:09 -0800)]
stream-ssl: Fix misleading bound address format.

When the SSL code presents the name of the address to which it is bound,
it should include an "ssl:" or "pssl:" prefix instead of "tcp:" or "ptcp:".

Reported-by: meishengxin <meishengxin@huawei.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2015-December/019694.html
Fixes: e731d71bf47b ("Add IPv6 support for OpenFlow, OVSDB, NetFlow, and sFlow.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agodatapath: stt: Fix error handling in stt_start().
Pravin B Shelar [Sun, 20 Dec 2015 06:21:56 +0000 (22:21 -0800)]
datapath: stt: Fix error handling in stt_start().

The bug was reported by Joe Stringer.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: stt: Do not access stt_dev socket in lookup.
Pravin B Shelar [Sun, 20 Dec 2015 03:19:22 +0000 (19:19 -0800)]
datapath: stt: Do not access stt_dev socket in lookup.

STT device is added to the device list at device create time. and
the dev socket is initialized when dev is UP. So avoid accessing
stt socket while searching a device.

---8<---
IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
Oops: 0000 [#1] PREEMPT SMP
Hardware name: VMware, Inc. VMware Virtual Platform/440BX
RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
Stack:
 ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
 ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
 ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
Call Trace:
 <IRQ>
 [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
 [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
 [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
 [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
 [<ffffffff90647398>] ip_rcv+0x298/0x3d0
 [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
 [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
 [<ffffffff90610b3e>] process_backlog+0xae/0x180
 [<ffffffff906102c2>] net_rx_action+0x152/0x270
 [<ffffffff9006d625>] __do_softirq+0xf5/0x320
 [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30

Reported-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Joe Stringer <joe@ovn.org>
8 years agotun-metadata: Fix memory leak in tun_metadata_add_entry() corner case.
Ben Pfaff [Thu, 17 Dec 2015 07:32:54 +0000 (23:32 -0800)]
tun-metadata: Fix memory leak in tun_metadata_add_entry() corner case.

Found by valgrind.

Reported-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agocompat: Backport conntrack strictly to v3.10+.
Joe Stringer [Tue, 15 Dec 2015 19:24:34 +0000 (11:24 -0800)]
compat: Backport conntrack strictly to v3.10+.

The conntrack/ipfrag backport was previously not entirely consistent in
its include for versions 3.9 and 3.10. The intention was to build it for
all kernels 3.10 and newer, so fix the version checks.

Reported-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Simon Horman <simon.horman@netronome.com>
8 years agocompat: Always use own __ipv6_select_ident().
Joe Stringer [Tue, 15 Dec 2015 19:24:33 +0000 (11:24 -0800)]
compat: Always use own __ipv6_select_ident().

If the ip fragmentation backport is enabled, we should always use our
own {,__}ipv6_select_ident(). This fixes the following issue on some
v3.19 kernels:

datapath/linux/ip6_output.c:93:12: error: conflicting types for
‘__ipv6_select_ident’
 static u32 __ipv6_select_ident(struct net *net, u32 hashrnd,

Reported-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Simon Horman <simon.horman@netronome.com>
8 years agodatapath: stt: Use RCU API to update stt-dev list.
Pravin B Shelar [Thu, 17 Dec 2015 21:56:39 +0000 (13:56 -0800)]
datapath: stt: Use RCU API to update stt-dev list.

Following crash was reported for STT tunnel. I am not able to reproduce
it, But the usage of wrong list manipulation API is likely culprit.

---8<---
IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
Oops: 0000 [#1] PREEMPT SMP
Hardware name: VMware, Inc. VMware Virtual Platform/440BX
RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
Stack:
 ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
 ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
 ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
Call Trace:
 <IRQ>
 [<ffffffff90646d10>] ? ip_rcv_finish+0x350/0x350
 [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
 [<ffffffff90646d10>] ? ip_rcv_finish+0x350/0x350
 [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
 [<ffffffff90646d10>] ? ip_rcv_finish+0x350/0x350
 [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
 [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
 [<ffffffff90647398>] ip_rcv+0x298/0x3d0
 [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
 [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
 [<ffffffff90610b3e>] process_backlog+0xae/0x180
 [<ffffffff906102c2>] net_rx_action+0x152/0x270
 [<ffffffff9006d625>] __do_softirq+0xf5/0x320
 [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30

Reported-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agogeneve-map-rename: rename geneve-map to tlv-map.
Mengke Liu [Tue, 15 Dec 2015 18:47:50 +0000 (02:47 +0800)]
geneve-map-rename: rename geneve-map to tlv-map.

This patch renames the command name related with geneve-map to a more
generic name as following:
add-geneve-map -> add-tlv-map
del-geneve-map -> del-tlv-map
dump-geneve-map -> dump-tlv-map

It also renames the Geneve_table to tlv_table.

By doing this renaming, the NSH variable context header (the same TLV
format as Geneve) or other protocol can reuse the field tun_metadata<N>
in the future.

Signed-off-by: Mengke Liu <mengke.liu@intel.com>
Signed-off-by: Ricky Li <ricky.li@intel.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
8 years agolib: Use proper type cast to poison lists.
Joe Stringer [Tue, 15 Dec 2015 06:30:11 +0000 (22:30 -0800)]
lib: Use proper type cast to poison lists.

'struct ovs_list' comprises of two pointers to 'struct ovs_list'.
Use these in the cast rather than void*.

VMware-BZ: #1571356
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: remove ASSERT in OvsDoFlowLookupOutput()
Nithin Raju [Thu, 10 Dec 2015 19:16:51 +0000 (11:16 -0800)]
datapath-windows: remove ASSERT in OvsDoFlowLookupOutput()

We needed this ASSERT earlier to catch unexpected cases. This code seems
to be fairly stable, and we can remove the ASSERT.

It is annoying to be hitting this ASSERT while changing the internal
adapter properties.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoconfifugre: Fix broken sed calls in shell code.
Alin Serdean [Thu, 10 Dec 2015 22:18:51 +0000 (22:18 +0000)]
confifugre: Fix broken sed calls in shell code.

Commit 43000bc (openvswitch.m4: Portability improvement), which introduced
a portability improvement, also introduces two bugs.  This commit fixes
both bug, by adding the command for $SED 's' and changes to x86 for 32 bit
instead of x64.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: Cleanup Stt.c
Alin Serdean [Fri, 11 Dec 2015 14:59:07 +0000 (14:59 +0000)]
datapath-windows: Cleanup Stt.c

Remove double include for Flow.h and sort the includes alphabetically.
Also remove tabs.

Found by inspection.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
8 years agodatapath: compat: Block upstream ip_tunnels functions.
Pravin B Shelar [Fri, 11 Dec 2015 04:03:01 +0000 (20:03 -0800)]
datapath: compat: Block upstream ip_tunnels functions.

Since upstream and compat ip_tunnel structures are not same, we can not
use exported upstream functions.
Following patch blocks definitions which used ip_tunnel internal
structure. Function which do not depend on these structures are
allows by explicitly by defining it in the header files. e.g.
iptunnel_handle_offloads(), iptunnel_pull_header(). etc.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: define compat ip_tunnel_get_link_net()
Pravin B Shelar [Fri, 11 Dec 2015 04:03:00 +0000 (20:03 -0800)]
datapath: define compat ip_tunnel_get_link_net()

Same as ip_tunnel_get_iflink(), function ip_tunnel_get_link_net()
also depends on ip_tunnel structure. So this patch defines
compat implementation for same.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: define compat ip_tunnel_get_iflink()
Pravin B Shelar [Fri, 11 Dec 2015 04:02:59 +0000 (20:02 -0800)]
datapath: define compat ip_tunnel_get_iflink()

ip_tunnel_get_iflink() depends on ip_tunnel structure. But OVS
compat layer defines its own ip_tunnel structure which is not
compatible with all upstream kernel versions. Therefore we
can no use such function.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath-windows: Fix small bug in STT
Alin Serdean [Fri, 11 Dec 2015 20:54:05 +0000 (20:54 +0000)]
datapath-windows: Fix small bug in STT

Allow STT encapsulation to take place in the case we have a TCP payload
without LSO.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
8 years agodatapath-windows: Add GRE TEB support for windows datapath
Alin Serdean [Fri, 11 Dec 2015 19:18:25 +0000 (19:18 +0000)]
datapath-windows: Add GRE TEB support for windows datapath

This patch introduces the support for GRE TEB (trasparent ethernet bridging)
for the windows datapath.

The GRE support is based on http://tools.ietf.org/html/rfc2890, without
taking into account the GRE sequence, and it supports only the GRE protocol
type 6558 (trasparent ethernet bridging) like its linux counterpart.

Util.h: define the GRE pool tag
Vport.c/h: sort the includes alphabetically
           add the function OvsFindTunnelVportByPortType which searches the
           tunnelVportsArray for a given port type
Actions.c : sort the includes alphabetically
            call the GRE encapsulation / decapsulation functions when needed
Gre.c/h : add GRE type defines
          add initialization/cleanup functions
          add encapsulation / decapsulation functions with software offloads
          (hardware offloads will be added in a separate patch)
          support

Tested using: PSPING
              (https://technet.microsoft.com/en-us/sysinternals/psping.aspx)
              (ICMP, TCP, UDP) with various packet lengths
              IPERF3
              (https://iperf.fr/iperf-download.php)
              (TCP, UDP) with various options

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
8 years agoovn-controller: Add clarifying comment about main loop in binding_run().
Justin Pettit [Fri, 11 Dec 2015 01:56:22 +0000 (17:56 -0800)]
ovn-controller: Add clarifying comment about main loop in binding_run().

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agoovn: Fix ACLs for child logical ports.
Russell Bryant [Tue, 17 Nov 2015 22:00:06 +0000 (14:00 -0800)]
ovn: Fix ACLs for child logical ports.

The physical input flows for child logical ports (for the
container-in-a-VM use case, for example) did not set a conntrack zone
ID.  The previous code only allocated a zone ID for local VIFs and
missed doing it for child ports.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agodatapath: Backport: skbuff: Fix skb checksum partial check.
Pravin B Shelar [Thu, 10 Dec 2015 22:42:43 +0000 (14:42 -0800)]
datapath: Backport: skbuff: Fix skb checksum partial check.

This bug fix is not required for OVS use cases. But is it
nice to keep function consistent with upstream implementation.

Upstream commit:

    Earlier patch 6ae459bda tried to detect void ckecksum partial
    skb by comparing pull length to checksum offset. But it does
    not work for all cases since checksum-offset depends on
    updates to skb->data.

    Following patch fixes it by validating checksum start offset
    after skb-data pointer is updated. Negative value of checksum
    offset start means there is no need to checksum.

    Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull")
Reported-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 31b33dfb0a1 ("skbuff: Fix skb checksum partial check");
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: Fix STT packet receive handling.
Pravin B Shelar [Thu, 10 Dec 2015 22:19:56 +0000 (14:19 -0800)]
datapath: Fix STT packet receive handling.

STT reassembly can generate list of packets. But it was
handled as a single skb. Following patch fixes it.

Fixes: e23775f20 ("datapath: Add support for lwtunnel").
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoFAQ: Add entry about different datapaths features.
Daniele Di Proietto [Fri, 11 Dec 2015 00:15:11 +0000 (16:15 -0800)]
FAQ: Add entry about different datapaths features.

This is an easy way to keep track of the features supported by the
different datapaths.

Nithin helped filling the list for the Hyper-V port.

CC: Nithin Raju <nithin@vmware.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Nithin Raju <nithin@vmware.com>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agoodp-util: Correctly [de]serialize mask for ND attributes.
Daniele Di Proietto [Wed, 9 Dec 2015 02:39:18 +0000 (18:39 -0800)]
odp-util: Correctly [de]serialize mask for ND attributes.

When converting between ODP attributes and struct flow_wildcards, we
check that all the prerequisites are exact matched on the mask.

For ND(ICMPv6) attributes, an exact match on tp_src and tp_dst
(which in this context are the icmp type and code) shold look like
htons(0xff), not htons(0xffff).  Fix this in two places.

The consequences were that the ODP mask wouldn't include the ND
attributes and the flow would be deleted by the revalidation.

8 years agoodp-util: Return exact mask if netlink mask attribute is missing.
Daniele Di Proietto [Tue, 8 Dec 2015 01:30:25 +0000 (17:30 -0800)]
odp-util: Return exact mask if netlink mask attribute is missing.

In the ODP context an empty mask netlink attribute usually means that
the flow should be an exact match.

odp_flow_key_to_mask{,_udpif}() instead return a struct flow_wildcards
with matches only on recirc_id and vlan_tci.

A more appropriate behavior is to handle a missing (zero length) netlink
mask specially (like we do in userspace and Linux datapath) and create
an exact match flow_wildcards from the original flow.

This fixes a bug in revalidate_ukey(): every flow created with
megaflows disabled would be revalidated away, because the mask would
seem too generic. (Another possible fix would be to handle the special
case of a missing mask in revalidate_ukey(), but this seems a more
generic solution).

8 years agoodp-util: Commit ICMP set only for ICMP packets.
Daniele Di Proietto [Tue, 8 Dec 2015 23:44:51 +0000 (15:44 -0800)]
odp-util: Commit ICMP set only for ICMP packets.

commit_set_icmp_action() should do its job only if the packet is ICMP,
otherwise there will be two problems:

* A set ICMP action will be inserted in the ODP actions and the flow
  will be slow pathed.
* The tp_src and tp_dst field will be unwildcarded.

Normal TCP or UDP packets won't be impacted, because
commit_set_icmp_action() is called after commit_set_port_action() and it
will see the fields as already committed (TCP/UCP transport ports and ICMP
code/type are stored in the same members in struct flow).

MPLS packets though will hit the bug, causing a nonsensical set action
(which will end up zeroing the transport source port) and an invalid
mask to be generated.

The commit also alters an MPLS testcase to trigger the bug.

8 years agotnl-ports: Generate mask with correct prerequisites.
Daniele Di Proietto [Mon, 23 Nov 2015 23:37:46 +0000 (15:37 -0800)]
tnl-ports: Generate mask with correct prerequisites.

We should match on the transport ports only if the tunnel has a UDP
header.  It doesn't make sense to match on transport port for GRE
tunnels.

Also, to match on fragment bits we should use FLOW_NW_FRAG_MASK instead
of 0xFF.  FLOW_NW_FRAG_MASK is what we get if we convert to the ODP
netlink format and back.

Adding the correct masks in the tunnel router classifier helps in making
sure that the translation generates masks that respect prerequisites.

If the mask has some fields that do not respect prerequisites, the flow
will get deleted by revalidation, because translating to ODP format and
back will generate a more generic mask, which will be perceived as too
generic (compared with the one generated by the translation).

8 years agoofproto-dpif-xlate: Fix revalidation in execute_controller_action().
Daniele Di Proietto [Fri, 4 Dec 2015 22:04:26 +0000 (14:04 -0800)]
ofproto-dpif-xlate: Fix revalidation in execute_controller_action().

If there's no actual packet (e.g. during revalidation),
execute_controller_action() exits right away, without calling
xlate_commit_actions().

xlate_commit_actions() might have an influence on slow_path reason
(which is included in the generated ODP actions), meaning that the
revalidation will not generate the same actions than the original
translation.

Fix the problem by making execute_controller_action() call
xlate_commit_actions() even without a packet.

8 years agodpif-netdev: Initialize match.tun_md in various places.
Daniele Di Proietto [Sat, 21 Nov 2015 00:15:36 +0000 (16:15 -0800)]
dpif-netdev: Initialize match.tun_md in various places.

This solves a crash in dp_netdev_flow_add(), when log level is debug.

8 years agodatapath: Define nf_connlabels_{put,get}.
Joe Stringer [Wed, 9 Dec 2015 00:14:07 +0000 (16:14 -0800)]
datapath: Define nf_connlabels_{put,get}.

Previously this was only done when connlabels were enabled in the kernel
config, even if the functions didn't exist. Fix the compile error.

Fixes: b8cce81fa9a1 ("compat: Backport nf_connlabels_{get, put}().")
Reported-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodatapath: Respect conntrack zone even if invalid.
Joe Stringer [Wed, 9 Dec 2015 00:14:06 +0000 (16:14 -0800)]
datapath: Respect conntrack zone even if invalid.

If userspace executes ct(zone=1), and the connection tracker determines
that the packet is invalid, then the ct_zone flow key field is populated
with the default zone rather than the zone that was specified. Even
though connection tracking failed, this field should be updated with the
value that userspace specified. Fix the issue.

Fixes: a94ebc39996b ("datapath: Add conntrack action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agoovn: Fix ct_state bit mappings in OVN symtab.
Russell Bryant [Tue, 8 Dec 2015 22:32:47 +0000 (17:32 -0500)]
ovn: Fix ct_state bit mappings in OVN symtab.

The OVN symbol table contained outdated mappings between connection
states and the corresponding bit in the ct_state field.  This patch
updates the symbol table with the proper values as defined in
lib/packets.h.

Signed-off-by: Russell Bryant <russell@ovn.org>
Fixes: 63bc9fb1c69f ("packets: Reorder CS_* flags to remove gap.")
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoseq: Add a coverage counter for seq_change.
Jarno Rajahalme [Tue, 8 Dec 2015 19:35:49 +0000 (11:35 -0800)]
seq: Add a coverage counter for seq_change.

Having a coverage counter tracking the value of the internal seq_next
should help in debugging.

Suggested-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath: Backport: vxlan: interpret IP headers for ECN correctly
Pravin B Shelar [Tue, 8 Dec 2015 02:23:21 +0000 (18:23 -0800)]
datapath: Backport: vxlan: interpret IP headers for ECN correctly

Upstream commit:
    When looking for outer IP header, use the actual socket address family, not
    the address family of the default destination which is not set for metadata
    based interfaces (and doesn't have to match the address family of the
    received packet even if it was set).

    Fix also the misleading comment.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: ce212d0f6f5 ("vxlan: interpret IP headers for ECN correctly")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: Backport: vxlan: fix incorrect RCO bit in VXLAN header
Pravin B Shelar [Tue, 8 Dec 2015 02:23:20 +0000 (18:23 -0800)]
datapath: Backport: vxlan: fix incorrect RCO bit in VXLAN header

Upstream commit:
    Commit 3511494ce2f3d ("vxlan: Group Policy extension") changed definition of
    VXLAN_HF_RCO from 0x00200000 to BIT(24). This is obviously incorrect. It's
    also in violation with the RFC draft.

    Fixes: 3511494ce2f3d ("vxlan: Group Policy extension")
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: c5fb8caaf91 ("vxlan: fix incorrect RCO bit in VXLAN header")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: Backport: openvswitch: properly refcount vport-vxlan module
Pravin B Shelar [Tue, 8 Dec 2015 02:23:19 +0000 (18:23 -0800)]
datapath: Backport: openvswitch: properly refcount vport-vxlan module

Upstream commit:
    After 614732eaa12d, no refcount is maintained for the vport-vxlan module.
    This allows the userspace to remove such module while vport-vxlan
    devices still exist, which leads to later oops.

    v1 -> v2:
     - move vport 'owner' initialization in ovs_vport_ops_register()
       and make such function a macro

    Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 83e4bf7a74 ("openvswitch: properly refcount vport-vxlan
module").
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: Backport: openvswitch: fix hangup on vxlan/gre/geneve device deletion
Pravin B Shelar [Tue, 8 Dec 2015 02:23:18 +0000 (18:23 -0800)]
datapath: Backport: openvswitch: fix hangup on vxlan/gre/geneve device deletion

Upstream commit:

    Each openvswitch tunnel vport (vxlan,gre,geneve) holds a reference
    to the underlying tunnel device, but never released it when such
    device is deleted.
    Deleting the underlying device via the ip tool cause the kernel to
    hangup in the netdev_wait_allrefs() loop.
    This commit ensure that on device unregistration dp_detach_port_notify()
    is called for all vports that hold the device reference, properly
    releasing it.

    Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device")
    Fixes: b2acd1dc3949 ("openvswitch: Use regular GRE net_device instead of vport")
    Fixes: 6b001e682e90 ("openvswitch: Use Geneve device.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 131753030("openvswitch: fix hangup on vxlan/gre/geneve device
deletion").
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: Avoid warning for unused static data on Linux <=3.9.0.
Ben Pfaff [Mon, 7 Dec 2015 20:34:08 +0000 (12:34 -0800)]
datapath: Avoid warning for unused static data on Linux <=3.9.0.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agoNEWS: Improve organization.
Ben Pfaff [Tue, 8 Dec 2015 00:49:20 +0000 (16:49 -0800)]
NEWS: Improve organization.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agoofproto-dpif: add reply on error in ofproto/tnl-push-pop
Ilya Maximets [Mon, 7 Dec 2015 10:02:41 +0000 (13:02 +0300)]
ofproto-dpif: add reply on error in ofproto/tnl-push-pop

Fixes hang of 'ovs-appctl ofproto/tnl-push-pop' when an invalid
argument passed.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agobond: Use correct type for slave's change_seq.
Jarno Rajahalme [Fri, 4 Dec 2015 18:19:07 +0000 (10:19 -0800)]
bond: Use correct type for slave's change_seq.

seq values are 64-bit, and storing them to a 32-bit variable causes
the stored value never to match actual seq value after the seq value
gets big enough.

This is a likely cause of OVS main thread using 100% CPU in a system
using bonds after some runtime.

VMware-BZ: #1564993
Reported-by: Hiram Bayless <hbayless@vmware.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agotests: Add tunnel-push-pop-ipv6 tests
Thadeu Lima de Souza Cascardo [Fri, 4 Dec 2015 14:36:51 +0000 (12:36 -0200)]
tests: Add tunnel-push-pop-ipv6 tests

Based on IPv4 tests, test tunnels over IPv6. In order to do that, add
netdev-dummy/ip6addr command for dummy bridges, and get_in6 support for
netdev-dummy as well.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoAllow flow-based IPv6 tunnels to be configured with OpenFlow
Thadeu Lima de Souza Cascardo [Fri, 4 Dec 2015 14:36:50 +0000 (12:36 -0200)]
Allow flow-based IPv6 tunnels to be configured with OpenFlow

With this patch, it is possible to set the IPv6 source and destination address
in flow-based tunnels.

$ ovs-ofctl add-flow br0 "in_port=LOCAL actions=set_field:2001:cafe::92->tun_ipv6_dst"

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Co-authored-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agotnl-neigh-cache: Remove tnl_arp_lookup().
Thadeu Lima de Souza Cascardo [Fri, 4 Dec 2015 14:36:49 +0000 (12:36 -0200)]
tnl-neigh-cache: Remove tnl_arp_lookup().

tnl_arp_lookup is not used anymore. All users have been converted to
IPv4-mapped addresses. New users need to use IPv4-mapped addresses and use
tnl_neigh_lookup.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto-dpif-xlate: Support IPv6 when sending to tunnel
Thadeu Lima de Souza Cascardo [Fri, 4 Dec 2015 14:36:48 +0000 (12:36 -0200)]
ofproto-dpif-xlate: Support IPv6 when sending to tunnel

When doing push/pop and building tunnel header, do IPv6 route lookups and send
Neighbor Solicitations if needed.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Cc: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev-vport: Add IPv6 support for build/push/pop tunnel header
Thadeu Lima de Souza Cascardo [Fri, 4 Dec 2015 14:36:47 +0000 (12:36 -0200)]
netdev-vport: Add IPv6 support for build/push/pop tunnel header

This includes VXLAN, GRE and Geneve.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agopackets: Introduce in6_addr_mapped_ipv4() and use where appropriate.
Ben Pfaff [Thu, 3 Dec 2015 21:00:38 +0000 (13:00 -0800)]
packets: Introduce in6_addr_mapped_ipv4() and use where appropriate.

This allows code to be written more naturally in some cases.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
8 years agoovs-router: fix compile error on FreeBSD
Kevin Lo [Fri, 4 Dec 2015 15:31:40 +0000 (23:31 +0800)]
ovs-router: fix compile error on FreeBSD

FreeBSD needs to include netinet/in.h to define struct in6_addr.

Signed-off-by: Kevin Lo <kevlo@FreeBSD.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoRevert "conntrack: Add support for NAT."
Justin Pettit [Fri, 4 Dec 2015 07:51:59 +0000 (23:51 -0800)]
Revert "conntrack: Add support for NAT."

This reverts commit 9ac0aadab9f99c5f9cbe8b30cc095ce9be4be4e9.

NAT functionality is still evolving, so revert the feature in 2.5 as the
user-visiable interface will likely change before the next release.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
8 years agoPrepare for 2.5.0.
Justin Pettit [Fri, 4 Dec 2015 07:18:19 +0000 (23:18 -0800)]
Prepare for 2.5.0.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoovn-northd: Only run idl loop if something changed.
Joe Stringer [Fri, 4 Dec 2015 01:11:49 +0000 (17:11 -0800)]
ovn-northd: Only run idl loop if something changed.

Before refactoring the main loop to reuse ovsdb_idl_loop_* functions, we
would use a sequence to see if anything changed in NB database to
compute and notify the SB database, and vice versa. This logic got
dropped with the refactor, causing a testsuite failure in the ovn-sbctl
test. Reintroduce the IDL sequence number checking.

Fixes: 331e7aefe1c6 ("ovn-northd: Refactor main loop to use ovsdb_idl_loop_*
functions")
Suggested-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Tested-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoFAQ: Document kernel feature support.
Joe Stringer [Thu, 3 Dec 2015 07:53:56 +0000 (23:53 -0800)]
FAQ: Document kernel feature support.

Some recent features have more stringent requirements for kernel
versions than the FAQ describes. Add an entry to be more explicit on
which features work with which versions of the upstream kernel.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodatapath: Scrub skb between namespaces
Joe Stringer [Thu, 3 Dec 2015 07:53:55 +0000 (23:53 -0800)]
datapath: Scrub skb between namespaces

If OVS receives a packet from another namespace, then the packet should
be scrubbed. However, people have already begun to rely on the behaviour
that skb->mark is preserved across namespaces, so retain this one field.

This is mainly to address information leakage between namespaces when
using OVS internal ports, but by placing it in ovs_vport_receive() it is
more generally applicable, meaning it should not be overlooked if other
port types are allowed to be moved into namespaces in future.

Upstream: 740dbc289155 ("openvswitch: Scrub skb between namespaces")
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodatapath: Backport conntrack fixes.
Joe Stringer [Thu, 3 Dec 2015 07:53:54 +0000 (23:53 -0800)]
datapath: Backport conntrack fixes.

Backport the following fixes for conntrack from upstream.

9723e6abc70a openswitch: fix typo CONFIG_NF_CONNTRACK_LABEL
0d5cdef8d5dd openvswitch: Fix conntrack compilation without mark.
982b52700482 openvswitch: Fix mask generation for nested attributes.
cc5706056baa openvswitch: Fix IPv6 exthdr handling with ct helpers.
33db4125ec74 openvswitch: Rename LABEL->LABELS
b8f2257069f1 openvswitch: Fix skb leak in ovs_fragment()
ec0d043d05e6 openvswitch: Ensure flow is valid before executing ct
6f225952461b openvswitch: Reject ct_state unsupported bits
fbccce5965a5 openvswitch: Extend ct_state match field to 32 bits
ab38a7b5a449 openvswitch: Change CT_ATTR_FLAGS to CT_ATTR_COMMIT
9e384715e9e7 openvswitch: Reject ct_state masks for unknown bits
4f0909ee3d8e openvswitch: Mark connections new when not confirmed.
e754ec69ab69 openvswitch: Serialize nested ct actions if provided
74c16618137f openvswitch: Fix double-free on ip_defrag() errors
6f5cadee44d8 openvswitch: Fix skb leak using IPv6 defrag

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodatapath: Allow attaching helpers to ct action
Joe Stringer [Thu, 3 Dec 2015 07:53:53 +0000 (23:53 -0800)]
datapath: Allow attaching helpers to ct action

Add support for using conntrack helpers to assist protocol detection.
The new OVS_CT_ATTR_HELPER attribute of the CT action specifies a helper
to be used for this connection. If no helper is specified, then helpers
will be automatically applied as per the sysctl configuration of
net.netfilter.nf_conntrack_helper.

The helper may be specified as part of the conntrack action, eg:
ct(helper=ftp). Initial packets for related connections should be
committed to allow later packets for the flow to be considered
established.

Example ovs-ofctl flows allowing FTP connections from ports 1->2:
in_port=1,tcp,action=ct(helper=ftp,commit),2
in_port=2,tcp,ct_state=-trk,action=ct(recirc)
in_port=2,tcp,ct_state=+trk-new+est,action=1
in_port=2,tcp,ct_state=+trk+rel,action=1

Upstream: cae3a26 "openvswitch: Allow attaching helpers to ct action"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodatapath: Allow matching on conntrack label
Joe Stringer [Thu, 3 Dec 2015 07:53:52 +0000 (23:53 -0800)]
datapath: Allow matching on conntrack label

Allow matching and setting the ct_label field. As with ct_mark, this is
populated by executing the CT action. The label field may be modified by
specifying a label and mask nested under the CT action. It is stored as
metadata attached to the connection. Label modification occurs after
lookup, and will only persist when the conntrack entry is committed by
providing the COMMIT flag to the CT action. Labels are currently fixed
to 128 bits in size.

Upstream: c2ac667 "openvswitch: Allow matching on conntrack label"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodatapath: Allow matching on conntrack mark
Joe Stringer [Thu, 3 Dec 2015 07:53:51 +0000 (23:53 -0800)]
datapath: Allow matching on conntrack mark

Allow matching and setting the ct_mark field. As with ct_state and
ct_zone, these fields are populated when the CT action is executed. To
write to this field, a value and mask can be specified as a nested
attribute under the CT action. This data is stored with the conntrack
entry, and is executed after the lookup occurs for the CT action. The
conntrack entry itself must be committed using the COMMIT flag in the CT
action flags for this change to persist.

Upstream: 182e304 "openvswitch: Allow matching on conntrack mark"
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodatapath: Add conntrack action
Joe Stringer [Thu, 3 Dec 2015 07:53:50 +0000 (23:53 -0800)]
datapath: Add conntrack action

Expose the kernel connection tracker via OVS. Userspace components can
make use of the CT action to populate the connection state (ct_state)
field for a flow. This state can be subsequently matched.

Exposed connection states are OVS_CS_F_*:
- NEW (0x01) - Beginning of a new connection.
- ESTABLISHED (0x02) - Part of an existing connection.
- RELATED (0x04) - Related to an established connection.
- INVALID (0x20) - Could not track the connection for this packet.
- REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
- TRACKED (0x80) - This packet has been sent through conntrack.

When the CT action is executed by itself, it will send the packet
through the connection tracker and populate the ct_state field with one
or more of the connection state flags above. The CT action will always
set the TRACKED bit.

When the COMMIT flag is passed to the conntrack action, this specifies
that information about the connection should be stored. This allows
subsequent packets for the same (or related) connections to be
correlated with this connection. Sending subsequent packets for the
connection through conntrack allows the connection tracker to consider
the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.

The CT action may optionally take a zone to track the flow within. This
allows connections with the same 5-tuple to be kept logically separate
from connections in other zones. If the zone is specified, then the
"ct_zone" match field will be subsequently populated with the zone id.

IP fragments are handled by transparently assembling them as part of the
CT action. The maximum received unit (MRU) size is tracked so that
refragmentation can occur during output.

IP frag handling contributed by Andy Zhou.

Based on original design by Justin Pettit.

Upstream: 7f8a436 "openvswitch: Add conntrack action"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodatapath: Serialize acts with original netlink len
Joe Stringer [Thu, 3 Dec 2015 07:53:49 +0000 (23:53 -0800)]
datapath: Serialize acts with original netlink len

Previously, we used the kernel-internal netlink actions length to
calculate the size of messages to serialize back to userspace.
However,the sw_flow_actions may not be formatted exactly the same as the
actions on the wire, so store the original actions length when
de-serializing and re-use the original length when serializing.

Upstream: 8e2fed1 "openvswitch: Serialize acts with original netlink len"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agodatapath: Move MASKED* macros to datapath.h
Joe Stringer [Thu, 3 Dec 2015 07:53:48 +0000 (23:53 -0800)]
datapath: Move MASKED* macros to datapath.h

This will allow the ovs-conntrack code to reuse these macros.

Upstream: be26b9a "openvswitch: Move MASKED* macros to datapath.h"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agocompat: Backport IPv6 reassembly.
Joe Stringer [Thu, 3 Dec 2015 07:53:47 +0000 (23:53 -0800)]
compat: Backport IPv6 reassembly.

Backport IPv6 fragment reassembly from upstream commits in the Linux 4.3
development tree.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agocompat: Backport IPv6 fragmentation.
Joe Stringer [Thu, 3 Dec 2015 07:53:46 +0000 (23:53 -0800)]
compat: Backport IPv6 fragmentation.

IPv6 fragmentation functionality is not exported by most kernels, so
backport this code from the upstream 4.3 development tree.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agocompat: Backport IPv4 reassembly.
Joe Stringer [Thu, 3 Dec 2015 07:53:45 +0000 (23:53 -0800)]
compat: Backport IPv4 reassembly.

Backport IPv4 reassembly from the upstream commit caaecdd3d3f8 ("inet:
frags: remove INET_FRAG_EVICTED and use list_evictor for the test").

This is necessary because kernels prior to upstream commit d6b915e29f4a
("ip_fragment: don't forward defragmented DF packet") would not always
track the maximum received unit size during ip_defrag(). Without the
MRU, refragmentation cannot occur so reassembled packets are dropped.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agocompat: Wrap IPv4 fragmentation.
Joe Stringer [Thu, 3 Dec 2015 07:53:44 +0000 (23:53 -0800)]
compat: Wrap IPv4 fragmentation.

Most kernels provide some form of ip fragmentation. However, until
recently many of them would always send ICMP responses for over_MTU
packets, even when operating in bridge mode. Backport the check to
ensure this doesn't occur.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agocompat: Backport ip_skb_dst_mtu().
Joe Stringer [Thu, 3 Dec 2015 07:53:43 +0000 (23:53 -0800)]
compat: Backport ip_skb_dst_mtu().

>From upstream f87c10a8aa1e ("ipv4: introduce ip_dst_mtu_maybe_forward
and protect forwarding path against pmtu spoofing")

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agocompat: Backport dev_recursion_level().
Joe Stringer [Thu, 3 Dec 2015 07:53:42 +0000 (23:53 -0800)]
compat: Backport dev_recursion_level().

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agocompat: Backport prandom_u32_max().
Joe Stringer [Thu, 3 Dec 2015 07:53:41 +0000 (23:53 -0800)]
compat: Backport prandom_u32_max().

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agocompat: Backport 'dst' functions.
Joe Stringer [Thu, 3 Dec 2015 07:53:40 +0000 (23:53 -0800)]
compat: Backport 'dst' functions.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
8 years agocompat: Backport nf_connlabels_{get, put}().
Joe Stringer [Thu, 3 Dec 2015 07:53:39 +0000 (23:53 -0800)]
compat: Backport nf_connlabels_{get, put}().

This is a partial backport of Linux commit 86ca02e77408
"netfilter: connlabels: Export setting connlabel length".

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>