cascardo/ovs.git
9 years agodatapath: Set packet egress_tun_info.
Pravin B Shelar [Sun, 7 Sep 2014 22:18:07 +0000 (15:18 -0700)]
datapath: Set packet egress_tun_info.

packet execute is setting egress_tun_info in skb->cb, rather
than packet->cb. skb is netlink msg skb. This causes corruption
in netlink skb state stored in skb->cb (NETLINK_CB) which
results in following deadlock in netlink code.

=============================================
[ INFO: possible recursive locking detected ]
3.2.62 #2
---------------------------------------------
handler55/22851 is trying to acquire lock:
 (genl_mutex){+.+.+.}, at: [<ffffffff81471ad7>] genl_lock+0x17/0x20

but task is already holding lock:
 (genl_mutex){+.+.+.}, at: [<ffffffff81471ad7>] genl_lock+0x17/0x20

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(genl_mutex);
  lock(genl_mutex);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

1 lock held by handler55/22851:
 #0:  (genl_mutex){+.+.+.}, at: [<ffffffff81471ad7>] genl_lock+0x17/0x20

stack backtrace:
Pid: 22851, comm: handler55 Tainted: G           O 3.2.62 #2
Call Trace:
 [<ffffffff81097bb2>] print_deadlock_bug+0xf2/0x100
 [<ffffffff81099b99>] validate_chain+0x579/0x860
 [<ffffffff8109a17c>] __lock_acquire+0x2fc/0x4f0
 [<ffffffff8109aab0>] lock_acquire+0xa0/0x180
 [<ffffffff81519070>] __mutex_lock_common+0x60/0x420
 [<ffffffff8151959a>] mutex_lock_nested+0x4a/0x60
 [<ffffffff81471ad7>] genl_lock+0x17/0x20
 [<ffffffff81471af6>] genl_rcv+0x16/0x40
 [<ffffffff8146ff72>] netlink_unicast+0x2f2/0x310
 [<ffffffff81470159>] netlink_ack+0x109/0x1f0
 [<ffffffff8147030b>] netlink_rcv_skb+0xcb/0xd0
 [<ffffffff81471b05>] genl_rcv+0x25/0x40
 [<ffffffff8146ff72>] netlink_unicast+0x2f2/0x310
 [<ffffffff8147134c>] netlink_sendmsg+0x28c/0x3d0
 [<ffffffff8143375f>] sock_sendmsg+0xef/0x120
 [<ffffffff81435766>] ___sys_sendmsg+0x416/0x430
 [<ffffffff81435949>] __sys_sendmsg+0x49/0x90
 [<ffffffff814359a9>] sys_sendmsg+0x19/0x20
 [<ffffffff8152432b>] system_call_fastpath+0x16/0x1b

Reported-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agodatapath: distinguish between the dropped and consumed skb
Li RongQing [Sun, 7 Sep 2014 21:49:02 +0000 (14:49 -0700)]
datapath: distinguish between the dropped and consumed skb

distinguish between the dropped and consumed skb, not assume the skb
is consumed always

Cc: Thomas Graf <tgraf@noironetworks.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: fix panic with multiple vlan headers
Jiri Benc [Sun, 7 Sep 2014 21:36:01 +0000 (14:36 -0700)]
datapath: fix panic with multiple vlan headers

When there are multiple vlan headers present in a received frame, the
first one is put into vlan_tci and protocol is set to ETH_P_8021Q.
Anything in the skb beyond the VLAN TPID may be still non-linear,
including the inner TCI and ethertype. While ovs_flow_extract takes
care of IP and IPv6 headers, it does nothing with ETH_P_8021Q. Later,
if OVS_ACTION_ATTR_POP_VLAN is executed, __pop_vlan_tci pulls the
next vlan header into vlan_tci.

This leads to two things:

1. Part of the resulting ethernet header is in the non-linear part of
   the skb. When eth_type_trans is called later as the result of
   OVS_ACTION_ATTR_OUTPUT, kernel BUGs in __skb_pull. Also,
   __pop_vlan_tci is in fact accessing random data when it reads
   past the TPID.

2. network_header points into the ethernet header instead of behind it.
   mac_len is set to a wrong value (10), too.

Reported-by: Yulong Pei <ypei@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
I have dropped second change. Since it assumes inner mac header is of
ETH_HLEN len which is not always true.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: Implement recirc action without recursion
Andy Zhou [Mon, 11 Aug 2014 07:14:05 +0000 (00:14 -0700)]
datapath: Implement recirc action without recursion

Since kernel stack is limited in size, it is not wise to using
recursive function with large stack frames.

This patch provides an alternative implementation of recirc action
without using recursion.

A per CPU fixed sized, 'deferred action FIFO', is used to store either
recirc or sample actions encountered during execution of an action
list. Not executing recirc or sample action in place, but rather execute
them laster as 'deferred actions' avoids recursion.

Deferred actions are only executed after all other actions has been
executed, including the ones triggered by loopback from the kernel
network stack.

The size of the private FIFO, currently set to 20, limits the number
of total 'deferred actions' any one packet can accumulate.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: Remove recirc stack depth limit check
Andy Zhou [Fri, 15 Aug 2014 08:53:30 +0000 (01:53 -0700)]
datapath: Remove recirc stack depth limit check

Future patches will change the recirc action implementation to not
using recursion. The stack depth detection is no longer necessary.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoovs-numa: Add module description.
Alex Wang [Fri, 5 Sep 2014 06:17:34 +0000 (06:17 +0000)]
ovs-numa: Add module description.

Add a short description of the module and its assumption.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoovs-numa: Add function for getting numa node id from core id.
Alex Wang [Fri, 5 Sep 2014 06:17:33 +0000 (06:17 +0000)]
ovs-numa: Add function for getting numa node id from core id.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoovs-numa: Relax the ovs_numa_*() input argument check.
Alex Wang [Fri, 5 Sep 2014 06:17:32 +0000 (06:17 +0000)]
ovs-numa: Relax the ovs_numa_*() input argument check.

Many of the ovs_numa_*() functions abort the program when the
input cpu socket or core id is invalid.  This commit relaxes
the input check and makes these functions return OVS_*_UNSPEC
when the check fails.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoovs-numa: Replace name 'cpu_socket' with 'numa_node'.
Alex Wang [Fri, 5 Sep 2014 06:17:31 +0000 (06:17 +0000)]
ovs-numa: Replace name 'cpu_socket' with 'numa_node'.

'numa' and 'socket' are currently used interchangeably in ovs-numa.
But they are not always equivalent as some platform can have multiple
sockets on a numa node.  To avoid confusion, this commit renames all
the 'cpu_socket' to 'numa_node'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agocccl: Ability to enable compiler optimization.
Gurucharan Shetty [Thu, 28 Aug 2014 16:25:56 +0000 (09:25 -0700)]
cccl: Ability to enable compiler optimization.

MSVC has a '-O2' compiler optimization flag which makes code run
fast and is the recommended option for released code. For e.g.,
running "./tests/ovstest.exe test-cmap benchmark 1000000 3 1"
shows a 3x improvement for some cmap micro-benchmarks.

In the Visual Studio world, there is a concept of "release" build
(fast code, harder to debug) and a "debug" build (easier to debug).
The IDE provides this option and the IDE users expect something similar
for command line build.

So this commit, introduces a "--with-debug" configure option for Windows
and does not use '-O2' as a compiler option when specified. This can
be extended further if there are more compiler options that distinguish
a "release" build vs "debug" build.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
9 years agocccl: Enable ability to parallel build.
Gurucharan Shetty [Thu, 28 Aug 2014 16:20:21 +0000 (09:20 -0700)]
cccl: Enable ability to parallel build.

The /FS option allows serial access to PDB file creation letting
parallel builds succeed with mingw32-make (with some tricks). The
'make' that comes with MSYS has a bug that causes hangs with
parallel builds which supposedly has been fixed in the upcoming
1.0.19 release.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
9 years agoovs-atomics: Add atomic support Windows.
Gurucharan Shetty [Thu, 21 Aug 2014 20:57:37 +0000 (13:57 -0700)]
ovs-atomics: Add atomic support Windows.

Before this change (i.e., with pthread locks for atomics on Windows),
the benchmark for cmap and hmap was as follows:

$ ./tests/ovstest.exe test-cmap benchmark 10000000 3 1
Benchmarking with n=10000000, 3 threads, 1.00% mutations:
cmap insert:  61070 ms
cmap iterate:  2750 ms
cmap search:  14238 ms
cmap destroy:  8354 ms

hmap insert:   1701 ms
hmap iterate:   985 ms
hmap search:   3755 ms
hmap destroy:  1052 ms

After this change, the benchmark is as follows:
$ ./tests/ovstest.exe test-cmap benchmark 10000000 3 1
Benchmarking with n=10000000, 3 threads, 1.00% mutations:
cmap insert:   3666 ms
cmap iterate:   365 ms
cmap search:   2016 ms
cmap destroy:  1331 ms

hmap insert:   1495 ms
hmap iterate:  1026 ms
hmap search:   4167 ms
hmap destroy:  1046 ms

So there is clearly a big improvement for cmap.

But the correspondig test on Linux (with gcc 4.6) yeilds the following:

./tests/ovstest test-cmap benchmark 10000000 3 1
Benchmarking with n=10000000, 3 threads, 1.00% mutations:
cmap insert:   3917 ms
cmap iterate:   355 ms
cmap search:    871 ms
cmap destroy:  1158 ms

hmap insert:   1988 ms
hmap iterate:  1005 ms
hmap search:   5428 ms
hmap destroy:   980 ms

So for this particular test, except for "cmap search", Windows and
Linux have similar performance. Windows is around 2.5x slower in "cmap search"
compared to Linux. This has to be investigated.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
[With a lot of inputs and help from Jarno]
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agoAUTHORS: Add Ariel Tubaltsev to AUTHORS.
Gurucharan Shetty [Thu, 4 Sep 2014 22:55:56 +0000 (15:55 -0700)]
AUTHORS: Add Ariel Tubaltsev to AUTHORS.

I missed it while adding commit 6ee1400bbff(vtep: additions to BFD
configuration and status reporting)

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agodatapath-windows: add support for GET_DP command to dump datpaths
Nithin Raju [Fri, 29 Aug 2014 22:48:10 +0000 (15:48 -0700)]
datapath-windows: add support for GET_DP command to dump datpaths

In this patch, we add support for the GET_DP netlink command to dump
the datpaaths. The userspace workflow to get this to work is the same
as on Linux. dpif-linux.c initiates a dump start by writing a netlink
message, and after that continues to read data from the kernel while
the kernel has data. The state is maintained in the kernel, and not in
userspace. This approach was taken since there was not great benefit
of maintaining state in userspace, and also to avoid userspace changes
specific to Windows.

This hopefully serves as a template to base the other dump commands on.

validation:
- With a hacked up dpif-linux.c to work on Windows,
  dpif_linux_enumerate() successfully enumerated the datapaths in the
  kernel.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: add a context structure for user parameters
Nithin Raju [Fri, 29 Aug 2014 22:47:49 +0000 (15:47 -0700)]
datapath-windows: add a context structure for user parameters

In this patch we add a context structure for collecting all the parameters
passed from usersapce in one place. The idea is to reduce the number of
parameters being passed to the netlink command handler functions.

It can be argued that not all functions require all the arguments, but this
approach keeps the code clean, IMO.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: make NL version a UIN8 and add a validateDp arg
Nithin Raju [Fri, 29 Aug 2014 22:47:37 +0000 (15:47 -0700)]
datapath-windows: make NL version a UIN8 and add a validateDp arg

I didn't realize earlier that version in a netlink message was a
UINT8. So, fixing that here.

Also, some of the commands don't pass a valid DP value. Hence adding
a field to identify such commands.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: Data structures and functions for dump state
Nithin Raju [Fri, 29 Aug 2014 22:47:21 +0000 (15:47 -0700)]
datapath-windows: Data structures and functions for dump state

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofp-errors: Migrate EXT-444 errors to ONF experimenter ID.
Jean Tourrilhes [Thu, 21 Aug 2014 17:40:51 +0000 (10:40 -0700)]
ofp-errors: Migrate EXT-444 errors to ONF experimenter ID.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
[blp@nicira.com removed the definitions of these errors in OF1.1 and OF1.2]
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agoofp-errors: Fix bugs in treatment of OpenFlow experimenter errors.
Ben Pfaff [Thu, 21 Aug 2014 17:38:15 +0000 (10:38 -0700)]
ofp-errors: Fix bugs in treatment of OpenFlow experimenter errors.

OpenFlow 1.2 and later have "experimenter errors".  The OVS implementation
was buggy in a few ways.  First, a bug in extract-ofp-errors prevented
OF1.2+ experimenter errors from being properly decoded.  Second,
OF1.2+ experimenter errors have only a type, not a code, whereas all other
types of errors (standard errors, OF1.0/1.1 Nicira extension errors) have
both, but extract-ofp-errors didn't properly enforce that.

This commit fixes both problems and improves existing tests to verify that
encoding and decoding of experimenter errors now works properly.

This commit also fixes the definition of OFPBIC_DUP_INST.  It claimed to
have an OF1.1 experimenter error value although OF1.1 didn't have
experimenter errors.  This commit changes it to use a Nicira extension
error in OF1.1 instead.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agonx-match: Serialize standard xregs instead of Nicira registers, in OF1.5.
Ben Pfaff [Thu, 21 Aug 2014 03:59:43 +0000 (20:59 -0700)]
nx-match: Serialize standard xregs instead of Nicira registers, in OF1.5.

Commit 79fe0f4611b60 (meta-flow: Add 64-bit registers.) added support for
the OpenFlow 1.5 (draft) standardized registers, but neglected to cause
them to be serialized when Open vSwitch composes flow matches.  This meant
that they were always sent to a controller as pairs of Nicira extension
registers.  This commit fixes the problem.

Found by inspection.

ONF-JIRA: EXT-244
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agodatapath-windows: NetlinkBuf.c: Minor fix for lines exceeding 79 chars
Ankur Sharma [Wed, 3 Sep 2014 23:33:40 +0000 (16:33 -0700)]
datapath-windows: NetlinkBuf.c: Minor fix for lines exceeding 79 chars

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Tested-by: Ankur Sharma <ankursharma@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/37
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean at cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: Netlink.c: Add netlink put APIs.
Ankur Sharma [Wed, 3 Sep 2014 23:33:32 +0000 (16:33 -0700)]
datapath-windows: Netlink.c: Add netlink put APIs.

In this change we have added the APIs for putting
netlink headers, attributes in a buffer.

The buffer is managed through NetlinkBuf.[c|h].

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Tested-by: Ankur Sharma <ankursharma@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/37
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean at cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: OvsTypes.h: Add support for BE16.
Ankur Sharma [Wed, 3 Sep 2014 23:33:24 +0000 (16:33 -0700)]
datapath-windows: OvsTypes.h: Add support for BE16.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Tested-by: Ankur Sharma <ankursharma@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/37
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean at cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: NetlinkProto: Fix typo and add padding macro.
Ankur Sharma [Wed, 3 Sep 2014 23:33:15 +0000 (16:33 -0700)]
datapath-windows: NetlinkProto: Fix typo and add padding macro.

Added a new macro for calculating the number of bytes required
for padding. Fixed a minor typo.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Tested-by: Ankur Sharma <ankursharma@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/37
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean at cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: Add Netlink buffer management APIs.
Ankur Sharma [Wed, 3 Sep 2014 23:33:05 +0000 (16:33 -0700)]
datapath-windows: Add Netlink buffer management APIs.

In this change we have introduced buffer mgmt apis which will be
used while creating netlink messages. The basic functionality provided
by apis is on similar lines to ofpbuf in userspace with an exception
that it will not do run time buffer reallocation.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Tested-by: Ankur Sharma <ankursharma@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/37
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean at cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: Move netlink files to a new directory.
Ankur Sharma [Wed, 3 Sep 2014 23:32:55 +0000 (16:32 -0700)]
datapath-windows: Move netlink files to a new directory.

In this change we have created a new directory named Netlink
inside datapath-windows/ovsext/. This directory will be used to
keep all the netlink related files.

The reason we have created new directory is that for 'put' related
APIs we will be adding netlink buffer mgmt files as well. These files
will take the count of netlink related files to 5. Hence
we decided to club the netlink files in a single directory.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Tested-by: Ankur Sharma <ankursharma@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/37
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean at cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodpif-netdev: Introduce port_try_ref() to prevent a race.
Alex Wang [Thu, 21 Aug 2014 22:54:07 +0000 (15:54 -0700)]
dpif-netdev: Introduce port_try_ref() to prevent a race.

When pmd thread interates through all ports for queue loading,
the main thread may unreference and 'rcu-free' a port before
pmd thread take new reference of it.  This could cause pmd
thread fail the reference and access freed memory later.

This commit fixes this race by introducing port_try_ref()
which uses ovs_refcount_try_ref_rcu().  And the pmd thread
will only load the port's queue, if port_try_ref() returns
true.

Found by inspection.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agovtep: additions to BFD configuration and status reporting
Ariel Tubaltsev [Tue, 2 Sep 2014 18:27:55 +0000 (11:27 -0700)]
vtep: additions to BFD configuration and status reporting

This commit adds default values for some BFD configuration keys
(bfd_config_local:bfd_dst_mac and bfd_params:enable). It also adds new
BFD status keys (bfd_enabled and bfd_info).

Signed-off-by: Ariel Tubaltsev <atubaltsev@vmware.com>
Signed-off-by: Bruce Davie <bdavie@vmware.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agonetdev-dpdk: Show interface status for dpdk0.
Alex Wang [Thu, 21 Aug 2014 22:53:15 +0000 (15:53 -0700)]
netdev-dpdk: Show interface status for dpdk0.

This commit fixes a bug which prevents the display of interface
status for dpdk0.

Found by inspection.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agopacket: Fix sparse warnings ICMPv6.
Jesse Gross [Wed, 3 Sep 2014 00:57:21 +0000 (17:57 -0700)]
packet: Fix sparse warnings ICMPv6.

The system defined ICMPv6 header doesn't have sparse annotation,
so this adds a definition so that endianness can be checked.

Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
9 years agonetdev-dpdk: Make memory pool name contain the socket id.
Alex Wang [Tue, 17 Jun 2014 00:19:11 +0000 (17:19 -0700)]
netdev-dpdk: Make memory pool name contain the socket id.

This commit makes the memory pool name contain the socket id.
Since dpdk library do not allow creation of memory pool with
same name, this commit serves as a simple way of making each
name unique.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: fix a memory leak
Li RongQing [Tue, 2 Sep 2014 20:31:12 +0000 (13:31 -0700)]
datapath: fix a memory leak

The user_skb maybe be leaked if the operation on it failed and codes
skipped into the label "out:" without calling genlmsg_unicast.

Cc: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agogetopt_long: Fix broken sequence of casts in __UNCONST macor.
Eitan Eliahu [Wed, 3 Sep 2014 02:11:13 +0000 (19:11 -0700)]
getopt_long: Fix broken sequence of casts in __UNCONST macor.

Unlike the compilation mode used for OVS x64 Linux Windows long word is
4 bytes for both 32 and 64 bit builds.
Replaced _UNCONST macro with CONST_CAST to avoid the intermediate casting
to an integer.

Testing: 32 and 64 Windows builds.
Signed-off-by: Eitan Eliahu eliahue@vmware.com
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoovs-vsctl: Correctly exit on errors for non-map types in "remove" command.
Ben Pfaff [Tue, 2 Sep 2014 15:35:02 +0000 (08:35 -0700)]
ovs-vsctl: Correctly exit on errors for non-map types in "remove" command.

Reported-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
9 years agoofp-actions: Support "resubmit" actions in action sets.
Srini Seetharaman [Sun, 31 Aug 2014 07:24:46 +0000 (00:24 -0700)]
ofp-actions: Support "resubmit" actions in action sets.

Fixing issue where "resubmit" action in a group action set was not
considered sufficient to retain the full action set. This patch allows
a group action set (considered terminal with OF1.4 and earlier spec)
to have the "output" action come from a different table.

Signed-off-by: Srini Seetharaman <srini.seetharaman@gmail.com>
[blp@nicira.com added documentation]
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agotest-bitmap: Fix multiple minor memory leaks
Thomas Graf [Mon, 1 Sep 2014 16:10:26 +0000 (18:10 +0200)]
test-bitmap: Fix multiple minor memory leaks

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agotest-stp: Fix leak of open file descriptor for input_file
Thomas Graf [Mon, 1 Sep 2014 16:09:57 +0000 (18:09 +0200)]
test-stp: Fix leak of open file descriptor for input_file

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agotravis: Announce travis CI and new build list in NEWS and CONTRIBUTING
Thomas Graf [Mon, 1 Sep 2014 12:52:13 +0000 (14:52 +0200)]
travis: Announce travis CI and new build list in NEWS and CONTRIBUTING

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodpif-netdev: Avoid variable length array on MSVC.
Alin Serdean [Mon, 1 Sep 2014 20:11:54 +0000 (20:11 +0000)]
dpif-netdev: Avoid variable length array on MSVC.

MSVC does not like variable length array either.

This patch treats the following error:

lib/dpif-netdev.c(2272) : error C2057: expected constant expression
lib/dpif-netdev.c(2272) : error C2466: cannot allocate an array of constant size 0
lib/dpif-netdev.c(2272) : error C2133: 'batches' : unknown size
lib/dpif-netdev.c(2273) : error C2057: expected constant expression
lib/dpif-netdev.c(2273) : error C2466: cannot allocate an array of constant size 0
lib/dpif-netdev.c(2273) : error C2133: 'mfs' : unknown size
lib/dpif-netdev.c(2274) : error C2057: expected constant expression
lib/dpif-netdev.c(2274) : error C2466: cannot allocate an array of constant size 0
lib/dpif-netdev.c(2274) : error C2133: 'rules' : unknown size
lib/dpif-netdev.c(2363) : warning C4034: sizeof returns 0
lib/dpif-netdev.c(2381) : error C2057: expected constant expression
lib/dpif-netdev.c(2381) : error C2466: cannot allocate an array of constant size 0
lib/dpif-netdev.c(2381) : error C2133: 'keys' : unknown size
make[2]: *** [lib/dpif-netdev.lo] Error 1

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agotravis: Run 'make distcheck' instead of 'make check'
Thomas Graf [Fri, 29 Aug 2014 23:43:03 +0000 (01:43 +0200)]
travis: Run 'make distcheck' instead of 'make check'

make distcheck runs a superset of make check and will additionally
catch failures in adding new files to the Makefile. It will also test
installation and uninstallation of the package.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodpif-netdev: Exact match cache
Daniele Di Proietto [Fri, 29 Aug 2014 23:06:43 +0000 (16:06 -0700)]
dpif-netdev: Exact match cache

Since lookups in the classifier can be pretty expensive,
we introduce this (thread local) cache which simply
compares the miniflows of the packets

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agopacket-dpif: Add dpif_packet_{get, set}_hash()
Daniele Di Proietto [Fri, 29 Aug 2014 23:06:42 +0000 (16:06 -0700)]
packet-dpif: Add dpif_packet_{get, set}_hash()

These function are used to stored the packet hash. 'netdev-dpdk'
automatically set this value to the RSS hash returned by the
NIC. Other 'netdev's set it to 0 (which is an invalid hash
value), so that callers can compute the hash on their own.

If DPDK support is enabled, struct dpif_packet's member
'dp_hash' is removed and 'pkt.hash.rss' from DPDK mbuf is used

This commit also configure DPDK devices to compute RSS hash
for UDP and IPv6 packets

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agolib/ovs-thread: Avoid atomic read in ovsthread_once_start().
Jarno Rajahalme [Fri, 29 Aug 2014 23:15:44 +0000 (16:15 -0700)]
lib/ovs-thread: Avoid atomic read in ovsthread_once_start().

We can use a normal bool and rely on the mutex_lock/unlock and an
atomic_thread_fence for synchronization.

Also flip the return value of ovsthread_once_start__() to match the
one of ovsthread_once_start().

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/ovs-thread: Use atomic_count.
Jarno Rajahalme [Fri, 29 Aug 2014 23:15:44 +0000 (16:15 -0700)]
lib/ovs-thread: Use atomic_count.

barrier->count is used as a simple counter and is not expected the
synchronize the state of any other variable, so we can use atomic_count,
which uses relaxed atomics.

Ditto for the 'next_id' within ovsthread_wrapper().

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/seq: Document acquire-release semantics.
Jarno Rajahalme [Fri, 29 Aug 2014 23:15:44 +0000 (16:15 -0700)]
lib/seq: Document acquire-release semantics.

Seq objects would be really hard to use if they did not provide
acquire-release semantics.  Currently they do that via
ovs_mutex_lock()/ovs_mutex_unlock(), respectively.  Document the
behavior so that it is safer to rely on that elsewhere.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/flow: Use BUILD_MESSAGE() to warn if BUILD_SEQ is not updated
Daniele Di Proietto [Fri, 29 Aug 2014 23:08:11 +0000 (16:08 -0700)]
lib/flow: Use BUILD_MESSAGE() to warn if BUILD_SEQ is not updated

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agoAdd BUILD_MESSAGE() macro
Daniele Di Proietto [Fri, 29 Aug 2014 23:08:11 +0000 (16:08 -0700)]
Add BUILD_MESSAGE() macro

This commit introduces the BUILD_MESSAGE() macro. It uses _Pragma("message"),
with compilers that support that, to output a warning-like compile-time message
without blocking the compilation.

Used by next commit.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agoDocumentation: DPDK IVSHMEM VM Communications
Pravin B Shelar [Fri, 29 Aug 2014 22:18:54 +0000 (15:18 -0700)]
Documentation: DPDK IVSHMEM VM Communications

Adds documentation on how to run IVSHMEM communication through VM.

Signed-off-by: Mike A. Polehn <mike.a.polehn@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetdev-dpdk: Use different constant for ring size
Daniele Di Proietto [Wed, 30 Jul 2014 15:51:34 +0000 (08:51 -0700)]
netdev-dpdk: Use different constant for ring size

DPDK rings must have a power-of-two size.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: simplify sample action implementation
Andy Zhou [Fri, 29 Aug 2014 20:20:23 +0000 (13:20 -0700)]
datapath: simplify sample action implementation

The current sample() function implementation is more complicated
than necessary in handling single user space action optimization
and skb reference counting. There is no functional changes.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agotravis: Add build@openvswitch.org email list for build notifications.
Thomas Graf [Fri, 29 Aug 2014 17:56:26 +0000 (19:56 +0200)]
travis: Add build@openvswitch.org email list for build notifications.

Enable build notifications to build@openvswitch.org

Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: Fix checksum calculation when modifying ICMPv6 packets.
Jesse Gross [Fri, 15 Aug 2014 18:01:54 +0000 (11:01 -0700)]
datapath: Fix checksum calculation when modifying ICMPv6 packets.

The checksum of ICMPv6 packets uses the IP pseudoheader as part of
the calculation, unlike ICMP in IPv4. This was not implemented,
which means that modifying the IP addresses of an ICMPv6 packet
would cause the checksum to no longer be correct as the psuedoheader
did not match.

Reported-by: Neal Shrader <icosahedral@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoINSTALL: Correct typo.
Ben Pfaff [Fri, 29 Aug 2014 17:39:25 +0000 (10:39 -0700)]
INSTALL: Correct typo.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
9 years agothread: Use explicit wide type when shifting > 32 bits
Thomas Graf [Fri, 29 Aug 2014 10:21:49 +0000 (12:21 +0200)]
thread: Use explicit wide type when shifting > 32 bits

Without the explicit wide type, the shift operation may be performed
on a int which will result in implementation defined behaviour on a
system with more than 32 CPUs.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agonetdev-linux: Cast policer rate to uint64_t to avoid overflow
Thomas Graf [Fri, 29 Aug 2014 10:20:21 +0000 (12:20 +0200)]
netdev-linux: Cast policer rate to uint64_t to avoid overflow

tc_fill_rate() takes a 64bit int, casting kbits_rate from int
to uint64_t avoids a possible overflow when translating from
kbits to bytes.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto/ofproto: Use relaxed atomics.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:53 +0000 (10:34 -0700)]
ofproto/ofproto: Use relaxed atomics.

Neither 'miss_config', 'n_missed', nor 'n_matched' is used to
synchronize the state of any other variable, so we can use relaxed
atomic operations on them.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/timeval: Use relaxed atomics also when writing on 'slow_path'.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:53 +0000 (10:34 -0700)]
lib/timeval: Use relaxed atomics also when writing on 'slow_path'.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto/ofproto-dpif-upcall: Use relaxed atomic operations.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:53 +0000 (10:34 -0700)]
ofproto/ofproto-dpif-upcall: Use relaxed atomic operations.

Neither 'enable_megaflows', 'udpif->flow_limit', 'udpif->n_flows', nor
'udpif->n_flows_timestamp' are used to synchronize the state of any
other variables, so we can use relaxed atomic operations to access
them.

Move the atomic read operation of 'enable_megaflows' outside the loop
in handle_upcalls().

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto/netflow: Use atomic_count for 'netflow_count'.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:53 +0000 (10:34 -0700)]
ofproto/netflow: Use atomic_count for 'netflow_count'.

'netflow_count' and the existence of actual netflow objects is not
tightly synchronized, so we can use the relaxed atomic_count for it.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/netdev-linux: Use atomic_count for 'miimon_cnt'.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:53 +0000 (10:34 -0700)]
lib/netdev-linux: Use atomic_count for 'miimon_cnt'.

'miimon_cnt' and the actual device miimon configuration is only
loosely coupled, so we can use the relaxed atomic_count for it.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/netdev-dummy: Use relaxed atomics for a trivial counter.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:53 +0000 (10:34 -0700)]
lib/netdev-dummy: Use relaxed atomics for a trivial counter.

Even though there is no need to optimize netdev-dummy, it might be
good to do this right, in case it serves as an inspiration for
something else later.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/netdev: Do not use atomics when not needed.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:53 +0000 (10:34 -0700)]
lib/netdev: Do not use atomics when not needed.

All access to struct netdev_registered_class ref_cnt member was done
with netdev_class_mutex held, so it does not need to be an atomic
variable.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/dpif-linux: Use relaxed atomics for 'dump->status'.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:52 +0000 (10:34 -0700)]
lib/dpif-linux: Use relaxed atomics for 'dump->status'.

'dump->status' does not syncronize the state of any other variable, so
we can use relaxed atomics on it.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/cfm: Use relaxed atomics and optimize cfm_should_process_flow().
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:52 +0000 (10:34 -0700)]
lib/cfm: Use relaxed atomics and optimize cfm_should_process_flow().

The atomics here do not synchronize the state of any other variables,
so we can use relaxed atomics.

cfm_should_process_flow() is rearranged to set the megaflow mask bits
only if necessary, and to avoid the atomic operation on non-CFM
packets.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/bfd: Used relaxed atomics and optimize bfd_should_process_flow().
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:52 +0000 (10:34 -0700)]
lib/bfd: Used relaxed atomics and optimize bfd_should_process_flow().

The atomics here do not synchronize the state of any other variables,
so we can use atomic_count and relaxed atomics.

bfd_should_process_flow() is rearranged to set the megaflow mask bits
only if necessary, and to avoid the atomic operation on non-BFD
packets.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/jsonrpc, lib/ofp-msgs, lib/ofp-parse: Use atomic_count.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:52 +0000 (10:34 -0700)]
lib/jsonrpc, lib/ofp-msgs, lib/ofp-parse: Use atomic_count.

Trivial ID counters do not synchronize anything, therefore can use
atomic_count.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib: Use shorter form of relaxed atomic access.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:52 +0000 (10:34 -0700)]
lib: Use shorter form of relaxed atomic access.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/ovs-atomic: Add atomic_count.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:52 +0000 (10:34 -0700)]
lib/ovs-atomic: Add atomic_count.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/ovs-atomic: Add helpers for relaxed atomic access.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:52 +0000 (10:34 -0700)]
lib/ovs-atomic: Add helpers for relaxed atomic access.

When an atomic variable is not serving to synchronize threads about
the state of other (atomic or non-atomic) variables, no memory barrier
is needed with the atomic operation.  However, the default memory
order for an atomic operation is memory_order_seq_cst, which always
causes a system-wide locking of the memory bus and prevents both the
CPU and the compiler from reordering memory accesses accross the
atomic operation.  This can add considerable stalls as each atomic
operation (regardless of memory order) always includes a memory
access.

In most cases we can let the compiler reorder memory accesses to
minimize the time we spend waiting for the completion of the atomic
memory accesses by using the relaxed memory order.  This patch adds
helpers to make such accesses a little easier on the eye (and the
fingers :-), but does not try to hide them completely.

Following patches make use of these and remove all the (implied)
memory_order_seq_cst use from the OVS code base.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/ovs-atomic: Clarified comments on ovs_refcount_unref().
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:52 +0000 (10:34 -0700)]
lib/ovs-atomic: Clarified comments on ovs_refcount_unref().

ovs_refcount_unref() needs to syncronize with the other instances of
itself rather than with ovs_refcount_ref().

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/ovs-atomic: Add missing macro argument parentheses.
Jarno Rajahalme [Fri, 29 Aug 2014 17:34:52 +0000 (10:34 -0700)]
lib/ovs-atomic: Add missing macro argument parentheses.

Otherwise the dereference operator could target a portion of a ternary
expression, for example.

Also minor style fixes.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: Always initialize fix_segment for GSO packet.
Pravin B Shelar [Wed, 27 Aug 2014 14:24:44 +0000 (07:24 -0700)]
datapath: Always initialize fix_segment for GSO packet.

OVS tunnel compat code depends on this function pointer to
handle GSO packet. Currently we do not initialize for all
GRE GSO packets. Following patch fixes that.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agoINSTALL: Describe steps to use/install continuous integration
Thomas Graf [Thu, 28 Aug 2014 23:50:21 +0000 (01:50 +0200)]
INSTALL: Describe steps to use/install continuous integration

Describe the steps required to setup use of travis-ci for any
GitHub ovs repository.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoChanging hash used for selecting bucket in a group action
Srini Seetharaman [Fri, 15 Aug 2014 16:42:46 +0000 (09:42 -0700)]
Changing hash used for selecting bucket in a group action

Current hash uses just the dl_dst field. This patch expands the hash  to
include all L2, L3 and L4 fields, allowing for more balanced selection.

Signed-off-by: Srini Seetharaman <srini.seetharaman@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: Rename files.
Samuel Ghinet [Fri, 29 Aug 2014 04:06:48 +0000 (04:06 +0000)]
datapath-windows: Rename files.

This patch includes the file renaming and accommodations needed for the file
renaming to build the forwarding extension for Hyper-V.

This patch is also a follow-up for the thread:
http://openvswitch.org/pipermail/dev/2014-August/044005.html

Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agonetlink-socket: Use read/write ioctl instead of ReadFile/WriteFile.
Nithin Raju [Wed, 27 Aug 2014 15:36:19 +0000 (08:36 -0700)]
netlink-socket: Use read/write ioctl instead of ReadFile/WriteFile.

The Windows datapath supports a READ/WRITE ioctl instead of ReadFile/WriteFile.
In this change, we update the following:
- WriteFile() in nl_sock_send__() to use DeviceIoControl(OVS_IOCTL_WRITE)
- ReadFile() in nl_sock_recv__() to use DeviceIoControl(OVS_IOCTL_READ)

The WriteFile() call in nl_sock_transact_multiple__() has not been touched
since it is not needed yet.

Main motive for this change is to be able to unblock the DP Dump workflow.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
9 years agoCodingStyle: Relax restrictions on types of bit-fields.
Ben Pfaff [Thu, 28 Aug 2014 15:50:13 +0000 (08:50 -0700)]
CodingStyle: Relax restrictions on types of bit-fields.

C99 only requires compilers to support four types for bit-fields: signed
int, unsigned int, int, and _Bool.  "int" should not be used because it
is implementation-defined whether it is signed.  In practice, we have found
that compilers (in particular, GCC, Clang, and MSVC 2013) support any
integer type.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agoAdd build of ovsext.sln using MSBuild
Alin Serdean [Thu, 28 Aug 2014 13:49:24 +0000 (13:49 +0000)]
Add build of ovsext.sln using MSBuild

This commit adds to the automake build system the full build required
by the forwarding extension solution.

It will help a lot in the future CI to check the full build of the project.

To configure the forwarding extension to be built one could use the following:
./configure CC=./build-aux/cccl LD="`which link`" LIBS="-lws2_32" \
    --prefix="C:/openvswitch/usr" --localstatedir="C:/openvswitch/var" \
    --sysconfdir="C:/openvswitch/etc" --with-pthread="C:/pthread" \
    --with-vstudioddk="Win8.1 Release"

Documentation will be updated in another patch.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
9 years agoovsdb: Fix error leak for negative timeout and invalid until case
Thomas Graf [Thu, 28 Aug 2014 12:40:50 +0000 (14:40 +0200)]
ovsdb: Fix error leak for negative timeout and invalid until case

Although the check for negative timeout is present, the error string
is overwritten if an invalid "until" is found right after. This leaks
an error string and results in not reporting the negative timeout back
to the user even though it is encountered first.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agolib/flow.h Revert bitfield back to uint64_t.
Jarno Rajahalme [Wed, 27 Aug 2014 15:21:45 +0000 (08:21 -0700)]
lib/flow.h Revert bitfield back to uint64_t.

Using different types for the two bitfields did not work on MSVC, so
reverting back to "64-bit bool" :-)

Reported-by: Saurabh Shah <ssaurabh@vmware.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agodatapath-windows: Update netlink family IDs
Nithin Raju [Wed, 27 Aug 2014 03:37:18 +0000 (20:37 -0700)]
datapath-windows: Update netlink family IDs

I didn't realize earlier while defining OvsDpInterfaceExt.h that
there are special values defined in netlink-protocol.h for nlmsg_type.
For Eg. NLMSG_ERROR is defined to be 2. In this patch, we update the
values of the family IDs to not clash with the special defines.
I'm using NLMSG_MIN_TYPE as a reference.

All this points to doing family ID lookup from the kernel rather than
returning values from netlink-socket.c. We should move to that model
after we get through the first round of netlink commands.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agonetlink-socket: fix typo to get_sock_pid_from_kernel()
Nithin Raju [Wed, 27 Aug 2014 03:20:51 +0000 (20:20 -0700)]
netlink-socket: fix typo to get_sock_pid_from_kernel()

A typo crept in while respinning get_sock_pid_from_kernel() in the previous
patch. Fixing it now. Also, get_sock_pid_from_kernel() doesn't need an OUT
argument. Fixing that too.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agonetlink-socket: add support for nl_lookup_genl_mcgroup()
Nithin Raju [Wed, 27 Aug 2014 03:17:03 +0000 (20:17 -0700)]
netlink-socket: add support for nl_lookup_genl_mcgroup()

While we work out whether nl_sock_join_mcgroup() will be the mechanism
to support VPORT events, it is easy to add support for
nl_lookup_genl_mcgroup() and make progress on the other commands.

In this patch, we implement support for nl_lookup_genl_mcgroup() only
for the VPORT family though, which is all what dpif-linux.c needs.

Validation:
- A ported dpif-linux.c with epoll code commented out went so far as
to call dp_enumerate! DP Dump commands can be implemented next.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: define mcgroup IDs for VPORT and other families
Nithin Raju [Wed, 27 Aug 2014 03:17:02 +0000 (20:17 -0700)]
datapath-windows: define mcgroup IDs for VPORT and other families

dpif-linux.c makes a nl_lookup_genl_mcgroup(OVS_VPORT_FAMILY) that is not
implemented yet on Windows yet. Multicast group is used currently to
subscribe to events related to VPORTs. Whether the exact same mechanism
would be used is unclear yet.

In the interim, we can implement code to support nl_lookup_genl_mcgroup().
and make progress with the other simpler commands.

In this patch, we define a ID for the VPORT MC group and use it. The IDs for
other families were also defined, but without usage, were seen as unclear.
Hence we define only the VPORT group.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoAUTHORS: Add Madhu Challa.
Ben Pfaff [Wed, 27 Aug 2014 15:05:22 +0000 (08:05 -0700)]
AUTHORS: Add Madhu Challa.

Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agovtep-ctl: Free error string before return from cmd_remove().
Madhu Challa [Wed, 27 Aug 2014 01:16:12 +0000 (18:16 -0700)]
vtep-ctl: Free error string before return from cmd_remove().

Error string should be freed in all cases.

Found by Coverity.

Signed-off-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoFix memory leaks in error paths.
yinpeijun [Wed, 27 Aug 2014 01:52:54 +0000 (09:52 +0800)]
Fix memory leaks in error paths.

Found by Fortify.

Signed-off-by: yinpeijun <yinpeijun@huawei.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agotests: Fix race conditions.
Joe Stringer [Thu, 31 Jul 2014 22:55:59 +0000 (10:55 +1200)]
tests: Fix race conditions.

These tests had the potential to fail due to statistics not updating
before the test script retrieves them. Fix them by waiting until the
next revalidation cycle.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: Fix recirc bug where skb is double freed.
Andy Zhou [Mon, 25 Aug 2014 22:18:19 +0000 (15:18 -0700)]
datapath: Fix recirc bug where skb is double freed.

If recirc action is the last action of a action list, the SKB triggers
the recirc will be freed twice. This patch fixes this bug.

Reported-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
9 years agolib/flow.h: Improve struct miniflow comment and definition.
Jarno Rajahalme [Tue, 26 Aug 2014 22:11:39 +0000 (15:11 -0700)]
lib/flow.h: Improve struct miniflow comment and definition.

Miniflows can nowadays be dynamically allocated to different inline
sizes, as done by lib/classifier.c, but this had not been documented
at the struct miniflow definition.

Also, MINI_N_INLINE had a different value for 32-bit and 64-bit builds
due to a historical reason.  Now we use 8 for both.

Finally, use change the storage type of 'values_inline' to uint8_t, as
uint64_t looks kind of wide for a boolean, even though we intend the
bit be carved out from the uint64_t where 'map' resides.

Suggested-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agoRevert "miniflow: Remove unused values_inline branch from miniflow_move()"
Jarno Rajahalme [Tue, 26 Aug 2014 22:48:48 +0000 (15:48 -0700)]
Revert "miniflow: Remove unused values_inline branch from miniflow_move()"

This reverts commit 29d2aa3aa74ab97df0f00af5bddaa12485c1d39a.

Turns out the code was correct but the dynamic nature of miniflow
inline storage was poorly documented.  A following patch fixes the
documentation.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agoAUTHORS: Add Ed Swierk.
Ben Pfaff [Tue, 26 Aug 2014 18:44:28 +0000 (11:44 -0700)]
AUTHORS: Add Ed Swierk.

Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodebian: Fix cross build.
Ed Swierk [Sun, 24 Aug 2014 17:37:29 +0000 (10:37 -0700)]
debian: Fix cross build.

Cross-building openvswitch with debuild -aARCH (or equivalent) fails
because the target architecture is not getting passed to configure.
Thus binaries like ovs-appctl get built using the build host
architecture.

Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodpif-netdev: Fix leaked port, port->rxq, port->type in error path
Thomas Graf [Tue, 26 Aug 2014 16:36:08 +0000 (18:36 +0200)]
dpif-netdev: Fix leaked port, port->rxq, port->type in error path

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
[blp@nicira.com added free of port->type]
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoodp-util: Only add recirc_id mask to Netlink message if mask is provided
Thomas Graf [Tue, 26 Aug 2014 16:34:52 +0000 (18:34 +0200)]
odp-util: Only add recirc_id mask to Netlink message if mask is provided

Current unconditional call may result in NULL being passed to
nl_msg_put_u32().

Cc: Andy Zhou <azhou@nicira.com>
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agobuild: Use errtrace to simplify travis-ci failure detection
Thomas Graf [Tue, 26 Aug 2014 10:24:04 +0000 (12:24 +0200)]
build: Use errtrace to simplify travis-ci failure detection

Causes the build script to fail if any command inside the
script returns nonzero.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agotest-controller: Rename to ovs-testcontroller, again install.
Ben Pfaff [Fri, 15 Aug 2014 17:32:50 +0000 (10:32 -0700)]
test-controller: Rename to ovs-testcontroller, again install.

mininet uses the Open vSwitch controller by default, for testing.

CC: 757761@bugs.debian.org
Reported-at: https://bugs.debian.org/757761
Requested-by: Tomasz Buchert <tomasz.buchert@inria.fr>
Requested-by: Dariusz Dwornikowski <dariusz.dwornikowski@cs.put.poznan.pl>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agovswitch.xml: Fix a typo.
Alex Wang [Tue, 26 Aug 2014 05:42:04 +0000 (22:42 -0700)]
vswitch.xml: Fix a typo.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agojson: Fix leaked nodes in json_hash_object()
Thomas Graf [Tue, 26 Aug 2014 10:23:03 +0000 (12:23 +0200)]
json: Fix leaked nodes in json_hash_object()

nodes is allocated through shash_sort() but never freed.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agominiflow: Remove unused values_inline branch from miniflow_move()
Thomas Graf [Tue, 26 Aug 2014 10:01:34 +0000 (12:01 +0200)]
miniflow: Remove unused values_inline branch from miniflow_move()

The branch is unused as size < sizeof dst->inline_values must
always be true for inlined values. Hitting the branch would lead
to corruption as inline_values is accessed out of bounds.

Remove branch and add assertion.

Cc: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>