cascardo/ovs.git
9 years agoovs-vtep: Store physical switch name globally.
Gurucharan Shetty [Tue, 29 Jul 2014 15:40:19 +0000 (08:40 -0700)]
ovs-vtep: Store physical switch name globally.

ovs-vtep is an emulator and it works only on one
physical switch. This switch name is stored in the variable
'ps_name' and then passed around. An upcoming commit requires
access to this variable at more places and it is easier if this
variable is global.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ariel Tubaltsev <atubaltsev@vmware.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agoovs-vtep: Clear left-over local mac information.
Gurucharan Shetty [Thu, 24 Jul 2014 19:40:39 +0000 (12:40 -0700)]
ovs-vtep: Clear left-over local mac information.

Before destroying a logical switch, cleanup any left over local
mac information in Ucast_Macs_Local or Mcast_Macs_Local table.
We need to do this to atleast cleanup the 'unknown-dst' information
added in the Mcast_Macs_Local table while creating the Logical_Switch
class in setup_ls().

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ariel Tubaltsev <atubaltsev@vmware.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agovtep-ctl: Add Tunnel table to vtep_ctl_table_class.
Gurucharan Shetty [Tue, 29 Jul 2014 13:53:47 +0000 (06:53 -0700)]
vtep-ctl: Add Tunnel table to vtep_ctl_table_class.

This is needed to create, get, set records in the Tunnel table.

(We need to add the Tunnel table's 'local' and 'remote' columns
that point to the Physical_Locator record to cache because vtep-ctl
commands like 'add-ucast-local' will try to add an entry in
Physical_Locator table based on the contents of the cache.)

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ariel Tubaltsev <atubaltsev@vmware.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agoREADME.ovs-vtep: Remotes can be connected through VTEP's manager table.
Gurucharan Shetty [Thu, 4 Sep 2014 19:03:28 +0000 (12:03 -0700)]
README.ovs-vtep: Remotes can be connected through VTEP's manager table.

Reported-by: Ziyou Wang <ziyouw@vmware.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agolib/ofproto: Sync RSTP operational state after configuration changes.
Jarno Rajahalme [Thu, 25 Sep 2014 21:53:50 +0000 (14:53 -0700)]
lib/ofproto: Sync RSTP operational state after configuration changes.

Otherwise the RSTP port operational state could be out of sync until
the next time the port's carrier status changes.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Daniele Venturino <daniele.venturino@m3s.it>
9 years agonetdev-dpdk: Fix crash when there is no pci numa info.
Alex Wang [Thu, 25 Sep 2014 20:10:55 +0000 (13:10 -0700)]
netdev-dpdk: Fix crash when there is no pci numa info.

When kernel cannot obtain the pci numa info, the numa_node file
in corresponding pci directory in sysfs will show -1.  Then the
rte_eth_dev_socket_id() function will return it to ovs.  On
current master, ovs assumes rte_eth_dev_socket_id() always
returns non-negative value.  So using this -1 in pmd thread
creation will cause ovs crash.

To fix the above issue, this commit makes ovs always check the
return value of rte_eth_dev_socket_id() and use numa node 0 if
the return value is negative.

Reported-by: Daniel Badea <daniel.badea@windriver.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
9 years agonetdev: Fix error check.
Alex Wang [Thu, 25 Sep 2014 18:40:24 +0000 (11:40 -0700)]
netdev: Fix error check.

Reported-by: Daniel Badea <daniel.badea@windriver.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
9 years agoofproto-dpif-rid: remove unused return value of rid_pool_add()
Simon Horman [Thu, 25 Sep 2014 11:57:53 +0000 (11:57 +0000)]
ofproto-dpif-rid: remove unused return value of rid_pool_add()

The return value of rid_pool_add() is never used
so the code may be slightly simplified by removing it.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
9 years agoofproto-dpif-rid: remove struct rid_map
Simon Horman [Thu, 25 Sep 2014 11:57:52 +0000 (11:57 +0000)]
ofproto-dpif-rid: remove struct rid_map

struct rid_map only has one member which is a struct hmap.
This allows for a slight simplification of the code by removing
struct rid_map and using a struct hmap directly instead.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
9 years agorstp.at: Fix intermittent test failure.
Alex Wang [Thu, 25 Sep 2014 23:51:01 +0000 (23:51 +0000)]
rstp.at: Fix intermittent test failure.

Sub-test "RSTP - dummy interface" checks the ovs-vswitchd.log
output immediately after command execution.  The check may
fail if the write of new log is delayed by the IO thread.

This commit fixes the above issue by waiting for the
ovs-vswitchd.log output.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agoofproto-dpif-rid: correct logic error in rid_pool_alloc_id()
Simon Horman [Wed, 24 Sep 2014 12:41:02 +0000 (12:41 +0000)]
ofproto-dpif-rid: correct logic error in rid_pool_alloc_id()

When searching through the valid ids an id should
be used if is not found rather than if it is found.

It appears to me that without this change duplicate recirculation
ids may used in cases where the last recirculation id has
been allocated; selection loops back to the beginning of the pool and;
reaches a recirculation id that is still in use.

As the number of recirculation ids is currently RECIRC_ID_N_IDS = 1024 this
does not seem beyond the bounds of possibility.

I have not verified that such a scenario can actually occur.  But it seems
that a likely consequence would be that some packets may be forwarded
incorrectly.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
9 years agoovs-atomic-msvc: Fix 64 bit atomic read/writes.
Gurucharan Shetty [Wed, 24 Sep 2014 17:25:52 +0000 (10:25 -0700)]
ovs-atomic-msvc: Fix 64 bit atomic read/writes.

MSVC converts 64 bit read/writes into two instructions (uses 'mov' as
seen through cl //FAs). So there is a possibility that an interrupt can
make a 64 bit read/write non-atomic even when 8 byte aligned. So we cannot
use a simple assignment. Use a full memory barrier function instead.

Reported-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agoAUTHORS: Add Selvamuthukumar <smkumar@merunetworks.com>.
Ben Pfaff [Wed, 24 Sep 2014 17:02:35 +0000 (10:02 -0700)]
AUTHORS: Add Selvamuthukumar <smkumar@merunetworks.com>.

Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofp-actions: Fix error code for invalid table id.
Selvamuthukumar [Wed, 24 Sep 2014 16:53:13 +0000 (09:53 -0700)]
ofp-actions: Fix error code for invalid table id.

Send OFPET_BAD_INSTRUCTION/OFPBIC_BAD_TABLE_ID if table is invalid
in goto table instruction.

Signed-off-by: Selvamuthukumar <smkumar@merunetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agojsonrpc: Notify excessive sending backlog.
Alex Wang [Thu, 18 Sep 2014 17:49:47 +0000 (10:49 -0700)]
jsonrpc: Notify excessive sending backlog.

This commit adds a log message to notify the excessive backlog
for jsonrpc.  Expectedly, this message should never be printed.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agobridge: Refactor the stats and status update.
Alex Wang [Thu, 18 Sep 2014 21:35:30 +0000 (14:35 -0700)]
bridge: Refactor the stats and status update.

This commit refactors the stats and status update in bridge_run()
by moving the corresponding code to separate functions.  This
makes the code more organized.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agobridge: Rate limit the statistics update.
Alex Wang [Thu, 18 Sep 2014 21:10:24 +0000 (14:10 -0700)]
bridge: Rate limit the statistics update.

When ovs is running with large topology (e.g. large number of
interfaces), the stats update to ovsdb becomes huge and normally
requires multiple run of ovsdb jsonrpc message processing loop to
consume.

To prevent the periodic stats update from backlogging in the
jsonrpc sending queue, this commit adds rate limiting logic
which only allows new update if the previous one is done.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
9 years agodatapath: Constify various function arguments
Thomas Graf [Tue, 23 Sep 2014 14:02:35 +0000 (16:02 +0200)]
datapath: Constify various function arguments

Help produce better optimized code.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: Restore OVS_CB after skb_segment.
Pravin B Shelar [Sun, 21 Sep 2014 04:10:49 +0000 (21:10 -0700)]
datapath: Restore OVS_CB after skb_segment.

OVS needs to segments large skb before sending it for miss
packet handling to userspace. but skb_gso_segment uses
skb->cb. This corrupted OVS_CB which result in following panic.

[  735.419921] BUG: unable to handle kernel paging request at 00000014000001b2
[  735.423168] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[  735.445097] RIP: 0010:[<ffffffffa05df0d7>]  [<ffffffffa05df0d7>] ovs_nla_put_flow+0x37/0x7c0 [openvswitch]
[  735.468858] Call Trace:
[  735.470384]  [<ffffffffa05d7ec2>] queue_userspace_packet+0x332/0x4d0 [openvswitch]
[  735.471741]  [<ffffffffa05d8155>] queue_gso_packets+0xf5/0x180 [openvswitch]
[  735.481862]  [<ffffffffa05da9f5>] ovs_dp_upcall+0x65/0x70 [openvswitch]
[  735.483031]  [<ffffffffa05dab81>] ovs_dp_process_packet+0x181/0x1b0 [openvswitch]
[  735.484391]  [<ffffffffa05e2f55>] ovs_vport_receive+0x65/0x90 [openvswitch]
[  735.492638]  [<ffffffffa05e5738>] internal_dev_xmit+0x68/0x110 [openvswitch]
[  735.495334]  [<ffffffff81588eb6>] dev_hard_start_xmit+0x2e6/0x8b0
[  735.496503]  [<ffffffff81589847>] __dev_queue_xmit+0x3c7/0x920
[  735.499827]  [<ffffffff81589db0>] dev_queue_xmit+0x10/0x20
[  735.500798]  [<ffffffff815d3b60>] ip_finish_output+0x6a0/0x950
[  735.502818]  [<ffffffff815d55f8>] ip_output+0x68/0x110
[  735.503835]  [<ffffffff815d4979>] ip_local_out+0x29/0x90
[  735.504801]  [<ffffffff815d4e46>] ip_queue_xmit+0x1d6/0x640
[  735.507015]  [<ffffffff815ee0d7>] tcp_transmit_skb+0x477/0xac0
[  735.508260]  [<ffffffff815ee856>] tcp_write_xmit+0x136/0xba0
[  735.510829]  [<ffffffff815ef56e>] __tcp_push_pending_frames+0x2e/0xc0
[  735.512296]  [<ffffffff815e0593>] tcp_sendmsg+0xa63/0xd50
[  735.513526]  [<ffffffff81612c2c>] inet_sendmsg+0x10c/0x220
[  735.516025]  [<ffffffff81566b8c>] sock_sendmsg+0x9c/0xe0
[  735.518066]  [<ffffffff81566d41>] SYSC_sendto+0x121/0x1c0
[  735.521398]  [<ffffffff8156801e>] SyS_sendto+0xe/0x10
[  735.522473]  [<ffffffff816df5e9>] system_call_fastpath+0x16/0x1b

Reported-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agodatapath: Fix double free when ovs_nla_copy_actions() fails
Thomas Graf [Tue, 23 Sep 2014 17:59:56 +0000 (19:59 +0200)]
datapath: Fix double free when ovs_nla_copy_actions() fails

ovs_nla_copy_actions() already frees the allocated actions buffers,
ovs_flow_cmd_new() will free it a second time when jumping to
err_kfree_acts.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoovs-pki: Use SHA-1 instead of SHA-512 as message digest.
Alex Wang [Mon, 22 Sep 2014 22:34:12 +0000 (15:34 -0700)]
ovs-pki: Use SHA-1 instead of SHA-512 as message digest.

Commit 9ff33ca7 (ovs-pki: Use SHA-512 instead of MD5 as message
digest.) changes the message digest algorithm to SHA-512.  This
seems to break the unit tests on some xenserver 5.6/6.0 builds
causing the error: "SSL_connect: error:0D0C50A1:asn1 encoding
routines:ASN1_item_verify:unknown message digest algorithm".

As a solution, this commit changes the message digest algorithm
to SHA-1 which works for both the above xenserver builds and
centos 7.

VMware-BZ: #1319116

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: compat: Fix compilation for 2.6.32 kernel
Pravin B Shelar [Sun, 21 Sep 2014 18:41:28 +0000 (11:41 -0700)]
datapath: compat: Fix compilation for 2.6.32 kernel

Define alloc_netdev() using alloc_netdev_mq() which is available on all
kernel supported by OVS.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: Remove pkt_key from OVS_CB.
Pravin B Shelar [Thu, 18 Sep 2014 01:58:44 +0000 (18:58 -0700)]
datapath: Remove pkt_key from OVS_CB.

OVS keeps pointer to packet key in skb->cb, but the packet key is
store on stack. This could make code bit tricky. So it is better to
get rid of the pointer.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agodatapath: restore OVS_FLOW_CMD_NEW notifications
Samuel Gauthier [Sat, 20 Sep 2014 13:25:23 +0000 (06:25 -0700)]
datapath: restore OVS_FLOW_CMD_NEW notifications

Since commit fb5d1e9e127a ("openvswitch: Build flow cmd netlink reply only if needed."),
the new flows are not notified to the listeners of OVS_FLOW_MCGROUP.

This commit fixes the problem by using the genl function, ie
genl_has_listerners() instead of netlink_has_listeners().

Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: Provide compatibility for kernels up to 3.17
Thomas Graf [Thu, 18 Sep 2014 12:48:56 +0000 (14:48 +0200)]
datapath: Provide compatibility for kernels up to 3.17

Port datapath to work with kernrels up to 3.17 and use 3.16.2 as
the new kernel for CI testing.

Tested with 3.14, 3.16.2, and net-next (3.17).

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Co-authored-by: Madhu Challa <challa@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: fix sparse warning.
Pravin B Shelar [Sat, 20 Sep 2014 12:36:47 +0000 (05:36 -0700)]
datapath: fix sparse warning.

datapath/linux/datapath.c:1418:28: warning: symbol
'i' shadows an earlier one
datapath/linux/datapath.c:1396:18: originally declared here

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: change the data type of error status to atomic_long_t
Li RongQing [Sat, 20 Sep 2014 12:10:03 +0000 (05:10 -0700)]
datapath: change the data type of error status to atomic_long_t

Change the date type of error status from u64 to atomic_long_t, and use atomic
operation, then remove the lock which is used to protect the error status.

The operation of atomic maybe faster than spin lock.

Cc: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: Remove support to set vport stats.
Pravin B Shelar [Sat, 20 Sep 2014 12:02:54 +0000 (05:02 -0700)]
datapath: Remove support to set vport stats.

This was required for old compatibility code which update stats
on fake bond interface. Now vswitchd has dropped it. This
support was always deprecated, so finally removing it.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agodpif-netdev: Fix (packet) memory leaks in the slow path.
Daniele Di Proietto [Fri, 19 Sep 2014 23:20:01 +0000 (16:20 -0700)]
dpif-netdev: Fix (packet) memory leaks in the slow path.

If a packet didn't match a rule in the fast path classifier its memory was
never freed. The issue was particularly clear with DPDK devices because it was
not possible to process more than ~250000 DPDK mbufs in the slow path.

This commit fixes the problem by:
* calling dpif_packet_delete() if the upcalls are disabled
* passing may_steal==true to dp_netdev_execute_actions() during normal upcall
  processing

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoovs-pki: Use SHA-512 instead of MD5 as message digest.
Ben Pfaff [Fri, 19 Sep 2014 23:17:09 +0000 (16:17 -0700)]
ovs-pki: Use SHA-512 instead of MD5 as message digest.

This fixes numerous testsuite failures of the form "SSL_connect:
error:0D0C50A1:asn1 encoding routines:ASN1_item_verify:unknown message
digest algorithm" on systems that disable MD5 in OpenSSL.  Centos 7 is one
example.  Presumably it increase security as well for anyone who generates
certificates based on a new configuration created by the new ovs-pki.

Reported-by: Robert Strickler <anomalyst@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodpif-netdev: Allow multi-rx-queue, multi-pmd-thread configuration.
Alex Wang [Mon, 8 Sep 2014 22:22:26 +0000 (15:22 -0700)]
dpif-netdev: Allow multi-rx-queue, multi-pmd-thread configuration.

This commits adds the multithreading functionality to OVS dpdk
module.  Users are able to create multiple pmd threads and set
their cpu affinity via specifying the cpu mask string similar
to the EAL '-c COREMASK' option.

Also, the number of rx queues for each dpdk interface is made
configurable to help distribution of rx packets among multiple
pmd threads.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoovs-numa: Add support for cpu-mask configuration.
Alex Wang [Mon, 23 Jun 2014 01:08:15 +0000 (18:08 -0700)]
ovs-numa: Add support for cpu-mask configuration.

This commit adds support in ovs-numa module for reading a user
specified cpu mask, which configures the availability of the cores.

The cpu mask has the format of a hex string similar to the EAL '-c
COREMASK' option input or the 'taskset' mask input.  The lowest order
bit corresponds to the first CPU core.  Bit value '1' means the
corresponding core is available.

An upcoming patch will allow user to configure the mask via OVSDB.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodpif-netlink: rename linux_flow variable to datapath_flow
Nithin Raju [Fri, 19 Sep 2014 22:34:45 +0000 (15:34 -0700)]
dpif-netlink: rename linux_flow variable to datapath_flow

In the flow related functions, there's a stack variable called
'linux_flow'. Since this code is not specific to Linux anymore,
in this patch, we rename the variable to 'datpath_flow'.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: add OVS_DP_CMD_SET and OVS_DP_CMD_GET transaction support
Nithin Raju [Fri, 19 Sep 2014 21:30:56 +0000 (14:30 -0700)]
datapath-windows: add OVS_DP_CMD_SET and OVS_DP_CMD_GET transaction support

In this patch, we add support for two commands, both of them are issued
as part of transactions semantics from userspace:
1. OVS_DP_CMD_SET is used to get the properties of a DP as well as set
some properties. The set operations does not seem to make much sense for
the Windows datpath right now.
2. There's already support for OVS_DP_CMD_GET command issued via the
dump semantics from userspace. Turns out that userspace can issue
OVS_DP_CMD_GET as a transaction.

There's lot of common code between these two commands. Hence combining
the implementation and the review.

Also refactories some of the code in the implementation of dump-based
OVS_DP_CMD_GET, and updated some of the comments.

Validation:
- With these series of patches, I was able to run the following command:
> .\utilities\ovs-dpctl.exe show
system@ovs-system:
    lookups: hit:0 missed:22 lost:0
    flows: 0
- I got so far as to hit the PORT_DUMP command which is currently not
implemented.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Tested-by: Nithin Raju <nithin@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/38
Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoextract-odp-netlink-windows-dp-h: add definition of IFNAMSIZ
Nithin Raju [Fri, 19 Sep 2014 21:30:55 +0000 (14:30 -0700)]
extract-odp-netlink-windows-dp-h: add definition of IFNAMSIZ

The Windows kernel datapath needs the definition of 'IFNAMSIZ' for
specifying attribute sizes in netlink policies. Adding the definition
of 'IFNAMSIZ' to be part of OvsDpInterface.h similar to ETH_ADDR_LEN.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agolib/netlink-socket.c: add support for nl_transact() on Windows
Nithin Raju [Fri, 19 Sep 2014 21:30:54 +0000 (14:30 -0700)]
lib/netlink-socket.c: add support for nl_transact() on Windows

In this patch, we add support for nl_transact() on Windows using
the OVS_IOCTL_TRANSACT ioctl that sends down the request and gets
the reply in the same call to the kernel.

This is obviously a digression from the way it is implemented in
Linux where all the sends are done at once using sendmsg() and
replies are received one at a time.

Initial implementation was in the Linux way using multiple writes
followed by reads, but decided against it since it is not efficient
and also it complicates the state machine in the kernel.

The Windows implementation has equivalent code for handling corner
cases and error coditions similar to Linux. Some of it is not
applicable yet. Eg. the Windows kernel does not embed an error
in the netlink message itself. There's userspace code nevertheless
for this.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: add OvsCompareString() to compare strings
Nithin Raju [Fri, 19 Sep 2014 21:30:53 +0000 (14:30 -0700)]
datapath-windows: add OvsCompareString() to compare strings

In this patch we implement a utility function to compare ANSI
strings using the Rtl* functions. As much as possible, in an
NDIS driver, we stick to Rtl* functions for memory/string
manipulation.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: return TRUE on success in NlAttrValidate
Nithin Raju [Fri, 19 Sep 2014 21:30:52 +0000 (14:30 -0700)]
datapath-windows: return TRUE on success in NlAttrValidate

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agotravis: Allow testsuite to run with GCC or Clang.
Ben Pfaff [Fri, 19 Sep 2014 20:09:24 +0000 (13:09 -0700)]
travis: Allow testsuite to run with GCC or Clang.

I don't see why the testsuite is supported only with GCC.

Acked-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agotravis: Include testsuite.log on failure.
Ben Pfaff [Fri, 19 Sep 2014 18:11:58 +0000 (11:11 -0700)]
travis: Include testsuite.log on failure.

Acked-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agonetdev-dpdk: Fix a bug in netdev_dpdk_set_multiq().
Alex Wang [Fri, 19 Sep 2014 17:38:39 +0000 (10:38 -0700)]
netdev-dpdk: Fix a bug in netdev_dpdk_set_multiq().

Commit 5a0340 (dpif-netdev: Create multiple tx/rx queues when
adding dpdk interface.) introduced a bug which causes the function
netdev_dpdk_set_multiq() never resetting the tx queues.  This bug
could cause pmd thread accessing unassigned memory, resulting in
segfault.

This commit fixes the bug.

Reported-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
9 years agonetdev-dpdk: Fix a typo.
Alex Wang [Fri, 19 Sep 2014 17:37:08 +0000 (10:37 -0700)]
netdev-dpdk: Fix a typo.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
9 years agocoverage: Remove unused macro.
Ben Pfaff [Fri, 19 Sep 2014 15:24:11 +0000 (08:24 -0700)]
coverage: Remove unused macro.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agoFAQ: Fix a newline problem
YAMAMOTO Takashi [Fri, 19 Sep 2014 16:10:30 +0000 (01:10 +0900)]
FAQ: Fix a newline problem

Fix a newline problem in commit dd63a57e55daddccbbd4f0bedfdc86b6827b6b1a
("FAQ: Add an entry about reconfiguration")

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
9 years agoFAQ: Add an entry about reconfiguration
YAMAMOTO Takashi [Thu, 18 Sep 2014 04:44:52 +0000 (13:44 +0900)]
FAQ: Add an entry about reconfiguration

It seems that the behaviour is not so intuitive.
cf. https://bugs.launchpad.net/neutron/+bug/1346861

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agonetdev-dpdk: Pass queue id to dpdk_do_tx_copy().
Alex Wang [Fri, 19 Sep 2014 00:02:17 +0000 (17:02 -0700)]
netdev-dpdk: Pass queue id to dpdk_do_tx_copy().

Since dpdk_do_tx_copy() will be called by both pmd and
non-pmd thread, it should take the queue id as input.
The current ovs always uses NON_PMD_THREAD_TX_QUEUE
as queue id, which causes unprotected multi-access
to the same queue.

This commit fixes the issue by passing the queue id
to dpdk_do_tx_copy().

Reported-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
9 years agodatapath-windows: NetLink kernel side, Event subscription and notification
Eitan Eliahu [Thu, 18 Sep 2014 06:12:05 +0000 (23:12 -0700)]
datapath-windows: NetLink kernel side, Event subscription and notification

This code handles an event notification subscription for a user mode thread
which joins an MC group. The event wait handler queues an IRP which is
completed upon change in a port state.

Signed-off-by: Eitan Eliahu <eliahue@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
9 years agodpif-linux: Rename dpif-netlink; change to compile with MSVC.
Alin Gabriel Serdean [Thu, 18 Sep 2014 11:17:54 +0000 (04:17 -0700)]
dpif-linux: Rename dpif-netlink; change to compile with MSVC.

The patch contains the necessary modifications to compile and also to run
under MSVC.

Added the files to the build system and also changed dpif_linux to be under
a more generic name dpif_windows.

Added a TODO under the windows part in case we want to implement another
counterpart for epoll functions.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto: Warn about excessive rule counts in OpenFlow tables.
Ethan Jackson [Wed, 17 Sep 2014 20:22:14 +0000 (13:22 -0700)]
ofproto: Warn about excessive rule counts in OpenFlow tables.

Frequently we've run into controller bugs which result in hundreds of
thousands, or even millions of rules being installed in an OpenFlow
table.  This isn't something trouble-shooters naturally think of to
check for, so it's nice to have a low rate warning message to hint at
the potential problem.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodpif-netdev: Store miniflow length in exact match cache
Daniele Di Proietto [Sat, 6 Sep 2014 08:10:43 +0000 (08:10 +0000)]
dpif-netdev: Store miniflow length in exact match cache

This optimization is done to avoid calling count_1bits(), which, if
the popcnt istruction is not available might is slow. popcnt may not
be available because:

- We are running on old hardware
- (more likely) We're using a generic build (i.e. packaged OVS from a
  distro), not tuned for the specific CPU

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agodpif-netdev: Introduce netdev_flow_key_* functions
Daniele Di Proietto [Sat, 6 Sep 2014 08:10:42 +0000 (08:10 +0000)]
dpif-netdev: Introduce netdev_flow_key_* functions

netdev_flow_key is a miniflow with the following constraints:

1) It is used only inside dpif-netdev.c.
2) It always has inline values.
3) It contains only miniflows created by miniflow_extract().

Therefore, by using these new functions instead of the miniflow_*
ones, we get the following (performance related) benefits:

- Because of (1) the functions can be inlined.
- Because of (2) and (3) the netdev_flow_key can be treated as POD.
  Specifically, because of (3), we can do comparisons with memcmp,
  since if the map is different the miniflow must be different.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agoofproto-dpif-xlate: Wildcard skb_priority if QoS is disabled
Daniele Di Proietto [Wed, 17 Sep 2014 22:01:48 +0000 (15:01 -0700)]
ofproto-dpif-xlate: Wildcard skb_priority if QoS is disabled

This optimization should give a small performance benefit to the userspace
datapath.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agodatapath-windows/Netlink: Add optional flag in policy.
Ankur Sharma [Tue, 16 Sep 2014 01:18:05 +0000 (18:18 -0700)]
datapath-windows/Netlink: Add optional flag in policy.

Added the optional flag in policy structure. This would allow
caller to avoid checks for mandatory attributes if parsing
succeeds.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: Add support for getting 64 and 16 bit attributes.
Ankur Sharma [Tue, 16 Sep 2014 01:17:36 +0000 (18:17 -0700)]
datapath-windows: Add support for getting 64 and 16 bit attributes.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows/Netlink: Added support for variable length attributes in validation.
Ankur Sharma [Tue, 16 Sep 2014 01:17:22 +0000 (18:17 -0700)]
datapath-windows/Netlink: Added support for variable length attributes in validation.

Added minor fix for allowing support for variable lenghth attributes in
parsing policy.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoovs-dev.py: Support running the clang binaries.
Ethan Jackson [Fri, 5 Sep 2014 21:18:27 +0000 (14:18 -0700)]
ovs-dev.py: Support running the clang binaries.

They have slightly different support characteristics, so it's nice to
easily switch between them for testing.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
9 years agoovs-dev.py: Support additional optimization flags.
Ethan Jackson [Fri, 5 Sep 2014 20:53:31 +0000 (13:53 -0700)]
ovs-dev.py: Support additional optimization flags.

They may or may not make a difference, but there's no reason not to
support passing them.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
9 years agodatapath-windows: use the Netlink set API and need new APIs
Nithin Raju [Mon, 15 Sep 2014 18:38:02 +0000 (11:38 -0700)]
datapath-windows: use the Netlink set API and need new APIs

In this change:
1. we refactor the code that fills up information about the DP into
a seprate function.
2. use the netlink set APIs to fill up the netlink attributes.
3. we define a OVS_DP_STATS to be a typedef of 'struct ovs_dp_stats'
in keeping with the Windows kernel naming conventions.
4. In the absence of netlink set API, I had put in an ASSERT earlier
that the output buffer should be limited to 512 bytes. This is not
true anymore. The netlink set API checks for bounds of the buffer.
Hence removed the ASSERT.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows: fix bug in NlBufCopyAtTailUninit
Nithin Raju [Mon, 15 Sep 2014 18:38:01 +0000 (11:38 -0700)]
datapath-windows: fix bug in NlBufCopyAtTailUninit

We should be returning value of tail before the increment
and not after.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto-dpif-xlate: Suppress some warnings on non-Linux OSes
YAMAMOTO Takashi [Tue, 16 Sep 2014 03:45:42 +0000 (12:45 +0900)]
ofproto-dpif-xlate: Suppress some warnings on non-Linux OSes

These warnings were introduced by
commit 7d031d7e511aeea8dd45348922fe8e3bbdd2956e
("ofproto-dpif-xlate: Work around Linux netdev_max_backlog limit.")
and found by --enable-Werror build on NetBSD.

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agocompiler: Define NO_RETURN for MSVC.
Gurucharan Shetty [Mon, 15 Sep 2014 19:58:09 +0000 (12:58 -0700)]
compiler: Define NO_RETURN for MSVC.

To prevent warnings such as "Not all control paths return a value",
we should define NO_RETURN for MSVC.

Currently for gcc, we add NO_RETURN at the end of function declaration.
But for MSVC, "__declspec(noreturn)" is needed at the beginning of function
declaration. So this commit moves NO_RETURN to the beginning of the function
declaration as it works with gcc and clang too.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoFix remaining "uninitialized local variable" used warning by MSVC.
Gurucharan Shetty [Mon, 15 Sep 2014 17:10:34 +0000 (10:10 -0700)]
Fix remaining "uninitialized local variable" used warning by MSVC.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoFix remaining "void function returning a value" warning by MSVC.
Gurucharan Shetty [Mon, 15 Sep 2014 17:04:32 +0000 (10:04 -0700)]
Fix remaining "void function returning a value" warning by MSVC.

MSVC complains about a void function returning a value if there is a
statement of the form - 'return foo()' even if foo() has a void return
type.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoovs-atomic-msvc: Disable a compiler warning.
Gurucharan Shetty [Mon, 15 Sep 2014 15:41:14 +0000 (08:41 -0700)]
ovs-atomic-msvc: Disable a compiler warning.

MSVC does not support c11 style atomics for the C compiler.
Windows has different InterLocked* functions for different data
sizes.  ovs-atomic-msvc.h maps the api in ovs-atomic.h (which is similar
to c11 atomics) to the available atomic functions in Windows. In some
cases, this causes compiler warnings about mismatched data sizes because
the generated code has 'if else' conditions on different data sizes and
proper casting is not possible.

In current OVS code base, we get one compiler warning through ovs-rcu.h
which says "‘void *’ differs in levels of indirection from LONGLONG."
This comes from the following in ovs-atomic-msvc.h for atomic_read64():
*(DST) = InterlockedOr64((int64_t volatile *) (SRC), 0);
when *DST is a void pointer (because InterLockedOr64 returns LONGLONG).
But this code path is only every hit for 64 bit data. So it should be safe to
disable the warning. (Any real bugs in api calls would hopefully be caught
while compiling on Linux using gcc/clang).

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Eitan Eliahu <eliahue@vmware.com>
9 years agonetdev-dpdk: Fix thread-safety breach.
Alex Wang [Mon, 15 Sep 2014 20:15:38 +0000 (13:15 -0700)]
netdev-dpdk: Fix thread-safety breach.

dpdk_eth_dev_init() must be called with dpdk_mutex.  However,
netdev_dpdk_set_multiq() fails to follow this rule.  This commit
fixes this breach.

Found by clang.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
9 years agonetdev-dpdk: Make get_config() report correct queue info.
Alex Wang [Mon, 15 Sep 2014 20:01:12 +0000 (13:01 -0700)]
netdev-dpdk: Make get_config() report correct queue info.

With the separation of tx queue and rx queue configuration
in netdev-dpdk module, the netdev_dpdk_get_config() can no
longer report 'n_rxq' as tx queue configuration.

This commit fixes the above issue.

Reported-by: Daniele Di Proietto <ddiproietto@vmware.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
9 years agodpif-netdev: Create multiple pmd threads by default.
Alex Wang [Fri, 5 Sep 2014 21:14:20 +0000 (14:14 -0700)]
dpif-netdev: Create multiple pmd threads by default.

With this commit, ovs by default will create one pmd thread
for each numa node and pin the pmd thread to available cpu
core on the numa node.

NON_PMD_CORE_ID (currently 0) is used to reserve a particular
cpu core for the I/O of all non-pmd threads.  No pmd thread
can be pinned to this reserved core.

As side-effects of this commit:

-  pmd thread will not be created, if there is no dpdk interface
   from the corresponding numa node added to ovs.

- the exact-match cache for non-pmd threads is removed from
  'struct dp_netdev'.  Instead, all non-pmd threads will use
  the exact-match cache defined in the 'struct dp_netdev_pmd_thread'
  for NON_PMD_CORE_ID.

- the rx packet processing functions are refactored to use
  'struct dp_netdev_pmd_thread' as input.

- the 'netdev_send()' function will be called with the proper
  queue id.

- both pmd and non-pmd threads can call the dpif_netdev_execute().
  so, use a per-thread key to help recognize the calling thread.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetdev-dpdk: Remove the tx queue spinlock.
Alex Wang [Fri, 5 Sep 2014 17:56:18 +0000 (10:56 -0700)]
netdev-dpdk: Remove the tx queue spinlock.

The previous commit makes OVS create one tx queue for each
cpu core, each pmd thread will use a separate tx queue.
Also, tx of non-pmd threads on dpdk interface is all through
'NON_PMD_THREAD_TX_QUEUE', protected by the 'nonpmd_mempool_mutex'.
Therefore, the spinlock is no longer needed.  And this commit
removes it from 'struct dpdk_tx_queue'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetdev-dpdk: Add indicator for flushing tx queue.
Alex Wang [Thu, 4 Sep 2014 20:09:22 +0000 (13:09 -0700)]
netdev-dpdk: Add indicator for flushing tx queue.

Previous commit makes OVS create one tx queue for each cpu
core.  An upcoming patch will allow multiple pmd threads be
created and pinned to cpu cores.  So each pmd thread will use
the tx queue corresponding to its core id.

Moreover, the pmd threads running on different numa node than
the dpdk interface (called non-local pmd thread) will not
handle the rx of the interface.  Consequently, there need to
be a way to flush the tx queues of the non-local pmd threads.

To address the queue flushing issue, this commit introduces a
new flag 'flush_tx' in the 'struct dpdk_tx_queue' which is
set if the queue is to be used by a non-local pmd thread.
Then, when enqueueing the tx pkts, if the flag is set, the tx
queue will always be flushed immediately after the enqueue.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodpif-netdev: Create multiple tx/rx queues when adding dpdk interface.
Alex Wang [Tue, 17 Jun 2014 17:52:20 +0000 (10:52 -0700)]
dpif-netdev: Create multiple tx/rx queues when adding dpdk interface.

Before this commit, ovs creates one tx and one rx queue for
each dpdk interface and uses only one poll thread for handling
I/O of all dpdk interfaces.  An upcoming patch will allow multiple
poll threads be created.  As a preparation, this commit changes
the dpif-netdev to create multiple tx/rx queues when the dpdk
interface is added.

Specifically, the number of rx queues will still be one per-dpdk
interface for this commit.  But upcoming work will allow user
create multiple rx queues.  The number of tx queues will be the
number of cpu cores on the machine.  Although not all the tx queues
will be used, each poll thread will have its own queue for
transmission on the dpdk interface.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetdev: Add function for configuring tx and rx queues.
Alex Wang [Mon, 8 Sep 2014 21:52:54 +0000 (14:52 -0700)]
netdev: Add function for configuring tx and rx queues.

This commit adds a new API to the 'struct netdev_class' which
allows user to configure the number of tx queues and rx queues
of 'netdev'.  Upcoming patches will use this function to set
multiple tx/rx queues when adding the netdev to dpif-netdev.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoofproto: Do not update stats on fake bond interface.
Pravin B Shelar [Fri, 12 Sep 2014 23:00:50 +0000 (16:00 -0700)]
ofproto: Do not update stats on fake bond interface.

There are couple of reasons to remove this support:
*   This is used in very old OVS use-case. It is much better
    to read stats directly from OVS.
*   Forthcoming commit will remove support for setting stats
    for vport. The stats update depends on stats-set.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: Improve robustness of this_cpu_ptr definition in compat layer
Andy Zhou [Wed, 10 Sep 2014 22:36:06 +0000 (15:36 -0700)]
datapath: Improve robustness of this_cpu_ptr definition in compat layer

Current autoconfig detection logic for HAVE_PER_CPU_PTR is not robust.
Depends on linux kernel version, the definition can be in either
linux/percpu.h or asm/percpu.h

Turns out it is simpler and safer to handle missing percpu.h
definitions in linux/percpu.h rather than asm/percpu.h. With this
change, there is no need for the autoconfig detection logic above.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoovs-dev.py: do not pass --enable-dummy to ovsdb
Daniele Di Proietto [Sat, 13 Sep 2014 01:35:10 +0000 (01:35 +0000)]
ovs-dev.py: do not pass --enable-dummy to ovsdb

--enable-dummy was useless anyway for ovsdb-server. Now it is an error to pass
it.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agoofproto: Increase default datapath max_idle time.
Joe Stringer [Fri, 12 Sep 2014 06:03:56 +0000 (06:03 +0000)]
ofproto: Increase default datapath max_idle time.

The datapath max_idle value determines how long to wait before deleting
an idle datapath flow when operating below the flow_limit. This patch
increases the max_idle to 10 seconds, which allows datapath flows to be
remain cached even if they are used less consistently, and provides a
small improvement in the supported number of flows when operating around
the flow_limit.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
9 years agodatapath: Add IS_ERR_OR_NULL for backward compatibility.
Pravin B Shelar [Fri, 12 Sep 2014 23:03:34 +0000 (16:03 -0700)]
datapath: Add IS_ERR_OR_NULL for backward compatibility.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoopenvswitch: rename ->sync to ->syncp
WANG Cong [Fri, 12 Sep 2014 21:12:24 +0000 (14:12 -0700)]
openvswitch: rename ->sync to ->syncp

Openvswitch defines u64_stats_sync as ->sync rather than ->syncp,
so fails to compile with netdev_alloc_pcpu_stats(). So just rename it to ->syncp.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 1c213bd24ad04f4430031 (net: introduce netdev_alloc_pcpu_stats() for drivers)
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: introduce netdev_alloc_pcpu_stats() for drivers
WANG Cong [Fri, 12 Sep 2014 21:05:11 +0000 (14:05 -0700)]
datapath: introduce netdev_alloc_pcpu_stats() for drivers

There are many drivers calling alloc_percpu() to allocate pcpu stats
and then initializing ->syncp. So just introduce a helper function for them.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: Use IS_ERR_OR_NULL
Himangi Saraogi [Fri, 12 Sep 2014 18:34:04 +0000 (11:34 -0700)]
datapath: Use IS_ERR_OR_NULL

This patch introduces the use of the macro IS_ERR_OR_NULL in place of
tests for NULL and IS_ERR.

The following Coccinelle semantic patch was used for making the change:

@@
expression e;
@@

- e == NULL || IS_ERR(e)
+ IS_ERR_OR_NULL(e)
 || ...

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodatapath: fix duplicate #include headers
Jean Sacren [Fri, 12 Sep 2014 18:31:27 +0000 (11:31 -0700)]
datapath: fix duplicate #include headers

The #include headers net/genetlink.h and linux/genetlink.h both were
included twice, so delete each of the duplicate.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: dev@openvswitch.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: Replace rcu_dereference() with rcu_access_pointer()
Andreea-Cristina Bernat [Fri, 12 Sep 2014 18:26:01 +0000 (11:26 -0700)]
datapath: Replace rcu_dereference() with rcu_access_pointer()

The "rcu_dereference()" call is used directly in a condition.
Since its return value is never dereferenced it is recommended to use
"rcu_access_pointer()" instead of "rcu_dereference()".
Therefore, this patch makes the replacement.

The following Coccinelle semantic patch was used:
@@
@@

(
 if(
 (<+...
- rcu_dereference
+ rcu_access_pointer
  (...)
  ...+>)) {...}
|
 while(
 (<+...
- rcu_dereference
+ rcu_access_pointer
  (...)
  ...+>)) {...}
)

Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetdev: Add n_txq to 'struct netdev'.
Alex Wang [Wed, 3 Sep 2014 21:37:35 +0000 (14:37 -0700)]
netdev: Add n_txq to 'struct netdev'.

This commit adds new variable n_txq to 'struct netdev' for recording
the number of tx queues.  Correspondingly, the send_*() functions are
extended to accept queue id as input argument.

All 'netdev-*' implementation will ignore the queue id since having
multiple tx queues is not supported.  Upcomping patches will start
using it and create multiple tx queues for dpdk netdev.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetdev: Add function for getting the numa node id of netdev.
Alex Wang [Wed, 11 Jun 2014 23:33:08 +0000 (16:33 -0700)]
netdev: Add function for getting the numa node id of netdev.

This commit adds a new API to the 'struct netdev_class' which
allows user to query the numa node id the 'netdev' is on.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoovs-rcu: Make ovsrcu_quiesce() flush the callback event set.
Alex Wang [Tue, 9 Sep 2014 18:01:52 +0000 (11:01 -0700)]
ovs-rcu: Make ovsrcu_quiesce() flush the callback event set.

On current master, the per-thread callback event set is flushed
when ovsrcu_quiesce_start() is called or when the callback
event set is full.  For threads that only call 'ovsrcu_quiesce()'
to indicate quiescient state, their callback event set will not
be flushed for execution until the set is full.  And this could
take a very long time.

Theoretically, this should not be an issue, since rcu postponed
callback events should only free the old version of objects.
However, current ovs does not follow this rule, and some callback
events include other activities like unregistering the netdev
from global name-netdev map.  The delay of unregistering the netdev
(by threads that only calls ovsrcu_quiesce()) will prevent the
recreate of same netdev indefinitely.

As a short-term workaround, this commit makes every call to
ovsrcu_quiesce() flush the callback event set.  In the long run,
there will be a refactor of the use of ovs-rcu module, in which all
callback events only free the old version of objects.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoNetlink_socket.c Join/Unjoin an MC group for event subscription
Eitan Eliahu [Thu, 11 Sep 2014 17:01:02 +0000 (10:01 -0700)]
Netlink_socket.c Join/Unjoin an MC group for event subscription

Use a specific out of band device control to subscribe/unsubscribe a socket
to the driver event queue for notification.

Signed-off-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows/Netlink: Nested attributes put/parse.
Ankur Sharma [Thu, 11 Sep 2014 00:36:22 +0000 (17:36 -0700)]
datapath-windows/Netlink: Nested attributes put/parse.

Added APIs for creating and parsing nested netlink attributes.
APIs are on similar lines as userspace netlink code.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath-windows/NetlinkBuf.h: Added NlBufSize
Ankur Sharma [Wed, 10 Sep 2014 23:20:16 +0000 (16:20 -0700)]
datapath-windows/NetlinkBuf.h: Added NlBufSize

Added an inline function to return used size in the buffer.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodebian: Don't depened on $RUNLEVEL at startup to create bridges.
Gurucharan Shetty [Thu, 11 Sep 2014 16:35:10 +0000 (09:35 -0700)]
debian: Don't depened on $RUNLEVEL at startup to create bridges.

Commit b2a0daa5bd (debian: Don't recreate bridges during manual restart.)
added a check on $RUNLEVEL to only create bridges and ports when the
system starts up. This fix does not work with systemd.

This commit uses a different approach to solve the same problem.

Reported-at: https://bugs.debian.org/686518
Reported-by: Philipp S. Schmidt <phils@in-panik.de>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Tested-by: Philipp S. Schmidt <phils@in-panik.de>
9 years agoAvoid uninitialized variable warnings with OBJECT_OFFSETOF() in MSVC.
Gurucharan Shetty [Tue, 9 Sep 2014 21:23:07 +0000 (14:23 -0700)]
Avoid uninitialized variable warnings with OBJECT_OFFSETOF() in MSVC.

Implementation of OBJECT_OFFSETOF() for non-GNUC compilers like MSVC
causes "uninitialized variable" warnings. Since OBJECT_OFFSETOF() is
indirectly used through all the *_FOR_EACH() (through ASSIGN_CONTAINER()
and  OBJECT_CONTAINING()) macros, the OVS build
on Windows gets littered with "uninitialized variable" warnings.
This patch attempts to workaround the problem.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agounixctl: Make command description all lowercase.
Alex Wang [Fri, 22 Aug 2014 23:27:22 +0000 (16:27 -0700)]
unixctl: Make command description all lowercase.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoovsdb-server: Remove the 'enable-dummy' option.
Alex Wang [Thu, 21 Aug 2014 20:54:58 +0000 (13:54 -0700)]
ovsdb-server: Remove the 'enable-dummy' option.

There is no use case of this option in ovsdb-server.  Also,
it causes dpif-dummy and netdev-dummy module register unrelated
unixctl commands.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto-dpif: Probe for userdata after backer is fully operational.
Jarno Rajahalme [Thu, 11 Sep 2014 20:27:29 +0000 (13:27 -0700)]
ofproto-dpif: Probe for userdata after backer is fully operational.

When probing for variable length userdata before handler threads are
set, the pid included in the userspace action will be 0, which is
flagged as an error by the linux kernel datapath.  As a result the
feature probe will produce an unnecessary log message.  By probing for
variable length userdata later the probe works as intended and the
unnecessary log message is avoided.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agohash.h: Avoid compiler warnings with MSVC.
Gurucharan Shetty [Tue, 9 Sep 2014 21:16:16 +0000 (14:16 -0700)]
hash.h: Avoid compiler warnings with MSVC.

The lack of 'const' in function declaration causes MSVC to complain
because the function definition uses it.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoovs-ofctl: Workaround a compiler warning on MSVC.
Gurucharan Shetty [Tue, 9 Sep 2014 21:12:57 +0000 (14:12 -0700)]
ovs-ofctl: Workaround a compiler warning on MSVC.

MSVC complains about a void function returning a value if there is a
statement of the form - 'return foo()' even if foo() has a void return
type.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agotravis: Fix DPDK build and treat bad-function-cast warning as non-error
Thomas Graf [Thu, 11 Sep 2014 19:34:22 +0000 (21:34 +0200)]
travis: Fix DPDK build and treat bad-function-cast warning as non-error

A missing " prevented the DPDK build in the matrix from functioning
so far. This patch enables the DPDK build by properly building DPDK
as a single library and by pointing the OVS build to the corresponding
build directory. Also removes the 'make install' as it is not required
and only slows down the build.

Due to incorrect casts in the DPDK headers, we have to disable
bad-function-cast and cast-align warnings as being treated as errors
for now.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Co-authored-by: Daniele Di Proietto <ddiproietto@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agobuild: Respect CFLAGS and LDFLAGS passed to make
Thomas Graf [Thu, 11 Sep 2014 19:34:21 +0000 (21:34 +0200)]
build: Respect CFLAGS and LDFLAGS passed to make

configure cannot expect that the user will not pass additional CFLAGS
and LDFLAGS at make time [0]. Use OVS_CFLAGS and OVS_LDFLAGS instead to
collect compiler and linker flags and substitute in Makefile.am.

This allows for:
./configure --with-dpdk=[...]
make CFLAGS=-Wno-error=foo

[0] http://www.gnu.org/software/automake/manual/html_node/Flag-Variables-Ordering.html

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: Add this_cpu_{read, inc, dec} APIs for backward compatibility
Andy Zhou [Wed, 10 Sep 2014 20:22:08 +0000 (13:22 -0700)]
datapath: Add this_cpu_{read, inc, dec} APIs for backward compatibility

The upstream modules uses this_cpu_xxx APIs. Add those functions for
older kernel (<3.0.0) that does not provide them.

VMware-BZ: #1319082

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetlink-socket: Convert from error number to string correctly.
Gurucharan Shetty [Tue, 9 Sep 2014 18:55:45 +0000 (11:55 -0700)]
netlink-socket: Convert from error number to string correctly.

As mentioned in the comment above the function ovs_strerror(), it
should not be used to convert WINAPI error numbers to string.
Use ovs_lasterror_to_string() instead.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
9 years agodatapath: Backport __ip_select_ident() function
Pravin B Shelar [Wed, 25 Sep 2013 01:42:43 +0000 (18:42 -0700)]
datapath: Backport __ip_select_ident() function

definition of __ip_select_ident() changed in newer kernel and
it is backported to stable kernel, Therefore adding configure
check to detect the new function.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agoopenvswitch.h: Fix the type of struct ovs_key_nd nd_target field.
Jarno Rajahalme [Wed, 10 Sep 2014 20:02:46 +0000 (13:02 -0700)]
openvswitch.h: Fix the type of struct ovs_key_nd nd_target field.

Should be the same as other IPv6 address fields.

Current master produces sparse warnings without this change.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>