cascardo/ovs.git
10 years agodatapath: Fix ovs_flow_free() ovs-lock assert.
Pravin B Shelar [Tue, 28 Jan 2014 02:18:33 +0000 (18:18 -0800)]
datapath: Fix ovs_flow_free() ovs-lock assert.

ovs_flow_free() is not called under ovs-lock during packet
execute path (ovs_packet_cmd_execute()). Since packet execute
does not touch flow->mask, there is no need to take that
lock either. So move assert in case where flow->mask is checked.

Found by code inspection.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agobridge: Set ofport column in every database transaction.
Ben Pfaff [Fri, 31 Jan 2014 00:57:16 +0000 (16:57 -0800)]
bridge: Set ofport column in every database transaction.

Database transactions can occasionally fail due to concurrent changes in
the database.  When that happens, the next transaction should repeat the
changes that ovs-vswitchd tried to make the first time (adjusted for the
changes to the database).

The code to report the OpenFlow port number in use didn't do that.  It set
the ofport field once when it created the port and never set it again, even
if the transaction to set it failed.  This commit fixes the problem.

Bug #23047.
Reported-by: Suganya Ramachandran <suganyar@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agoofproto-dpif-upcall: Hardcode max_idle to 1500ms.
Ethan Jackson [Tue, 28 Jan 2014 00:40:27 +0000 (16:40 -0800)]
ofproto-dpif-upcall: Hardcode max_idle to 1500ms.

Before this patch, OVS tried to guess an optimal max idle time for
datapath flows based on the number of datapath flows relative to the
limit.  This caused instability because the limit was based on the
dump duration which was affected by the max idle time.  This patch
chooses instead to hardcode the max idle time to 1.5s except in
extreme case where the datapath flow limit is exceeded.  1.5s was
chosen to ensure pings occurring at once per second stay cached in the
datapath.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
10 years agodatapath: Fix ovs_dp_cmd_msg_size()
Daniele Di Proietto [Thu, 23 Jan 2014 16:18:59 +0000 (17:18 +0100)]
datapath: Fix ovs_dp_cmd_msg_size()

commit c58cc9a460fd158e5250e59902e96ac677dc115f (datapath: Allow user space to
announce ability to accept unaligned Netlink messages) introduced
OVS_DP_ATTR_USER_FEATURES netlink attribute in datapath responses,
but the attribute size was not taken into account in ovs_dp_cmd_msg_size().

Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoipsec: install iptables rules that set IPsec bit in skb mark
Ansis Atteka [Tue, 21 Jan 2014 01:16:39 +0000 (17:16 -0800)]
ipsec: install iptables rules that set IPsec bit in skb mark

Without these two iptables rules (one for UDP encapsulated IPsec and
another for direct IPsec), ovs-vswitchd would incorrectly conclude
that GRE packet belonged to a plain GRE tunnel instead of IPsec GRE
tunnel.

Reported-by: Aryan TaheriMonfared <aryan.taherimonfared@uis.no>
Reported-by: Daniel Hiltgen <daniel@netkine.com>
Signed-off-by: Ansis Atteka <aatteka@nicira.com>
10 years agobfd: Add bfd_src_ip and bfd_dst_ip.
Alex Wang [Tue, 21 Jan 2014 22:23:27 +0000 (14:23 -0800)]
bfd: Add bfd_src_ip and bfd_dst_ip.

This commit adds two new options, bfd_src_ip and bfd_dst_ip
respectively, which allows user to configure the source and
destination IP address of bfd control packet.  If the user
specified address cannot be parsed, the default address
will be used.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoupcall: Cache the number of flows from the datapath.
Joe Stringer [Wed, 22 Jan 2014 06:50:49 +0000 (06:50 +0000)]
upcall: Cache the number of flows from the datapath.

Fetching the number of flows in the datapath has been causing
unnecessary contention on the kernel ovs_lock in recent TCP CRR tests.
This patch caches this number for up to 100ms in the userspace to reduce
such kernel calls.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Co-authored-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off--by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agovlog: Avoid deadlock in vlog_init__() corner case.
Ben Pfaff [Fri, 6 Dec 2013 00:59:13 +0000 (16:59 -0800)]
vlog: Avoid deadlock in vlog_init__() corner case.

Anything inside vlog_init__() that tried to log a message was going to
deadlock, since it would hit pthread_once() recursively and wait for the
previous call to complete.  Unfortunately, there was a VLOG_ERR call inside
vlog_init__(), only called in the corner case where the system's clock was
wrong.

This fixes the problem by rearranging code so that the logging isn't
inside the "once-only" region.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Fix kernel panic on ovs_flow_free
Andy Zhou [Fri, 10 Jan 2014 23:57:04 +0000 (15:57 -0800)]
datapath: Fix kernel panic on ovs_flow_free

Both mega flow mask's reference counter and per flow table mask list
should only be accessed when holding ovs_mutex() lock. However
this is not true with ovs_flow_table_flush(). The patch fixes this bug.

Reported-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
10 years agodatapath: Pad OVS_PACKET_ATTR_PACKET if linear copy was performed
Thomas Graf [Tue, 14 Jan 2014 09:27:02 +0000 (01:27 -0800)]
datapath: Pad OVS_PACKET_ATTR_PACKET if linear copy was performed

While the zerocopy method is correctly omitted if user space
does not support unaligned Netlink messages. The attribute is
still not padded correctly as skb_zerocopy() will not ensure
padding and the attribute size is no longer pre calculated
though nla_reserve() which ensured padding previously.

This patch applies appropriate padding if a linear data copy
was performed in skb_zerocopy().

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto-dpif-xlate: Avoid recursive acquisition of xlate_rwlock.
YAMAMOTO Takashi [Wed, 15 Jan 2014 18:06:40 +0000 (10:06 -0800)]
ofproto-dpif-xlate: Avoid recursive acquisition of xlate_rwlock.

Currently xlate_rwlock is recursively acquired.
(xlate_send_packet -> ofproto_dpif_execute_actions -> xlate_actions)
Due to writer-preference in rwlock implementations, this causes
deadlock if another thread tries to acquire the lock exclusively
behind us.

This change avoids the problem by making xlate_send_packet drop
the lock before calling ofproto_dpif_execute_actions.  This is the
simplest fix but opens a race window against port reconfigurations.
Given the way xlate_send_packet is currently used, the race does not
seem a big problem.  An alternative would be passing down the
"xlate_rwlock is held" info to ofproto_dpif_execute_actions.

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-xlate: Fix a whitespace error.
YAMAMOTO Takashi [Wed, 15 Jan 2014 03:41:22 +0000 (12:41 +0900)]
ofproto-dpif-xlate: Fix a whitespace error.

No functional changes.

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agofat-rwlock: Don't forget to destroy a mutex
YAMAMOTO Takashi [Wed, 15 Jan 2014 03:41:21 +0000 (12:41 +0900)]
fat-rwlock: Don't forget to destroy a mutex

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoclassifier: Use fat_rwlock instead of ovs_rwlock.
Ben Pfaff [Mon, 13 Jan 2014 19:21:12 +0000 (11:21 -0800)]
classifier: Use fat_rwlock instead of ovs_rwlock.

Jarno Rajahalme reported up to 40% performance gain on netperf TCP_CRR with
an earlier version of this patch in combination with a kernel NUMA patch,
together with a reduction in variance:
    http://openvswitch.org/pipermail/dev/2014-January/035867.html

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agofat-rwlock: New big but fast synchronization primitive.
Ben Pfaff [Mon, 13 Jan 2014 19:17:55 +0000 (11:17 -0800)]
fat-rwlock: New big but fast synchronization primitive.

This implements a reader-writer lock that uses a lot of memory (128 to 192
bytes per thread that takes the lock) but avoids cache line bouncing when
taking the read side.  Thus, a fat_rwlock is a good choice for rwlocks
taken frequently by readers.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoovs-thread: Add new support for thread-specific data.
Ben Pfaff [Tue, 14 Jan 2014 22:35:48 +0000 (14:35 -0800)]
ovs-thread: Add new support for thread-specific data.

A couple of times I've wanted to create a dynamic data structure that has
thread-specific data, but I've not been able to do that because
PTHREAD_KEYS_MAX is so low (POSIX says at least 128, glibc is only a little
bigger at 1024).  This commit introduces a new form of thread-specific data
that supports a large number of items.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Un-wildcard nw_frag only for protocols that have fragments.
Ben Pfaff [Fri, 10 Jan 2014 23:17:43 +0000 (15:17 -0800)]
ofproto-dpif: Un-wildcard nw_frag only for protocols that have fragments.

The revalidator code in ofproto-dpif-upcall.c, in revalidate_ukey(),
deletes any datapath flow for which the kernel reports wildcarded bits
that userspace requires to be matched.  Until now, a couple of pieces of
code in ofproto-dpif always marked nw_frag (which tracks whether a packet
is an IPv4 or IPV6 fragment) as exact-match.  For non-IP protocols, this
wasn't meaningful, so adding such a flow to the datapath and then receiving
it back caused nw_frag to become wildcarded, so revalidate_ukey() always
deleted them.

This fixes the problem by only un-wildcarding nw_frag for protocols where
it is defined (IPv4 and IPv6).

Noticed while observing CFM traffic (which isn't IP based) over a tunnel.

Reported-by: Guolin Yang <gyang@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agotunnel: Un-wildcard only flags that really exist in tnl_xlate_init().
Ben Pfaff [Fri, 10 Jan 2014 23:14:27 +0000 (15:14 -0800)]
tunnel: Un-wildcard only flags that really exist in tnl_xlate_init().

The revalidator code in ofproto-dpif-upcall.c, in revalidate_ukey(),
deletes any datapath flow for which the kernel reports wildcarded bits
that userspace requires to be matched.  Until now, tnl_xlate_init() marked
every bit in the tunnel flags as required to be matched.  Since most of
those bits don't actually have defined flags, adding such a flow to the
datapath and then receiving it back caused those bits to become wildcarded,
which meant that revalidate_ukey() always deleted them.

This fixes the problem by only un-wildcarding defined flags.

Reported-by: Guolin Yang <gyang@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-upcall: Avoid unnecessarily installing datapath flows.
Ben Pfaff [Mon, 13 Jan 2014 23:33:27 +0000 (15:33 -0800)]
ofproto-dpif-upcall: Avoid unnecessarily installing datapath flows.

handle_upcalls() always installed a flow for every packet, as long as
the datapath didn't already have too many flows, but there are cases where
we don't want to do this:

    - If we get multiple packets in a single microflow all in one batch
      (perhaps due to GSO breaking up a large TCP packet for sending to
      userspace, or for another reason), then we only need to install the
      datapath flow once.

    - For a slow-pathed flow received via a slow-path action in the kernel,
      we know that the kernel flow is already there (because otherwise it
      would have been received as "no match" instead of an action), so
      there is no benefit to reinstalling it.

Noticed because a CFM slow-pathed flow was getting reinstalled every time
a CFM packet was received.

Reported-by: Guolin Yang <gyang@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agosFlow: clear the padding
Francesco Fusco [Thu, 19 Dec 2013 17:16:24 +0000 (18:16 +0100)]
sFlow: clear the padding

putString pads the string to the 4-byte boundary without
clearing the "padded" memory. This patch simply set the
padding to zero.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoUpdate build requirements.
Ben Pfaff [Tue, 31 Dec 2013 22:23:34 +0000 (14:23 -0800)]
Update build requirements.

Libtool is now required as of commit 38b7a52b61 (openvswitch: Use libtool
and allow building shared libs).

It seems that a build requirement for Python slipped in a while back, for
generating ovs-vswitchd.conf.db.5, and no one complained, so we might as
well make it official.  (That will let us simplify some bits of the build,
too, since they won't have to be conditional on Python anymore, so I'm all
in favor of this change.)

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agobfd: Fix cpath_down set failure.
Alex Wang [Thu, 9 Jan 2014 02:51:43 +0000 (18:51 -0800)]
bfd: Fix cpath_down set failure.

Commit ccc09689 (bfd: Implement Bidirectional Forwarding Detection.)
set the bfd local diagnostic to "Concatenated Path Down" in response
to the set of cpath_down only when the current local diagnostic is
"None".  However, since the bfd local diagnostic is not reset when
the bfd state is restored from last erroneous state, the set of
cpath_down will not update the local diagnostic in that case.

This commit fixes the bug by always checking for local diagnostic
change when cpath_down is set or reset.

Bug #22625
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agodatapath: Use kmem_cache_free() instead of kfree()
Wei Yongjun [Wed, 8 Jan 2014 14:07:52 +0000 (06:07 -0800)]
datapath: Use kmem_cache_free() instead of kfree()

memory allocated by kmem_cache_alloc() should be freed using
kmem_cache_free(), not kfree().

Fixes: e298e5057006 ('openvswitch: Per cpu flow stats.')
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto-dpif: Fix a vlan-splinter megaflow bug
Andy Zhou [Tue, 7 Jan 2014 08:17:25 +0000 (00:17 -0800)]
ofproto-dpif: Fix a vlan-splinter megaflow bug

When vlan-splinter is enabled, ovs receives non-vlan flows from the
kernel vlan ports, vlan tag is then added to the incoming flow before
xlating, so that they look like those received from a trunk port.

In case megaflow is enabled, xlating may set vlan masks during rule
processing as usual. If those vlan masks were serialized and downloaded
to the kernel (this bug), those mega flows will be rejected due to
unexpected vlan mask encapsulation, since the original kernel flows do
not have vlan tags. This bug does not break connectivity, but impacts
performance since all traffic received on vlan splinter ports will now
be handled by vswitchd, as no datapath flows can be successfully
installed.

This fix is to make sure no vlan mask encapsulation is generated for
the datapath flow if its in_port was re-written by vlan-splinter
receiving logic.

Bug #22567

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Check for backported sctp_compute_cksum().
Jesse Gross [Fri, 3 Jan 2014 23:44:28 +0000 (15:44 -0800)]
datapath: Check for backported sctp_compute_cksum().

This is backported by RHEL7.

Reported-by: Ashok Byahatti <ashok.byahatti@embrane.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
10 years agoodp-util: Avoid null dereference in parse_8021q_onward().
Ben Pfaff [Tue, 31 Dec 2013 19:32:16 +0000 (11:32 -0800)]
odp-util: Avoid null dereference in parse_8021q_onward().

For parsing a mask, this code in parse_8021q_onward() always read out
the OVS_KEY_ATTR_VLAN attribute without first checking whether it existed.
The correct behavior, implemented by this commit, appears to be treating
the VLAN as wildcarded and to continue parsing the flow.

Bug #22312.
Reported-by: Krishna Miriyala <miriyalak@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agobridge: Fix reversed string parsing in bridge_configure_flow_miss_model().
Alex Wang [Mon, 30 Dec 2013 22:21:40 +0000 (14:21 -0800)]
bridge: Fix reversed string parsing in bridge_configure_flow_miss_model().

This commit fixes a command matching error introduced by commit
7155fa52f (ofproto-dpif: Add 'force-miss-model' configuration).

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agobfd: Notify connectivity_seq on rmt_state changes.
Joe Stringer [Wed, 25 Dec 2013 00:50:53 +0000 (16:50 -0800)]
bfd: Notify connectivity_seq on rmt_state changes.

The bfd module did not previously change the global connectivity_seq
when the remote state changed, which means that such state changes may
not be propagated to the database. This is particularly bad if this is
the last state transition to happen in an otherwise stable environment.
This patch checks for transitions in remote state, and ensures that the
main thread will update the database when these happen.

Bug #22136.

Co-authored-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofp-parse: Check port number only after parsing it in parse_output().
Daisuke Kotani [Mon, 23 Dec 2013 09:19:48 +0000 (18:19 +0900)]
ofp-parse: Check port number only after parsing it in parse_output().

This patch allows to set max_len to UINT16_MAX in parse_output
if output port is OFPP_CONTROLLER.

Signed-off-by: Daisuke Kotani <kotani@net.ist.i.kyoto-u.ac.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoPrepare for 2.1.0.
Justin Pettit [Tue, 24 Dec 2013 00:15:46 +0000 (16:15 -0800)]
Prepare for 2.1.0.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: compat: Configure check GRE DEMUX.
Pravin B Shelar [Mon, 23 Dec 2013 03:43:58 +0000 (19:43 -0800)]
datapath: compat: Configure check GRE DEMUX.

RHEL6-openstack kernel has backported gre DEMUX module,
Therefore add configure check to detect it.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #21936

10 years agodatapath: compat: Add configure check for lockdep_rtnl_is_held()
Pravin B Shelar [Fri, 20 Dec 2013 23:34:40 +0000 (15:34 -0800)]
datapath: compat: Add configure check for lockdep_rtnl_is_held()

RHEL6-openstack kernel has backported lockdep_rtnl_is_held().

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: compat: Fix skb_has_frag_list definition.
Pravin B Shelar [Fri, 20 Dec 2013 22:30:28 +0000 (14:30 -0800)]
datapath: compat: Fix skb_has_frag_list definition.

RHEL6-openstack kernel has already replaced skb_has_frags
with skb_has_frag_list().

Fix compilation error on RHEL6-openstack.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agobitmap: add bitmap_count1 function
Ben Pfaff [Mon, 23 Dec 2013 20:56:14 +0000 (12:56 -0800)]
bitmap: add bitmap_count1 function

Signed-off-by: Alexander Wu <alexander.wu@huawei.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-dev.py: Pass leak-check=full to valgrind.
Ethan Jackson [Thu, 12 Dec 2013 03:04:10 +0000 (19:04 -0800)]
ovs-dev.py: Pass leak-check=full to valgrind.

This valgrind leak checker isn't really useful without this.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
10 years agocompiler.h: Update documentation
Joe Stringer [Fri, 20 Dec 2013 20:52:52 +0000 (12:52 -0800)]
compiler.h: Update documentation

OVS_LOCKS_EXCLUDED doesn't exist. This should be OVS_EXCLUDED.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoNEWS: Mention new ovs-ofctl ofp-parse-pcap command.
Ben Pfaff [Mon, 23 Dec 2013 18:41:14 +0000 (10:41 -0800)]
NEWS: Mention new ovs-ofctl ofp-parse-pcap command.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-ofctl: New command "ofp-parse-pcap" to dump OpenFlow from PCAP files.
Ben Pfaff [Fri, 22 Nov 2013 21:17:23 +0000 (13:17 -0800)]
ovs-ofctl: New command "ofp-parse-pcap" to dump OpenFlow from PCAP files.

Based on the number of people who ask about Wireshark support for OpenFlow,
this is likely to be widely useful.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agopcap-file: Add timestamp support for reading and writing pcap files.
Ben Pfaff [Fri, 22 Nov 2013 19:42:06 +0000 (11:42 -0800)]
pcap-file: Add timestamp support for reading and writing pcap files.

Only the write support is initially useful, but an upcoming commit will add
a user for the read support.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofpbuf: New function ofpbuf_shift().
Ben Pfaff [Fri, 22 Nov 2013 19:42:42 +0000 (11:42 -0800)]
ofpbuf: New function ofpbuf_shift().

An upcoming commit will add the first user.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: bug.h missing from distfiles
Chris Luke [Sun, 22 Dec 2013 22:43:33 +0000 (14:43 -0800)]
datapath: bug.h missing from distfiles

commit 7c359202 introduced datapath/linux/compat/include/bug.h
but did not include it in datapath/linux/Modules.mk, which results
in the following build error:

> The distribution is missing the following files:
> datapath/linux/compat/include/linux/bug.h

Signed-off-by: Chris Luke <chris_luke@cable.comcast.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Fix sparse warning on BUILD_BUG_ON_INVALID()
Andy Zhou [Sat, 21 Dec 2013 00:18:58 +0000 (16:18 -0800)]
datapath: Fix sparse warning on BUILD_BUG_ON_INVALID()

Sparse gives the following warnings when compile against Linux kernel
3.5:

 CHECK   /root/projs/ovs/openvswitch/datapath/linux/skbuff-openvswitch.c
 include/linux/mm.h:405:9: error: undefined identifier
 'BUILD_BUG_ON_INVALID'
 include/linux/mm.h:405:9: error: not a function <noident>

The same issue may also exist in kernel 3.6.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
10 years agobfd: Send FINAL immediately after receiving POLL.
Alex Wang [Fri, 20 Dec 2013 22:53:52 +0000 (14:53 -0800)]
bfd: Send FINAL immediately after receiving POLL.

Commit 307464a11 (ofproto-dpif-monitor: Use heap to order the mport
wakeup time.) makes bfd only send packet at specified periodic instant.
This fails to meet the RFC5880 requirement, which requires bfd send
FINAL immediately after receiving POLL.

This commit fixes the above issue by scheduling bfd to send FINAL
within 100 ms after receiving POLL.

Signed-off-by: Alex Wang <alexw@nicira.com>
10 years agodatapath: Check for backported netdev_features_t.
Jesse Gross [Tue, 17 Dec 2013 18:22:40 +0000 (10:22 -0800)]
datapath: Check for backported netdev_features_t.

This is apparently used by CentOS 6.5.

Reported-by: Phil Daws <uxbod@splatnix.net>
Reported-by: Edouard Bourguignon <madko@linuxed.net>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agolinux: Report supported user features to the kernel
Thomas Graf [Thu, 19 Dec 2013 15:20:42 +0000 (16:20 +0100)]
linux: Report supported user features to the kernel

Following commit (''netlink: Do not enforce alignment of last Netlink
attribute''), signal the ability to receive unaligned Netlink messages
to the datapath to enable utilization of zerocopy optimizations.

Opening a datapath is now done by issueing a OVS_DP_CMD_SET in order
to overwrite previously set user features.

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoovs-check-dead-ifs: Flush buffer before calling execvp.
Gurucharan Shetty [Fri, 20 Dec 2013 17:30:21 +0000 (09:30 -0800)]
ovs-check-dead-ifs: Flush buffer before calling execvp.

According to Python documentation here for execvp:
http://docs.python.org/2/library/os.html
"The current process is replaced immediately. Open file objects
and descriptors are not flushed, so if there may be data buffered
on these open files, you should flush them using sys.stdout.flush()
or os.fsync() before calling an exec* function.

Without the flush, we will miss the print statements before that
if we redirect the o/p to a file.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofp-print: Print durations with at least three decimals.
Ben Pfaff [Fri, 20 Dec 2013 16:39:27 +0000 (08:39 -0800)]
ofp-print: Print durations with at least three decimals.

Occasionally I run a command like this:
    watch -n.1 ovs-ofctl dump-flows br0
to see how flows change over time.  Until now, it has been more difficult
than necessary to spot real changes, because flows "jump around" as the
number of decimals printed for duration changes from moment to moment.
That is, you might see
    cookie=0x0, duration=4.566s, table=0, n_packets=0, ...
one moment, and then
    cookie=0x0, duration=4.8s, table=0, n_packets=0, ...
the next moment.  Shortening 4.8 to 4.800 shifts everything following it
two places to the left, creating a visual jump.

This commit avoids that problem by always printing at least three decimals
if we print any.  There can still be an occasional jump if a duration is
exactly on a second boundary, but that only happens 1/1000 of the time.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agolib/flow: Skip minimask value checks.
Jarno Rajahalme [Fri, 20 Dec 2013 16:16:31 +0000 (08:16 -0800)]
lib/flow: Skip minimask value checks.

We allow zero 'values' in a miniflow for it to have the same map
as the corresponding minimask.  Minimasks themselves never have
zero data values, though.  Document this and optimize the code
accordingly.

v2:
- Made miniflow_get_map_in_range() to return data offset instead of
  a pointer via the last parameter.
- Simplified minimatch_hash_in_range() by removing pointer arithmetic.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agotests/learn.at: Workaround a race
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:06 +0000 (19:31 +0900)]
tests/learn.at: Workaround a race

This test seems to assume that the switch completes
processing of the first packet before start processing
the second one.  I don't see any code ensuring that.
Workaround the problem by giving 1 second for the upcall.

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agotimeval: Workaround for threaded test failures
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:05 +0000 (19:31 +0900)]
timeval: Workaround for threaded test failures

BFD tests have the code like the following.

    # wait for a while to stablize everything.
    for i in `seq 0 9`; do ovs-appctl time/warp 500; done

They no longer work as intended because BFD code is run in a
separate monitor thread these days.  The loop merely "warp"
the time by 5000.  The monitor thread should have been woken
at least once, but it's far from "wait for a while to stablize
everything."

This commit mitigates the problem by sleeping a little in the
appctl handler.  This is not ideal but makes BFD tests success
on my environment.

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agotests/ofproto-dpif.at: Workaround a race
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:04 +0000 (19:31 +0900)]
tests/ofproto-dpif.at: Workaround a race

This test seems to assume only the first packets in flows
counted as 'miss'.  I don't see any code ensuring that.
The test would fail if the upcall handler for the flow doesn't
run fast enough.  Workaround the problem by giving 1 second
for the miss upcall.

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-upcall: Reduce log level of "Spent unreasonably long" msg
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:03 +0000 (19:31 +0900)]
ofproto-dpif-upcall: Reduce log level of "Spent unreasonably long" msg

This message can be caused by a time warp and make tests fail.

The message was introduced by commit e79a6c83.
("ofproto: Handle flow installation and eviction in upcall.")

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agotests/ofproto.at: Avoid stdout/stderr ordering assumptions
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:02 +0000 (19:31 +0900)]
tests/ofproto.at: Avoid stdout/stderr ordering assumptions

Stop assuming the order of outputs from separate streams.
(stdout and stderr)

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agotests/ofproto-dpif.at: Portability improvement
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:01 +0000 (19:31 +0900)]
tests/ofproto-dpif.at: Portability improvement

The output of "wc -l" have leading spaces on some platforms.
(NetBSD, OSX, ...)

This fixes a test failure introduced by commit e79a6c83.
("ofproto: Handle flow installation and eviction in upcall.")

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agotests/daemon-py.at: Skip if no python
YAMAMOTO Takashi [Fri, 20 Dec 2013 10:31:00 +0000 (19:31 +0900)]
tests/daemon-py.at: Skip if no python

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years ago.gitignore: add /libtool
Lorand Jakab [Fri, 20 Dec 2013 10:46:53 +0000 (12:46 +0200)]
.gitignore: add /libtool

The ./configure script now generates a 'libtool' file in the top-level
directory.  Add it to .gitignore

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Handle flow installation and eviction in upcall.
Ethan Jackson [Tue, 24 Sep 2013 20:39:56 +0000 (13:39 -0700)]
ofproto: Handle flow installation and eviction in upcall.

This patch moves flow installation and eviction from ofproto-dpif and
the main thread, into ofproto-dpif-upcall.  This performs
significantly better (approximately 2x TCP_CRR improvement), and
allows ovs-vswitchd to maintain significantly larger datapath flow
tables.  On top of that, it significantly simplifies the code,
retiring "struct facet" and friends.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agounixctl: Make dpif/dump-flows fetch kernel flows.
Joe Stringer [Wed, 20 Nov 2013 22:25:43 +0000 (14:25 -0800)]
unixctl: Make dpif/dump-flows fetch kernel flows.

Previously we used facets for ovs-appctl dpif/dump-flows commands.
This switches to fetching flows directly from the dpif.  This is
necessary because future patches remove facets and subfacet entirely.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Fix build failure on RHEL 6.4
Pravin B Shelar [Wed, 18 Dec 2013 18:57:33 +0000 (10:57 -0800)]
datapath: Fix build failure on RHEL 6.4

Patch fixes following build failure:-

make[4]: Entering directory
`/usr/src/kernels/2.6.32-358.18.1.el6.x86_64'
  CC [M]  openvswitch/datapath/linux/actions.o
In file included from
openvswitch/datapath/linux/actions.c:21:
openvswitch/datapath/linux/compat/include/linux/skbuff.h:273:
error: redefinition of Ã¢â‚¬Ëœ__skb_fill_page_desc’
include/linux/skbuff.h:1123: note: previous definition of
‘__skb_fill_page_desc’ was here
-----

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agoentropy: Add Windows support.
Alin Serdean [Thu, 19 Dec 2013 17:20:17 +0000 (09:20 -0800)]
entropy: Add Windows support.

Signed-off-by: Alin Serdean <aserdean at cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoxenserver: Fix build failures because of libraries in /usr/lib.
Gurucharan Shetty [Thu, 19 Dec 2013 05:35:09 +0000 (21:35 -0800)]
xenserver: Fix build failures because of libraries in /usr/lib.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Fix deadlock during stats update.
Pravin B Shelar [Tue, 17 Dec 2013 23:43:30 +0000 (15:43 -0800)]
datapath: Fix deadlock during stats update.

Stats-read needs to lock stats but same lock is taken in stats
update in irq context. Therefore it needs to disable irq to
avoid following deadlock :-

BUG: soft lockup - CPU#1 stuck for 23s! [ovs-vswitchd:1425]
CPU 1
Pid: 1425, comm: ovs-vswitchd Tainted: G           O 3.2.39-server-nn23 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
RIP: 0010:[<ffffffff8103db22>]  [<ffffffff8103db22>] __ticket_spin_lock+0x22/0x30
RSP: 0018:ffff88003fd03b30  EFLAGS: 00000297
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000050
RDX: 0000000000000002 RSI: ffff88003d0a9900 RDI: ffff88003ae19598
RBP: ffff88003fd03b30 R08: 0000000000000000 R09: ffff88003ad44048
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88003fd03aa8
R13: ffffffff8164e5de R14: ffff88003fd03b30 R15: ffff88003ae19580
FS:  00007ffb0b428940(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3c0ef94000 CR3: 00000000250e2000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ovs-vswitchd (pid: 1425, threadinfo ffff88002514a000, task ffff8800250aae00)
Stack:
 ffff88003fd03b40 ffffffff8164596e ffff88003fd03b70 ffffffffa027622d
 ffff88003d0a9900 ffffe8ffffd03800 ffff8800297f5a80 ffff88003fd03ba8
 ffff88003fd03c60 ffffffffa02759af ffff88003fd03de0 ffff88003fd03e4c
Call Trace:
 <IRQ>
 [<ffffffff8164596e>] _raw_spin_lock+0xe/0x20
 [<ffffffffa027622d>] ovs_flow_stats_update+0x5d/0x100 [openvswitch]
 [<ffffffffa02759af>] ovs_dp_process_received_packet+0x8f/0x130 [openvswitch]
 [<ffffffffa027c0ca>] ovs_vport_receive+0x2a/0x30 [openvswitch]
 [<ffffffffa027db18>] netdev_frame_hook+0xb8/0x120 [openvswitch]
 [<ffffffffa027da60>] ? free_port_rcu+0x30/0x30 [openvswitch]
 [<ffffffff81539318>] __netif_receive_skb+0x1c8/0x620
 [<ffffffff8153a4c0>] netif_receive_skb+0x80/0x90
 [<ffffffff8115f14c>] ? ksize+0x1c/0xc0
 [<ffffffff8153a610>] napi_skb_finish+0x50/0x70
 [<ffffffff8153ac15>] napi_gro_receive+0xf5/0x140
 [<ffffffffa00368ae>] vmxnet3_rq_rx_complete+0x51e/0x7c0 [vmxnet3]
 [<ffffffff8101ac90>] ? nommu_map_sg+0xe0/0xe0
 [<ffffffffa0036da5>] vmxnet3_poll_rx_only+0x45/0xc0 [vmxnet3]
 [<ffffffff8153ae64>] net_rx_action+0x134/0x290
 [<ffffffff8103db0d>] ? __ticket_spin_lock+0xd/0x30
 [<ffffffff8106e1a8>] __do_softirq+0xa8/0x210
 [<ffffffff8164596e>] ? _raw_spin_lock+0xe/0x20
 [<ffffffff8164fd6c>] call_softirq+0x1c/0x30
 [<ffffffff81016215>] do_softirq+0x65/0xa0
 [<ffffffff8106e58e>] irq_exit+0x8e/0xb0
 [<ffffffff81650633>] do_IRQ+0x63/0xe0
 [<ffffffff81645e2e>] common_interrupt+0x6e/0x6e

-----------
Bug #21853
Reported-by: Pawan Shukla <shuklap@vmware.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto-dpif: Get rid of mirror_mask_ffs() function.
Ben Pfaff [Wed, 18 Dec 2013 17:20:49 +0000 (09:20 -0800)]
ofproto-dpif: Get rid of mirror_mask_ffs() function.

There's no need for it because we have the equivalent (actually more
convenient) function raw_ctz(), which works with any integer type.

Signed-off-by: Ben Pfaff <blp@nicira.com>
CC: Alin Serdean <aserdean@cloudbasesolutions.com>
CC: Gurucharan Shetty <shettyg@nicira.com>
10 years agorhel: Fix build failures because of libraries in /usr/lib.
Gurucharan Shetty [Wed, 18 Dec 2013 17:05:40 +0000 (09:05 -0800)]
rhel: Fix build failures because of libraries in /usr/lib.

Reported-by: Igor Sever <igor@xorops.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofp-tcpdump: Fix tcpdump patch breakage due to libtool.
Ben Pfaff [Wed, 18 Dec 2013 21:47:16 +0000 (13:47 -0800)]
ofp-tcpdump: Fix tcpdump patch breakage due to libtool.

The recently introduced use of libtool, in commit 38b7a52b618b98
(openvswitch: Use libtool and allow building shared libs) broke the
tcpdump patch.  This fixes the problem.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agoAUTHORS: update my entry
YAMAMOTO Takashi [Wed, 18 Dec 2013 08:21:23 +0000 (17:21 +0900)]
AUTHORS: update my entry

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev-bsd: remove an unused variable
YAMAMOTO Takashi [Wed, 18 Dec 2013 08:16:15 +0000 (17:16 +0900)]
netdev-bsd: remove an unused variable

this is a leftover of commit da4a6191.
("netdev: Globally track port status changes")

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoINSTALL: Mention --enable-Werror.
Ben Pfaff [Wed, 18 Dec 2013 05:44:25 +0000 (21:44 -0800)]
INSTALL: Mention --enable-Werror.

I think that some developers haven't noticed this, but it's useful, so
mention it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agoofproto: Add table config to struct ofproto
Simon Horman [Mon, 16 Dec 2013 08:53:22 +0000 (17:53 +0900)]
ofproto: Add table config to struct ofproto

Add table config to to struct ofproto and set it
when a table mod message is received.

This is in preparation for changing the behaviour of the switch
based on table config.

Cc: Andy Zhou <azhou@nicira.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodpif-netdev: Remove unnecessary parameters from dp_netdev_port_input()
Simon Horman [Wed, 27 Nov 2013 05:08:41 +0000 (14:08 +0900)]
dpif-netdev: Remove unnecessary parameters from dp_netdev_port_input()

The skb_priority, pkt_mark and tunl parameters of dp_netdev_port_input()
are always passed as 0, 0 and NULL respectively. So rather than
passing these values to dp_netdev_port_input() just use them directly.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoopenvswitch: Use libtool and allow building shared libs
Helmut Schaa [Fri, 13 Dec 2013 17:54:28 +0000 (18:54 +0100)]
openvswitch: Use libtool and allow building shared libs

Currently openvswitch builds all libraries static only. However,
libopenvswitch is linked into nearly all openvswitch executables
making it hardly possible to run openvswitch on embedded devices
(for example running OpenWrt).

Convert openvswitch to use libtool for building its internal libs.
This allows "--enable-shared" and "--enable-static" as configure
arguments. Default is "--disable-shared" thus keeping the current
behavior with the only change that static libs are installed by
"make install".

Since the openvswitch library interfaces are internal and thus not
stable (yet) encode the openvswitch version into the library name:
libopenvswitch-2.0.90.so

Binary size is reduced to around 1/3 when using shared libs.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agocfm: Add test for fault_override
Joe Stringer [Mon, 16 Dec 2013 18:08:08 +0000 (10:08 -0800)]
cfm: Add test for fault_override

This patch adds tests for the cfm fault_override feature which can be
set through "ovs-appctl cfm/set-fault <port> <value>". It brings up two
ports with CFM, sets a fault, then checks that the fault status has
propagated correctly to the CFM module and the database. Finally, it
sets the fault override behaviour to normal and checks that the fault
has gone away.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoDo not free uninitialized packets.
Jarno Rajahalme [Tue, 17 Dec 2013 23:54:30 +0000 (15:54 -0800)]
Do not free uninitialized packets.

Commit da546e0 (dpif: Allow execute to modify the packet.) uninitializes
the "dpif_upcall.packet" of "struct upcall" when dpif_recv() returns error.
The packet ofpbuf is likely uninitialized in this case, hence calling
ofpbuf_uninit() on it will likely cause a SEGFAULT.

This commit fixes this bug by only uninitializing packet's ofpbuf on
successfully received upcalls.

A note warning about this is added on the comment of dpif_recv() in
dpif.c and dpif-provider.h.

Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoRemove unnecessary memset().
Jarno Rajahalme [Tue, 17 Dec 2013 23:54:30 +0000 (15:54 -0800)]
Remove unnecessary memset().

We already set all the fields of the upcall, so memsetting right before
is unnecessary.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-monitor: Acquire write lock in monitor_run().
Alex Wang [Fri, 13 Dec 2013 19:29:09 +0000 (11:29 -0800)]
ofproto-dpif-monitor: Acquire write lock in monitor_run().

Commit 307464a1 (ofproto-dpif-monitor: Use heap to order the mport
wakeup time.) re-heapifies the heap in monitor_run().  So the
monitor_run() should be protected by the write lock, rather than
the read lock.

This commit fixes the issue.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodpif-linux: Do not call nl_sock_pid() on NULL sock pointer.
Alex Wang [Tue, 17 Dec 2013 22:37:10 +0000 (14:37 -0800)]
dpif-linux: Do not call nl_sock_pid() on NULL sock pointer.

This commit adds check of sock pointer in dpif_linux_port_get_pid().
If the pointer is NULL, do not call nl_sock_pid().

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodpif-linux: Fix the return type of dpif_linux_port_dump_next__().
Alex Wang [Tue, 17 Dec 2013 22:37:09 +0000 (14:37 -0800)]
dpif-linux: Fix the return type of dpif_linux_port_dump_next__().

Commit 222837 (dpif-linux: Factor out port dumping helper functions.)
introduced a bug by making dpif_linux_port_dump_next__() return 'bool'
instead of 'int' as defined in dpif-provider.h.  This bug causes ovs-
vswitchd failure with SEGFAULT when processing slow-path packet.

This commit fixes the bug by following the dpif-provider specification.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoRemove stream, vconn, and rconn functions to get local/remote IPs/ports.
Ben Pfaff [Tue, 17 Dec 2013 23:07:12 +0000 (15:07 -0800)]
Remove stream, vconn, and rconn functions to get local/remote IPs/ports.

These functions don't have any ultimate users.  The in-band control code
used to use them, but not anymore, so we might as well delete them all.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agodatapath: make functions local
Stephen Hemminger [Tue, 17 Dec 2013 22:57:46 +0000 (14:57 -0800)]
datapath: make functions local

Several functions and datastructures could be local
Found with 'make namespacecheck'

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agobfd: Set next_tx correctly when processing packets
Joe Stringer [Fri, 6 Dec 2013 20:22:07 +0000 (12:22 -0800)]
bfd: Set next_tx correctly when processing packets

In the case where we have not yet sent a control packet for a bfd
connection, and we receive a control packet from the remote host,
bfd->next_tx is updated to an unusual value. This causes the logging to
incorrectly report that there has been long delays (in the order of
weeks) since the last bfd transmission time.

This patch only modifies bfd->next_tx in this case if we are not
expecting to immediately send a control packet. This should mean that
bfd->next_tx is either 0 (immediate tx) or in the order of time_msec().

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agolacp: Give LACP a moment to initialize before testing its state, in tests.
Ben Pfaff [Tue, 17 Dec 2013 22:14:52 +0000 (14:14 -0800)]
lacp: Give LACP a moment to initialize before testing its state, in tests.

These tests configured LACP and then immediately dumped out its state.
Most of the time, this worked, but there was a brief race window in which
the "negotiated" flag could be missing because this took one pass through
the main loop.  This fixes the problem.

This race may be seen in the failures of tests 11 and 12 here:
https://launchpadlibrarian.net/151884888/buildlog_ubuntu-precise-amd64.openvswitch_2.0~201309300804-1ppa1~precise_FAILEDTOBUILD.txt.gz

Reported-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agobridge: Let ofprotos run once before reporting configuration complete.
Ben Pfaff [Mon, 30 Sep 2013 20:07:35 +0000 (13:07 -0700)]
bridge: Let ofprotos run once before reporting configuration complete.

Occasionally in the unit tests the following race can happen:

   1. ovs-vsctl updates database
   2. ovs-vswitchd reconfigures, notifies ovs-vsctl that it is complete
   3. ovs-appctl ofproto/trace fails to see newly added port
   4. ovs-vswitchd main loop calls ofproto's ->type_run(), making the
      new port visible to translation.

This race may be seen in the failures of tests 5 and 624 here:
https://launchpadlibrarian.net/151884888/buildlog_ubuntu-precise-amd64.openvswitch_2.0~201309300804-1ppa1~precise_FAILEDTOBUILD.txt.gz

Reported-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodpif-linux: fix the size of n_masks
Francesco Fusco [Tue, 17 Dec 2013 19:18:18 +0000 (20:18 +0100)]
dpif-linux: fix the size of n_masks

The command ovs-dpctl can wrongly output the masks even if the
datapath does not implement mega flows. In this case the output
will be similar to the following:

system@ovs-system:
lookups: hit:14 missed:41 lost:0
flows: 0
masks: hit:18446744073709551615 total:4294967295
hit/pkt:335395346794719104.00
port 0: ovs-system (internal)
port 1: gre_system (gre: df_default=false, ttl=0)
port 2: ots-br0 (internal)
port 3: int0 (internal)
port 4: vnet0
port 5: vnet1

The problem depends on the fact that n_masks stats is stored as a
uint32 in the struct ovs_dp_megaflow_stats and as a uint64 in the
struct dpif_dp_stats. UINT32_MAX instead of UINT64_MAX should be
used to detect if the datapath supports megaflows or not.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoRename NOT_REACHED to OVS_NOT_REACHED
Harold Lim [Tue, 17 Dec 2013 18:32:12 +0000 (10:32 -0800)]
Rename NOT_REACHED to OVS_NOT_REACHED

This allows other libraries to use util.h that has already
defined NOT_REACHED.

Signed-off-by: Harold Lim <haroldl@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoUpdate openvswitch to allow linking from C++ projects
Harold Lim [Tue, 17 Dec 2013 18:32:11 +0000 (10:32 -0800)]
Update openvswitch to allow linking from C++ projects

The input variable of ovs_scan is changed from 'template' to
'format'. template is a keyword in C++.

Signed-off-by: Harold Lim <haroldl@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agovtep: add "Arp_sources" tables
Bruce Davie [Mon, 25 Nov 2013 16:19:50 +0000 (08:19 -0800)]
vtep: add "Arp_sources" tables

Add two new tables to the VTEP schema in support of distributed L3.
Each table contains MAC addresses to be used by VTEPs (both hardware
and software) when issuing ARP requests on behalf of a logical router.

Signed-off-by: Bruce Davie <bdavie@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoFAQ: Reference 2.0, not 1.12.
Justin Pettit [Tue, 17 Dec 2013 20:50:50 +0000 (12:50 -0800)]
FAQ: Reference 2.0, not 1.12.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoFAQ: Update to reflect that tunneling is now in upstream Linux.
Jesse Gross [Tue, 17 Dec 2013 18:49:26 +0000 (10:49 -0800)]
FAQ: Update to reflect that tunneling is now in upstream Linux.

Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: pritesh <pritesh.kothari@cisco.com>
10 years agodpif-linux: Fix a bug.
Alex Wang [Tue, 17 Dec 2013 16:16:24 +0000 (16:16 +0000)]
dpif-linux: Fix a bug.

Commit da546e0 (dpif: Allow execute to modify the packet.) introduced
a bug by subtracting the zero-value ofpbuf size by "sizeof(struct
nlattr)" and assigning the result back to the ofpbuf size.  This bug
causes the ovs-assert failure in facet_push_stats().

This commit fixes the bug by assigning the right value to the ofpbuf
size.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agorconn: Update comments on is_admitted_msg().
Ben Pfaff [Thu, 21 Nov 2013 23:03:23 +0000 (15:03 -0800)]
rconn: Update comments on is_admitted_msg().

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoFAQ: Explain how to add QoS features to Open vSwitch.
Ben Pfaff [Fri, 22 Nov 2013 01:03:13 +0000 (17:03 -0800)]
FAQ: Explain how to add QoS features to Open vSwitch.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoFAQ: Describe weak and strong ES models.
Ben Pfaff [Tue, 17 Dec 2013 06:19:08 +0000 (22:19 -0800)]
FAQ: Describe weak and strong ES models.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agodatapath: Handle different definitions of page in skb_frag_t.
Jesse Gross [Tue, 17 Dec 2013 05:03:09 +0000 (21:03 -0800)]
datapath: Handle different definitions of page in skb_frag_t.

Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Add hash library to .gitignore.
Jesse Gross [Tue, 17 Dec 2013 02:10:40 +0000 (18:10 -0800)]
datapath: Add hash library to .gitignore.

Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Backport __skb_fill_page_desc
Jesse Gross [Tue, 17 Dec 2013 02:07:16 +0000 (18:07 -0800)]
datapath: Backport __skb_fill_page_desc

Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Backport skb_has_frag_list().
Jesse Gross [Tue, 17 Dec 2013 02:05:01 +0000 (18:05 -0800)]
datapath: Backport skb_has_frag_list().

Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Backport skb_frag_ functions
Pravin B Shelar [Sun, 3 Mar 2013 07:53:52 +0000 (23:53 -0800)]
datapath: Backport skb_frag_ functions

Define accessors skb_frag_* functions whch were introduced in 3.2

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Add missing #include for skb page accessors.
Jesse Gross [Tue, 31 Jan 2012 21:15:30 +0000 (13:15 -0800)]
datapath: Add missing #include for skb page accessors.

Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Compute checksum in skb_gso_segment() if needed
Thomas Graf [Tue, 17 Dec 2013 01:03:45 +0000 (17:03 -0800)]
datapath: Compute checksum in skb_gso_segment() if needed

The copy & csum optimization is no longer present with zerocopy
enabled. Compute the checksum in skb_gso_segment() directly by
dropping the HW CSUM capability from the features passed in.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Use skb_zerocopy() for upcall
Thomas Graf [Tue, 17 Dec 2013 00:56:03 +0000 (16:56 -0800)]
datapath: Use skb_zerocopy() for upcall

Use of skb_zerocopy() can avoid the expensive call to memcpy()
when copying the packet data into the Netlink skb. Completes
checksum through skb_checksum_help() if not already done in
GSO segmentation.

Zerocopy is only performed if user space supported unaligned
Netlink messages. memory mapped netlink i/o is preferred over
zerocopy if it is set up.

Cost of upcall is significantly reduced from:
+   7.48%       vhost-8471  [k] memcpy
+   5.57%     ovs-vswitchd  [k] memcpy
+   2.81%       vhost-8471  [k] csum_partial_copy_generic

to:
+   5.72%     ovs-vswitchd  [k] memcpy
+   3.32%       vhost-5153  [k] memcpy
+   0.68%       vhost-5153  [k] skb_zerocopy

(megaflows disabled)

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jesse Gross <jesse@nicira.com>