cascardo/ovs.git
10 years agoofproto-dpif: Do not dpif_port_del() patch ports in port_del().
Ben Pfaff [Fri, 30 Aug 2013 03:55:10 +0000 (20:55 -0700)]
ofproto-dpif: Do not dpif_port_del() patch ports in port_del().

Patch ports don't have datapath ports so it doesn't make sense to try to
call dpif_port_del() on them.  If we do try, it will fail, which makes the
caller think that the port wasn't really deleted, which in turn keeps
ofproto from reporting the port deletion via OpenFlow.  This fixes the
problem.

Without this patch, the following commands, executed under "make sandbox",
will report the patch port addition in "ovs-ofctl monitor" output, but not
the patch port deletion.  With this patch, both the addition and deletion
will be reported.

    ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
    ovs-ofctl monitor br0 128 &
    ovs-vsctl add-port br0 patch1 \
        -- set interface patch1 type=patch options:peer=patch2 \
        -- add-port br0 patch2 \
        -- set interface patch2 type=patch options:peer=patch1
    ovs-vsctl del-port patch1
    ovs-vsctl del-port patch2

Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-xlate: Fix confusion between "no stp port" and "stp port 0".
Alex Wang [Wed, 4 Sep 2013 22:21:15 +0000 (15:21 -0700)]
ofproto-dpif-xlate: Fix confusion between "no stp port" and "stp port 0".

Commit 9d189a50e (ofproto-dpif-xlate: Pull STP xlation into
ofproto-dpif-xlate.) introduced the bug that considers 'stp_port_no'
of 0 as stp disabled on the port.  However 'stp_port_no' is
actually the index of the stp struct's port array and ranges
between [0, STP_MAX_PORTS).  So the bug allows the blocked
port keep transmitting packets and generates loop.

This commit fixes this bug.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodpif: fix segfault in CONTROLLER action with sflow or ipfix setup
Romain Lenglet [Wed, 4 Sep 2013 01:58:37 +0000 (18:58 -0700)]
dpif: fix segfault in CONTROLLER action with sflow or ipfix setup

Signed-off-by: Romain Lenglet <rlenglet@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Convert units correctly in ofport_open().
Ben Pfaff [Wed, 4 Sep 2013 20:37:56 +0000 (13:37 -0700)]
ofproto: Convert units correctly in ofport_open().

netdev_features_to_bps() returns a speed in bps, but struct
ofputil_phy_port's curr_speed and max_speed are in kbps, so a conversion
is necessary.  This commit fixes the problem.

Reported-by: Benjamin Lunsky <benjamin.lunsky@netronome.com>
Tested-by: Benjamin Lunsky <benjamin.lunsky@netronome.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoflow: Fix hypothetical memory leak in miniflow_move().
Ben Pfaff [Wed, 4 Sep 2013 19:33:06 +0000 (12:33 -0700)]
flow: Fix hypothetical memory leak in miniflow_move().

Ordinarily a miniflow will use its inline_values if its values can fit, but
there is nothing to prevent a small number of values from being stored
in malloc()'d memory.  If this happened, then miniflow_move() would leak
memory.  This commit fixes the problem.

This is a hypothetical problem.  I haven't seen it in practice.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto: Fix memory leak in add_flow().
Ben Pfaff [Wed, 4 Sep 2013 19:33:43 +0000 (12:33 -0700)]
ofproto: Fix memory leak in add_flow().

Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif-mirror: Fix memory leak in mbridge_unref().
Ben Pfaff [Wed, 4 Sep 2013 19:33:28 +0000 (12:33 -0700)]
ofproto-dpif-mirror: Fix memory leak in mbridge_unref().

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoflow: Fix memory leak in minimask_move().
Ben Pfaff [Wed, 4 Sep 2013 19:39:19 +0000 (12:39 -0700)]
flow: Fix memory leak in minimask_move().

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Fix use-after-free deleting a bridge with active traffic.
Ben Pfaff [Wed, 4 Sep 2013 19:38:12 +0000 (12:38 -0700)]
ofproto-dpif: Fix use-after-free deleting a bridge with active traffic.

When a bridge that has active traffic is deleted, the bridge's ofproto can
be referenced by in-flight "flow_miss"es.  This commit fixes the problem
by destroying "flow_miss"es that reference the ofproto that is being
deleted.

I found the problem by adding and removing a bridge that had active traffic
(via hping3) while running under valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Destroy facets on ofproto removal.
Ethan Jackson [Wed, 4 Sep 2013 16:27:31 +0000 (09:27 -0700)]
ofproto-dpif: Destroy facets on ofproto removal.

Facets maintain a pointer to their owning ofproto-dpif, and therefore
when an ofproto-dpif is destroyed all of their child facets should be
destroyed along with it.  If they aren't, it's possible a subfacet
looked up in the dpif-backer could access it's parent facet and
therefore a defunct parent ofproto causing a segmentation fault.

Bug #19421.
Bug #19423.
Diagnosed-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agocfm: Fix a memory allocation bug in cfm_get_remove_mpids().
Guolin Yang [Wed, 4 Sep 2013 16:00:12 +0000 (09:00 -0700)]
cfm: Fix a memory allocation bug in cfm_get_remove_mpids().

In cfm, when allocating the rmp array, the size is not calculated correctly
which cause the remote MPID not correctly updated in ovsdb.

This bug was introduced by commit 13b1b2ae70 (cfm: Make the CFM module
thread safe.).

Signed-off-by: Guolin Yang <gyang@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-xlate: Fix mac learning deadlock.
Ethan Jackson [Wed, 4 Sep 2013 00:34:00 +0000 (17:34 -0700)]
ofproto-dpif-xlate: Fix mac learning deadlock.

xlate_normal() held the mac_learning lock while calling
output_normal().  When running with patch ports, this could cause
xlate_actions() to be called again, possibly attempting to take a
write lock on the same learning table causing a deadlock.  This patch
solves the problem by holding the lock for a very brief period of
time.

Bug #19423.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agoFix a bug in conversion between flow/mask and flow key
Guolin Yang [Fri, 30 Aug 2013 16:57:13 +0000 (09:57 -0700)]
Fix a bug in conversion between flow/mask and flow key

In odp_flow_key_from_flow__(), when converting ICMPv6 flow/mask
to flow/mask key, we should always use flow to check for whether
ND informaition is present or not. In mask case, both type and code
field should be masked, otherwise ND fields can be masked.

Similarly in reverse conversion (parse_l2_5_onward()), we should
have same check.

Signed-off-by: Guolin Yang <gyang@nicira.com>
[blp@nicira.com changed && to || in parse_l2_5_onward()
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoutil: Include pthread_np.h on FreeBSD
Ed Maste [Fri, 30 Aug 2013 14:56:08 +0000 (10:56 -0400)]
util: Include pthread_np.h on FreeBSD

On FreeBSD pthread_set_name_np's prototype is provided by pthread_np.h.
As I believe it is the only platform to provide the "set_name" (with an
underscore) variant I hope it's fine to use the existing autoconf macro
HAVE_PTHREAD_SET_NAME_NP.

Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Sync vxlan tunneling code with upstream ovs-vxlan.
Pravin B Shelar [Mon, 26 Aug 2013 18:05:35 +0000 (11:05 -0700)]
datapath: Sync vxlan tunneling code with upstream ovs-vxlan.

Upstream vxlan implementation was changed according to few
comments.  Following patch brings back those changes to out
of tree ovs module.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agoPrepare for 2.0.0.
Justin Pettit [Wed, 28 Aug 2013 23:13:41 +0000 (16:13 -0700)]
Prepare for 2.0.0.

We decided to call the next release 2.0 instead of 1.12.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoSet release date for 1.11.0.
Justin Pettit [Wed, 28 Aug 2013 21:32:27 +0000 (14:32 -0700)]
Set release date for 1.11.0.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoofproto-dpif-upcall: Batch upcalls.
Jarno Rajahalme [Wed, 28 Aug 2013 19:06:07 +0000 (12:06 -0700)]
ofproto-dpif-upcall: Batch upcalls.

Batching reduces overheads and enables upto 4 times the upcall processing
performance in a specialized test case.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev-dummy: Stop overriding patch vport type with dummy.
YAMAMOTO Takashi [Wed, 28 Aug 2013 07:33:07 +0000 (16:33 +0900)]
netdev-dummy: Stop overriding patch vport type with dummy.

There's little point to override patch ports, which is implmeneted
purely in our userland process these days, with a dummy implementation.

This allows testing patch ports in "make sandbox" environment.

Signed-off-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev-bsd: fix crashes
YAMAMOTO Takashi [Wed, 28 Aug 2013 05:44:07 +0000 (14:44 +0900)]
netdev-bsd: fix crashes

fix a regression added by commit 89454bf477d1dc95357792677ccbd4d483ab42d8.
"netdev: Fix deadlock when netdev_dump_queues() callback calls into netdev."

Signed-off-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agopackets, pktbuf: Align L3 headers on 32-bit boundary.
Ben Pfaff [Wed, 28 Aug 2013 05:10:22 +0000 (22:10 -0700)]
packets, pktbuf: Align L3 headers on 32-bit boundary.

Memory access tends to be faster when data is properly aligned.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agopackets: Introduce IPv6 headers not aligned on a 32-bit boundary.
Ben Pfaff [Thu, 15 Aug 2013 18:07:24 +0000 (11:07 -0700)]
packets: Introduce IPv6 headers not aligned on a 32-bit boundary.

This fixes the same problem for IPv6 headers treated for other headers in
the previous commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agopackets: Do not assume that IPv4, TCP, or ARP headers are 32-bit aligned.
Ben Pfaff [Thu, 15 Aug 2013 17:47:39 +0000 (10:47 -0700)]
packets: Do not assume that IPv4, TCP, or ARP headers are 32-bit aligned.

Ethernet headers are 14 bytes long, so when the beginning of such a header
is 32-bit aligned, the following data is misaligned.  The usual trick to
fix that is to start the Ethernet header on an odd-numbered 16-bit
boundary.  That trick works OK for Open vSwitch, but there are two
problems:

   - OVS doesn't use that trick everywhere.  Maybe it should, but it's
     difficult to make sure that it does consistently because the CPUs
     most commonly used with OVS don't care about misalignment, so we
     only find problems when porting.

   - Some protocols (GRE, VXLAN) don't use that trick, so in such a case
     one can properly align the inner or outer L3/L4/L7 but not both.  (OVS
     userspace doesn't directly deal with such protocols yet, so this is
     just future-proofing.)

   - OpenFlow uses the alignment trick in a few places but not all of them.

This commit starts the adoption of what I hope will be a more robust way
to avoid misalignment problems and the resulting bus errors on RISC
architectures.  Instead of trying to ensure that 32-bit quantities are
always aligned, we always read them as if they were misaligned.  To ensure
that they are read this way, we change their types from 32-bit types to
pairs of 16-bit types.  (I don't know of any protocols that offset the
next header by an odd number of bytes, so a 16-bit alignment assumption
seems OK.)

The same would be necessary for 64-bit types in protocol headers, but we
don't yet have any protocol definitions with 64-bit types.

IPv6 protocol headers need the same treatment, but for those we rely on
structs provided by system headers, so I'll leave them for an upcoming
patch.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Remove obsolete comment and assertion.
Ben Pfaff [Wed, 14 Aug 2013 23:26:05 +0000 (16:26 -0700)]
ofproto: Remove obsolete comment and assertion.

At one time, the ofproto-dpif implementation of the 'rule_execute' member
function required, indirectly, at least struct(ofp10_packet_in) bytes of
headroom in the packet passed into it.  (This allowed constructing an
OFPT_PACKET_IN without allocating and copying a new ofpbuf.)  This
restriction has long been lifted, but rule_execute() had not yet caught
up.  This commit updates it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agolib: Keep track of usable protocols while parsing.
Jarno Rajahalme [Wed, 21 Aug 2013 01:41:45 +0000 (18:41 -0700)]
lib: Keep track of usable protocols while parsing.

Keep track of usable protocols while parsing actions and matches,
rather than checking for them afterwards.  This fixes silently discarded
meter and goto table instructions when not explicitly specifying the
protocol to use.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev: Fix deadlock when netdev_dump_queues() callback calls into netdev.
Ben Pfaff [Wed, 28 Aug 2013 00:15:53 +0000 (17:15 -0700)]
netdev: Fix deadlock when netdev_dump_queues() callback calls into netdev.

We have a call chain like this:

    iface_configure_qos() calls
        netdev_dump_queues(), which calls
            netdev_linux_dump_queues(), which calls back through 'cb' to
                qos_unixctl_show_cb(), which calls
                    netdev_delete_queue(), which calls
                        netdev_linux_delete_queue().

Both netdev_dump_queues() and netdev_linux_delete_queue() take the same
mutex in the same netdev, which deadlocks.

This commit fixes the problem by getting rid of the callback.

netdev_linux_dump_queue_stats() would benefit from the same treatment but
it's less urgent because I don't see any callbacks from that function that
call back into a netdev function.

Bug #19319.
Reported-by: Scott Hendricks <shendricks@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Fully construct rules before putting them in the classifier.
Ben Pfaff [Mon, 26 Aug 2013 19:45:55 +0000 (12:45 -0700)]
ofproto: Fully construct rules before putting them in the classifier.

add_flow() in ofproto.c has a race: it adds a new flow to the flow table
before calling ->rule_construct().  That means that (in ofproto-dpif) the
new flow is visible to the forwarder threads before gets properly
initialized.

One solution would be to lock the flow table across the entire operation,
but this is not desirable:

   * We would need a write-lock but this would be expensive for
     implementing "learn" flow_mods that often don't really modify anything
     at all.

   * The code would need to keep the lock across a couple of different calls
     into the client, which seems undesirable if it can be avoided.

   * The code in add_flow() is difficult to understand already.

Instead, I've decided to refactor add_flow() to simplify it and make it
easier to understand.  I think this will also improve the potential for
concurrency.

This commit completes the initial refactoring and solves the race.  It makes
two key changes:

    1. It disentangles insertion of a new flow from evicting some existing
       flow to make room for it (if necessary).  Instead, if inserting a
       new flow would make the flow table overfull, it evicts a flow and
       commits that operation before it attempts the insertion.  This is
       a user-visible change in behavior, in that previously a flow table
       insertion could never cause the number of flows in the flow table
       to decrease, and now it theoretically could if the eviction succeeds
       but the insertion fails.  However, I do not think that this is a
       serious problem.

    2. It adds two new steps to the life cycle of a rule.  Previously,
       rules had "alloc", "construct", "destruct", and "dealloc" steps,
       like other ofproto objects do.  This adds "insert" and "delete"
       steps between "construct" and "destruct".  The new steps are
       intended to handle the actual insertion and deletion into the
       datapath flow table.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agodatapath: add rounddown() definition into compatibility code
Andy Zhou [Tue, 27 Aug 2013 23:19:53 +0000 (16:19 -0700)]
datapath: add rounddown() definition into compatibility code

rounddown() was not available in older kenrel, such as kernel 2.6.32.61.
Add it to the compatibility code.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto: Make OFPFC_ADD internally modify a rule instead of swapping.
Ben Pfaff [Tue, 27 Aug 2013 19:53:10 +0000 (12:53 -0700)]
ofproto: Make OFPFC_ADD internally modify a rule instead of swapping.

add_flow() in ofproto.c has a race: it adds a new flow to the flow table
before calling ->rule_construct().  That means that (in ofproto-dpif) the
new flow is visible to the forwarder threads before gets properly
initialized.

One solution would be to lock the flow table across the entire operation,
but this is not desirable:

   * We would need a write-lock but this would be expensive for
     implementing "learn" flow_mods that often don't really modify anything
     at all.

   * The code would need to keep the lock across a couple of differen calls
     into the client, which seems undesirable if it can be avoided.

   * The code in add_flow() is difficult to understand already.

Instead, I've decided to refactor add_flow() to simplify it and make it
easier to understand.  I think this will also improve the potential for
concurrency.

This commit starts off by collapsing two different cases together into
(almost) one.  In particular, OpenFlow has two ways to modify a flow with a
"flow_mod" command.  You can use an "add" flow_mod to replace the flow, or
you can use a "modify" flow_mod to change it.  The differences in semantics
are minor, but until now Open vSwitch has treated them quite differently.
This commit merges both cases, treating them as variants of what was
previously a "modify".  The advantage is that add_flow() no longer has to
deal with two flows at a time in the "add" case (the old flow being
deleted, the new flow replacing that one).  This does not fix the race, but
it makes it easier to deal with in a later commit.

Transforming "add" into a form of "modify" requires that "modify" be able
to reset flow statistic counters.  OpenFlow 1.2 and later make this an
optional flag in a "flow_mod".  Until now, we haven't implemented that
feature of OF1.2+, but we get it pretty much for free with this
refactoring, so this commit also adds that OF1.2+ feature too.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofp-util: Abstract flow_mod OFPFF_* flags.
Ben Pfaff [Mon, 26 Aug 2013 23:23:50 +0000 (16:23 -0700)]
ofp-util: Abstract flow_mod OFPFF_* flags.

The OFPFF_* flags used in flow_mods are just confusing enough that it
seems worthwhile to try to abstract them out.  In particular:

    * OFPFF_EMERG was introduced in OF1.0, deleted in OF1.1, and then
      its bit was reused for a different purpose in OF1.2.

    * OFPFF_RESET_COUNTS was introduced in OF1.2 but the semantics that it
      specifies are implied by "add" commands in earlier versions, so
      proper translation requires the OpenFlow version number and flow_mod
      command.

This commit does the abstraction.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif-xlate: Fix fin_timeout to make rules expirable.
Ben Pfaff [Mon, 26 Aug 2013 20:08:30 +0000 (13:08 -0700)]
ofproto-dpif-xlate: Fix fin_timeout to make rules expirable.

In an Open vSwitch flow table that has a configured maximum number of
flows, flows that have an idle or hard timeout, or both, are expirable,
and flows with neither are permanent.  The fin_timeout action can change
a flow that has no idle or hard timeout into one that has either one or
both, which should make a permanent flow into an expirable one, but the
implementation was buggy and did not actually make the flow expirable.
This commit fixes the problem.

This commit also moves most of the implementation of fin_timeout from
ofproto-dpif-xlate into ofproto, because this seems to better respect the
layering.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoclassifier: New function cls_rule_move().
Ben Pfaff [Tue, 27 Aug 2013 19:25:48 +0000 (12:25 -0700)]
classifier: New function cls_rule_move().

This function will acquire its first user in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoFix typos in a couple of OpenFlow related comments.
Ben Pfaff [Mon, 26 Aug 2013 22:28:46 +0000 (15:28 -0700)]
Fix typos in a couple of OpenFlow related comments.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agodatapath: optimize flow compare and mask functions
Andy Zhou [Tue, 27 Aug 2013 18:21:26 +0000 (11:21 -0700)]
datapath: optimize flow compare and mask functions

Make sure the sw_flow_key structure and valid mask boundaries are always
machine word aligned. Optimize the flow compare and mask operations
using machine word size operations. This patch improves throughput on
average by 15% when CPU is the bottleneck of forwarding packets.

This patch is inspired by ideas and code from a patch submitted by Peter
Klausler titled "replace memcmp() with specialized comparator".
However, The original patch only optimizes for architectures
support unaligned machine word access. This patch optimizes for all
architectures.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto-dpif.at: Remove push_vlan from an OF1.0 test.
Jarno Rajahalme [Mon, 26 Aug 2013 20:39:37 +0000 (13:39 -0700)]
ofproto-dpif.at: Remove push_vlan from an OF1.0 test.

Remove push_vlan from an OF1.0 test, as it requires OF1.1+ support, but was
silently discarded.  A later patch will make this test to fail due to
validation of usable OpenFlow protocol versions while parsing actions.

It should be noted that existing controllers may depend on the silently
discarded push_vlan being accepted during the action validation code at the
switch, as OpenFlow 1.0 supports setting a vlan ID, which will implicitly
push a vlan header, if it did not exist already.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-save: Do not depend on hardcoded set of $PATH for utilities.
Gurucharan Shetty [Mon, 26 Aug 2013 17:44:18 +0000 (10:44 -0700)]
ovs-save: Do not depend on hardcoded set of $PATH for utilities.

Reported-by: Jian Qiu <swordqiu@gmail.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-atomic: Add native Clang implementation.
Ben Pfaff [Mon, 26 Aug 2013 20:03:02 +0000 (13:03 -0700)]
ovs-atomic: Add native Clang implementation.

With this implementation I get warnings with Clang on GNU/Linux when the
previous patch is not applied.  This ought to make it easier to avoid
introducing new problems in the future even without building on FreeBSD.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agodpif: Change get_max_ports() to return uint32_t.
Alex Wang [Thu, 22 Aug 2013 21:24:18 +0000 (14:24 -0700)]
dpif: Change get_max_ports() to return uint32_t.

The declaration of 'get_max_ports()' to return odp_port_t adds
unwanted complexity to coding.  This commit changes it back to
return uint32_t type.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Change max_ports and alloc_port_no to uint16_t type.
Alex Wang [Thu, 22 Aug 2013 21:24:17 +0000 (14:24 -0700)]
ofproto: Change max_ports and alloc_port_no to uint16_t type.

The declaration of 'max_ports' and 'alloc_port_no' as ofp_port_t
adds unwanted complexity to coding.  This commit changes them back
to uint16_t type.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: remove the extra reset to make code clear to read.
pritesh [Thu, 22 Aug 2013 20:59:32 +0000 (13:59 -0700)]
ofproto: remove the extra reset to make code clear to read.

Signed-off-by: pritesh <pritesh.kothari@cisco.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agometa-flow: Use correct OF1.2+ errors for invalid fields in actions.
Ben Pfaff [Fri, 23 Aug 2013 18:11:03 +0000 (11:11 -0700)]
meta-flow: Use correct OF1.2+ errors for invalid fields in actions.

OFPERR_OFPBAC_BAD_ARGUMENT is not as specific as the errors provided by
OpenFlow 1.2 and later.

Some of these errors needed Nicira extension numbers for use with OpenFlow
1.0 and 1.1, so this commit also adds those.

Some of these errors had poor explanations likely to confuse users, so this
commits improves them.

Some of the errors had the wrong names, so this commit fixes them.

Reported-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jean Tourrilhes <jt@hpl.hp.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agoofproto-dpif-xlate: Refactor xlate_table_action() to avoid Clang warnings.
Ben Pfaff [Fri, 23 Aug 2013 18:03:55 +0000 (11:03 -0700)]
ofproto-dpif-xlate: Refactor xlate_table_action() to avoid Clang warnings.

I get a bunch of thread-safety warnings with the latest Clang without this
patch, because Clang is smart enough to see locking and unlocking but not
smart enough to figure out the relationships.  This refactoring avoids the
warnings.

I first noticed these warnings with Clang 1:3.4~svn188890-1~exp1.
I previously used version 1:3.4~svn187484-1~exp1.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agoofproto-dpif: Fix thread safety annotation on rule_dpif_lookup_in_table().
Ben Pfaff [Wed, 21 Aug 2013 18:27:25 +0000 (11:27 -0700)]
ofproto-dpif: Fix thread safety annotation on rule_dpif_lookup_in_table().

New Clang versions raise warnings about the incorrect old annotation.

I first noticed these warnings with Clang 1:3.4~svn188890-1~exp1.
I previously used version 1:3.4~svn187484-1~exp1.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agoofproto-dpif-upcall.c: Fix indentation.
Jarno Rajahalme [Thu, 22 Aug 2013 23:01:41 +0000 (16:01 -0700)]
ofproto-dpif-upcall.c: Fix indentation.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoRemove duplicate call to subfacet_create().
Jarno Rajahalme [Thu, 22 Aug 2013 22:51:39 +0000 (15:51 -0700)]
Remove duplicate call to subfacet_create().

There were two calls to subfacet_create(), which is redundant.  The first
one seemed the one to remove, as there is a subsequent test on the facet's
subfacet list being empty, which is never true after a call to
subfacet_create().  The correctness of this needs to be checked, however.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agobfd: Implement forwarding_if_rx.
Alex Wang [Thu, 22 Aug 2013 22:44:23 +0000 (15:44 -0700)]
bfd: Implement forwarding_if_rx.

This commit adds a new boolean option "forwarding_if_rx" to bfd.

When forwarding_if_rx is true the interface will be considered
capable of packet I/O as long as there is packet received at
interface.  This is important in that when link becomes temporarily
conjested, consecutive BFD control packets can be lost.  And the
forwarding_if_rx can prevent link failover by detecting non-control
packets received at interface.

Signed-off-by: Alex Wang <alexw@nicira.com>
10 years agodatapath: Remove redundant EtherType check in vlan translation.
Jesse Gross [Thu, 22 Aug 2013 20:30:45 +0000 (13:30 -0700)]
datapath: Remove redundant EtherType check in vlan translation.

This was supposed to be part of the previous patch but I forgot
to commit it before pushing.

Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: More strict vlan encap netlink check
Andy Zhou [Thu, 15 Aug 2013 21:16:52 +0000 (14:16 -0700)]
datapath: More strict vlan encap netlink check

Only parse the encap key field if eth_type is 802.1Q and
VLAN_TAG_PRESENT bit is set. Add a few more eror checks and logs.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Rename key_len to key_end
Andy Zhou [Fri, 16 Aug 2013 23:01:29 +0000 (16:01 -0700)]
datapath: Rename key_len to key_end

Key_end is a better name describing the ending boundary than key_len.
Rename those variables to make it less confusing.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agobond: Fix error in bond_shift_load().
Alex Wang [Thu, 22 Aug 2013 18:29:04 +0000 (11:29 -0700)]
bond: Fix error in bond_shift_load().

Commit 4a1b8f30e(bond: Stop using tags.) introduced the bug that
prevents the load shifting when the traffic needs to be balanced.
This commit fixes this bug.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Fix typo in comment.
Ben Pfaff [Wed, 21 Aug 2013 21:15:33 +0000 (14:15 -0700)]
ofproto: Fix typo in comment.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Mark rule_release() as no_thread_safety_analysis.
Ben Pfaff [Wed, 21 Aug 2013 18:33:49 +0000 (11:33 -0700)]
ofproto-dpif: Mark rule_release() as no_thread_safety_analysis.

Otherwise new Clang complains about this function because it only sometimes
releases the lock (that is, it only does it when there is a lock to
release).

I first noticed these warnings with Clang 1:3.4~svn188890-1~exp1.
I previously used version 1:3.4~svn187484-1~exp1.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-thread: Mark lock and unlock functions as no_thread_safety_analysis.
Ben Pfaff [Wed, 21 Aug 2013 18:24:49 +0000 (11:24 -0700)]
ovs-thread: Mark lock and unlock functions as no_thread_safety_analysis.

I don't see any other way to make Clang realize that these are the real
mutex implementation functions.

I first noticed these warnings with Clang 1:3.4~svn188890-1~exp1.
I previously used version 1:3.4~svn187484-1~exp1.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofp-util: Add SCTP support
Joe Stringer [Thu, 22 Aug 2013 08:24:45 +0000 (20:24 +1200)]
ofp-util: Add SCTP support

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Add SCTP support
Joe Stringer [Thu, 22 Aug 2013 08:24:44 +0000 (20:24 +1200)]
ofproto-dpif: Add SCTP support

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Add SCTP support
Joe Stringer [Thu, 22 Aug 2013 08:24:43 +0000 (20:24 +1200)]
datapath: Add SCTP support

This patch adds support for rewriting SCTP src,dst ports similar to the
functionality already available for TCP/UDP.

Rewriting SCTP ports is expensive due to double-recalculation of the
SCTP checksums; this is performed to ensure that packets traversing OVS
with invalid checksums will continue to the destination with any
checksum corruption intact.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agolib: Add CRC32C Implementation
Joe Stringer [Thu, 22 Aug 2013 08:24:42 +0000 (20:24 +1200)]
lib: Add CRC32C Implementation

This implementation was derived from FreeBSD:
http://svnweb.freebsd.org/base/head/sys/libkern/crc32.c

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoipfix: implement flow caching and aggregation in exporter
Romain Lenglet [Wed, 21 Aug 2013 20:49:04 +0000 (13:49 -0700)]
ipfix: implement flow caching and aggregation in exporter

Implement a per-exporter flow cache with active timeout expiration.
Add columns "cache_active_timeout" and "cache_max_flows" into table
"IPFIX" to configure each cache.

Add per-flow elements "octetDeltaSumOfSquares",
"minimumIpTotalLength", and "maximumIpTotalLength" to replace
"ethernetTotalLength".  Add per-flow element "flowEndReason" to
indicate whether a flow has expired because of an active timeout, the
cache size limit being reached, or the exporter being stopped.

Signed-off-by: Romain Lenglet <rlenglet@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofp-parse: Silence uninitialized use warnings with optimized gcc.
Ethan Jackson [Wed, 21 Aug 2013 21:49:09 +0000 (14:49 -0700)]
ofp-parse: Silence uninitialized use warnings with optimized gcc.

GCC 4.6.3 gets confused by the str_to_*() functions in ofp-parse and
spits out the following warning.

error: â€˜priority’ may be used uninitialized in this function

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Start ofport allocation from the previous max after restart.
Gurucharan Shetty [Mon, 19 Aug 2013 21:45:02 +0000 (14:45 -0700)]
ofproto: Start ofport allocation from the previous max after restart.

We currently do not recycle ofport numbers from interfaces that are recently
deleted by maintaining the maximum allocated ofport value and
allocating new ofport numbers greater than the previous maximum.
But after a restart of ovs-vswitchd, we start again from ofport value of '1'.
This means that after a restart, we can immeditaley recycle the 'ofport'
value of the most recently deleted interface.

With this commit, during ovs-vswitchd initial configuration, we figure
out the previously allocated max ofport value. New interfaces get ofport
value that is greater than this.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agonetdev: Make run and wait functions optional as documented.
Ethan Jackson [Wed, 21 Aug 2013 19:59:28 +0000 (12:59 -0700)]
netdev: Make run and wait functions optional as documented.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Reported-by: Guolin Yang <gyang@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif-upcall: Increase upcall stub size.
Ethan Jackson [Wed, 21 Aug 2013 19:56:53 +0000 (12:56 -0700)]
ofproto-dpif-upcall: Increase upcall stub size.

The existing upcall stub size is not large enough even for the
smallest miss upcalls.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Reported-by: Guolin Yang <gyang@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoReinterpret base for meter band types bitmap.
Ben Pfaff [Wed, 21 Aug 2013 20:05:36 +0000 (13:05 -0700)]
Reinterpret base for meter band types bitmap.

OpenFlow 1.3 says that the band_types member of struct ofp_meter_features
is a bitmap of OFPMBT_* values.  The OFPMBT_* values are 1-based, so
until now, to avoid wasting bit 0, OVS mapped an OFPMBT_* with value 1 to
bit 0, value 2 to bit 1, and so on.  However, according to
http://openvswitch.org/pipermail/dev/2013-July/029666.html,
other OpenFlow implementations directly translate value 1 to bit 1,
value 2 to bit 2, and so on.  This commit changes Open vSwitch to use this
more common interpretation.

A request for clarification of this issue in the OpenFlow standard has been
filed with the ONF Extensibility Working Group as issue EXT-345.

Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
10 years agoovs-atomic: Fix typo in comment.
Ben Pfaff [Wed, 21 Aug 2013 16:58:38 +0000 (09:58 -0700)]
ovs-atomic: Fix typo in comment.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ed Maste <emaste@freebsd.org>
10 years agoodp-util: New function odp_flow_key_to_mask().
Guolin Yang [Tue, 20 Aug 2013 17:40:50 +0000 (10:40 -0700)]
odp-util: New function odp_flow_key_to_mask().

With megaflow support, there is API to convert mask to nlattr key based
format.  This change introduces API to do the reverse conversion.  We
leverage the existing odp_flow_key_to_flow() API to reuse the code.

Signed-off-by: Guolin Yang <gyang@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agosparse: Suppress sparse warnings for global variables.
Alex Wang [Tue, 20 Aug 2013 22:24:19 +0000 (15:24 -0700)]
sparse: Suppress sparse warnings for global variables.

sparse warns if a non-static variable with external linkage has an
initializer at first declaration.  This commit suppresses the
warnings issued when adding custom section is not supported by
compiler.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-atomic: atomic_load() must take a non-const argument.
Ben Pfaff [Tue, 20 Aug 2013 17:46:15 +0000 (10:46 -0700)]
ovs-atomic: atomic_load() must take a non-const argument.

C11 says that atomic_load() requires a non-const argument, and Clang
enforces that.  This fixes warnings with FreeBSD <stdatomic.h> that uses
the Clang extensions.

Reported-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ed Maste <emaste@freebsd.org>
10 years agoofproto-dpif: Enable smooth transition between CFM and BFD.
Alex Wang [Wed, 21 Aug 2013 02:45:00 +0000 (02:45 +0000)]
ofproto-dpif: Enable smooth transition between CFM and BFD.

When user switches between using CFM and BFD, there will be a short
down time before the new protocol goes up.  This can unintentionally
change the traffic pattern of the bundled ports.  To prevent this,
it is proposed that user can enable both CFM and BFD before transition,
wait for the new protocol to go up, and then disable the old protocol.

To make this scheme work, this commit modifies the port_run() in
ofproto-dpif.c, so that when both CFM and BFD are used, if either shows
correct status, the port is considered usable in the bundle.

Signed-off-by: Alex Wang <alexw@nicira.com>
10 years agobfd: Implement BFD decay.
Alex Wang [Tue, 20 Aug 2013 23:41:05 +0000 (23:41 +0000)]
bfd: Implement BFD decay.

When there is no incoming data traffic at the interface for a period,
BFD decay allows the bfd session to increase the min_rx. This is
helpful in that some interfaces may usually be idle for a long time.
And cpu consumption can be reduced by processing fewer bfd control
packets.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agocoverage: Make thread-safe.
Ben Pfaff [Tue, 20 Aug 2013 20:46:33 +0000 (13:46 -0700)]
coverage: Make thread-safe.

This makes each of the coverage counters per-thread.  It abandons the
idea of trying to keep track of the number of hits in the "current" poll
loop, since there are many poll loops running, each in its own thread, as
well as the idea of numbering epochs for the same reason.  Instead, we
just keep track of overall totals for the process for each coverage
counter, accumulating per-thread counts into the global total each time
a thread's main loop passes through poll_block().

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoUse "error-checking" mutexes in place of other kinds wherever possible.
Ben Pfaff [Tue, 20 Aug 2013 20:40:02 +0000 (13:40 -0700)]
Use "error-checking" mutexes in place of other kinds wherever possible.

We've seen a number of deadlocks in the tree since thread safety was
introduced.  So far, all of these are self-deadlocks, that is, a single
thread acquiring a lock and then attempting to re-acquire the same lock
recursively.  When this has happened, the process simply hung, and it was
somewhat difficult to find the cause.

POSIX "error-checking" mutexes check for this specific problem (and
others).  This commit switches from other types of mutexes to
error-checking mutexes everywhere that we can, that is, everywhere that
we're not using recursive mutexes.  This ought to help find problems more
quickly in the future.

There might be performance advantages to other kinds of mutexes in some
cases.  However, the existing mutex type choices were just guesses, so I'd
rather go for easy detection of errors until we know that other mutex
types actually perform better in specific cases.  Also, I did a quick
microbenchmark of glibc mutex types on my host and found that the
error checking mutexes weren't any slower than the other types, at least
when the mutex is uncontended.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif-sflow: Fix memory leak.
Ben Pfaff [Tue, 20 Aug 2013 18:27:32 +0000 (11:27 -0700)]
ofproto-dpif-sflow: Fix memory leak.

dpif_sflow_set_options() uses xcalloc() to allocate space for the SFLAgent
structure, but nothing ever freed it.  This fixes the problem.

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif-xlate: Add comment on xlate_actions().
Ben Pfaff [Tue, 20 Aug 2013 18:16:14 +0000 (11:16 -0700)]
ofproto-dpif-xlate: Add comment on xlate_actions().

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Fix argument descriptions in vport.c.
Justin Pettit [Tue, 20 Aug 2013 00:25:08 +0000 (17:25 -0700)]
datapath: Fix argument descriptions in vport.c.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Fix function description in vport-lisp.c.
Justin Pettit [Tue, 20 Aug 2013 00:24:39 +0000 (17:24 -0700)]
datapath: Fix function description in vport-lisp.c.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Remove old argument description in flow.c.
Justin Pettit [Tue, 20 Aug 2013 00:23:26 +0000 (17:23 -0700)]
datapath: Remove old argument description in flow.c.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Fix typos in Netlink debugging messages.
pritesh [Mon, 19 Aug 2013 19:00:22 +0000 (12:00 -0700)]
datapath: Fix typos in Netlink debugging messages.

Signed-off-by: pritesh <pritesh.kothari@cisco.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agotests: Fix build on FreeBSD
Ed Maste [Sat, 17 Aug 2013 20:40:11 +0000 (16:40 -0400)]
tests: Fix build on FreeBSD

Avoid relying on a non-portable implementation detail for atomic_flag
tests.  Per the standard, the only way to obtain the value of the flag
is via the return value from atomic_flag_test_and_set.

Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: compat: remove __net_init and __net_exit annotations.
Pravin B Shelar [Thu, 15 Aug 2013 03:53:32 +0000 (20:53 -0700)]
datapath: compat: remove __net_init and __net_exit annotations.

net_namespace-device can get registered after module init, e.g. vxlan
registers name-space-device on port add.  On kernel without namespace
support __net_init is defined as __init which cause panic on vxlan port
add. Following patch fixes it.

BUG: unable to handle kernel paging request at ffffffffa02b6293
IP: [<ffffffffa02b6293>] 0xffffffffa02b6293
PGD 1a87067 PUD 1a8b063 PMD 8371de067 PTE 0
Oops: 0010 [#1] SMP
Process ovs-vswitchd (pid: 10330, threadinfo ffff8808367fe000, task
      f880839e16aa0)
Stack:
Call Trace:
 [<ffffffff8144b254>] ? register_pernet_gen_device+0x74/0xd0
 [<ffffffffa027e220>] ? vxlan_rcv+0x0/0x60 [openvswitch]
 [<ffffffffa0280a7b>] vxlan_handler_add+0x3cb/0x480 [openvswitch]
 [<ffffffffa027e1f4>] vxlan_tnl_create+0xc4/0xf0 [openvswitch]
 [<ffffffffa027b6f3>] ovs_vport_add+0x53/0xb0 [openvswitch]
 [<ffffffffa0273bc6>] new_vport+0x16/0x60 [openvswitch]
 [<ffffffffa0276399>] ovs_vport_cmd_new+0x109/0x210 [openvswitch]
 [<ffffffff81478f80>] genl_rcv_msg+0x1d0/0x210
 [<ffffffff81477e29>] netlink_rcv_skb+0xa9/0xd0
 [<ffffffff81478d95>] genl_rcv+0x25/0x40
 [<ffffffff81477a63>] netlink_unicast+0x283/0x2d0
 [<ffffffff814783de>] netlink_sendmsg+0x1fe/0x2e0
 [<ffffffff8143c8d3>] sock_sendmsg+0x123/0x150
 [<ffffffff8143e0b6>] __sys_sendmsg+0x406/0x420
 [<ffffffff8143e2d9>] sys_sendmsg+0x49/0x90
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Code:  Bad RIP value.
RIP  [<ffffffffa02b6293>] 0xffffffffa02b6293

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #19178

10 years agonetdev-linux: Fix self-deadlocks in traffic control code.
Ben Pfaff [Fri, 16 Aug 2013 21:25:16 +0000 (14:25 -0700)]
netdev-linux: Fix self-deadlocks in traffic control code.

htb_parse_qdisc_details__(), which is called with the netdev mutex, called
netdev_get_mtu(), which tried to reacquire the mutex and thus deadlocked.
This commit fixes the problem and similar problems in
htb_parse_class_details__() and hfsc_parse_qdisc_details__().

Bug #19180.
Reported-by: Dhaval Badiani <dbadiani@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-dummy: Add Clang lock annotations.
Ben Pfaff [Fri, 16 Aug 2013 21:04:08 +0000 (14:04 -0700)]
netdev-dummy: Add Clang lock annotations.

This commit adds some superfluous locks and unlocks, but they should not
hurt performance and do make Clang happy.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agonetdev-dummy: Fix synchronization error in netdev_dummy_get_config().
Ben Pfaff [Mon, 12 Aug 2013 19:51:47 +0000 (12:51 -0700)]
netdev-dummy: Fix synchronization error in netdev_dummy_get_config().

Found by Clang.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoofproto-dpif-upcall: ofproto_dpif_send_packet_in() needs object on heap.
YAMAMOTO Takashi [Fri, 16 Aug 2013 05:17:23 +0000 (14:17 +0900)]
ofproto-dpif-upcall: ofproto_dpif_send_packet_in() needs object on heap.

fix a bug introduced by commit e1ec7dd46 (ofproto-dpif: Implement
multi-threaded miss handling.), in which execute_flow_miss() passes a
stack-allocated object to ofproto_dpif_send_packet_in() whereas that
function requires a heap-allocated object.  Also fixes two related bugs:
the 'packet' previously used in the packet-in was invalid and its data
was not copied with xmemdup().

Previously, when there were multiple packets in a single flow miss,
execute_flow_miss() would only send the first one to the controller.  This
was intentional (the goal is to find out whether the controller is
operational, and sending a single packet is sufficient for that) but
possibly confusing to the reader.  This commit switches to sending all of
the packets (the common case is a single packet anyhow).

Signed-off-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoseq: Initialize 'ovsthread_id' member of waiters in seq_wait__().
Ben Pfaff [Fri, 16 Aug 2013 18:20:57 +0000 (11:20 -0700)]
seq: Initialize 'ovsthread_id' member of waiters in seq_wait__().

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoofproto-dpif-xlate: Unreference handles on xbridge removal.
Ethan Jackson [Fri, 16 Aug 2013 01:37:41 +0000 (18:37 -0700)]
ofproto-dpif-xlate: Unreference handles on xbridge removal.

Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
[blp@nicira.com added one more hmap_destroy()]
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agobfd: Include prerequisite header for FreeBSD
Ed Maste [Fri, 16 Aug 2013 12:24:30 +0000 (08:24 -0400)]
bfd: Include prerequisite header for FreeBSD

Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Fix bad pointer dereference when deleting a bridge.
Ben Pfaff [Fri, 16 Aug 2013 00:54:04 +0000 (17:54 -0700)]
ofproto-dpif: Fix bad pointer dereference when deleting a bridge.

The 'pins' list contains packet-ins, not flow-mods.

Introduced by commit ada3a58d1f8 (ofproto-dpif: Make packet_ins thread
safe.).

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto: Avoid extra O(N) work in common case on flow addition.
Ben Pfaff [Fri, 16 Aug 2013 00:38:40 +0000 (17:38 -0700)]
ofproto: Avoid extra O(N) work in common case on flow addition.

The OpenFlow OFPFF_CHECK_OVERLAP flag requires us to check whether the flow
being inserted overlaps with any existing flows.  That isn't efficiently
implemented and typically requires us to compare the new flow against most
or all of the existing flows.  We don't have to do that work if
OFPFF_CHECK_OVERLAP is not requested, but commit 0b4f207828c (classifier:
Make use of the classifier thread safe.) inadvertently made us do it
anyway.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto: Make number of packet handler threads runtime configurable.
Alex Wang [Thu, 15 Aug 2013 07:23:54 +0000 (00:23 -0700)]
ofproto: Make number of packet handler threads runtime configurable.

This commit adds a new column "n-handler-threads" to the Open_vSwitch
table. This is used to set the number of upcall handler threads created by
the ofproto-dpif-upcall module.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: move per-backer wait calls from wait to type_wait
YAMAMOTO Takashi [Thu, 15 Aug 2013 11:25:44 +0000 (20:25 +0900)]
ofproto-dpif: move per-backer wait calls from wait to type_wait

move per-backer wait calls (dpif_wait/udpif_wait) from ofproto_wait
to ofproto_type_wait.

This eliminates excessive poll slot consumption when there is more
than one bridge.

Signed-off-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Fix RHEL compat for netdev_rx_handler_register
Chris Wright [Wed, 14 Aug 2013 01:02:48 +0000 (18:02 -0700)]
datapath: Fix RHEL compat for netdev_rx_handler_register

Andrei Andone reported an oops on CentOS 6.4 when adding a device to an
ovs instance.  The problem is easy to reproduce and generates the
following stack trace:

kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle kernel paging request at ffff88033afa49c0
IP: [<ffff88033afa49c0>] 0xffff88033afa49c0
...
Call Trace:
 <IRQ>
 [<ffffffff814487ba>] ? __netif_receive_skb+0x60a/0x750
 [<ffffffff81055f96>] ? enqueue_task+0x66/0x80
 [<ffffffff8144aa38>] netif_receive_skb+0x58/0x60
 [<ffffffff8144ab40>] napi_skb_finish+0x50/0x70
 [<ffffffff8144d0e9>] napi_gro_receive+0x39/0x50
 [<ffffffffa025f46c>] igb_poll+0x7ec/0xc70 [igb]
 [<ffffffff810e6881>] ? cpu_quiet_msk+0xc1/0x130
...

As you can see, we jumped to data rather than code.  This is caused by
a mistake in the compat code for netdev_rx_handler_register which sets
the handler to rx_handler_data rather than rx_handler.  This was
introduced by commit "3e35fe3 datapath: rhel: Move RHEL OVS hook
registration to netdev_rx_handler_register() backport".

Reported-by: Andrei Andone <andrei.andone@softvision.ro>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Thomas Graf <tgraf@redhat.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
Reviewed-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: Move generic tunnel functions to lisp module.
Pravin B Shelar [Tue, 13 Aug 2013 07:19:53 +0000 (00:19 -0700)]
datapath: Move generic tunnel functions to lisp module.

Generic tunnel rcv and send function are only used by lisp tunneling
module, so It make sense to move them to lisp module.

CC: Lori Jakab <lojakab@cisco.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Lorand Jakab <lojakab@cisco.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Move find_route() to compat.h
Pravin B Shelar [Mon, 29 Jul 2013 22:47:34 +0000 (15:47 -0700)]
datapath: Move find_route() to compat.h

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: move tunnel-init function to flow.h
Pravin B Shelar [Mon, 29 Jul 2013 22:47:27 +0000 (15:47 -0700)]
datapath: move tunnel-init function to flow.h

This makes ovs-module more in-sync with upstream ovs-module.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: tunnel: Fix gre64 tunnel when key not specified.
Pravin B Shelar [Thu, 15 Aug 2013 00:31:57 +0000 (17:31 -0700)]
datapath: tunnel: Fix gre64 tunnel when key not specified.

User is allowed to create tunnel without any keys.  In this
case userspace set tunnel action does not set tun-key flag
which was confusing gre64 vport header calculations.  Following
patch fixes it by always assuming TUNNEL_KEY parameter as we
do it with tun-seq.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #19121

10 years agodatapath: tunnel: Do not use inner ip-header-id for tunnel ip-header-id.
Pravin B Shelar [Wed, 14 Aug 2013 18:46:15 +0000 (11:46 -0700)]
datapath: tunnel: Do not use inner ip-header-id for tunnel ip-header-id.

Using inner-id for tunnel id is not safe in some rare cases.
E.g. packets coming from multiple sources entering same tunnel
can have same id. Therefore on tunnel packet receive we
could have packets from two different stream but with same
source and dst IP with same ip-id which could confuse ip packet
reassembly.

CC: Jarno Rajahalme <jrajahalme@nicira.com>
CC: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agonetdev-bsd: implement netdev_arp_lookup for NetBSD
YAMAMOTO Takashi [Tue, 13 Aug 2013 20:57:41 +0000 (05:57 +0900)]
netdev-bsd: implement netdev_arp_lookup for NetBSD

Signed-off-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agobfd: Increase configuration efficiency.
Alex Wang [Tue, 13 Aug 2013 23:51:08 +0000 (16:51 -0700)]
bfd: Increase configuration efficiency.

Currently, when there are multiple bfd configuration changes,
the bfd_poll() will only update one change at a time with the
other side. This commit moves the call to bfd_poll() at the
end of configuration processing function, so that bfd_poll()
will update all configuration changes together.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agobfd: Fix check_tnl_key error.
Alex Wang [Wed, 14 Aug 2013 20:12:33 +0000 (13:12 -0700)]
bfd: Fix check_tnl_key error.

This commit fixes the bug introduced by commit 26131299fa5 (bfd: Make the
BFD module thread safe.). The bug caused the opposite of the intended
behaviour.

Unit test is added for the 'check_tnl_key' feature.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Destroy bundle after moving its last port out.
Ben Pfaff [Wed, 14 Aug 2013 00:44:14 +0000 (17:44 -0700)]
ofproto-dpif: Destroy bundle after moving its last port out.

When the ofp_port argument to bundle_add_port() refers to an ofport_dpif
that already belongs to some other bundle, bundle_add_port() removed
the port from the other bundle, correctly, with bundle_del_port().
If the other bundle now contained no ports, however, this violated the
invariant that a bundle always contains at least one port.

Normally, this would get fixed up when the other bundle was processed
later during reconfiguration.  I haven't quite zeroed in on the exact
case where this is not true, but segfaults have happened here in
production, in particular when port adds and deletes happen simultaneously
and the new port reuses the OpenFlow port number of one of the deleted
ports.  It seems that the duplicate port number allows some port to rip
away the new port from its bundle without destroying that bundle.  I
suspect, therefore, that there is still a more subtle bug here, but I
hope that this will fix the segfault.

Bug #18967.
Signed-off-by: Ben Pfaff <blp@nicira.com>