cascardo/ovs.git
10 years agovlan-splinter: Fix inverted logic bug. v1.10.2
Alex Wang [Tue, 23 Jul 2013 01:15:49 +0000 (18:15 -0700)]
vlan-splinter: Fix inverted logic bug.

When "other-config:enable-vlan-splinters=true" is set, the existing
vlans with ip address must be retained. The bug actually does the
opposite and retains the vlans without ip address. This commit fixes
it.

Reported-by: Roman Sokolkov <rsokolkov@gmail.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoRelease Open vSwitch 1.10.2.
Justin Pettit [Wed, 4 Sep 2013 22:56:46 +0000 (15:56 -0700)]
Release Open vSwitch 1.10.2.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoofproto-dpif: Do not dpif_port_del() patch ports in port_del().
Ben Pfaff [Fri, 30 Aug 2013 03:55:10 +0000 (20:55 -0700)]
ofproto-dpif: Do not dpif_port_del() patch ports in port_del().

Patch ports don't have datapath ports so it doesn't make sense to try to
call dpif_port_del() on them.  If we do try, it will fail, which makes the
caller think that the port wasn't really deleted, which in turn keeps
ofproto from reporting the port deletion via OpenFlow.  This fixes the
problem.

Without this patch, the following commands, executed under "make sandbox",
will report the patch port addition in "ovs-ofctl monitor" output, but not
the patch port deletion.  With this patch, both the addition and deletion
will be reported.

    ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
    ovs-ofctl monitor br0 128 &
    ovs-vsctl add-port br0 patch1 \
        -- set interface patch1 type=patch options:peer=patch2 \
        -- add-port br0 patch2 \
        -- set interface patch2 type=patch options:peer=patch1
    ovs-vsctl del-port patch1
    ovs-vsctl del-port patch2

Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Convert units correctly in ofport_open().
Ben Pfaff [Wed, 4 Sep 2013 20:37:56 +0000 (13:37 -0700)]
ofproto: Convert units correctly in ofport_open().

netdev_features_to_bps() returns a speed in bps, but struct
ofputil_phy_port's curr_speed and max_speed are in kbps, so a conversion
is necessary.  This commit fixes the problem.

Reported-by: Benjamin Lunsky <benjamin.lunsky@netronome.com>
Tested-by: Benjamin Lunsky <benjamin.lunsky@netronome.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Destroy bundle after moving its last port out.
Ben Pfaff [Wed, 14 Aug 2013 00:44:14 +0000 (17:44 -0700)]
ofproto-dpif: Destroy bundle after moving its last port out.

When the ofp_port argument to bundle_add_port() refers to an ofport_dpif
that already belongs to some other bundle, bundle_add_port() removed
the port from the other bundle, correctly, with bundle_del_port().
If the other bundle now contained no ports, however, this violated the
invariant that a bundle always contains at least one port.

Normally, this would get fixed up when the other bundle was processed
later during reconfiguration.  I haven't quite zeroed in on the exact
case where this is not true, but segfaults have happened here in
production, in particular when port adds and deletes happen simultaneously
and the new port reuses the OpenFlow port number of one of the deleted
ports.  It seems that the duplicate port number allows some port to rip
away the new port from its bundle without destroying that bundle.  I
suspect, therefore, that there is still a more subtle bug here, but I
hope that this will fix the segfault.

Bug #18967.
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodebian: Fix build with old versions of dpkg-buildflags.
Ben Pfaff [Tue, 13 Aug 2013 19:54:35 +0000 (12:54 -0700)]
debian: Fix build with old versions of dpkg-buildflags.

dpkg-buildflags has not always supported --export=configure, but commit
6c2d4c8780 (debian: Apply hardening options to build.) used it
unconditionally, causing the build to fail on old Debian distributions.
This fixes the problem.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-ofctl: Avoid groff warning due to too-long line.
Ben Pfaff [Mon, 12 Aug 2013 22:11:35 +0000 (15:11 -0700)]
ovs-ofctl: Avoid groff warning due to too-long line.

Avoids these warnings from groff:

<standard input>:1037: warning [p 14, 6.0i]: cannot adjust line
<standard input>:1037: warning [p 14, 6.2i]: can't break line

Found by lintian.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodebian: Apply hardening options to build.
Ben Pfaff [Mon, 12 Aug 2013 22:10:39 +0000 (15:10 -0700)]
debian: Apply hardening options to build.

Debian now encourages building every program with various GCC hardening
options.  This commit implements that recommendation for Open vSwitch.

See https://wiki.debian.org/Hardening for details.

Found by lintian.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Use kernel eth_mac_addr() on old kernels.
Jesse Gross [Fri, 12 Jul 2013 17:02:15 +0000 (10:02 -0700)]
datapath: Use kernel eth_mac_addr() on old kernels.

The OVS MAC address set function was removed in favor of the version
in the kernel but the function pointer for older kernels was not
updated.

Reported-by: Cali Ente <calientepermanente@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoFAQ: Add supported kernel versions for newer OVS releases.
Jesse Gross [Tue, 9 Jul 2013 21:11:12 +0000 (14:11 -0700)]
FAQ: Add supported kernel versions for newer OVS releases.

Reported-by: Kris zhang <zhang.kris@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
10 years agotests: Tolerate init process pid != 1.
James Page [Thu, 20 Jun 2013 21:31:52 +0000 (22:31 +0100)]
tests: Tolerate init process pid != 1.

On Ubuntu Saucy based desktops, upstart runs with user sessions
enabled which means that the init process under which a daemon
might run is not always pid = 1.

Instead of checking for pid = 1, check to ensure that the parent
pid of the monitor is not the pid of the shell that started it.

Signed-off-by: James Page <james.page@ubuntu.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agovswitchd: Update flow-eviction-threshold documentation
Joe Stringer [Thu, 20 Jun 2013 00:25:30 +0000 (09:25 +0900)]
vswitchd: Update flow-eviction-threshold documentation

Patch 27a88d1373cbfcceac6d901bbf1c17051aa7845f caused the vswitchd
documentation and the code to digress. This brings them back in line.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-xapi-sync: Cache the bridge-id value for non nicira-bridge-id too.
Gurucharan Shetty [Sun, 16 Jun 2013 12:09:20 +0000 (05:09 -0700)]
ovs-xapi-sync: Cache the bridge-id value for non nicira-bridge-id too.

Currently we connect to xapi in case there are multiple
external_ids:xs-network-uuids to get the single bridge id everytime
we have a change in the database for all the interested columns in
ovs-xapi-sync. The xs-network-uuids value can also change whenever
new VLANs are added or deleted, which is a common use case. The
disadvantage with this approach is that we query XAPI more often
and set the bridge-id as "" if we don't get a valid response for
our query. This can take down the logical connectivity for all the
VMs on that xenserver.

Instead of looking at the PIF records for all the xs-network-uuids,
we can instead just look at the xapi record which has the same bridge
name as the OVS bridge name and then cache its uuid. This value will
hold true till the OVS bridge is recreated in which case we will re-read
the value.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agoovs-xapi-sync: Retry getting bridge-ids in case xapi is not ready.
Gurucharan Shetty [Fri, 14 Jun 2013 18:23:25 +0000 (11:23 -0700)]
ovs-xapi-sync: Retry getting bridge-ids in case xapi is not ready.

When there are multiple xs-network-uuids set for a bridge,
we query xapi to get the record that does not have a VLAN
associated with it. For cases when xapi does not respond,
retry again after a second.

During the times when xapi does not respond, set the value
as external_ids:bridge_id "".

Bug #17877.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agoodp-util: Use proper formatting for ODP port number.
Jarno Rajahalme [Fri, 14 Jun 2013 14:09:34 +0000 (17:09 +0300)]
odp-util: Use proper formatting for ODP port number.

Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Fix use of uninitialized local variable.
Jarno Rajahalme [Fri, 14 Jun 2013 14:09:33 +0000 (17:09 +0300)]
ofproto: Fix use of uninitialized local variable.

Also make the table id arithmetic less confusing.

Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovsdb-server: Preserve remotes across crash and restart.
Ben Pfaff [Thu, 13 Jun 2013 19:25:39 +0000 (12:25 -0700)]
ovsdb-server: Preserve remotes across crash and restart.

Commit b421d2af0ab (ovsdb-server: Add commands for adding and removing
remotes) made it possible to make ovsdb-server connect to OVS managers only
after ovs-vswitchd has completed its initial configuration.  But this
results in an undesirable effect: whenever ovsdb-server crashes, the
monitor restarts its, but ovsdb-server can no longer connect to the manager
because the remotes were added during runtime and that information is lost
during the crash.

This commit fixes the problem.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovsdb-idlc: Write a new-line at the end of "annotate" output.
Ben Pfaff [Mon, 10 Jun 2013 17:25:29 +0000 (10:25 -0700)]
ovsdb-idlc: Write a new-line at the end of "annotate" output.

Some tools do not like text files that lack a trailing new-line.  In
particular, Debian's dpkg-source utility complains about a missing new-line
in the file generated by ovsdb-idlc:

    dpkg-source: warning: file
    openvswitch-1.9.2+git20130605/lib/vswitch-idl.ovsidl has no final
    newline (either original or modified version)

This commit fixes the problem.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agodpif-netdev: Don't run port names through netdev_vport_get_dpif_port().
Ben Pfaff [Thu, 6 Jun 2013 22:27:15 +0000 (15:27 -0700)]
dpif-netdev: Don't run port names through netdev_vport_get_dpif_port().

The ports that exist within a dpif have already been translated through
netdev_vport_get_dpif_port(), so there is no value to translating them
again in the interfaces that query or dump ports (and possibly a drawback
if somehow the translation could change).

After this change, dpif-netdev translates port names in just one place,
the port_add path, which makes dpif-netdev act the same way as dpif-linux
in this respect.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif.c: Modify vsp_realdev_to_vlandev() function
Alex Wang [Wed, 5 Jun 2013 19:34:01 +0000 (12:34 -0700)]
ofproto-dpif.c: Modify vsp_realdev_to_vlandev() function

Commit 52a90c29 (Implement new "VLAN splinters" feature) passed in OpenFlow
port number to vsp_realdev_to_vlandev() function which asks for datapath port
number.

This patch fixes this bug by making the vsp_realdev_to_vlandev() function
take in and return OpenFlow port number.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Do not give stats to rules bypassed by "drop" frag policy.
Ben Pfaff [Wed, 5 Jun 2013 17:11:55 +0000 (10:11 -0700)]
ofproto-dpif: Do not give stats to rules bypassed by "drop" frag policy.

When the OFPC_FRAG_DROP policy is in effect, IP fragments are supposed to
be dropped before they reach the flow table.  Open vSwitch properly dropped
IP fragments in this case, but still accounted them to the packet and byte
counters for the flow that they would have hit if the OFPC_FRAG_NX_MATCh
policy had been in effect.

Reported-by: love you <thunder.love07@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Don't count misses in OpenFlow table stats.
Jesse Gross [Sat, 25 May 2013 00:01:34 +0000 (17:01 -0700)]
ofproto-dpif: Don't count misses in OpenFlow table stats.

Originally no rule existed for packets that did not match an
OpenFlow flow and therefore every packet with a rule could be
counted as a hit. However, newer versions of OVS have hidden
miss rules so this is no longer true. To return the correct
table stats, this subtracts packets that hit the miss rule
from the total and removes the separate counter.

Reported-by: love you <thunder.love07@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agopackets: Fix typo in reserved multicast Ethernet addresses.
Ben Pfaff [Tue, 28 May 2013 23:05:34 +0000 (16:05 -0700)]
packets: Fix typo in reserved multicast Ethernet addresses.

The reserved multicast Ethernet addresses begin with 01:80:c2, not
01:08:c2.

Reported-by: Padmanabhan Krishnan <kprad1@yahoo.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoAlways use valid ids pointer in dec_ttl_cnt_ids_from_openflow()
Simon Horman [Mon, 3 Jun 2013 05:46:30 +0000 (14:46 +0900)]
Always use valid ids pointer in dec_ttl_cnt_ids_from_openflow()

Always update the ids pointer after calling ofpbuf_put()
to ensure that it is valid when accessed.

During testing a case came up where the call to ofpbuf_put() in the
for (i = 0; i < ids->n_controllers; i++) loop would cause the underlying
buffer to be reallocated. This resulted in ids->n_controllers being an
incorrect value, the loop continuing on longer than desired and finally a
segmentation fault.

Reported-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodebian: Don't fail ovs-controller restart if daemon not running.
Gurucharan Shetty [Wed, 29 May 2013 00:18:12 +0000 (17:18 -0700)]
debian: Don't fail ovs-controller restart if daemon not running.

Reported-by: Maxime Brun <m.brun@alphalink.fr>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agoovs-xapi-sync: Handle exceptions from XAPI for get_single_bridge_id.
Gurucharan Shetty [Thu, 23 May 2013 23:14:19 +0000 (16:14 -0700)]
ovs-xapi-sync: Handle exceptions from XAPI for get_single_bridge_id.

There are possibilities when records disappear underneath ovs-xapi-sync.
In this particular case, when VLAN network was deleted, the corresponding
record in bridge's external_ids:xs_network_ids column was not deleted by
xenserver.  In situations like that handle the exceptions cleanly.

Bug #17390.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agoovs-xapi-sync: Handle multiple xs-network-uuids for xs 6.1.
Gurucharan Shetty [Sun, 19 May 2013 07:05:09 +0000 (00:05 -0700)]
ovs-xapi-sync: Handle multiple xs-network-uuids for xs 6.1.

For xenservers with version less than 6.1, interface reconfiguration
happened through interface-reconfigure scripts in this repo. In cases
where there were multiple xs-network-uuids for a single bridge,
interface-reconfigure script would add the network uuid associated
with the non-VLAN network as the first record. ovs-xapi-sync would
just blindly use the first record to create the bridge-id

But it looks like for xenserver 6.1, interface-reconfigure script
is no longer used and xenserver natively writes the xs-network-uuids.
So, in ovs-xapi-sync we no longer can copy the first value in
xs-network-uuids as bridge-id. This commit fetches the PIF record
for each xs-network-uuids and the network that does not have a VLAN
associated with it is copied over to bridge-id.

Bug #17090.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agoRelease Open vSwitch 1.10.1.
Justin Pettit [Wed, 15 May 2013 05:24:07 +0000 (22:24 -0700)]
Release Open vSwitch 1.10.1.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agodatapath: Fix compilation with Linux kernel 3.7.
Pravin B Shelar [Mon, 13 May 2013 22:53:06 +0000 (15:53 -0700)]
datapath: Fix compilation with Linux kernel 3.7.

Definition of __sum16 and __wsum is moved to uapi header.
Following patch adds check in config script for second possible
header.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto-dpif: Make fin_timeout work when governor kicks in.
Ben Pfaff [Sun, 12 May 2013 21:53:51 +0000 (14:53 -0700)]
ofproto-dpif: Make fin_timeout work when governor kicks in.

The xlate_actions() call in handle_flow_miss_without_facet() didn't
implement fin_timeout properly because tcp_flags wasn't getting set.

I have not tested that this fixes the problem, but it seems "obviously
correct".

Bug #16506.
Reported-by: Ying Chen <yingchen@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoflow: Fix IPv6 fragment packet handling
Takashi Kawaguchi [Thu, 9 May 2013 17:39:34 +0000 (02:39 +0900)]
flow: Fix IPv6 fragment packet handling

IPv6 fragmented packet (except first fragment) will not be handled
correctly. When extracting packet at parse_ipv6(), although nw_frag
should have both of FLOW_NW_FRAG_ANY and FLOW_NW_FRAG_LATER for
later fragment, only FLOW_NW_FRAG_LATER is set.

Signed-off-by: Takashi Kawaguchi <kawaguchi-takashi@mxd.nes.nec.co.jp>
Signed-off-by: Ken Ajiro <ajiro@mxw.nes.nec.co.jp>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoovsdb-client: Fix recently introduced svec_sort() bug.
Justin Pettit [Tue, 7 May 2013 04:30:26 +0000 (21:30 -0700)]
ovsdb-client: Fix recently introduced svec_sort() bug.

Commit 66980be9 (ovsdb-client: Avoid assertion with multiple databases.)
passed in a pointer to an svec pointer, when it should have just been an
svec pointer.  This corrects the bug.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoovsdb-client: Avoid assertion with multiple databases.
Justin Pettit [Mon, 6 May 2013 19:43:48 +0000 (12:43 -0700)]
ovsdb-client: Avoid assertion with multiple databases.

When using ovsdb-client with an ovsdb-server with multiple databases, an
assertion could trigger due to them being returned in non-sorted order.
This commit changes the fetch_dbs() function to always return databases
in sorted order, since both callers are expecting that behavior.

Bug #16882

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Reported-by: Spiro Kourtessis <spiro@vmware.com>
10 years agonetdev-bsd: Use UINT64_MAX for unsupported stats.
Ed Maste [Fri, 3 May 2013 20:57:38 +0000 (16:57 -0400)]
netdev-bsd: Use UINT64_MAX for unsupported stats.

As documented in netdev-provider.h for the get_stats method.

Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoChange sFlow model to reflect per-bridge sampling
Neil Mckee [Wed, 1 May 2013 05:38:53 +0000 (22:38 -0700)]
Change sFlow model to reflect per-bridge sampling

Until now, we were presenting a separate sFlow data-source (sampler) for
each ifIndex-interface.  This caused problems with samples that did not
easily map to an ifIndex being aliased together and breaking the sFlow
containment rules.  This patch changes the model to present a single sFlow
data-source for each bridge.  Now we can still make all reasonable effort
to map packet samples to ingress/egress ifIndex numbers, knowing that the
fallback to "unknown" does not break the sFlow model.  Note that
interface-counter-polling is still handled the same way as before, with
sFlow counter-polling data only being exported for ifIndex-interfaces.

Signed-off-by: Neil Mckee <neil.mckee@inmon.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodpif-linux: Close channel Netlink sockets when a port number gets recycled.
Ben Pfaff [Thu, 2 May 2013 00:08:20 +0000 (17:08 -0700)]
dpif-linux: Close channel Netlink sockets when a port number gets recycled.

When ovs-vswitchd deletes a port with dpif_linux_port_del(), that function
uses del_channel() to delete the corresponding channel, including closing
its Netlink socket fd.  However, if the vport gets removed by some other
process (e.g. "ip link delete" for veths) then this function never gets
called and thus the channel never gets deleted.

This commit partially fixes the problem.  Now, if a port number gets
reused, add_channel() closes the old Netlink socket assigned to that port
before it installs the new one.

Bug #16784.
Reported-by: Paul Ingram <paul@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoSet release date for 1.10.0. v1.10.0
Justin Pettit [Wed, 1 May 2013 21:30:38 +0000 (14:30 -0700)]
Set release date for 1.10.0.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agoworker: Prevent worker from being responsible for pidfile deletion.
Gurucharan Shetty [Mon, 29 Apr 2013 02:25:55 +0000 (19:25 -0700)]
worker: Prevent worker from being responsible for pidfile deletion.

Currently we are creating the worker process after creation of the pidfile.
This means that the responsibility of deleting the pidfile after process
termination rests with the worker process.

When we restart openvswitch using the startup scripts, we SIGTERM the main
process and once it is cleaned up, we start ovs-vswitchd again. This results
in a race condition. The new ovs-vswitchd will create a pidfile because it is
unlocked. But, if the old worker process exits after the start of new
ovs-vswitchd, it will simply delete the pidfile underneath the new ovs-vswitchd.
This will eventually result in multiple ovs-vswitchd daemons.

This patch gives the responsibility of deleting the pidfile to the main
process.

Bug #16669.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agovswitchd: Disable system stats collection on a concurrently running daemon.
Gurucharan Shetty [Sun, 28 Apr 2013 02:58:12 +0000 (19:58 -0700)]
vswitchd: Disable system stats collection on a concurrently running daemon.

There are very rare cases (ex: ovs-vswitchd.pid is inadvertantly deleted),
when multiple ovs-vswitchd daemons can end up running at the same time.
In a situation like that one of the daemons can wait on the poll()
with a 0 ms wait time as it would be expecting system stats to be collected.

But system stats are never run for the daemon that does not have the
lock on the database and hence it takes up 100% of the CPU if its state
machine for stats collection previously was S_WAITING.

With this patch, we disable the system stats collection for the daemon that
does not have the database lock. When it eventually gets the lock on the
database, system stats are automatically enabled if other_config:\
enable-statistics=true.

Bug #16669.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agodatapath: Account for RHEL6.4 backports in compat layer
Thomas Graf [Fri, 26 Apr 2013 10:03:11 +0000 (12:03 +0200)]
datapath: Account for RHEL6.4 backports in compat layer

Explicitly check the availability of several kernel API functions
instead of relying on the kernel version to account for Red Hat
Enterprise Linux backports.

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
(cherry picked from commit 42d5dd9595cce35a8825a20be7d71a3a8f6f5640)

Conflicts:
datapath/linux/compat/include/asm/percpu.h
datapath/linux/compat/include/linux/netdevice.h

11 years agodatapath: Use openvswitch_handle_frame hook in >=RHEL6.4 to live side by side with...
Thomas Graf [Fri, 26 Apr 2013 10:03:10 +0000 (12:03 +0200)]
datapath: Use openvswitch_handle_frame hook in >=RHEL6.4 to live side by side with bridging

Due to the missing register rx_handler API in the kernel RHEL6 is
based on, the datapath currently falls back to using the bridging
hook with the consequence that bridging and OVS cannot be used in
parallel on any RHEL6 release.

For this purpose, >=RHEL6.4 releases provide a special rx frame hook
to be used by OVS. It captures frames at the same location in the
stack as the rx_handler would do in more recent kernel releases. In
order to store the vport pointer, the net_device's ax25_ptr field is
utilized under the assumption that an AX25 device will never be
attached to an OVS bridge.

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
(cherry picked from commit f285d3e715512571c4b2f92a4d1c65022bbcc9d5)

Conflicts:
datapath/vport-netdev.c

11 years agoAdd FAQ entries around the VXLAN support in Open vSwitch.
Kyle Mestery [Fri, 26 Apr 2013 18:30:25 +0000 (14:30 -0400)]
Add FAQ entries around the VXLAN support in Open vSwitch.

Add a section to the FAQ explaining VXLAN with a pointer to the IETF draft.
Add sections detailing how much of the VXLAN protocol is currently supported
in OVS, along with a section explaining the default UDP port and how to change
this when creating VXLAN ports.

Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Conflicts:
FAQ

11 years agoUpdate the default VXLAN destination UDP port to the IANA assigned port
Kyle Mestery [Fri, 26 Apr 2013 18:30:24 +0000 (14:30 -0400)]
Update the default VXLAN destination UDP port to the IANA assigned port

VXLAN was recently assigned UDP port 4789 by IANA. This
comit updates the OVS VXLAN implementation to reflect the new UDP port
number.

Cc: Kenneth Duda <kduda@aristanetworks.com>
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Conflicts:
NEWS

11 years agopython: fix a typo error in python/ovs/socket_util.py.
Alex Wang [Thu, 18 Apr 2013 00:35:04 +0000 (17:35 -0700)]
python: fix a typo error in python/ovs/socket_util.py.

The commit 89d7ffa9 (python: Workaround UNIX socket path
length limits), fixes most failed tests. But it has a
typo and the typo causes the failure of test <unixctl
server errors - Python> when the path length is very
long (e.g. more than 90 characters).

This patch fixes the above issue.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agopython/ovs/poller.py: workaround an eventlet bug
YAMAMOTO Takashi [Tue, 16 Apr 2013 06:56:31 +0000 (15:56 +0900)]
python/ovs/poller.py: workaround an eventlet bug

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoovs-vsctl: Fix a segfault.
Gurucharan Shetty [Wed, 10 Apr 2013 18:55:06 +0000 (11:55 -0700)]
ovs-vsctl: Fix a segfault.

The following two commands results in a ovs-vsctl segfault.
ovs-vsctl -vfatal_signal:off --timeout=0 wait-until \
Open_vswitch . external_ids:blah="1"
/etc/init.d/openvswitch-switch restart

This patch fixes the segfault by properly setting the global
varibale, the_idl_txn to NULL when the underlying memory is
freed.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agobridge: Complete initial configuration even with empty database.
Ben Pfaff [Thu, 11 Apr 2013 22:47:08 +0000 (15:47 -0700)]
bridge: Complete initial configuration even with empty database.

If the database was empty, that is, it did not even contain an Open_vSwitch
top-level configuration record, at ovs-vswitchd startup time, then
OVS failed to detach and used 100% CPU.  This commit fixes the problem.

This problem was introduced by commit 63ff04e82623e765 (bridge: Only
complete daemonization after db commits initial config.).

This problem did not manifest if the initscripts supplied with Open vSwitch
were used, because those initscripts always initialize the database before
starting ovs-vswitchd, so this problem affects only users with hand-rolled
local OVS startup scripts.

Bug #16090.
Reported-by: Pravin Shelar <pshelar@nicira.com>
Tested-by: Pravin Shelar <pshelar@nicira.com>
Reported-by: Paul Ingram <paul@nicira.com>
Reported-by: Amre Shakimov <ashakimov@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
11 years agobridge: Only complete daemonization after db commits initial config.
Ben Pfaff [Wed, 10 Apr 2013 17:33:39 +0000 (10:33 -0700)]
bridge: Only complete daemonization after db commits initial config.

An earlier commit changed the Open vSwitch startup scripts so that they
connect to remote managers only after ovs-vswitchd does its initial
configuration, as signaled by ovs-vswitchd detaching from its parent
process.  However, a race window remains, because ovs-vswitchd detaching
does not mean that the database server has received and committed the
transaction, only that ovs-vswitchd has sent it.  This commit fixes that
race window, by changing ovs-vswitchd to complete detaching only after
the database server acknowledges the transaction.

It is still possible for unusual events to cause ovs-vswitchd to detach
before ephemeral columns are filled in.  There is always a slim possibility
that the transaction will fail or that some other client has added new
bridges, ports, etc. while ovs-vswitchd was configuring using an old
configuration.  The latter race is inherent to the design of the system
and cannot be avoided without radical changes.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
Bug #15983.

11 years agoovs-ctl: Connect to remote OVSDB managers only after ovs-vswitchd starts.
Ben Pfaff [Wed, 10 Apr 2013 16:53:54 +0000 (09:53 -0700)]
ovs-ctl: Connect to remote OVSDB managers only after ovs-vswitchd starts.

Until now, ovs-ctl has started ovsdb-server with the full set of remote
managers configured.  This means that ovsdb-server immediately connects to
these managers, before ovs-vswitchd even starts.  Because the Open vSwitch
schema has several ephemeral columns, there will be considerable startup
churn in the database.   For example, ovs-vswitchd will initially fill in
the datapath-id and ofport columns as it starts and sets up the initial
configuration.  This churn wastes bandwidth to the remote managers and has
potential for confusing them.

This commit reduces the churn by changing ovs-ctl so that ovsdb-server
connects to the remote managers only after ovs-vswitchd has finished its
initial configuration.  This means that remote managers will initially
see a filled-in database, not one that has its ephemeral columns empty.

This commit does not mean that managers can ignore the possibility that
some columns have not yet been filled in.  For example, some columns will
still be briefly blank after a new bridge or a new port is added at
runtime, because adding a bridge or port occurs in one transaction (made by
the client adding the port, e.g. ovs-vsctl) and filling in those columns
happens in a different transaction (made by ovs-vswitchd).  But this commit
does reduce the quantity of empty columns that I would expect a database
client to observe in practice.

Reported-by: Jeff Merrick <jmerrick@vmware.com>
CC: Amar Padmanabhan <amar@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
Bug #15983.

11 years agoovsdb-server: Add commands for adding and removing remotes at runtime.
Ben Pfaff [Wed, 10 Apr 2013 16:34:49 +0000 (09:34 -0700)]
ovsdb-server: Add commands for adding and removing remotes at runtime.

This will make it possible, in later commits, to make ovsdb-server connect
to OVS managers only after ovs-vswitchd has completed its initial
configuration.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
11 years agoovsdb-server: Refactor parsing of remote names to avoid ovs_fatal().
Ben Pfaff [Wed, 10 Apr 2013 23:22:00 +0000 (16:22 -0700)]
ovsdb-server: Refactor parsing of remote names to avoid ovs_fatal().

The current users of parse_db_column() are content to terminate with a
fatal error if parsing fails.  An upcoming commit requires more flexibility,
so this commit refactors parse_db_column() to make this possible.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
11 years agosset: New function sset_sort().
Ben Pfaff [Wed, 10 Apr 2013 16:27:49 +0000 (09:27 -0700)]
sset: New function sset_sort().

This will have its first caller in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
11 years agodpif-linux: Reset epoll() on channel deletion.
Ethan Jackson [Wed, 10 Apr 2013 20:05:04 +0000 (13:05 -0700)]
dpif-linux: Reset epoll() on channel deletion.

The list of epoll events contains references to channels which may
be stale when one of those channels is deleted.  The safest thing
to do is simply refresh epoll() whenever a channel is deleted.

Bug #16057.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoovs-lib: Do not tee the ovs-ctl o/p in case of strace.
Gurucharan Shetty [Sat, 6 Apr 2013 23:56:06 +0000 (16:56 -0700)]
ovs-lib: Do not tee the ovs-ctl o/p in case of strace.

Running the OVS daemons with strace option enabled
will block if we pipe the output. We use tee
to log the output of ovs-ctl to ovs-ctl.log

This patch disables the startup script logging when we run the
OVS daemons with the strace option.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agoofproto-dpif: Disable miss handling in rule_get_stats().
Ethan Jackson [Sat, 6 Apr 2013 22:22:14 +0000 (15:22 -0700)]
ofproto-dpif: Disable miss handling in rule_get_stats().

rule_get_stats() is often called when iterating over every rule in
the flow table.  To ensure up-to-date statistics, rule_get_stats()
calls push_all_stats() which can cause flow misses to be handled.
When using the learn action, this can cause rules to be added (and
potentially removed) from the OpenFlow table.  This could corrupt
the caller's data structures, leading to a segmentation fault.
This patch fixes the issue by disabling flow miss handling from
within rule_get_stats().

Bug #15999.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoovs-appctl: dpif/show display bug fix
Andy Zhou [Thu, 4 Apr 2013 23:35:27 +0000 (16:35 -0700)]
ovs-appctl: dpif/show display bug fix

Fixes a bug where per ofproto moving average stats did not update
when there is no active dp flows.

Reported-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agorhel: Add depmod.d conf file for rhel6 kmod package.
Gurucharan Shetty [Sun, 31 Mar 2013 01:32:25 +0000 (18:32 -0700)]
rhel: Add depmod.d conf file for rhel6 kmod package.

It looks like for Centos6.4, there is an upstream openvswitch
kernel module already installed. When we try to install kmod-openvswitch
package from this tree's pre-1.10 branches, we get the following warning:
"brcompat.ko needs unknown symbol ovs_dp_ioctl_hook".

Also, after installing the kmod-openvswitch package, if we run
"modprobe openvswitch", the upstream kernel module gets loaded.
We should instead load the kernel module compiled from this tree.

This patch fixes both the above issues.

Bug #15829.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agojsonrpc-server: Disconnect connections that queue too much data.
Ben Pfaff [Wed, 27 Mar 2013 21:38:11 +0000 (14:38 -0700)]
jsonrpc-server: Disconnect connections that queue too much data.

Consider this situation:

    * OVSDB client A executes transactions very quickly for a long time.

    * OVSDB client B monitors the tables that A modifies, but (either
      because B is connected over a slow network, or because B is slow to
      process updates) cannot keep up.

In this situation, the data that ovsdb-server has queued to send B grows
without bound and eventually ovsdb-server runs out of memory.  This commit
avoids the problem by noticing that more data is queued to B than necessary
to express the whole contents of the database and dropping the connection
to B.  When B reconnects later, it can then fetch the contents of the
database using less data than was previously queued to it.

(This is not entirely hypothetical.  We have seen this behavior in
intentional stress tests.)

Bug #15637.
Reported-by: Jeff Merrick <jmerrick@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoovsdb-data: New functions for predicting serialized length of data.
Ben Pfaff [Wed, 27 Mar 2013 16:32:56 +0000 (09:32 -0700)]
ovsdb-data: New functions for predicting serialized length of data.

These will be used for the first time in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agojson: New function json_serialized_length().
Ben Pfaff [Mon, 1 Apr 2013 20:16:59 +0000 (13:16 -0700)]
json: New function json_serialized_length().

This will be used for the first time in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto-dpif: Don't rate limit facet_learn() with fin_timeouts.
Ethan Jackson [Tue, 2 Apr 2013 19:32:22 +0000 (12:32 -0700)]
ofproto-dpif: Don't rate limit facet_learn() with fin_timeouts.

In the standard case, rate limiting facet_learn() to once ever
500ms, makes sense.  The worst that can happen is a learning entry
is expired half a second to early.  However, when using
fin_timeouts, we really need react quickly to delete the newly
stale flow.

Bug #15915.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoofproto: Increase default flow-eviction-threshold.
Ethan Jackson [Fri, 29 Mar 2013 21:19:04 +0000 (14:19 -0700)]
ofproto: Increase default flow-eviction-threshold.

The flow-eviction-threshold presents a trade off between the
expense of maintaining large numbers of datapath flows, and the
benefit of avoid unnecessary flow misses.  In some large Open
vSwitch deployments, we've seen the previous default flow eviction
threshold negatively impact performance with reasonably typical
traffic patterns.  This patch increases the default to a level
which should represent a better trade off: still relatively safe,
but much more amenable to large numbers of long lived flows.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoofproto-dpif: Push statistics less frequently.
Ethan Jackson [Fri, 22 Mar 2013 02:04:52 +0000 (19:04 -0700)]
ofproto-dpif: Push statistics less frequently.

The most natural place to push facet statistics is in
update_stats() where they're pulled from the datapath.  However,
under load, update_stats() can be called as many as 10 times per
second causing us to push statistics so frequently it hurts
performance.  By pushing statistics much less frequently, this
patch generates a roughly 8% improvement in TCP_CRR performance.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoofproto-dpif: Run fast internally.
Ethan Jackson [Wed, 27 Mar 2013 18:33:22 +0000 (11:33 -0700)]
ofproto-dpif: Run fast internally.

ofproto-dpif is responsible for quite a few book keeping tasks in
addition to handling flow misses.  Many of these tasks (flow
expiration, flow revalidation, etc) can take many hundreds of
milliseconds, during which no misses can be handled.  The ideal
long term solution to this problem, is to isolate flow miss
handling into it's own thread.  However, for now this patch
provides a 5% increase in TCP_CRR performance, and smooths out
results during revalidations.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoofproto-dpif: Systematically push stats upon request.
Ethan Jackson [Sat, 30 Mar 2013 22:13:00 +0000 (15:13 -0700)]
ofproto-dpif: Systematically push stats upon request.

Commit bf1e8ff (ofproto-dpif: Push statistics in rule_get_stats()),
started down the road towards pushing stats on demand, but it
didn't go quite far enough.  First, it neglected to push stats in
port_get_stats() and mirror_get_stats().  Second, it only pushes
stats for a single ofproto, making it incomplete when patch ports
are used.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoofproto-dpif.at: Fix timing issue in show rates test.
Jarno Rajahalme [Thu, 28 Mar 2013 13:01:18 +0000 (15:01 +0200)]
ofproto-dpif.at: Fix timing issue in show rates test.

Fix a test failure due to timing differences in different test runs.

Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto-dpif: Keep track of exact-match flow info
Andy Zhou [Tue, 26 Mar 2013 02:49:13 +0000 (19:49 -0700)]
ofproto-dpif: Keep track of exact-match flow info

This patch adds more flow related stats to the output of
"ovs-appctl dpif/show".  Specifically, the follow information
are added per ofproto:

- Max flow table size
- Average flow table size
- Average flow table add rate
- Average flow table delete rate
- Average flow entry life in milliseconds

Feature #15366

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoovs-appctl: dpif/show display per bridge stats
Andy Zhou [Tue, 12 Mar 2013 21:19:18 +0000 (14:19 -0700)]
ovs-appctl: dpif/show display per bridge stats

This is to fix the fallout of single datapath change.
ovs-appctl dpif/show displays per bridge miss, hit
and flow counts on the screen, but the backend is
obtaining those information from the datapath.
With a single datapath, all bridges of the same
datapath would all display the same  (global)
counters maintained by the datapath, obviously
not correct.

This patch fixes the bug by maintaining per ofproto_dpif
miss and hit counts, which are used for display output.
The number of flows count is obtained by counting the
number facets per ofproto.

ovs-dpctl show still displays the counters maintain by
the datapath, as before.

Bug #15369

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto-dpif: Rate limit calls to facet_learn().
Ethan Jackson [Fri, 22 Mar 2013 02:40:49 +0000 (19:40 -0700)]
ofproto-dpif: Rate limit calls to facet_learn().

In the TCP_CRR benchmark, ovs-vswitchd spends so much time in
update_stats() that it has a significant impact on flow setup
performance.  Further work is needed in this area, but for now,
simply rate limiting facet_learn() has a roughly 10% improvement
with complex flow tables.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoofproto-dpif: Rate limit facet_check_consistency()
Ethan Jackson [Thu, 21 Mar 2013 20:31:14 +0000 (13:31 -0700)]
ofproto-dpif: Rate limit facet_check_consistency()

With complex flow tables, facet_check_consistency() can be
expensive enough to show up in flow setup performance benchmarks.
In my testing this patch gives us a roughly 10% improvement in
TCP_CRR and ovs-benchmark.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoovs-lib: Wait for a longer time after SIGKILL.
Gurucharan Shetty [Wed, 27 Mar 2013 21:15:05 +0000 (14:15 -0700)]
ovs-lib: Wait for a longer time after SIGKILL.

Currently, when we stop a daemon, we first send it SIGTERM.
If SIGTERM did not work within ~5 seconds, we send a SIGKILL.
After sending SIGKILL, we wait only for 4 seconds, before giving
up.

If the system is extremely busy, there is a chance that a
process is not killed by the kernel within 4 seconds. In such
a case, when we try to start the daemon immediately, we see that
the pid inside the pid-file is valid and assume that the daemon
is still running. This leaves us in a state, where the daemon is
actually not running.

This patch increases the time waiting for the kernel to kill the
process to 60 seconds.

Bug #15404.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agodatapath: Fix IP ID setting.
Jarno Rajahalme [Mon, 25 Mar 2013 19:03:38 +0000 (21:03 +0200)]
datapath: Fix IP ID setting.

Eliminate the extra call to ip_select_ident(), and place the
__ip_select_ident() call where the ip_select_ident() call was.
This fixes two problems: Before, the call to ip_select_ident() did
always zero out the value set earlier by __ip_select_ident().  Also,
when __ip_select_ident() was called before setting the iph->daddr,
ident calculation was possibly based on uninitialized data (but as
the result was masked by the later call to ip_select_ident() it was
not visible).

Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
11 years agodatapath: Factor out common code from *_build_header() to ovs_tnl_send().
Jarno Rajahalme [Mon, 25 Mar 2013 19:03:37 +0000 (21:03 +0200)]
datapath: Factor out common code from *_build_header() to ovs_tnl_send().

Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Conflicts:
datapath/vport-lisp.c

11 years agoovs-bugtool: Add iptables output for all tables.
Gurucharan Shetty [Mon, 25 Mar 2013 15:41:18 +0000 (08:41 -0700)]
ovs-bugtool: Add iptables output for all tables.

Currently we list all the rules only from the 'filter' table.
Include the rules from all the other tables too.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agoAdd binary option for command outputs collected by ovs-bugtool
Shih-Hao Li [Fri, 22 Feb 2013 16:54:04 +0000 (08:54 -0800)]
Add binary option for command outputs collected by ovs-bugtool

Current ovs-bugtool collects command outputs as text strings.
Thus it reads the output by lines. For commands that generate
huge binary data, it becomes very inefficient to read the output.

The change here is to use a 1MB buffer to read binary data
instead of reading them by lines.

Signed-off-by: Shih-Hao Li <shihli@vmware.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agoodp-utils: Fix memory corruption while flow parsing.
Gurucharan Shetty [Fri, 22 Mar 2013 23:25:36 +0000 (16:25 -0700)]
odp-utils: Fix memory corruption while flow parsing.

Currently, when flow attribute type is greater than OVS_KEY_ATTR_MAX,
we can write into a random memory address causing corruption. Fix it.

Bug #15702.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agoofproto-dpif: Push statistics in rule_get_stats().
Ethan Jackson [Sat, 23 Mar 2013 22:11:21 +0000 (15:11 -0700)]
ofproto-dpif: Push statistics in rule_get_stats().

As time goes on, and flow tables become more complicated, the
tradeoff between keeping up to date statistics, and the CPU
resources needed to maintain them, will become more important.
Commit 5c0243a (ofproto-dpif: xlate actions once with subfacets.)
delayed the reporting of some statistics in an effort to achieve
higher flow setup performance.  Future commits will continue in the
same direction.

This patch helps to alleviate the issue, by pushing statistics
rule_get_stats(), when users actually want them.  Presumably, this
happens rarely, and thus will not have a negative impact on
ovs-vswitchd performance.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoofproto-dpif: xlate actions once with subfacets.
Ethan Jackson [Thu, 21 Mar 2013 18:17:00 +0000 (11:17 -0700)]
ofproto-dpif: xlate actions once with subfacets.

Before this patch, when ofproto-dpif decided that a particular flow
miss needed a facet, it would do action translation multiple times.
Once in subfacet_make_actions(), and once per packet in
subfacet_update_stats().  In the common case (once per miss), this
would double the amount of work required in xlate_actions().

The call to facet_push_stats() in subfacet_update_stats() is
unnecessary.  If the packets are simply accounted to the facet,
they will eventually be pushed to the relevant rules in
update_stats() or when the facet is removed.   Removing the
unnecessary step gives us a 20% improvement of the netperf TCP_CRR
benchmark with the complex flow tables installed by our controller.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
11 years agoovs-bugtool: Add ovs-ofctl commands to bugtool plugin scripts.
Gurucharan Shetty [Thu, 21 Mar 2013 20:46:15 +0000 (13:46 -0700)]
ovs-bugtool: Add ovs-ofctl commands to bugtool plugin scripts.

This patch adds two new scripts that run "ovs-ofctl show" and
"ovs-ofctl dump-flows" on each bridge.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agoovs-bugtool: Remove calls of ovs-ofctl on ovs-system.
Gurucharan Shetty [Thu, 21 Mar 2013 20:22:56 +0000 (13:22 -0700)]
ovs-bugtool: Remove calls of ovs-ofctl on ovs-system.

With single datapath, making ovs-ofctl calls on ovs-system
does not give the necessary o/p. This patch removes those calls.

The next patch adds the correct commands to bugtool plugin scripts.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agobridge: Rate-limit updates to "instant stats".
Ben Pfaff [Tue, 19 Mar 2013 21:02:48 +0000 (14:02 -0700)]
bridge: Rate-limit updates to "instant stats".

Some information in the database must be kept as up-to-date as
possible to allow controllers to respond rapidly to network outages.
We call these statistics "instant" stats.

Until now, the instant stats have been updated on every trip through
the main loop.  This work scales with the number of interfaces that
ovs-vswitchd manages.  With CFM enabled on 5000 interfaces, even with
a low transmission rate, we see ovs-vswitchd using 100% CPU just to
maintain statistics, even with no actual changes.

This commit rate-limits updates to instant stats to at most 10 times
per second.  Earlier tests I did with similar patches showed a major
reduction in CPU usage.  I have not rerun those tests with this patch,
but I expect that the CPU usage should similarly decline.

CC: Ram Jothikumar <rjothikumar@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
11 years agodebian: Re-add --timeout option for ifupdown script.
Gurucharan Shetty [Mon, 18 Mar 2013 19:33:17 +0000 (12:33 -0700)]
debian: Re-add --timeout option for ifupdown script.

Commit fba6bd1d3f(ovs-vsctl: Try connecting only once for active connections..)
removed the timeout option from ifupdown.sh. Removing the "--timeout=" option
can cause ifupdown script to hang if ovs-vswitchd is not running and ifupdown
script changes the OVSDB. So, re-add it.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agoovs-vsctl: Try connecting only once for active connections by default.
Ben Pfaff [Fri, 15 Mar 2013 23:14:28 +0000 (16:14 -0700)]
ovs-vsctl: Try connecting only once for active connections by default.

Until now, ovs-vsctl has kept trying to the database server until it
succeeded or the timeout expired (if one was specified with --timeout).
This meant that if ovsdb-server wasn't running, then ovs-vsctl would hang.
The result was that almost every ovs-vsctl invocation in scripts specified
a timeout on the off-chance that the database server might not be running.
But it's difficult to choose a good timeout.  A timeout that is too short
can cause spurious failures.  A timeout that is too long causes long delays
if the server really isn't running.

This commit should alleviate this problem.  It changes ovs-vsctl's behavior
so that, if it fails to connect to the server, it exits unsuccessfully.
This makes --timeout obsolete for the purpose of avoiding a hang if the
database server isn't running.  (--timeout is still useful to avoid a hang
if ovsdb-server is running but ovs-vswitchd is not, for ovs-vsctl commands
that modify the database.  --no-wait also avoids that issue.)

Bug #2393.
Bug #15594.
Reported-by: Jeff Merrick <jmerrick@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoipsec: unset IPSEC_MARK flag from skb_mark after tunnel packet is decapsulated
Ansis Atteka [Thu, 14 Mar 2013 18:53:00 +0000 (11:53 -0700)]
ipsec: unset IPSEC_MARK flag from skb_mark after tunnel packet is decapsulated

After tunnel packet is unencapsulated we should unset IPsec flag from
skb_mark.

Otherwise, IPsec policies would be applied one more time on internal
interfaces, if there is one. This is especially necessary after we
will introduce global, low-priority IPsec drop policy that will make
sure that we never let through marked but unencrypted packets.

Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Issue: 15074

11 years agoovs-bugtool: Add ovs-ctl.log to debug bundle.
Gurucharan Shetty [Fri, 15 Mar 2013 16:21:25 +0000 (09:21 -0700)]
ovs-bugtool: Add ovs-ctl.log to debug bundle.

ovs-ctl.log will include the o/p of ovs-ctl when
run from rhel, debian and xenserver startup scripts.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agodebian, rhel, xenserver: Ability to collect ovs-ctl logs.
Gurucharan Shetty [Wed, 13 Mar 2013 22:07:06 +0000 (15:07 -0700)]
debian, rhel, xenserver: Ability to collect ovs-ctl logs.

We use ovs-ctl from startup scripts to start, stop, restart,
force-reload-kmod OVS daemons. ovs-ctl gives quite a descriptive
o/p while running the above commands. But the o/p goes to stdout.
Sometimes, this output is quite useful to debug issues.

With this patch, we store the o/p of ovs-ctl when called from
startup scripts in /var/log/openvswitch/ovs-ctl.log

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
11 years agotunnel: Remove references to multicast tunnels in schema documentation.
Jesse Gross [Wed, 13 Mar 2013 15:35:15 +0000 (08:35 -0700)]
tunnel: Remove references to multicast tunnels in schema documentation.

The vestigal multicast support in tunnels has been removed at this
point, so this deletes the remaining references in the documentation.

Reported-by: Guangvy <1965837689@qq.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
11 years agodatapath: Check for Centos 6.4 backports.
Jesse Gross [Tue, 12 Mar 2013 18:34:29 +0000 (11:34 -0700)]
datapath: Check for Centos 6.4 backports.

Centos 6.4 backported a number of additional functions so our existing
versions started causing conflicts.

Reported-by: Denis Iskandarov <d.iskandarov@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
11 years agobridge: Store the 'mac_in_use' for interfaces in OVSDB.
Justin Pettit [Tue, 12 Mar 2013 21:47:22 +0000 (14:47 -0700)]
bridge: Store the 'mac_in_use' for interfaces in OVSDB.

It can be useful to remotely determine the MAC addresses of attached
interfaces without going through OpenFlow.  This adds the MAC address to
a new 'mac_in_use' column on the Interface table.

Feature #15551

Requested-by: Paul Ingram <paul@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agodatapath: Reduce loop limit by one to 4.
Jesse Gross [Tue, 12 Mar 2013 19:36:03 +0000 (12:36 -0700)]
datapath: Reduce loop limit by one to 4.

We currently allow five trips through the kernel datapath
before dropping the packet to protect the stack.  However, there
have been a few reports recently involving tunneling that this is
still too much.  Although it's not a complete solution, this reduces
the limit by one to balance safety in common situations with
flexibility.

Bug #15477

Reported-by: Paul Ingram <paul@nicira.com>
Reported-by: 謝秉融 <faithfulman@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
11 years agoconnmgr: Fix memory leak in ofconn monitor table.
Ben Pfaff [Fri, 18 Jan 2013 23:17:15 +0000 (15:17 -0800)]
connmgr: Fix memory leak in ofconn monitor table.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
11 years agoovsdb: Fix memory leak.
Ben Pfaff [Thu, 24 Jan 2013 19:33:35 +0000 (11:33 -0800)]
ovsdb: Fix memory leak.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
11 years agoSet dates for 1.9.0 release.
Justin Pettit [Tue, 26 Feb 2013 19:24:20 +0000 (11:24 -0800)]
Set dates for 1.9.0 release.

This also sets the dates for 1.8.0, even though it was an internal-only
release.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agoNEWS: Note tunneling feature removals in the correct release.
Jesse Gross [Mon, 11 Mar 2013 23:00:17 +0000 (16:00 -0700)]
NEWS: Note tunneling feature removals in the correct release.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Conflicts:
NEWS

11 years agoAdd table_id to NXM flow_removed messages.
Ben Pfaff [Wed, 6 Mar 2013 17:13:37 +0000 (09:13 -0800)]
Add table_id to NXM flow_removed messages.

Feature #15466.
Requested-by: Ronghua Zhang <rzhang@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto-dpif: Fix up user specifying wrong bridge on "ofproto/trace".
Ben Pfaff [Wed, 6 Mar 2013 00:48:21 +0000 (16:48 -0800)]
ofproto-dpif: Fix up user specifying wrong bridge on "ofproto/trace".

If there is more than one bridge, then it's easy to specify the wrong one
on an ofproto/trace command.  Previously, this would produce surprising
results.  With this commit, "ofproto/trace" should silently fix up the
problem.

It would be nice to not require the user to specify a bridge at all, but
it's theoretically possible to have more than one backer, in which case we
need some way to distinguish, and a bridge name is as good an identifier
as we have.  We could ask the user to specify the datapath_type, I guess,
but that's a less familiar name to most users and it would be a somewhat
gratuitous change in synatx for ofproto/trace.

Bug #15419.
Reported-by: Paul Ingram <paul@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
11 years agoofproto-dpif: Print slow-path actions instead of "drop" in dump-flows.
Justin Pettit [Thu, 7 Mar 2013 01:11:35 +0000 (17:11 -0800)]
ofproto-dpif: Print slow-path actions instead of "drop" in dump-flows.

The command "ovs-appctl dpif/dump-flows" would print slow-path actions
as "drop", which could be confusing to users.  This is different from
"ovs-dpctl dump-flows", which prints a descriptive reason.  This commit
replaces "drop" with the reason.

Bug #14840

Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agotimeval: Avoid backtrace() from signal handler on x86-64.
Ben Pfaff [Fri, 8 Mar 2013 01:13:49 +0000 (17:13 -0800)]
timeval: Avoid backtrace() from signal handler on x86-64.

backtrace() is really useful, but it is not signal safe everywhere.  We
need to reassess whether it is reasonable to use it anywhere, but
immediately we need to disable it on x86-64 (with glibc) because it is
causing segfaults in testing.

Bug #15497.
Reported-by: Ram Jothikumar <rjothikumar@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
11 years agotunnel: Mark ECN status on decapsulated tunnel packets.
Justin Pettit [Wed, 13 Feb 2013 22:50:24 +0000 (14:50 -0800)]
tunnel: Mark ECN status on decapsulated tunnel packets.

In the kernel tunnel implementation, if a packet was marked as ECN CE on
the outer packet then we would carry this over to the inner packet on
decapsulation.  With the switch to flow based tunneling, this stopped
happening.  This commit reintroduces that behavior by using the set IP
header action.

Bug #15072

Signed-off-by: Justin Pettit <jpettit@nicira.com>
11 years agotunnel: Generate datapath flows for tunneled packets dropped due to ECN.
Justin Pettit [Wed, 13 Feb 2013 22:08:15 +0000 (14:08 -0800)]
tunnel: Generate datapath flows for tunneled packets dropped due to ECN.

Move the check for whether tunneled packets should be dropped due to
congestion encountered (CE) when the encapsulated packet is not ECN
capable (non-ECT).  This also adds some additional tests for ECN
handling on tunnel decapsulation.

Signed-off-by: Justin Pettit <jpettit@nicira.com>