cascardo/ovs.git
7 years agodatapath-windows: Add ECN support on STT decapsulation
Paul Boca [Mon, 6 Jun 2016 16:45:06 +0000 (16:45 +0000)]
datapath-windows: Add ECN support on STT decapsulation

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: STT reassemble small fix
Paul Boca [Mon, 6 Jun 2016 16:45:05 +0000 (16:45 +0000)]
datapath-windows: STT reassemble small fix

Fixed possible deadlock in case NdisGetDataBuffer fails
Validate the segment length and offset on reassemble to avoid buffer overflow

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: Add VLAN support to STT
Paul Boca [Mon, 6 Jun 2016 16:45:04 +0000 (16:45 +0000)]
datapath-windows: Add VLAN support to STT

Add VLAN to STT header and on receive applyit to encapsulated packet

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: Improved offloading on STT tunnel
Paul Boca [Mon, 6 Jun 2016 16:45:00 +0000 (16:45 +0000)]
datapath-windows: Improved offloading on STT tunnel

*Added OvsExtractLayers - populates only the layers field without unnecessary
memory operations for flow part
*If in STT header the flags are 0 then force packets checksums calculation
on receive.
*Ensure correct pseudo checksum is set for LSO both on send and receive.
Linux includes the segment length to TCP pseudo-checksum conforming to
RFC 793 but in case of LSO Windows expects this to be only on
Source IP Address, Destination IP Address, and Protocol.
*Fragment expiration on rx side of STT was set to 30 seconds, but the correct
timeout would be TTL of the packet

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agotests: Skip "daemon --service" test on Windows from non-admin console
Paul Boca [Tue, 7 Jun 2016 08:12:16 +0000 (08:12 +0000)]
tests: Skip "daemon --service" test on Windows from non-admin console

Check if we have enough rights to create a service on Windows
otherwise we skip daemon test

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodebian, rhel: Ship ovs shared libraries and header files
Edwin Chiu [Tue, 31 May 2016 21:32:59 +0000 (14:32 -0700)]
debian, rhel: Ship ovs shared libraries and header files

Compile and package ovs shared libraries and create new header
package for debian (openvswitch-dev) and rhel (openvswitch-devel).

VMware-BZ: #1556299
Signed-off-by: Edwin Chiu <echiu@vmware.com>
Co-authored-by: Harold Lim <haroldl@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoINSTALL.md: Note use of "hacking" flake8 plugin.
Russell Bryant [Thu, 2 Jun 2016 19:53:46 +0000 (15:53 -0400)]
INSTALL.md: Note use of "hacking" flake8 plugin.

The automatic flake8 check that runs against Python code has some
warnings enabled that come from the "hacking" flake8 plugin.  If it's
not installed, the warnings just won't occur until it's run on a system
with "hacking" installed.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-northd: logical router icmp response should not care about inport
Flavio Fernandes [Fri, 27 May 2016 15:53:35 +0000 (11:53 -0400)]
ovn-northd: logical router icmp response should not care about inport

When responding to icmp echo requests (aka ping) packets, the logical
router should not restrict responses based on the inport.

Example diagram:

vm: IP1.1 (subnet1)
logical_router: IP1.2 (subnet1) and IP2.2 (subnet2)

   vm -------[subnet1]------- logical_router -------[subnet2]
   <IP1.1>                <IP1.2>        <IP2.2>

vm should be able to ping <IP2.2>, even though it is an address
of a subnet that can only be reached through L3 routing.

Reference to the mailing list thread:
http://openvswitch.org/pipermail/discuss/2016-May/021172.html

Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agotests: Fixed access denied on ovs-vswitchd.log
Paul Boca [Fri, 3 Jun 2016 13:05:54 +0000 (13:05 +0000)]
tests: Fixed access denied on ovs-vswitchd.log

On Windows trying to overwrite the opened ovs-vswitchd.log
fails with access denied. Closing it before trying to overwrite it
solves the problem

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agonetdev-dummy: Add multiqueue support to dummy-pmd.
Ilya Maximets [Fri, 27 May 2016 13:32:53 +0000 (16:32 +0300)]
netdev-dummy: Add multiqueue support to dummy-pmd.

All previous multi-open logic preserved for rx queues.
Also, added new optional parameter '--qid' for 'netdev-dummy/receive'
in order to allow user to choose id of rx queue to which packet will
be sent.

Ex.:
ovs-appctl netdev-dummy/receive p1 --qid 3 'in_port(1) ...'

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoovs-vsctl.at: Use OVS_VSCTL_CLEANUP.
Ilya Maximets [Fri, 27 May 2016 13:32:52 +0000 (16:32 +0300)]
ovs-vsctl.at: Use OVS_VSCTL_CLEANUP.

OVSDB_SERVER_SHUTDOWN defined in another module and not inside
'*-macros.at'. So, it should not be used inside ovs-vsctl.at.

Also, OVS_VSCTL_CLEANUP should be used instead of direct calls
to OVSDB_SERVER_SHUTDOWN.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agodpctl: Implement dpctl/flow-get for dpif-netdev.
Ilya Maximets [Fri, 27 May 2016 13:32:50 +0000 (16:32 +0300)]
dpctl: Implement dpctl/flow-get for dpif-netdev.

Currently 'dpctl/flow-get' doesn't work for flows installed by
PMD threads.

Fix that by implementing search across all PMD threads. Will be returned
flow from first PMD thread with match.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agonetdev-dummy: Add dummy-pmd class.
Ilya Maximets [Fri, 27 May 2016 13:32:48 +0000 (16:32 +0300)]
netdev-dummy: Add dummy-pmd class.

'dummy-pmd' class is a new dummy class.
Created in purposes of testing of PMD interfaces.

Ex.:
ovs-vsctl set interface <iface> type=dummy-pmd

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoAdd *.c to datapath/linux/.gitignore
Aaron Rosen [Mon, 6 Jun 2016 18:41:32 +0000 (14:41 -0400)]
Add *.c to datapath/linux/.gitignore

This should prevent any additional *.c files from sneaking in here.

Signed-off-by: Aaron Rosen <aaronorosen@gmail.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
7 years agoipfix: Bug fix for configuring IPFIX for flows
Benli Ye [Fri, 27 May 2016 15:32:40 +0000 (23:32 +0800)]
ipfix: Bug fix for configuring IPFIX for flows

There are two kinds of IPFIX: bridge level IPFIX and flow level
IPFIX. Now if we only configure flow level IPFIX, even if there
is no bridge IPFIX configuration, the datapath flow will contain
a sample action for bridge IPFIX. Fix it.

Steps to configure flow level IPFIX:
1) Create a new record in Flow_Sample_Collector_Set table:
   'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
2) Add IPFIX configuration which is referred by corresponding
   row in Flow_Sample_Collector_Set table:
   'ovs-vsctl -- set Flow_Sample_Collector_Set
   "Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX
   targets=\"IP:4739\" obs_domain_id=123 obs_point_id=456
   cache_active_timeout=60 cache_max_flows=13'
3) Add sample action to the flows:
   'ovs-ofctl add-flow mybridge in_port=1,
   actions=sample'('probability=65535,collector_set_id=1,
   obs_domain_id=123,obs_point_id=456')',output:LOCAL'

Before this fix, if you only configure flow IPFIX, the datapath flow is:
   id(0),in_port(2),eth_type(0x0806), packets:0, bytes:0, used:never,
   actions:sample(sample=0.0%,actions(userspace(pid=4294960835,
   ipfix(output_port=4294967295)))),sample(sample=100.0%,
   actions(userspace(pid=4294960835,flow_sample(probability=65535,
   collector_set_id=1,obs_domain_id=123,obs_point_id=456)))),
   sample(sample=0.0%,actions(userspace(pid=4294960835,
   ipfix(output_port=1)))),1

The datapath flow should only contain the sample action like below:
   id(0),in_port(2),eth_type(0x0800),ipv4(frag=no), packets:9, bytes:871,
   used:0.656s, actions:sample(sample=100.0%,actions(userspace(pid=4294962911,
   flow_sample(probability=65535,collector_set_id=1,obs_domain_id=123,
   obs_point_id=456)))),1

Signed-off-by: Benli Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoINSTALL.md: Document system-traffic tests.
Joe Stringer [Thu, 19 May 2016 01:51:51 +0000 (18:51 -0700)]
INSTALL.md: Document system-traffic tests.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agotests: Avoid endianness sensitivity in MPLS handling test.
Ben Pfaff [Fri, 27 May 2016 00:02:38 +0000 (17:02 -0700)]
tests: Avoid endianness sensitivity in MPLS handling test.

The test "ofproto-dpif - MPLS handling" included a test of the "multipath"
action whose results depended on the hash function in use.  The OVS hash
function yields different results on little-endian and big-endian systems,
so this caused a failure.

This commit fixes the problem by changing the modulus in the multipath
action from 256 to 1; any (nonnegative) value modulo 1 is 0, so this makes
the results consistent across endianness (and across hash function
changes).  I think that this is still a good enough test.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Gerhard Stenzel <gstenzel@linux.vnet.ibm.com>
7 years agotests: Fix select group test on big-endian systems.
Ben Pfaff [Thu, 26 May 2016 23:57:00 +0000 (16:57 -0700)]
tests: Fix select group test on big-endian systems.

This test ensures that, when the selection criteria for a select group are
the same from packet to packet, the same bucket is always selected.
However, it hardcoded the bucket that was selected to the one that happens
to be selected with the current OVS hash function on little-endian systems.
On big-endian systems, the current OVS hash functions turns out to select
the other bucket.  That's fine (it's consistent, it just consistently makes
the other choice), so this commit fixes the problem by allowing either
bucket to be selected.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Gerhard Stenzel <gstenzel@linux.vnet.ibm.com>
7 years agoofp-print: Sort queues before printing in OFPT_QUEUE_GET_CONFIG_REPLY.
Ben Pfaff [Thu, 26 May 2016 22:14:54 +0000 (15:14 -0700)]
ofp-print: Sort queues before printing in OFPT_QUEUE_GET_CONFIG_REPLY.

Otherwise the ordering tends to vary across endianness.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Gerhard Stenzel <gstenzel@linux.vnet.ibm.com>
7 years agonetdev-native-tnl: Fix treatment of GRE key on big-endian systems.
Ben Pfaff [Thu, 26 May 2016 23:53:52 +0000 (16:53 -0700)]
netdev-native-tnl: Fix treatment of GRE key on big-endian systems.

The GRE implementation used bitwise shifts to convert an ovs_be32 to an
ovs_be64 (with zero extension), but on big-endian systems these conversions
are no-ops.  This fixes the problem.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Gerhard Stenzel <gstenzel@linux.vnet.ibm.com>
7 years agotypes: Change ofp_port_t from uint16_t to uint32_t.
Ben Pfaff [Fri, 3 Jun 2016 20:15:01 +0000 (13:15 -0700)]
types: Change ofp_port_t from uint16_t to uint32_t.

This fixes several tests that failed on big-endian systems because "union
flow_in_port" overlays an ofp_port_t and odp_port_t and in some cases it
is not easy to determine which one is in use.

This commit also fixes up a few places where this broke other code.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Gerhard Stenzel <gstenzel@linux.vnet.ibm.com>
7 years agoFAQ: Explain that the order of actions is significant.
Ben Pfaff [Fri, 3 Jun 2016 16:10:15 +0000 (09:10 -0700)]
FAQ: Explain that the order of actions is significant.

I've seen users make this error several times.  This FAQ will provide a
useful answer to pass along.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Gurucharan Shetty <guru@ovn.org>
7 years agonetdev: Fix typo in comment.
Ben Pfaff [Fri, 3 Jun 2016 19:31:34 +0000 (12:31 -0700)]
netdev: Fix typo in comment.

The name of the macro was wrong.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
7 years agoroute-table: If device is not there, route is still parseable.
Thadeu Lima de Souza Cascardo [Thu, 26 May 2016 20:34:58 +0000 (17:34 -0300)]
route-table: If device is not there, route is still parseable.

Do not return failure to parse a route if device has been removed before we are
able to parse the route. That prevents "received bad netlink message" warnings
on the log.

This can be reproduced by simply removing interfaces.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agovswitch.xml: Document interface name length restrictions.
Ben Pfaff [Thu, 26 May 2016 17:30:39 +0000 (10:30 -0700)]
vswitch.xml: Document interface name length restrictions.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agorconn: Disable probe for local connections.
nghosh@us.ibm.com [Tue, 24 May 2016 22:47:20 +0000 (15:47 -0700)]
rconn: Disable probe for local connections.

There are four sessions established from ovn-controller to the following:
OVN Southbound — JSONRPC based
Local ovsdb — JSONRPC based
Local vswitchd — openflow based from ofctrl
Local vswitchd — openflow based from pinctrl

All of these sessions have their own probe_interval, For the last
two connections, they do not need probe_timer as they are over unix domain
socket. This patch takes care of that.

This change has been tested putting logs in several places like in
ovn-controller.c, lib/rconn.c to make sure the probe_timer is
disabled. Also, by making sure from ovn-controller's
log file that there is no more reconnect happening due to probe
under heavy load.

Signed-off-by: Nirapada Ghosh <nghosh@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-nbctl: Add lrouter and lrport related commands.
Nirapada Ghosh [Fri, 3 Jun 2016 18:48:49 +0000 (11:48 -0700)]
ovn-nbctl: Add lrouter and lrport related commands.

ovn-nbctl provides a shortcut to perform commands related lswitch, lport
and such but it doesn't have similar commands related to logical routers
and logical router ports. Also, 'ovn-nbctl show' is supposed to show an
overview of database contents, which means it should show the routers
as well. "ovn-nbctl show LSWITCH" shows the switch details, similarly
"ovn-nbctl show LROUTER" should show the router details too. This patch
takes care of all of these.

Modifications;
1) ovn-nbctl show -- will now show lrouters as well
2) ovn-nbctl show <lrouter> -- will show the router now

New commands added:
3) ovn-nbctl lrouter-add [LROUTER]
4) ovn-nbctl lrouter-del LROUTER
5) ovn-nbctl lrouter-list
6) lrport-add LROUTER LRPORT
7) lrport-del LRPORT
8) lrport-list LROUTER
9) lrport-set-mac-address LRPORT [ADDRESS]
10) lrport-get-mac-address LRPORT
11) lrport-set-enabled LRPORT STATE
12) lrport-get-enabled LRPORT

Unit test cases have been added to test all of these modifications and
additions.

Signed-off-by: Nirapada Ghosh <nghosh@us.ibm.com>
[blp@ovn.org added features to match the lswitch and lport commands]
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoINSTALL.DPDK: Replace tabs with spaces
Ciara Loftus [Tue, 24 May 2016 14:13:30 +0000 (15:13 +0100)]
INSTALL.DPDK: Replace tabs with spaces

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-controller: Assign conntrack zones for gateway router.
Gurucharan Shetty [Wed, 11 May 2016 00:19:15 +0000 (17:19 -0700)]
ovn-controller: Assign conntrack zones for gateway router.

OVS NAT currently cannot do snat and dnat in the same zone.
So we need two zones per gateway router.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-northd.8.xml: fix sock path of NB and SB database.
Li Wei [Thu, 2 Jun 2016 01:09:42 +0000 (09:09 +0800)]
ovn-northd.8.xml: fix sock path of NB and SB database.

commit 60bdd01148e4 ("Separating OVN NB and SB database processes")
introduced a separating OVN NB and SB database process, the path of
sock files need to be updated.

Fixes: 60bdd01148e4 ("Separating OVN NB and SB database processes")
Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
7 years agoovn-controller: Refactor conntrack zone allocation.
Gurucharan Shetty [Tue, 10 May 2016 23:35:05 +0000 (16:35 -0700)]
ovn-controller: Refactor conntrack zone allocation.

We currently allocate conntrack zones in binding.c. It fits
in nicely there because we currently only allocate conntrack
zones to logical ports and binding.c is where we figure out
the local ones.

An upcoming commit needs conntrack zone allocation for routers
in a gateway. For that reason, this commit moves conntrack zone
allocation code to ovn-controller.c where it would be easily
accessible for router zone allocation too.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn: Introduce l3 gateway router.
Gurucharan Shetty [Mon, 9 May 2016 20:44:34 +0000 (13:44 -0700)]
ovn: Introduce l3 gateway router.

Currently OVN has distributed switches and routers. When a packet
exits a container or a VM, the entire lifecycle of the packet
through multiple switches and routers are calculated in source
chassis itself. When the destination endpoint resides on a different
chassis, the packet is sent to the other chassis and it only goes
through the egress pipeline of that chassis once and eventually to
the real destination.

When the packet returns back, the same thing happens. The return
packet leaves the VM/container on the chassis where it resides.
The packet goes through all the switches and routers in the logical
pipleline on that chassis and then sent to the eventual destination
over the tunnel.

The above makes the logical pipeline very flexible and easy. But,
creates a problem for cases where you need to add stateful services
(via conntrack) on switches and routers.

For l3 gateways, we plan to leverage DNAT and SNAT functionality
and we want to apply DNAT and SNAT rules on a router. So we ideally need
the packet to go through that router in both directions in the same
chassis. To achieve this, this commit introduces a new gateway router which is
static and can be connected to your distributed router via a switch.

To make minimal changes in OVN's logical pipeline, this commit
tries to make the switch port connected to a l3 gateway router look like
a container/VM endpoint for every other chassis except the chassis
on which the l3 gateway router resides. On the chassis where the
gateway router resides, the connection looks just like a patch port.

This is achieved by the doing the following:
Introduces a new type of port_binding record called 'gateway'.
On the chassis where the gateway router resides, this port behaves just
like the port of type 'patch'. The ovn-controller on that chassis
populates the "chassis" column for this record as an indication for
other ovn-controllers of its physical location. Other ovn-controllers
treat this port as they would treat a VM/Container port on a different
chassis.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-northd: Support connecting multiple routers to a switch.
Gurucharan Shetty [Fri, 6 May 2016 16:02:08 +0000 (09:02 -0700)]
ovn-northd: Support connecting multiple routers to a switch.

Currently we can connect routers via "peer"ing. This limits
the number of routers that can be connected with each other
directly to 2.

One of the design goals for L3 Gateway is to be able to
have multiple gateways (each with their own router)
connected to a distributed router via a switch.

With the above goal in mind, this commit gives the general
ability to connect multiple routers via a switch.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoAUTHORS: Add Dustin Lundquist.
Ben Pfaff [Thu, 2 Jun 2016 22:31:17 +0000 (15:31 -0700)]
AUTHORS: Add Dustin Lundquist.

Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-northd: Restrict use of unspecified source addresses
Dustin Lundquist [Fri, 20 May 2016 19:48:16 +0000 (12:48 -0700)]
ovn-northd: Restrict use of unspecified source addresses

Restrict use of the unspecified source addresses (:: and 0.0.0.0) to
traffic necessary to obtain an IP address. DHCP discovery messages for
the IPv4 case, and ICMP6 types necessary for duplicate address detection
for IPv6.

This breaks the existing ovn -- portsecurity : 3 HVs, 1 LS, 3 lports/HV
test since it tests sourcing IPv6 packets from the unspecified address
with and invalid ICMPv6 type (0). Modified this test should be extended
to verify ICMPv6 types for DAD are permitted, and other IPv6 traffic
sourced from the unspecified address are dropped.

Signed-off-by: Dustin Lundquist <dustin@null-ptr.net>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agotests: Add a tunnel packet-out test.
Daniele Di Proietto [Fri, 20 May 2016 18:15:56 +0000 (11:15 -0700)]
tests: Add a tunnel packet-out test.

We only stress the same code path in testcase "ovn -- 3 HVs, 3 LS,
3 lports/LS, 1 LR", which is slow to execute under valgrind.

It's probably worth adding a separate case.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agotests: Wait for ARPs to be sent in tunnel-push-pop.
Daniele Di Proietto [Fri, 20 May 2016 18:14:13 +0000 (11:14 -0700)]
tests: Wait for ARPs to be sent in tunnel-push-pop.

Otherwise the tests can fail under heavy load (or with valgrind).

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agonetdev-dpdk: vhost-user port link state fix
Zoltán Balogh [Thu, 2 Jun 2016 12:42:39 +0000 (12:42 +0000)]
netdev-dpdk: vhost-user port link state fix

OVS reports that link state of a vhost-user port (type=dpdkvhostuser) is
DOWN, even when traffic is running through the port between a Virtual
Machine and the vSwitch. Changing admin state with the
"ovs-ofctl mod-port <BR> <PORT> up/down" command over OpenFlow does
affect neither the reported link state nor the traffic.

The patch below does the flowing:
 - Triggers link state change by altering netdev's change_seq member.
 - Controls sending/receiving of packets through vhost-user port
   according to the port's current admin state.
 - Sets admin state of newly created vhost-user port to UP.

Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
Co-authored-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoofproto-dpif: Cache result of time_msec() for rule_expire().
Daniele Di Proietto [Thu, 2 Jun 2016 02:01:10 +0000 (19:01 -0700)]
ofproto-dpif: Cache result of time_msec() for rule_expire().

In the run() function of ofproto-dpif we call rule_expire() for every
possible flow that has a timeout and rule_expire() calls time_msec().
Calling time_msec() repeatedly can be pretty expensive, even though most
of the time it involves only a vdso call.

This commit calls time_msec only once in run(), to reduce the workload.

Keeping the flows ordered by expiration in some kind of heap or timing
wheel data structure could help make this process more efficient, if
rule_expire() turns out to be a bottleneck.

VMware-BZ: #1655122
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agonetdev-vport: Update copyright headers
Thadeu Lima de Souza Cascardo [Thu, 2 Jun 2016 10:18:49 +0000 (07:18 -0300)]
netdev-vport: Update copyright headers

Red Hat has contributed to the original code that has moved to netdev-native-tnl
module and to code that has been kept in netdev-vport as well.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
7 years agonetdev-vport: remove unneeded headers
Thadeu Lima de Souza Cascardo [Thu, 2 Jun 2016 10:18:47 +0000 (07:18 -0300)]
netdev-vport: remove unneeded headers

Throughout the years, changes in netdev vport have removed the need for some of
the headers, like shash, hmap, and many others. With the recent split of
push/pop code, less headers are needed in each of the two modules.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
7 years agosystem-tests: fix module removal during cleanup
Thadeu Lima de Souza Cascardo [Tue, 24 May 2016 00:57:52 +0000 (21:57 -0300)]
system-tests: fix module removal during cleanup

Currently, cleanup files for system tests will look like this:

modprobe -q -r vport_vxlan
modprobe -q -r vport_sttmodprobe
modprobe -q -r vport_lispmodprobe
modprobe -q -r vport_gremodprobe
modprobe -q -r vport_genevemodprobe
modprobe -r openvswitch

This is caused by a missing newline in m4_foreach EXPRESSION and the fact that
on_exit is a shell function. It was being expanded like this:

on_exit 'modprobe -q -r vport_genevemodprobe' -q vport_gre

Fixes: 53eb8cb83013 ("tests: Replace ON_EXIT m4 macro by on_exit() shell function.")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
7 years agodatapath-windows: Fix alignment on Offload.c
Alin Serdean [Tue, 24 May 2016 16:14:26 +0000 (16:14 +0000)]
datapath-windows: Fix alignment on Offload.c

Found by inspection.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: Add UDP checksum verifications for VXLAN
Alin Serdean [Tue, 24 May 2016 16:14:19 +0000 (16:14 +0000)]
datapath-windows: Add UDP checksum verifications for VXLAN

Introduce UDP checksum if it was specified in the tunnel information
on Tx.

Set the tunnel checksum information flag on the flow if the
UDP checksum was non zero on the Rx.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Nithin Raju <nithin@vmware.com>
7 years agodatapath-windows: Move UDP checksum computation to Offload.c
Yin Lin [Tue, 24 May 2016 23:28:03 +0000 (23:28 +0000)]
datapath-windows: Move UDP checksum computation to Offload.c

UDP checksum computation is shared by both vxlan and geneve on Windows.
Move the function so that the code can be shared.

Signed-off-by: Yin Lin <linyi@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoacinclude: fix issue when configuring with --with-dpdk
Mauricio Vasquez B [Wed, 1 Jun 2016 16:48:07 +0000 (18:48 +0200)]
acinclude: fix issue when configuring with --with-dpdk

when an empty path is given to the --with-dpdk option
(--with-dpdk="" or --width-dpdk=$NON_SET_ENV_VARIABLE) the configure
script does not show any error and configures OvS without DPDK support,
this can create some confusion.

This patch modifies that behavior showing an explicity error in that case.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovs-vtep: Make compatible with python2.7 and 3.
Joe Stringer [Tue, 24 May 2016 01:11:03 +0000 (18:11 -0700)]
ovs-vtep: Make compatible with python2.7 and 3.

Translate commandline calls to UTF-8, appease flake8 and use six's
integer types. This allows the testsuite to pass when using python3 as
your default system python version.

Signed-off-by: Joe Stringer <joe@ovn.org>
Tested-by: Darrell Ball <dlu998@gmail.com>
7 years agolist.h: Define OVS_LIST_POISON statically
Nithin Raju [Fri, 27 May 2016 17:54:58 +0000 (10:54 -0700)]
list.h: Define OVS_LIST_POISON statically

Looks like part of the patch committed in e32c1f7c
got left out while moving header files.

Fixes: e32c1f7c0d65 ("list.h: Define OVS_LIST_POISON statically")
Signed-off-by: Nithin Raju <nithin@vmware.com>
Reported-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoxlate: Skip recirculation for output and set actions
Simon Horman [Wed, 25 May 2016 01:34:31 +0000 (10:34 +0900)]
xlate: Skip recirculation for output and set actions

Until 8bf009bf8ab4 ("xlate: Always recirculate after an MPLS POP to a
non-MPLS ethertype.") the translation code took some care to only
recirculate as a result of a pop_mpls action if necessary. This was
implemented using per-action checks and resulted in some maintenance
burden.

Unfortunately recirculation is a relatively expensive operation and a
performance degradation of up to 35% has been observed with the above
mentioned patch applied for the arguably common case of:

pop_mpls,set(l2 field),output

This patch attempts to strike a balance between performance and
maintainability by special casing set and output actions such
that recirculation may be avoided.

This partially reverts the above mentioned commit. In particular most
of the C code outside of do_xlate_actions().

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
7 years agotests: bfd.at: Fix bridge name in comment
Markos Chandras [Thu, 26 May 2016 15:23:02 +0000 (16:23 +0100)]
tests: bfd.at: Fix bridge name in comment

The bridge sitting between 'br-bfd0' and 'br-bfd1' is called 'br-sw'
instead of 'br2' and the patch ports are 'p0-sw' and 'p1-sw' instead
of 'p0-2' and 'p1-2' respectively so fix the comment.

Signed-off-by: Markos Chandras <mchandras@suse.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoutil: Drop 'date' and 'time' arguments from ovs_set_program_name
Markos Chandras [Thu, 26 May 2016 09:02:54 +0000 (10:02 +0100)]
util: Drop 'date' and 'time' arguments from ovs_set_program_name

The 'date' and 'time' arguments are normally being set by
'ovs_set_program_name' using __DATE__ and __TIME__. However, this
breaks reproducible builds since even without any changes in the
toolchain, build system etc, the end binary will still differ in
that regard. This is also visible when building with -Wdate-time:

utilities/ovs-dpctl.c:61:29: warning: macro "__DATE__" might prevent
reproducible builds [-Wdate-time]
     set_program_name(argv[0]);
                             ^

and it's also something that triggers the following warning in the
openSUSE OBS builds:

[...]
openvswitch.x86_64: W: file-contains-date-and-time /usr/bin/ovs-ofctl
openvswitch.x86_64: W: file-contains-date-and-time /usr/bin/ovs-appctl
Your file uses  __DATE and __TIME__ this causes the package to rebuild
when not needed
[...]

This patch drops these two arguments from ovs_set_program_name__ and
renames the function to ovs_set_program_name dropping the previous
preprocessor macro in the process.

This finally removes the remaining references to __DATE__ and __TIME__
from the sources which is something that has already been done in
commit 26bfaeaa9687 ("Stop using __DATE__ and __TIME__ in startup
string.") for the kernel datapath.

Cc: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Markos Chandras <mchandras@suse.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agovagrant: Enable silent-rules for configure.
Joe Stringer [Fri, 20 May 2016 18:49:59 +0000 (11:49 -0700)]
vagrant: Enable silent-rules for configure.

In the majority of cases, developers debugging their code using vagrant
will be more interested in compiler errors/warnings than the exact
invocation of the compiler. By enabling silent-rules, the verbosity of
compilation is lowered and it is easier to identify these pieces of
information.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agovagrant: Update default box to Fedora-23.
Joe Stringer [Thu, 26 May 2016 00:17:37 +0000 (17:17 -0700)]
vagrant: Update default box to Fedora-23.

This brings a newer kernel (4.2) and newer iproute2, allowing more of
the tests to run by default.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
7 years agovagrant: Update dependencies.
Joe Stringer [Fri, 20 May 2016 18:49:55 +0000 (11:49 -0700)]
vagrant: Update dependencies.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
7 years agotests: Eliminate some intermittent failures due to races.
Jarno Rajahalme [Wed, 25 May 2016 23:43:42 +0000 (16:43 -0700)]
tests: Eliminate some intermittent failures due to races.

Wait until a megaflow is set up before sending more packets to have
deterministic stats for the megaflows.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoofproto: update mtu when port is getting removed as well
ak47izatool@gmail.com [Wed, 25 May 2016 15:03:43 +0000 (21:03 +0600)]
ofproto: update mtu when port is getting removed as well

When we're adding the port into ovs bridge, its mtu is updated to the minimal
mtu of the included port. But when the port is getting removed, no such update
is performed, which leads to bug. For example, when the port with minimal mtu
is removed, bridge's mtu must adapt to new value, but it won't happen.
How to reproduce the problem:

$ ovs-vsctl add-br testing
$ ip link add name gretap11 type gretap local 10.0.0.1 remote 10.0.0.100
$ ip link add name gretap12 type gretap local 10.0.0.1 remote 10.0.0.200
$ ip link set dev gretap12 mtu 1600
$ ovs-vsctl add-port testing gretap11
$ ovs-vsctl add-port testing gretap12
$ ip a sh testing
16: testing: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN
group default qlen 1
    link/ether 7a:42:95:00:96:40 brd ff:ff:ff:ff:ff:ff

$ ovs-vsctl del-port gretap11
$ ip a sh testing
16: testing: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN
group default qlen 1
    link/ether 7a:42:95:00:96:40 brd ff:ff:ff:ff:ff:ff

$## as we can see here, 'testing' bridge mtu is stuck, while it must
adapt to new '1600' value,
$## cause there is only one port 'gretap12' left, and it's mtu is '1600':

$ ip a sh gretap12
19: gretap12@NONE: <BROADCAST,MULTICAST> mtu 1600 qdisc noop master
ovs-system state DOWN group default qlen 1000
    link/ether b2:c6:1d:9f:be:0d brd ff:ff:ff:ff:ff:ff

My commit fixes this problem - mtu update is performed on port removal as well.

Signed-off-by: wisd0me <ak47izatool@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agodatapath-windows: o/p buffer must fit NL error message
Nithin Raju [Thu, 19 May 2016 22:31:49 +0000 (15:31 -0700)]
datapath-windows: o/p buffer must fit NL error message

OVS_IOCTL_WRITE and OVS_IOCTL_TRANSACT can generate a
netlink error that is represented by a OVS_MESSAGE_ERROR
struct. We want to make sure at the entry point of the
ioctl processing that the output buffer is big enough
to hold the error message. We were earlier checking
for struct OVS_MESSAGE which is smaller.

Since we are ensuring that output buffer can fit
OVS_MESSAGE_ERROR at the top of the ioctl function,
there's no need to check for that later.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
7 years agodatapath-windows: don't map output buffer in OVS_IOCTL_WRITE
Nithin Raju [Thu, 19 May 2016 22:31:48 +0000 (15:31 -0700)]
datapath-windows: don't map output buffer in OVS_IOCTL_WRITE

The contract of OVS_IOCTL_WRITE is that write operations
will not need the output buffer. Only the input buffer
will be used in the IRP. So, better to not map the output
buffer at all.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
7 years agodatapath-windows: remove extract flow in OvsDoRecirc()
Nithin Raju [Tue, 17 May 2016 17:15:22 +0000 (10:15 -0700)]
datapath-windows: remove extract flow in OvsDoRecirc()

It is not necessary to do a flow extract in OvsDoRecirc().
In fact, doing it would overwrite the tunnel key within
'key'. So, let's remove the call.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Co-Authored-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
7 years agodatapath-windows: Use l2 port and tunkey during execute
Nithin Raju [Tue, 17 May 2016 17:15:21 +0000 (10:15 -0700)]
datapath-windows: Use l2 port and tunkey during execute

While testing DFW and recirc code it was found that userspace
was calling into packet execute with the tunnel key and the
vport added as part of the execute structure. We were not passing
this along to the code that executes actions. The right thing is
to contruct the key based on all of the attributes sent down from
userspace.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
7 years agodatapath-windows: Make _MapTunAttrToFlowPut() global
Nithin Raju [Tue, 17 May 2016 17:15:20 +0000 (10:15 -0700)]
datapath-windows: Make _MapTunAttrToFlowPut() global

Move this function out from file scope.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
7 years agodatapath-windows: add nlMsgHdr to OvsPacketExecute
Nithin Raju [Tue, 17 May 2016 17:15:19 +0000 (10:15 -0700)]
datapath-windows: add nlMsgHdr to OvsPacketExecute

We'll need this for parsing nested attributes.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
7 years agonetdev-dpdk.c: Add ingress-policing functionality.
Ian Stokes [Tue, 24 May 2016 16:36:51 +0000 (17:36 +0100)]
netdev-dpdk.c: Add ingress-policing functionality.

This patch provides the modifications required in netdev-dpdk.c and
vswitch.xml to enable ingress policing for DPDK interfaces.

This patch implements the necessary netdev functions to netdev-dpdk.c as
well as various helper functions required for ingress policing.

The vswitch.xml has been modified to explain the expected parameters and
behaviour when using ingress policing.

The INSTALL.DPDK.md guide has been modified to provide an example
configuration of ingress policing.

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agonetdev-dpdk.c: Add generic policer functions.
Ian Stokes [Tue, 24 May 2016 16:36:50 +0000 (17:36 +0100)]
netdev-dpdk.c: Add generic policer functions.

Add generic policer functions to avoid code duplication.

Policing can be implemented on both egress and ingress paths.
Currently the QoS egress-policer implementation uses it's own specific run
and packet handle policer functions. This patch makes the policer functions
generic so that they can be used regardless of whether the policer is egress
or ingress by just requiring a pointer to the rte_meter used for policing
to be passed.

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoofp-actions: Allow conntrack action in group buckets.
Jarno Rajahalme [Tue, 24 May 2016 07:25:36 +0000 (00:25 -0700)]
ofp-actions: Allow conntrack action in group buckets.

Conntrack action used in group buckets lets
us do simple load-balancing.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
[guru@ovn.org updated the commit message and made
a small change to the test output format]
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agotests: Use on_exit in ovsdb-idl tests.
Daniele Di Proietto [Mon, 23 May 2016 21:36:24 +0000 (14:36 -0700)]
tests: Use on_exit in ovsdb-idl tests.

Instead of hardcoding 'kill `cat pid`' on every call to AT_CHECK is
simpler to use on_exit.

This makes sure that we kill every started daemon and fixes a travis
build timeout.

Suggested-by: Ben Pfaff <blp@ovn.org>
Tested-at: https://travis-ci.org/ddiproietto/ovs/builds/132404750
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agonetdev-native-tnl: Fix IPv6 tos bits handling.
Pravin B Shelar [Tue, 24 May 2016 03:27:14 +0000 (20:27 -0700)]
netdev-native-tnl: Fix IPv6 tos bits handling.

IPv6 tunnels ignores outer tos bits on recieve and does not
set it on xmit. Following patch fixes it.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agonetdev-native-tnl: Introduce ip_build_header()
Pravin B Shelar [Tue, 24 May 2016 03:27:14 +0000 (20:27 -0700)]
netdev-native-tnl: Introduce ip_build_header()

The native tunneling build tunnel header code is spread across
two different modules, it makes pretty hard to follow the code.
Following patch refactors the code to move all code to
netdev-ative-tnl module.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agoupcall: Unregister dpif cbs in udpif_destroy().
Joe Stringer [Tue, 17 May 2016 03:08:01 +0000 (20:08 -0700)]
upcall: Unregister dpif cbs in udpif_destroy().

During udpif_create(), we register callbacks for handling upcalls and
purging the datapath; however, in the corresponding udpif_destroy() we
never did this. This could potentially lead to dereference of
uninitialized memory in the userspace datapath if the main thread
destroys the udpif then executes an OpenFlow packet-out.

Fixes: e4e74c3a2b9a ("dpif-netdev: Purge all ukeys when reconfigure pmd.")
Fixes: 623540e4617e ("dpif-netdev: Streamline miss handling.")
Reported-by: William Tu <u9012063@gmail.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agonetdev-dpdk: Use ->reconfigure() call to change rx/tx queues.
Daniele Di Proietto [Fri, 26 Feb 2016 23:58:24 +0000 (15:58 -0800)]
netdev-dpdk: Use ->reconfigure() call to change rx/tx queues.

This introduces in dpif-netdev and netdev-dpdk the first use for the
newly introduce reconfigure netdev call.

When a request to change the number of queues comes, netdev-dpdk will
remember this and notify the upper layer via
netdev_request_reconfigure().

The datapath, instead of periodically calling netdev_set_multiq(), can
detect this and call reconfigure().

This mechanism can also be used to:
* Automatically match the number of rxq with the one provided by qemu
  via the new_device callback.
* Provide a way to change the MTU of dpdk devices at runtime.
* Move a DPDK vhost device to the proper NUMA socket.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agonetdev: Add reconfigure request mechanism.
Daniele Di Proietto [Thu, 25 Feb 2016 01:25:11 +0000 (17:25 -0800)]
netdev: Add reconfigure request mechanism.

A netdev provider, especially a PMD provider (like netdev DPDK) might
not be able to change some of its parameters (such as MTU, or number of
queues) without stopping everything and restarting.

This commit introduces a mechanism that allows a netdev provider to
request a restart (netdev_request_reconfigure()).  The upper layer can
be notified via netdev_wait_reconf_required() and
netdev_is_reconf_required().  After closing all the rxqs the upper layer
can finally call netdev_reconfigure(), to make sure that the new
configuration is in place.

This will be used by next commit to reconfigure rx and tx queues in
netdev-dpdk.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
7 years agodpif-netdev: Handle errors in reconfigure_pmd_threads().
Daniele Di Proietto [Thu, 7 Apr 2016 22:19:28 +0000 (15:19 -0700)]
dpif-netdev: Handle errors in reconfigure_pmd_threads().

Errors returned by netdev_set_multiq() and netdev_rxq_open() weren't
handled properly in reconfigure_pmd_threads().  In case of error now we
remove the port from the datapath.

Also, part of the code is moved in a new function, port_reconfigure().

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agodpif-netdev: Change pmd thread configuration in dpif_netdev_run().
Daniele Di Proietto [Tue, 23 Feb 2016 23:33:43 +0000 (15:33 -0800)]
dpif-netdev: Change pmd thread configuration in dpif_netdev_run().

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agoofproto-dpif: Call dpif_poll_threads_set() before dpif_run().
Daniele Di Proietto [Tue, 23 Feb 2016 19:36:10 +0000 (11:36 -0800)]
ofproto-dpif: Call dpif_poll_threads_set() before dpif_run().

An upcoming commit will make dpif_poll_threads_set() record the
requested configuration and dpif_run() apply it, so it makes sense to
change the order.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
7 years agoovs-thread: Do not quiesce in ovs_mutex_cond_wait().
Daniele Di Proietto [Mon, 4 Apr 2016 23:38:57 +0000 (16:38 -0700)]
ovs-thread: Do not quiesce in ovs_mutex_cond_wait().

ovs_mutex_cond_wait() is used in many functions in dpif-netdev to
synchronize with pmd threads, but we can't guarantee that the callers do
not hold RCU references, so it's better to avoid quiescing.

In system_stats_thread_func() the code relied on ovs_mutex_cond_wait()
to introduce a quiescent state, so explicit calls to
ovsrcu_quiesce_start() and ovsrcu_quiesce_end() are added there.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agodpif-netdev: Use hmap for ports.
Daniele Di Proietto [Mon, 4 Apr 2016 18:15:12 +0000 (11:15 -0700)]
dpif-netdev: Use hmap for ports.

netdev objects are hard to use with RCU, because it's not possible to
split removal and reclamation.  Postponing the removal means that the
port is not removed and cannot be readded immediately.  Waiting for
reclamation means introducing a quiescent state, and that may introduce
subtle bugs, due to the RCU model we use in userspace.

This commit changes the port container from cmap to hmap.  'port_mutex'
must be held by readers and writers.  This shouldn't have performance
impact, as readers in the fast path use a thread local cache.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agohmap: Use struct for hmap_at_position().
Daniele Di Proietto [Sat, 2 Apr 2016 01:31:22 +0000 (18:31 -0700)]
hmap: Use struct for hmap_at_position().

The interface will be more similar to the cmap.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agodpif-netdev: Add pmd thread local port cache for transmission.
Daniele Di Proietto [Wed, 6 Apr 2016 01:41:09 +0000 (18:41 -0700)]
dpif-netdev: Add pmd thread local port cache for transmission.

A future commit will stop using RCU for 'dp->ports' and use a mutex for
reading/writing them.  To avoid taking a mutex in dp_execute_cb(), which
is called in the fast path, this commit introduces a pmd thread local
cache of ports.

The downside is that every port add/remove now needs to synchronize with
every pmd thread.

Among the advantages, keeping a per thread port mapping could allow
greater control over the txq assigment.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agodpif-netdev: Fix race condition in pmd thread initialization.
Daniele Di Proietto [Wed, 6 Apr 2016 01:02:14 +0000 (18:02 -0700)]
dpif-netdev: Fix race condition in pmd thread initialization.

The pmds and the main threads are synchronized using a condition
variable.  The main thread writes a new configuration, then it waits on
the condition variable.  A pmd thread reads the new configuration, then
it calls signal() on the condition variable. To make sure that the pmds
and the main thread have a consistent view, each signal() should be
backed by a wait().

Currently the first signal() doesn't have a corresponding wait().  If
the pmd thread takes a long time to start and the signal() is received
by a later wait, the threads will have an inconsistent view.

The commit fixes the problem by removing the first signal() from the
pmd thread.

This is hardly a problem on current master, because the main thread
will call the first wait() a long time after the creation of a pmd
thread.  It becomes a problem with the next commits.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agodpif-netdev: Add functions to modify rxq without reloading pmd threads.
Daniele Di Proietto [Wed, 6 Apr 2016 00:01:25 +0000 (17:01 -0700)]
dpif-netdev: Add functions to modify rxq without reloading pmd threads.

This commit introduces some functions to add/remove rxqs from pmd
threads without reloading them.  They will be used by next commits.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agodpif-netdev: Factor out port_create() from do_add_port().
Daniele Di Proietto [Tue, 5 Apr 2016 20:14:56 +0000 (13:14 -0700)]
dpif-netdev: Factor out port_create() from do_add_port().

Instead of performing every operation inside do_port_add() it seems
clearer to introduce port_create(), since we already have
port_destroy().

No functional change.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agodpif-netdev: Remove unused 'index' in dp_netdev_pmd_thread.
Daniele Di Proietto [Thu, 7 Apr 2016 19:54:10 +0000 (12:54 -0700)]
dpif-netdev: Remove unused 'index' in dp_netdev_pmd_thread.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agodpif-netdev: Destroy 'port_mutex' in dp_netdev_free().
Daniele Di Proietto [Tue, 5 Apr 2016 01:10:33 +0000 (18:10 -0700)]
dpif-netdev: Destroy 'port_mutex' in dp_netdev_free().

Found by inspection.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agonetdev-native-tnl: Fix a build error on NetBSD 7.0
YAMAMOTO Takashi [Fri, 20 May 2016 05:52:19 +0000 (05:52 +0000)]
netdev-native-tnl: Fix a build error on NetBSD 7.0

netinet/ip6.h is not a standalone header there.

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Tested-by: Jeff Feng <jianhua@us.ibm.com>
7 years agonetdev-dpdk: Improve pthread_getaffinity_np() fail handling.
Kevin Traynor [Thu, 19 May 2016 12:51:32 +0000 (13:51 +0100)]
netdev-dpdk: Improve pthread_getaffinity_np() fail handling.

Prevent pthread_setaffinity_np() being called with a potentially
invalid cpu_set_t and add a default (core 0x1).

Also, only call pthread_getaffinity_np() if no dpdk-lcore-mask specified.

Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agonetdev-dpdk: Fix coremask logic.
Kevin Traynor [Thu, 19 May 2016 12:51:31 +0000 (13:51 +0100)]
netdev-dpdk: Fix coremask logic.

Only set the thread affinity back to the pre rte_eal_init() value
when the user has not specified a coremask.

Fixes: 88964e6428dc("netdev-dpdk: Autofill lcore coremask if absent")
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoofproto-dpif-xlate: Fix IGMP megaflow matching.
Ben Pfaff [Sun, 8 May 2016 17:34:10 +0000 (10:34 -0700)]
ofproto-dpif-xlate: Fix IGMP megaflow matching.

IGMP translations wasn't setting enough bits in the wildcards to ensure
different packets were handled differently.

Reported-by: "O'Reilly, Darragh" <darragh.oreilly@hpe.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-April/021036.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodpif-netdev: Initialize packet RSS hash in dpif_netdev_execute().
Daniele Di Proietto [Wed, 18 May 2016 01:38:20 +0000 (18:38 -0700)]
dpif-netdev: Initialize packet RSS hash in dpif_netdev_execute().

The datapath code expects the RSS hash to always be initialized.  This
is enforced by checking in emc_processing() that the hash is valid, and
eventually by computing a new one.

Unfortunately, there is another entry point to the datapath,
dpif_netdev_execute().  A packet generated by OVS (BFD frame,
packet-out from controller) doesn't have a valid RSS hash and so is
allowed to enter the datapath with an uninitialized hash value.

This commit recomputes the hash (if not valid) in dpif_netdev_execute().

The only place where we would use an invalid hash is netdev-vport, in
push_udp_header().  This caused an uninitialized memory read, and a
random value to be assigned to the outer tunnel header source port.

Reported-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agodpif: Pass flow parameter to dpif_execute().
Daniele Di Proietto [Wed, 18 May 2016 01:26:02 +0000 (18:26 -0700)]
dpif: Pass flow parameter to dpif_execute().

All the callers of the function already have a copy of the extracted
flow in their stack (or a few frames before).

This is useful for different resons:
* It forces the callers to also call flow_extract() on the packet, which
  is necessary to initialize the l2,l3,l4 pointers.
* It will be used in the userspace datapath to generate the RSS hash by
  a following commit
* It can be used by the userspace connection tracker to avoid extracting
  the l3 type again.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoflow: Fix uninitialized reads in [mini]flow_hash_5tuple().
Daniele Di Proietto [Wed, 18 May 2016 02:18:51 +0000 (19:18 -0700)]
flow: Fix uninitialized reads in [mini]flow_hash_5tuple().

Almost every caller expects [mini]flow_hash_5tuple() to be able to deal
with all kinds of flows, not only TCP and UDP.

Currently, when dealing with non L4 flows, the function may access
uninitialized memory.  This commit changes it to return prematurely with
a partial hash value instead of reading uninitialized memory.

Found by valgrind.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoutilities/checkpatch.py: Check for appropriate bracing
Aaron Conole [Fri, 20 May 2016 15:52:59 +0000 (11:52 -0400)]
utilities/checkpatch.py: Check for appropriate bracing

Teach checkpatch.py to understand that if/for/while blocks should always
end with braces on the same line (if possible). This does not address
multi-line if/for/while blocks, but provides a point where such blocks
could be added.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agostp: Initialize mutex whenever we register unixctl command.
Ben Pfaff [Fri, 20 May 2016 14:49:02 +0000 (07:49 -0700)]
stp: Initialize mutex whenever we register unixctl command.

The stp/tcn command, which locks the mutex, was being registered without
initializing the mutex, so calling stp/tcn before STP was enabled on the
switch caused a crash.  This commit fixes the bug by initializing the mutex
at the same time we register the stp/tcn command.

Reported-by: Ding Zhi <zhi.ding@6wind.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-May/071381.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Quentin Monnet <quentin.monnet@6wind.com>
7 years agopython: Add TCP passive-mode to IDL.
Ofer Ben-Yacov [Wed, 18 May 2016 15:29:13 +0000 (18:29 +0300)]
python: Add TCP passive-mode to IDL.

Requested-by: "D M, Vikas" <vikas.d-m@hpe.com>
Requested-by: "Kamat, Maruti Haridas" <maruti.kamat@hpe.com>
Requested-by: "Sukhdev Kapur" <sukhdev@arista.com>
Requested-by: "Migliaccio, Armando" <armando.migliaccio@hpe.com>
Signed-off-by: "Ofer Ben-Yacov" <ofer.benyacov@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoutilities/ovs-ctl.in: Only add_managers with vswitchd
Aaron Conole [Fri, 20 May 2016 14:50:46 +0000 (10:50 -0400)]
utilities/ovs-ctl.in: Only add_managers with vswitchd

The ovs-ctl script was changed recently to have per-service start/stop
control. However, when that change was made the add_managers() call was
overlooked. This results in calls to `ovs-ctl --no-ovs-vswitchd start`
telling the ovsdb-server to connect to the remote controllers.

The fix presented will defer signaling to remote managers until the
following are both true:
1. At least one of OVSDB_SERVER or OVS_VSWITCHD was told to start
2. Both daemons are running.

Fixes: 7fc28c50c012 ("ovs-ctl: Allow selective start for db and switch")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoutilities: Tweak python shebangs to use env
YAMAMOTO Takashi [Fri, 13 May 2016 14:36:15 +0000 (14:36 +0000)]
utilities: Tweak python shebangs to use env

"python" command provided by pkg_alternatives is a shell script.
At least on NetBSD-7, execve can't execute scripts whose interpreter
is another shell script.  (While some "rich" shells like zsh seem
to have handle the case by itself, NetBSD's /bin/sh doesn't.)
Workaround the issue by using env command for shebangs for
these scripts.

Noticed with the recent tunnel-push-pop.at tests using ovs-pcap command.

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-controller-vtep.at: Pre-sort output before feeding to "sort -d"
YAMAMOTO Takashi [Fri, 13 May 2016 14:11:20 +0000 (14:11 +0000)]
ovn-controller-vtep.at: Pre-sort output before feeding to "sort -d"

NetBSD's "sort -d" preserves the order of lines which doesn't have
alphanumeric and blanks.  eg. empty lines and [].
It means it sometimes preserve unstable order of the list output.

Also, simply remove -d option where the expected output doesn't
include [].

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovsdb-server.at: Fix races
YAMAMOTO Takashi [Fri, 13 May 2016 12:57:48 +0000 (12:57 +0000)]
ovsdb-server.at: Fix races

As ovsdb-server creates pid file before unixctl socket, waiting
for pid file creation is not enough.  Fix the race by retrying
with "version" command before assuming the server is up.

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agodpif: Remove a warning
YAMAMOTO Takashi [Fri, 13 May 2016 11:42:55 +0000 (11:42 +0000)]
dpif: Remove a warning

Remove "attempted to unregister a datapath provider that is not registered"
warning.  It's normal for --enabled-dummy=system with userland-only build.
ovn-controller-vtep.at tests use the flag and fail on the extra warning.

Alternatively, we can make the tests ignore this specific warning.
But currently it doesn't make much sense as dp_unregister_provider
is only used for --enabled-dummy.

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn test: add '-O OpenFlow13' to ovs-ofctl
Flavio Fernandes [Tue, 17 May 2016 01:02:52 +0000 (21:02 -0400)]
ovn test: add '-O OpenFlow13' to ovs-ofctl

Make test calls to ovs-ofctl in test use the protocol parameter
'-O OpenFlow13', so it is consistent with the existing dump-flows
invocations.

Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovn test: remove check for non-existing bridge in hv3
Flavio Fernandes [Tue, 17 May 2016 01:02:51 +0000 (21:02 -0400)]
ovn test: remove check for non-existing bridge in hv3

In OVN vtep test, the network topology is like this:

  hv1---\
         >-- [net1] <-- vtep --> [net2] <-- hv3
  hv2---/

The logical switch lsw0 created in this test has no logical
port corresponding to hv3, so that hypervisor does not have
any bridges created by OVN. With this test change, we are
replacing the 'show br-int' with a check to ensure that
'br-int' is not present.

Fixes: 8dab102238f0 ("ovn: Add more details to test output.")
Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>