cascardo/ovs.git
7 years agodpif-netlink: add GENEVE creation support rtnetlink
Thadeu Lima de Souza Cascardo [Thu, 26 May 2016 14:21:36 +0000 (11:21 -0300)]
dpif-netlink: add GENEVE creation support

Creates GENEVE devices using rtnetlink and tunnel metadata. If the kernel does
not support tunnel metadata, it will return EINVAL because of the missing ID and
REMOTE attributes.

This was tested on kernels 4.2.3, 4.3.6, 4.4.9 and 4.5.5. All of them worked
with the system traffic test "datapath - ping over geneve tunnel".

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
7 years agodpif-netlink: add GRE creation support
Thadeu Lima de Souza Cascardo [Thu, 26 May 2016 14:21:36 +0000 (11:21 -0300)]
dpif-netlink: add GRE creation support

Creates GRE devices using rtnetlink and tunnel metadata. If the kernel does
not support tunnel metadata, it will return EEXIST because of the fallback
tunnel. However, on kernels between v3.10 and v3.12, it will not. So, we need to
verify the created tunnel has the tunnel metadata attribute.

This was tested on kernels 4.2.3, 4.3.6, 4.4.9, 4.5.5 and RHEL 3.10 based. All
of them worked with the system traffic test "datapath - ping over gre tunnel".
Also tested on a 3.10 based kernel without the fix for the existing fallback
tunnel.  That is, the kernel would not return EEXIST. Yet, the test works fine.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
7 years agodpif-netlink: add VXLAN creation support
Thadeu Lima de Souza Cascardo [Thu, 26 May 2016 14:21:36 +0000 (11:21 -0300)]
dpif-netlink: add VXLAN creation support

Creates VXLAN devices using rtnetlink and tunnel metadata. If the kernel does
not support tunnel metadata, it will return EINVAL because of the missing VNI
attribute.

This was tested on kernels 4.2.3, 4.3.6, 4.4.9, 4.5.5 and RHEL-based 3.10. All
of them worked with the system traffic test "datapath - ping over vxlan tunnel".

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
7 years agodpif-netlink: break out code to add compat and non-compat vports
Thadeu Lima de Souza Cascardo [Thu, 26 May 2016 13:18:42 +0000 (10:18 -0300)]
dpif-netlink: break out code to add compat and non-compat vports

The vport type for adding tunnels is now compatibility code and any new features
from tunnels must configure the tunnel as an interface using the tunnel metadata
support.

In order to be able to add those tunnels, we need to add code to create the
tunnels and add them as NETDEV vports. And when there is no support to create
them, we need to use the compatibility code and add them as tunnel vports.

When removing those tunnels, we need to remove the interfaces as well, and
detecting the right type might be important, at least to distinguish the tunnel
vports that we should remove and the interfaces that we shouldn't.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
7 years agonetdev: get device type from vport prefix if it uses one
Thadeu Lima de Souza Cascardo [Thu, 26 May 2016 12:56:34 +0000 (09:56 -0300)]
netdev: get device type from vport prefix if it uses one

If the device name uses a vport prefix, then use that vport type.

Since these names are reserved, we can assume this is the right type.

This is important when we are querying the datapath right after vswitch has
started and using the right type will be even more important when we add support
to creating tunnel ports with rtnetlink.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
7 years agonetlink-notifier: change message to a less scary one
Thadeu Lima de Souza Cascardo [Fri, 17 Jun 2016 19:33:23 +0000 (16:33 -0300)]
netlink-notifier: change message to a less scary one

"received bad netlink message" may be interpreted as a corrupt netlink message.
However, the parse functions may return failure when the message contains
unexpected attributes or misses non optional attributes. Indicating the message
contained "unexpected contents" will avoid some interpretation that there may be
some netlink message corruption.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Cc: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agolport: Persist lport_index and mcgroup_index structures.
RYAN D. MOATS [Thu, 9 Jun 2016 01:01:39 +0000 (20:01 -0500)]
lport: Persist lport_index and mcgroup_index structures.

This is preparatory to making physical_run and lflow_run process
incrementally as changes to the data in these structures control
that processing.

Signed-off-by: RYAN D. MOATS <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoofproto-dpif.at: Fix typo.
Flavio Fernandes [Wed, 22 Jun 2016 21:49:38 +0000 (17:49 -0400)]
ofproto-dpif.at: Fix typo.

Correct spelling of the word 'dropped'.

The typo appears to have been introduced in this changeset:
http://openvswitch.org/pipermail/dev/2014-March/037433.html

Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
7 years agoConvert binding_run to incremental processing.
RYAN D. MOATS [Tue, 7 Jun 2016 18:52:51 +0000 (13:52 -0500)]
Convert binding_run to incremental processing.

Ensure that the entire port binding table is processed
when chassis are added/removed or when get_local_iface_ids
finds new ports on the local vswitch.

Side effects:
  - Persist local_datapaths and patch_datapaths across runs so
    that changes to either can be used as a trigger to reset
    incremental flow processing.
  - Persist all_lports structure
  - Revert commit 9baaabfff3c7df014e9acbd4c68189b568552ca9
    (ovn: Fix localnet ports deletion and recreation sometimes
    after restart.) as these changes are not desirable once
    local_datatpath is persisted.

Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-controller: Change encaps_run to work incrementally.
Ryan Moats [Tue, 7 Jun 2016 18:52:50 +0000 (13:52 -0500)]
ovn-controller: Change encaps_run to work incrementally.

As a side effect, tunnel context is persisted.

Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: Rename local variable in Vport.c
Sairam Venugopal [Wed, 22 Jun 2016 20:50:34 +0000 (13:50 -0700)]
datapath-windows: Rename local variable in Vport.c

Declaration of 'event' hides previous local declaration. Rename this to
evt. The other variable wasn't being used.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoovn-northd: no logical router icmp response for directed broadcasts
Flavio Fernandes [Mon, 20 Jun 2016 20:57:22 +0000 (16:57 -0400)]
ovn-northd: no logical router icmp response for directed broadcasts

Responding to icmp queries where the L3 destination is a directed broadcast
was not being properly handled, causing the reply to be sent to all logical
ports except for the one port that should receive it.

This is a proposal for using choice B in the mail discussion; where icmp
queries to broadcast are simply not responded by the logical router.

Reported-at: http://openvswitch.org/pipermail/discuss/2016-June/021610.html
Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
7 years agodoc: Fix an error in FAQ.
Han Zhou [Tue, 7 Jun 2016 05:56:50 +0000 (22:56 -0700)]
doc: Fix an error in FAQ.

Signed-off-by: Han Zhou <zhouhan@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agodatapath-windows: Remove unused headers in Event.c
Sairam Venugopal [Tue, 21 Jun 2016 22:23:47 +0000 (15:23 -0700)]
datapath-windows: Remove unused headers in Event.c

Cleanup unused headers. Found by inspection.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoovn: Allow IP packets destined to router ip for SNAT
Chandra S Vejendla [Wed, 22 Jun 2016 01:36:43 +0000 (18:36 -0700)]
ovn: Allow IP packets destined to router ip for SNAT

By default all the ip traffic destined to router ip is dropped in
lr_in_ip_input stage. When the router ip is used as snat ip, allow
reverse snat traffic destined to the router ip.

Signed-off-by: Chandra Sekhar Vejendla <csvejend@us.ibm.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agodatapath-windows: Remove unused headers from Datapath.c
Sairam Venugopal [Tue, 21 Jun 2016 18:09:52 +0000 (11:09 -0700)]
datapath-windows: Remove unused headers from Datapath.c

Clean up unused headers in Datapath.c. Found by inspection.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Yin Lin<linyi@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoovn: DNAT and SNAT on a gateway router.
Gurucharan Shetty [Wed, 11 May 2016 01:59:01 +0000 (18:59 -0700)]
ovn: DNAT and SNAT on a gateway router.

For traffic from physical space to virtual space we need DNAT.
The DNAT happens in the gateway router and reaches the logical
port. The return traffic should be unDNATed.

Traffic originating in virtual space heading to physical space
should be SNATed. The return traffic is unSNATted.

East-west traffic with the public destination IP address needs
a DNAT. This traffic is punted to the l3 gateway where DNAT
takes place. This traffic is also SNATed and eventually loops back to
its destination. The SNAT is needed because we need the reverse traffic
to go back to the l3 gateway and not short-circuit directly to the source.

This commit introduces 4 new logical actions.
1. ct_snat: To send the packet through SNAT zone to unSNAT packets.
2. ct_snat(IP): To SNAT to the provided IP address.
3. ct_dnat: To send the packet throgh DNAT zone to unDNAT packets.
4. ct_dnat(IP): To DNAT to the provided IP.

This commit only provides the ability to do IP based NAT. This will
eventually be enhanced to do PORT based NAT too.

Command hints:

Consider a distributed router "R1" that has switch foo (192.168.1.0/24)
with a lport foo1 (192.168.1.2) and bar (192.168.2.0/24) with lport bar1
(192.168.2.2) connected to it. You connect "R1" to
a gateway router "R2" via a switch "join" in (20.0.0.0/24) network.

R2 has a switch "alice" (172.16.1.0/24) connected to it (to simulate
external network).

case: Add pure DNAT (north-south)

Add a DNAT rule in R2:
ovn-nbctl -- --id=@nat create nat type="dnat" logical_ip=192.168.1.2 \
external_ip=30.0.0.2 -- add logical_router R2 nat @nat

Now alice1 should be able to ping 192.168.1.2 via 30.0.0.2.

case2 : Add pure SNAT (south-north)

Add a SNAT rule in R2:

ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \
external_ip=30.0.0.1 -- add logical_router R2 nat @nat

(You need a static route in R1 to send packets destined to outside
world to go through R2. The logical_ip can be a subnet.)

When bar1 pings alice1, alice1 receives traffic from 30.0.0.1

case3 : SNAT and DNAT (east-west traffic)

When bar1 pings 30.0.0.2, the traffic jumps to the gateway router
and loops back to foo1 with a source ip address of 30.0.0.1

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Flavio Fernandes <flavio@flaviof.com>
7 years agotests: make ovn logical router test case more reliable
Lance Richardson [Mon, 6 Jun 2016 18:03:00 +0000 (14:03 -0400)]
tests: make ovn logical router test case more reliable

The "ovn -- 1 HVs, 2 LSs, 1 lport/LS, 1 LR" test case creates a
configuration including a logical router, then:
    1) Sends a packet that is expected to be forwarded by the
       logical router.
    2) Disables the logical router.
    3) Sends another packet, identical to the one sent in (1), that
       should not be forwarded.

This test case fails intermittently, apparently because the disabling
of the logical router in (2) has not yet been propagated to the
forwarding plane at the time the second packet is sent. (When the
failure occurs, two packets are captured whereas only one is expected.)

Address this issue by adding a one second sleep between steps (2) and
(3). Adding a sleep does not actually fix anything, but it
does make this test case more likely to work correctly.

In one series of tests, this test case failed 11 times out of 20
without this fix and succeeded 20 times out of 20 attempts with
this fix.

Fixes: 5412db307420 ("ovn: Add column enabled to table Logical_Router")
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
7 years agotun-metadata: Use correct offset when accessing fragmented metadata.
Jesse Gross [Sun, 29 May 2016 02:17:27 +0000 (19:17 -0700)]
tun-metadata: Use correct offset when accessing fragmented metadata.

Since tunnel metadata is stored in a fixed area in the flow match
field, we must allocate space for options as they are registered with
the switch. In order to avoid exposing implementation complexity to
the controller, we support fragmentation when we run out of contiguous
blocks that are large enough to handle new requests.

When reading or writing to these fragmented blocks, there is a bug
that would cause us to keep on using the area after the allocated
space rather than moving to the next offset. This corrects that to
use the offset for each block.

Unfortunately, while we did have a test for this exact use case, since
the same bug was present in both reading and writing code, everything
appeared to work as normal from the outside.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
7 years agoofproto: Set to revalidate when a new version is available.
Jarno Rajahalme [Tue, 21 Jun 2016 16:41:02 +0000 (09:41 -0700)]
ofproto: Set to revalidate when a new version is available.

There is no need to set the revalidate flag after each flow mod
separately, as we can do it once after the whole transaction is
finished.  It is not done at all if the transaction fails.

In the successful case this change makes no functional difference,
since the revalidation thread is triggered by the main thread only
after a bundle transaction has been fully processed.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoxlate: Fix typo in comment.
Jarno Rajahalme [Tue, 21 Jun 2016 16:41:01 +0000 (09:41 -0700)]
xlate: Fix typo in comment.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath: Fix cached ct with helper.
Joe Stringer [Tue, 21 Jun 2016 01:51:09 +0000 (18:51 -0700)]
datapath: Fix cached ct with helper.

Upstream commit:
    commit 16ec3d4fbb967bd0e1c8d9dce9ef70e915a86615
    Author: Joe Stringer <joe@ovn.org>
    Date:   Wed May 11 10:29:26 2016 -0700

    openvswitch: Fix cached ct with helper.

    When using conntrack helpers from OVS, a common configuration is to
    perform a lookup without specifying a helper, then go through a
    firewalling policy, only to decide to attach a helper afterwards.

    In this case, the initial lookup will cause a ct entry to be attached to
    the skb, then the later commit with helper should attach the helper and
    confirm the connection. However, the helper attachment has been missing.
    If the user has enabled automatic helper attachment, then this issue
    will be masked as it will be applied in init_conntrack(). It is also
    masked if the action is executed from ovs_packet_cmd_execute() as that
    will construct a fresh skb.

    This patch fixes the issue by making an explicit call to try to assign
    the helper if there is a discrepancy between the action's helper and the
    current skb->nfct.

    Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Fixes: 11251c170d92 ("datapath: Allow attaching helpers to ct action")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: __nf_ct_l{3,4}proto_find() always return a valid pointer
Pablo Neira Ayuso [Tue, 21 Jun 2016 01:51:09 +0000 (18:51 -0700)]
datapath: __nf_ct_l{3,4}proto_find() always return a valid pointer

Upstream commit:
    commit 3b78155b1b3688dbe910fecdc3e003f431b46630
    Author: Pablo Neira Ayuso <pablo@netfilter.org>
    Date:   Tue May 3 11:13:29 2016 +0200

    openvswitch: __nf_ct_l{3,4}proto_find() always return a valid pointer

    If the protocol is not natively supported, this assigns generic protocol
    tracker so we can always assume a valid pointer after these calls.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: change nf_connlabels_get bit arg to 'highest used'
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:09 +0000 (18:51 -0700)]
datapath: change nf_connlabels_get bit arg to 'highest used'

Upstream commit:
    commit adff6c65600000ec2bb71840c943ee12668080f5
    Author: Florian Westphal <fw@strlen.de>
    Date:   Tue Apr 12 18:14:25 2016 +0200

    netfilter: connlabels: change nf_connlabels_get bit arg to 'highest used'

    nf_connlabel_set() takes the bit number that we would like to set.
    nf_connlabels_get() however took the number of bits that we want to
    support.

    So e.g. nf_connlabels_get(32) support bits 0 to 31, but not 32.
    This changes nf_connlabels_get() to take the highest bit that we want
    to set.

    Callers then don't have to cope with a potential integer wrap
    when using nf_connlabels_get(bit + 1) anymore.

    Current callers are fine, this change is only to make folloup
    nft ct label set support simpler.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
OVS compat code defined nf_connlabels_get() if it was missing.  Now we
redefine it if it is missing, or if it has the old signature.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: call only into reachable nf-nat code
Arnd Bergmann [Tue, 21 Jun 2016 01:51:09 +0000 (18:51 -0700)]
datapath: call only into reachable nf-nat code

Upstream commit:
    commit 99b7248e2ad57ca93ada10c6598affb267ffc99a
    Author: Arnd Bergmann <arnd@arndb.de>
    Date:   Fri Mar 18 14:33:45 2016 +0100

    openvswitch: call only into reachable nf-nat code

    The openvswitch code has gained support for calling into the
    nf-nat-ipv4/ipv6 modules, however those can be loadable modules
    in a configuration in which openvswitch is built-in, leading
    to link errors:

    net/built-in.o: In function `__ovs_ct_lookup':
    :(.text+0x2cc2c8): undefined reference to `nf_nat_icmp_reply_translation'
    :(.text+0x2cc66c): undefined reference to `nf_nat_icmpv6_reply_translation'

    The dependency on (!NF_NAT || NF_NAT) prevents similar issues,
    but NF_NAT is set to 'y' if any of the symbols selecting
    it are built-in, but the link error happens when any of them
    are modular.

    A second issue is that even if CONFIG_NF_NAT_IPV6 is built-in,
    CONFIG_NF_NAT_IPV4 might be completely disabled. This is unlikely
    to be useful in practice, but the driver currently only handles
    IPv6 being optional.

    This patch improves the Kconfig dependency so that openvswitch
    cannot be built-in if either of the two other symbols are set
    to 'm', and it replaces the incorrect #ifdef in ovs_ct_nat_execute()
    with two "if (IS_ENABLED())" checks that should catch all corner
    cases also make the code more readable.

    The same #ifdef exists ovs_ct_nat_to_attr(), where it does not
    cause a link error, but for consistency I'm changing it the same
    way.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Fixes: c5f6c06b58d6 ("datapath: Interface with NAT.")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Fix checking for new expected connections.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: Fix checking for new expected connections.

Upstream commit:
    commit 5745b0be05a0f8ccbc92a36b69f3a6bc58e91954
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Mon Mar 21 11:15:19 2016 -0700

    openvswitch: Fix checking for new expected connections.

    OVS should call into CT NAT for packets of new expected connections only
    when the conntrack state is persisted with the 'commit' option to the
    OVS CT action.  The test for this condition is doubly wrong, as the CT
    status field is ANDed with the bit number (IPS_EXPECTED_BIT) rather
    than the mask (IPS_EXPECTED), and due to the wrong assumption that the
    expected bit would apply only for the first (i.e., 'new') packet of a
    connection, while in fact the expected bit remains on for the lifetime of
    an expected connection.  The 'ctinfo' value IP_CT_RELATED derived from
    the ct status can be used instead, as it is only ever applicable to
    the 'new' packets of the expected connection.

    Fixes: 05752523e565 ('openvswitch: Interface with NAT.')
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes: c5f6c06b58d6 ("datapath: Interface with NAT.")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Use proper buffer size in nla_memcpy
Haishuang Yan [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: Use proper buffer size in nla_memcpy

Upstream commit:
    commit ac71b46efd2838c02ec193987c8f61c3ba33b495
    Author: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
    Date:   Mon Mar 28 18:08:59 2016 +0800

    openvswitch: Use proper buffer size in nla_memcpy

    For the input parameter count, it's better to use the size
    of destination buffer size, as nla_memcpy would take into
    account the length of the source netlink attribute when
    a data is copied from an attribute.

Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Fixes: c5f6c06b58d6 ("datapath: Interface with NAT.")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: conntrack NF_NAT_RANGE_PROTO_RANDOM_FULLY compat code.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: conntrack NF_NAT_RANGE_PROTO_RANDOM_FULLY compat code.

Linux kernel 3.13 and older do not have
NF_NAT_RANGE_PROTO_RANDOM_FULLY (unless backported by the
distribution).  Silently fall back to NF_NAT_RANGE_PROTO_RANDOM to
maintain OVS API compatibility.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: conntrack NAT helper compat code for Linux 4.5 and earlier.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: conntrack NAT helper compat code for Linux 4.5 and earlier.

Upstream commit:
    commit 264619055bd52bc2278af848472176642d759874
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:17 2016 -0800

    netfilter: Allow calling into nat helper without skb_dst.

    NAT checksum recalculation code assumes existence of skb_dst, which
    becomes a problem for a later patch in the series ("openvswitch:
    Interface with NAT.").  Simplify this by removing the check on
    skb_dst, as the checksum will be dealt with later in the stack.

Suggested-by: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds a corresponding backport for Linux 4.5 and older into
datapath/conntrack.c, changing a TCP or UDP packet to CHECKSUM_PARTIAL
to avoid triggering the skb_dst dependency that otherwise crashes the
kernel when checksums are recalculated after NAT helper has mangled
TCP or UDP packet contents.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Interface with NAT.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:08 +0000 (18:51 -0700)]
datapath: Interface with NAT.

Upstream commit:
    commit 05752523e56502cd9975aec0a2ded465d51a71f3
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:23 2016 -0800

    openvswitch: Interface with NAT.

    Extend OVS conntrack interface to cover NAT.  New nested
    OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action.
    A bare OVS_CT_ATTR_NAT only mangles existing and expected connections.
    If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested
    attributes, new (non-committed/non-confirmed) connections are mangled
    according to the rest of the nested attributes.

    The corresponding OVS userspace patch series includes test cases (in
    tests/system-traffic.at) that also serve as example uses.

    This work extends on a branch by Thomas Graf at
    https://github.com/tgraf/ovs/tree/nat.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Delay conntrack helper call for new connections.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Delay conntrack helper call for new connections.

Upstream commit:
    commit 28b6e0c1ace45779c60e7cefe6d469b7ecb520b8
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:22 2016 -0800

    openvswitch: Delay conntrack helper call for new connections.

    There is no need to help connections that are not confirmed, so we can
    delay helping new connections to the time when they are confirmed.
    This change is needed for NAT support, and having this as a separate
    patch will make the following NAT patch a bit easier to review.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Handle NF_REPEAT in conntrack action.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Handle NF_REPEAT in conntrack action.

Upstream commit:
    commit 5b6b929376a621e2bd3367f5de563d7123506597
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:21 2016 -0800

    openvswitch: Handle NF_REPEAT in conntrack action.

    Repeat the nf_conntrack_in() call when it returns NF_REPEAT.  This
    avoids dropping a SYN packet re-opening an existing TCP connection.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Find existing conntrack entry after upcall.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Find existing conntrack entry after upcall.

Upstream commit:
    commit 289f225349cb2a97448fd14599ab34b741f706f3
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:20 2016 -0800

    openvswitch: Find existing conntrack entry after upcall.

    Add a new function ovs_ct_find_existing() to find an existing
    conntrack entry for which this packet was already applied to.  This is
    only to be called when there is evidence that the packet was already
    tracked and committed, but we lost the ct reference due to an
    userspace upcall.

    ovs_ct_find_existing() is called from skb_nfct_cached(), which can now
    hide the fact that the ct reference may have been lost due to an
    upcall.  This allows ovs_ct_commit() to be simplified.

    This patch is needed by later "openvswitch: Interface with NAT" patch,
    as we need to be able to pass the packet through NAT using the
    original ct reference also after the reference is lost after an
    upcall.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Update the CT state key only after nf_conntrack_in().
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Update the CT state key only after nf_conntrack_in().

Upstream commit:
    commit 394e910e909b174270b8231fd51942eb2f541fb9
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:19 2016 -0800

    openvswitch: Update the CT state key only after nf_conntrack_in().

    Only a successful nf_conntrack_in() call can effect a connection state
    change, so it suffices to update the key only after the
    nf_conntrack_in() returns.

    This change is needed for the later NAT patches.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Add commentary to conntrack.c
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:07 +0000 (18:51 -0700)]
datapath: Add commentary to conntrack.c

Upstream commit:
    commit 9f13ded8d3c715147c4759f937cfb712c185ca13
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:18 2016 -0800

    openvswitch: Add commentary to conntrack.c

    This makes the code easier to understand and the following patches
    more focused.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: Remove NF_CT_NEW_REPLY
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:06 +0000 (18:51 -0700)]
datapath: Remove NF_CT_NEW_REPLY

Upstream commit:
    commit bfa3f9d7f3b349acea8982d2248e33a0ed84c687
    Author: Jarno Rajahalme <jarno@ovn.org>
    Date:   Thu Mar 10 10:54:16 2016 -0800

    netfilter: Remove IP_CT_NEW_REPLY definition.

    Remove the definition of IP_CT_NEW_REPLY from the kernel as it does
    not make sense.  This allows the definition of IP_CT_NUMBER to be
    simplified as well.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath: compat for NAT.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:06 +0000 (18:51 -0700)]
datapath: compat for NAT.

Compat code required to make the NAT code in the following patch
compile with Linux 3.10 - 4.6.

Some compat code applies to the conntrack.c itself; these are added
after the main NAT backport for conntrack.c later in the series.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agoacinclude: Add OVS_FIND_PARAM_IFELSE.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:06 +0000 (18:51 -0700)]
acinclude: Add OVS_FIND_PARAM_IFELSE.

OVS_FIND_PARAM_IFELSE is more robust macro for checking function
parameters, as it does not require the parameter to be on the same
line as the function name like the OVS_GREP_IFELSE does.

Use this to fix the check for struct conntrack_zone parameter, which
is on a different line on Linux 4.3 and higher.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agotests: Clear TCP state from conntrack dumps.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:06 +0000 (18:51 -0700)]
tests: Clear TCP state from conntrack dumps.

When the TCP state is not important it is better ignore it.  This
makes test cases more robust w.r.t. kernel versions and timing.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agodatapath-windows: comment cleanup and indentation
Nithin Raju [Thu, 16 Jun 2016 17:17:09 +0000 (10:17 -0700)]
datapath-windows: comment cleanup and indentation

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolution.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agonetdev-dpdk: NUMA Aware vHost User
Ciara Loftus [Mon, 13 Jun 2016 10:10:09 +0000 (11:10 +0100)]
netdev-dpdk: NUMA Aware vHost User

This commit allows for vHost User memory from QEMU, DPDK and OVS, as
well as the servicing PMD, to all come from the same socket.

The socket id of a vhost-user port used to be set to that of the master
lcore. Now it is possible to update the socket id if it is detected
(during VM boot) that the vhost device memory is not on this node. If
this is the case, a new mempool is created from the new node, and the
PMD thread currently servicing the port will no longer, in favour of a
thread from the new node (if enabled in the pmd-cpu-mask).

To avail of this functionality, one must enable the
CONFIG_RTE_LIBRTE_VHOST_NUMA DPDK configuration option.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agodatapath-windows: use ip proto for tunnel port lookup
Nithin Raju [Fri, 17 Jun 2016 17:51:52 +0000 (10:51 -0700)]
datapath-windows: use ip proto for tunnel port lookup

In Actions.c, based on the IP Protocol type and L4 port of
the outer packet, we lookup the tunnel port. The function
that made this happen took the tunnel type as an argument.
Semantically, is is better to pass the IP protocol type and
let the lookup code map IP protocol type to tunnel type.

In the vport add code, we make sure that we block tunnel
port addition if there's already a tunnel port that uses
the same IP protocol type and L4 port number.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Yin Lin <linyi@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
7 years agoipfix: Support tunnel information for Flow IPFIX.
Benli Ye [Tue, 14 Jun 2016 08:53:34 +0000 (16:53 +0800)]
ipfix: Support tunnel information for Flow IPFIX.

Add support to export tunnel information for flow-based IPFIX.
The original steps to configure flow level IPFIX:
    1) Create a new record in Flow_Sample_Collector_Set table:
       'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
    2) Add IPFIX configuration which is referred by corresponding
       row in Flow_Sample_Collector_Set table:
       'ovs-vsctl -- set Flow_Sample_Collector_Set
       "Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX
       targets=\"IP:4739\" obs_domain_id=123 obs_point_id=456
       cache_active_timeout=60 cache_max_flows=13'
    3) Add sample action to the flows:
       'ovs-ofctl add-flow mybridge in_port=1,
       actions=sample'('probability=65535,collector_set_id=1,
       obs_domain_id=123,obs_point_id=456')',output:3'
NXAST_SAMPLE action was used in step 3. In order to support exporting tunnel
information, the NXAST_SAMPLE2 action was added and with NXAST_SAMPLE2 action
in this patch, the step 3 should be configured like below:
       'ovs-ofctl add-flow mybridge in_port=1,
       actions=sample'('probability=65535,collector_set_id=1,obs_domain_id=123,
       obs_point_id=456,sampling_port=3')',output:3'
'sampling_port' can be equal to ingress port or one of egress ports. If sampling
port is equal to output port and the output port is a tunnel port,
OVS_USERSPACE_ATTR_EGRESS_TUN_PORT will be set in the datapath flow sample action.
When flow sample action upcall happens, tunnel information will be retrieved from
the datapath and then IPFIX can export egress tunnel port information. If
samping_port=65535 (OFPP_NONE), flow-based IPFIX will keep the same behavior
as before.

This patch mainly do three tasks:
    1) Add a new flow sample action NXAST_SAMPLE2 to support exporting
       tunnel information. NXAST_SAMPLE2 action has a new added field
       'sampling_port'.
    2) Use 'other_configure: enable-tunnel-sampling' to enable or disable
       exporting tunnel information.
    3) If 'sampling_port' is equal to output port and output port is a tunnel
       port, the translation of OpenFlow "sample" action should first emit
       set(tunnel(...)), then the sample action itself. It makes sure the
       egress tunnel information can be sampled.
    4) Add a test of flow-based IPFIX for tunnel set.

How to test flow-based IPFIX:
    1) Setup a test environment with two Linux host with Docker supported
    2) Create a Docker container and a GRE tunnel port on each host
    3) Use ovs-docker to add the container on the bridge
    4) Listen on port 4739 on the collector machine and use wireshark to filter
       'cflow' packets.
    5) Configure flow-based IPFIX:
       - 'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
       - 'ovs-vsctl -- set Flow_Sample_Collector_Set
          "Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX \
          targets=\"IP:4739\" cache_active_timeout=60 cache_max_flows=13 \
          other_config:enable-tunnel-sampling=true'
       - 'ovs-ofctl add-flow mybridge in_port=1,
          actions=sample'('probability=65535,collector_set_id=1,obs_domain_id=123,
          obs_point_id=456,sampling_port=3')',output:3'
       Note: The in-port is container port. The output port and sampling_port
             are both open flow port and the output port is a GRE tunnel port.
    6) Ping from the container whose host enabled flow-based IPFIX.
    7) Get the IPFIX template pakcets and IPFIX information packets.

Signed-off-by: Benli Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agonetdev-dpdk: Remove vhost send retries when no packets have been sent.
Kevin Traynor [Fri, 10 Jun 2016 16:49:38 +0000 (17:49 +0100)]
netdev-dpdk: Remove vhost send retries when no packets have been sent.

If the guest is connected but not servicing the virt queue, this leads
to vhost send retries until timeout. This is fine in isolation but if
there are other high rate queues also being serviced by the same PMD
it can lead to a performance hit on those queues. Change to only retry
when at least some packets have been successfully sent on the previous
attempt.

Also, limit retries to avoid a similar delays if packets are being sent
at a very low rate due to few available descriptors.

Reported-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Acked-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoofp-util: Fix parsing of parenthesized values within key-value pairs.
Ben Pfaff [Mon, 13 Jun 2016 21:53:01 +0000 (14:53 -0700)]
ofp-util: Fix parsing of parenthesized values within key-value pairs.

Reported-by: james hopper <jameshopper@email.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-June/021662.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoat test vlog: Switch from stderr to log
Alin Serdean [Wed, 8 Jun 2016 14:02:20 +0000 (14:02 +0000)]
at test vlog: Switch from stderr to log

Using the --detach parameter the child does not propagate the first
message to the parent.

Proposed change use the log file instead of the stderr.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Tested-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovs-ofctl: Fixed PID file naming on windows
Paul Boca [Wed, 8 Jun 2016 08:40:34 +0000 (08:40 +0000)]
ovs-ofctl: Fixed PID file naming on windows

On Windows if a relative file name is given to --pidfile parameter
(not containing ':') then the application name is used for PID file,
ignoring the given name.

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: Fix misc on vport
Alin Serdean [Tue, 10 May 2016 00:46:01 +0000 (00:46 +0000)]
datapath-windows: Fix misc on vport

Remove ununsed variables, found by inspection.

On fail reset extInfo name.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agodatapath-windows: Sample action support.
Sorin Vinturis [Wed, 1 Jun 2016 15:50:27 +0000 (15:50 +0000)]
datapath-windows: Sample action support.

This patch adds support for sampling to the OVS extension.

The following flow was used for generating sample actions:
  ovs-ofctl add-flow tcp:127.0.0.1:9999 "actions=sample(
    probability=12345,collector_set_id=23456,obs_domain_id=34567,
    obs_point_id=45678)"

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoipfix: Bug fix for not sending template packets on 32-bit OS
Benli Ye [Tue, 14 Jun 2016 03:09:45 +0000 (11:09 +0800)]
ipfix: Bug fix for not sending template packets on 32-bit OS

'last_template_set_time' in truct dpif_ipfix_exporter is declared
as time_t and time_t is long int type. If we initialize
'last_template_set_time' as TIME_MIN, whose value is -2147483648
on 32-bit OS and -2^63 on 64-bit OS. There will be a problem on
32-bit OS when comparing 'last_template_set_time' with a unisgned int
type variable, because type casting will happen and negative value
could be a large positive number. Fix this problem by simply initialize
'last_template_set_time' as 0.

Signed-off-by: Benli Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
7 years agoipfix: Add support for exporting ipfix statistics.
Benli Ye [Mon, 13 Jun 2016 21:44:09 +0000 (14:44 -0700)]
ipfix: Add support for exporting ipfix statistics.

It is meaningful for user to check the stats of IPFIX.
Using IPFIX stats, user can know how much flows the system
can support. It is also can be used for performance check
of IPFIX.

IPFIX stats is added for per IPFIX exporter. If bridge IPFIX is
enabled on the bridge, the whole bridge will have one exporter.
For flow IPFIX, the system keeps per id (column in
Flow_Sample_Collector_Set) per exporter.

1) Add 'ovs-ofctl dump-ipfix-bridge SWITCH' to export IPFIX stats of
   the bridge which enable bridge IPFIX. The output format:
   NXST_IPFIX_BRIDGE reply (xid=0x2):
     bridge ipfix: flows=0, current flows=0, sampled pkts=0, \
                   ipv4 ok=0, ipv6 ok=0, tx pkts=0
                   pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0
2) Add 'ovs-ofctl dump-ipfix-flow SWITCH' to export IPFIX stats of
   the bridge which enable flow IPFIX. The output format:
   NXST_IPFIX_FLOW reply (xid=0x2): 2 ids
     id   1: flows=4, current flows=4, sampled pkts=14, ipv4 ok=13, \
             ipv6 ok=0, tx pkts=0
             pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0
     id   2: flows=0, current flows=0, sampled pkts=0, ipv4 ok=0, \
             ipv6 ok=0, tx pkts=0
             pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0

flows: the number of total flow records, including those exported.
current flows: the number of current flow records cached.
sampled pkts: Successfully sampled packet count.
ipv4 ok: successfully sampled IPv4 flow packet count.
ipv6 ok: Successfully sampled IPv6 flow packet count.
tx pkts: the count of IPFIX exported packets sent  to the collector(s).
pkts errs: count of packets failed when sampling, maybe not supported or other error.
ipv4 errs: Count of IPV4 flow packet in the error packets.
ipv6 errs: Count of IPV6 flow packet in the error packets.
tx errs: the count of IPFIX exported packets failed when sending to the collector(s).

Signed-off-by: Benli Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovs-vsctl: Support identifying Flow_Sample_Collector_Set records by id.
Ben Pfaff [Fri, 10 Jun 2016 22:19:03 +0000 (15:19 -0700)]
ovs-vsctl: Support identifying Flow_Sample_Collector_Set records by id.

This allows commands like
    ovs-vsctl list Flow_Sample_Collector_Set 123
if there's a record with id 123.  It's not perfect, since there can be
more than one record with the same id, but it's helpful.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
7 years agonetlink-notifier: Support multiple groups.
Jarno Rajahalme [Mon, 13 Jun 2016 21:22:32 +0000 (14:22 -0700)]
netlink-notifier: Support multiple groups.

A netlink notifier ('nln') already supports multiple notifiers.  This
patch allows each of these notifiers to subscribe to a different
multicast group.  Sharing a single socket for multiple event types
(each on their own multicast group) provides serialization of events
when reordering of different event types could be problematic.  For
example, if a 'create' event and 'delete' event are on different
netlink multicast group, we may want to process those events in the
order in which kernel issued them, rather than in the order we happen
to check for them.

Moving the multicast group argument from nln_create() to
nln_notifier_create() allows each notifier to specify a different
multicast group.  The parse callback needs to identify the group the
message belonged to by returning the corresponding group number, or 0
when an parse error occurs.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
7 years agodpif-netdev: Print installed flows in dpif format.
Jesse Gross [Sat, 28 May 2016 16:56:07 +0000 (09:56 -0700)]
dpif-netdev: Print installed flows in dpif format.

When debug logging is enabled, dpif-netdev can print each flow as it is
installed, which it currently does using OpenFlow match formatting. Compared
to ODP formatting, there generally isn't too much difference since the
fields are largely the same but it is inconsistent with other logging in
dpif-netdev as well as the analogous functions that deal with the kernel.

However, in some cases there is a difference between the two formats, such
as in the cases of input port or tunnel metadata. For input port, datapath
format helped detect that the generated masks were incorrect. As for tunnels,
at the moment, it's possible to convert between the two formats on demand as
we have a global metadata table. In the future, though this won't be possible
as the metadata table becomes per-bridge which the datapath won't have access
to.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoodp-util: Remove odp_in_port from struct odp_flow_key_parms.
Jesse Gross [Thu, 9 Jun 2016 20:32:50 +0000 (13:32 -0700)]
odp-util: Remove odp_in_port from struct odp_flow_key_parms.

When calling odp_flow_key_from_flow (or _mask), the in_port included
as part of the flow is ignored and must be explicitly passed as a
separate parameter. This is because the assumption was that the flow's
version would often be in OFP format, rather than ODP.

However, at this point all flows that are ready for serialization in
netlink format already have their in_port properly set to ODP format.
As a result, every caller needs to explicitly initialize the extra
paramter to the value that is in the flow. This switches to just use
the value in the flow to simply things and avoid the possibility of
forgetting to initialize the extra parameter.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoofproto-dpif-upcall: Translate input port as part of upcall translation.
Jesse Gross [Thu, 9 Jun 2016 20:18:45 +0000 (13:18 -0700)]
ofproto-dpif-upcall: Translate input port as part of upcall translation.

When we generate wildcards for upcalled flows, the flows and therefore
the wildcards, are in OpenFlow format. These are mostly the same but
one exception is the input port. We work around this problem by simply
performing an exact match on the input port when generating netlink
formatted keys. (This does not lose any information in practice because
action translation also always exact matches on input port.)

While this works fine for kernel based flows, it misses the userspace
datapath, which directly consumes the OFP format mask for the input
port. The effect of this is that the in_port mask is sometimes only
the lower 16 bits of the field. (This is because OFP format is a 16-bit
value stored in a 32-bit field. The full width of the field is initialized
with an exact match mask but certain operations result in cleaving this
down to 16 bits.) In practice this does not cause a problem because datapath
port numbers are almost always in the lower 16 bits of the range anyways.

This moves the masking of the datapath format field to translation so that
all datapaths see the same result. This also makes more sense conceptually
as the input port in the flow is also in ODP format at this stage.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoovn-architecture.7.xml: Fix ovn-controller behavior in VIF life cycle
Hui Kang [Mon, 13 Jun 2016 16:43:26 +0000 (12:43 -0400)]
ovn-architecture.7.xml: Fix ovn-controller behavior in VIF life cycle

Signed-off-by: Hui Kang <kangh@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoovn: Replace tabs with spaces and clean up alignment in unit tests.
Justin Pettit [Wed, 18 May 2016 06:15:40 +0000 (23:15 -0700)]
ovn: Replace tabs with spaces and clean up alignment in unit tests.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
7 years agoovn-nbctl: Update logical switch commands.
Justin Pettit [Thu, 9 Jun 2016 00:15:02 +0000 (17:15 -0700)]
ovn-nbctl: Update logical switch commands.

    A few minor changes related to logical switch commands:

        - Use "ls" instead of "lswitch" to be more consistent with other
          command changes.
        - Use commands where possible in ovn unit tests.
        - Update references from "lswitch" to "ls" (code) or "switch" (user).

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-nbctl: Update logical switch port commands.
Justin Pettit [Tue, 7 Jun 2016 23:43:34 +0000 (16:43 -0700)]
ovn-nbctl: Update logical switch port commands.

A few minor changes related to logical switch port commands:

    - Use "lsp" instead of "lport" to be more consistent with later
      changes.
    - Use commands where possible in ovn unit tests.
    - Update references from "lport" to "lsp" (code) or "port" (user).

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn: Use Logical_Switch_Port in NB.
Justin Pettit [Tue, 7 Jun 2016 23:22:06 +0000 (16:22 -0700)]
ovn: Use Logical_Switch_Port in NB.

We have both logical switch and router ports.  Router ports are
referenced in "Logical_Router_Port" table, so this make it more
consistent.

Also change internal use of "lport" to "lsp".

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-nbctl: Add static route commands.
Justin Pettit [Tue, 17 May 2016 13:02:28 +0000 (06:02 -0700)]
ovn-nbctl: Add static route commands.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agopackets: Parse IP address strings with a zero length prefix.
Justin Pettit [Tue, 17 May 2016 14:08:29 +0000 (07:08 -0700)]
packets: Parse IP address strings with a zero length prefix.

A zero prefix length is used to match any IP address, which is useful
for defining default routes.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Flavio Fernandes <flavio@flaviof.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovn-nbctl: Update logical router port commands.
Justin Pettit [Wed, 18 May 2016 00:56:12 +0000 (17:56 -0700)]
ovn-nbctl: Update logical router port commands.

A few minor changes related to logical router port commands:

    - Use "lrp" instead of "lrport" to be more consistent with later
      changes.
    - Use commands where possible in ovn unit tests.
    - Move documentation to group router commands together.
    - Adds mac/network/peer to lrp-add command.  The existing command
      doesn't require creating a mac or network address, which shouldn't
      be possible.
    - Drops lrport-[get|set]-mac-addresses commands in favor of
      initializing them in lrp-add command.
    - Update references from "lrport" to "lrp" (code) or "port" (user).

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoovn-nbctl: Update basic router commands.
Justin Pettit [Tue, 17 May 2016 13:39:46 +0000 (06:39 -0700)]
ovn-nbctl: Update basic router commands.

A few minor changes related to router commands:

    - Use "lr" instead of "lrouter" to be more consistent with later
      changes.
    - Use the commands where possible in ovn unit tests.
    - Move documentation to group router commands together.
    - Update references from "lrouter" to "router".

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovn-nbctl: Use "ctx->output" instead of printf for list ACLs.
Justin Pettit [Wed, 18 May 2016 18:55:02 +0000 (11:55 -0700)]
ovn-nbctl: Use "ctx->output" instead of printf for list ACLs.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovn-controller: Fix memory leak reported by valgrind.
William Tu [Sun, 5 Jun 2016 14:37:35 +0000 (07:37 -0700)]
ovn-controller: Fix memory leak reported by valgrind.

Calling ovsdb_idl_set_remote() might overwrite the 'idl->session'.  The patch
fixes them by freeing 'idl->session' before it is overwritten.

Testcast ovn-controller - ovn-bridge-mappings reports two definitely losts in:
    xmalloc (util.c:112)
    jsonrpc_session_open (jsonrpc.c:784)
    ovsdb_idl_create (ovsdb-idl.c:246)
    main (ovn-controller.c:384)
and,
    xmalloc (util.c:112)
    jsonrpc_session_open (jsonrpc.c:784)
    ovsdb_idl_set_remote (ovsdb-idl.c:289)
    main (ovn-controller.c:409)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agotests: Remove "test" from test names.
Ben Pfaff [Thu, 9 Jun 2016 02:01:16 +0000 (19:01 -0700)]
tests: Remove "test" from test names.

Every test is a test, so each test doesn't need to attest to being a test.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovn-nb.xml: Fix typo.
Ben Pfaff [Thu, 9 Jun 2016 22:17:45 +0000 (15:17 -0700)]
ovn-nb.xml: Fix typo.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
7 years agoovs-bugtool: Fix flake8 errors.
Russell Bryant [Thu, 9 Jun 2016 20:20:11 +0000 (21:20 +0100)]
ovs-bugtool: Fix flake8 errors.

A previous commit added this file to be checked by flake8, but the file
failed a number of checks done by the 'hacking' flake8 plugin.

Fixes: b00bdc728e7a ("automake: Add ovs-bugtool.in to flake8-check.")
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-By: Kyle Mestery <mestery@mestery.com>
7 years agodatapath:backport: openvswitch: use flow protocol when recalculating ipv6 checksums
Pravin B Shelar [Thu, 9 Jun 2016 05:53:23 +0000 (22:53 -0700)]
datapath:backport: openvswitch: use flow protocol when recalculating ipv6 checksums

Upstream commit:
    commit b4f70527f052b0c00be4d7cac562baa75b212df5
    Author: Simon Horman <simon.horman@netronome.com>
    Date:   Thu Apr 21 11:49:15 2016 +1000

    openvswitch: use flow protocol when recalculating ipv6 checksums

    When using masked actions the ipv6_proto field of an action
    to set IPv6 fields may be zero rather than the prevailing protocol
    which will result in skipping checksum recalculation.

    This patch resolves the problem by relying on the protocol
    in the flow key rather than that in the set field action.

    Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.")
Cc: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
7 years agoautomake: Add ovs-bugtool.in to flake8-check.
Gurucharan Shetty [Mon, 6 Jun 2016 06:57:58 +0000 (23:57 -0700)]
automake: Add ovs-bugtool.in to flake8-check.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovs-bugtool.in: Do not assign a lambda expression, use a def.
Gurucharan Shetty [Mon, 6 Jun 2016 06:56:51 +0000 (23:56 -0700)]
ovs-bugtool.in: Do not assign a lambda expression, use a def.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovs-bugtool.in: Comparison to None should be 'if cond is None:'
Gurucharan Shetty [Mon, 6 Jun 2016 06:12:36 +0000 (23:12 -0700)]
ovs-bugtool.in: Comparison to None should be 'if cond is None:'

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovs-bugtool.in: Test for membership should be 'not in'.
Gurucharan Shetty [Mon, 6 Jun 2016 06:11:08 +0000 (23:11 -0700)]
ovs-bugtool.in: Test for membership should be 'not in'.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovs-bugtool.in: Remove usage of 'has_key'.
Gurucharan Shetty [Mon, 6 Jun 2016 06:07:12 +0000 (23:07 -0700)]
ovs-bugtool.in: Remove usage of 'has_key'.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovs-bugtool.in: Remove unused variables.
Gurucharan Shetty [Mon, 6 Jun 2016 05:55:19 +0000 (22:55 -0700)]
ovs-bugtool.in: Remove unused variables.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovs-bugtool.in: Fix errors around spaces and line length.
Gurucharan Shetty [Mon, 6 Jun 2016 05:20:39 +0000 (22:20 -0700)]
ovs-bugtool.in: Fix errors around spaces and line length.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovs-bugtool.in: Remove unused imports.
Gurucharan Shetty [Fri, 3 Jun 2016 11:57:53 +0000 (04:57 -0700)]
ovs-bugtool.in: Remove unused imports.

Also take care of a 'import not at top of file' warning from
flake8.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
7 years agoovs-numa: Fix a compilation error
YAMAMOTO Takashi [Wed, 8 Jun 2016 04:15:20 +0000 (04:15 +0000)]
ovs-numa: Fix a compilation error

Fix the following error on NetBSD 7.0.

    ../lib/ovs-numa.c: In function 'ovs_numa_set_cpu_mask':
    ../lib/ovs-numa.c:555:9: error: array subscript has type 'char' [-Werror=char-subscripts]

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agosparse: Fix conflict between netinet/in.h and linux/in.h
Daniele Di Proietto [Thu, 2 Jun 2016 01:35:55 +0000 (18:35 -0700)]
sparse: Fix conflict between netinet/in.h and linux/in.h

linux/in.h (from linux uapi headers) carries many of the same
definitions as netinet/in.h (from glibc).

If linux/in.h is included after netinet/in.h, conflicts are avoided in
two ways:

1) linux/libc-compat.h (included by linux/in.h) detects the include
   guard of netinet/in.h and defines some macros (e.g.
   __UAPI_DEF_IN_IPPROTO) to 0.  linux/in.h avoids exporting the same
   enums if those macros are 0.

2) The two files are allowed to redefine the same macros as long as the
   values are the same.

Our include/sparse/netinet/in.h creates problems, because:

1) It uses a custom include guard
2) It uses dummy values for some macros.

This commit changes include/sparse/netinet/in.h to use the same include
guard as glibc netinet/in.h, and to use the same values for some macros.

I think this problem is present with linux headers after
a263653ed798("netfilter: don't pull include/linux/netfilter.h from netns
headers") which cause our lib/netlink-conntrack.c to include linux/in.h
after netinet/in.h.

sample output from sparse:

/usr/include/linux/in.h:29:9: warning: preprocessor token IPPROTO_IP
redefined
../include/sparse/netinet/in.h:60:9: this was the original definition
/usr/include/linux/in.h:31:9: warning: preprocessor token IPPROTO_ICMP
redefined
../include/sparse/netinet/in.h:63:9: this was the original definition
[...]
/usr/include/linux/in.h:28:3: error: bad enum definition
/usr/include/linux/in.h:28:3: error: Expected } at end of specifier
/usr/include/linux/in.h:28:3: error: got 0
/usr/include/linux/in.h:84:16: error: redefinition of struct in_addr

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoAdd optional C extension wrapper for Python JSON parsing
Terry Wilson [Wed, 8 Jun 2016 13:55:14 +0000 (08:55 -0500)]
Add optional C extension wrapper for Python JSON parsing

The pure Python in-tree JSON parser is *much* slower than the
in-tree C JSON parser. A local test parsing a 100Mb JSON file
showed the Python version taking 270 seconds. With the C wrapper,
it took under 4 seconds.

The C extension will be used automatically if it can be built. If
the extension fails to build, a warning is displayed and the build
is restarted without the extension.

The Serializer class is replaced with Python's built-in
JSON library since the ability to process chunked data is not
needed in that case.

The extension should work with both Python 2.7 and Python 3.3+.

Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoEnsure significand remains an integer in Python3 json parser
Terry Wilson [Wed, 8 Jun 2016 13:55:13 +0000 (08:55 -0500)]
Ensure significand remains an integer in Python3 json parser

The / operation in Python 2 is "floor division" for int/long types
while in Python 3 is "true division". This means that the
significand can become a float with the existing code in Python 3.
This, in turn, can result in a parse of something like [1.10e1]
returning 11 in Python 2 and 11.0 in Python 3. Switching to the
// operator resolves this difference.

The JSON tests do not catch this difference because the built-in
serializer prints floats with the %.15g format which will convert
floats with no fractional part to an integer representation.

Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
7 years agoofproto-dpif-upcall: Prevent memory leak on log message.
Thadeu Lima de Souza Cascardo [Wed, 8 Jun 2016 16:04:11 +0000 (13:04 -0300)]
ofproto-dpif-upcall: Prevent memory leak on log message.

When DPIF does not support UFID (like old kernels), it may print this
message quite frequently, if using an OVS version that does not include
the upstream fix af50de800ecb ("ofproto-dpif-upcall: Pass key to
dpif_flow_get().").

Fixes: 64bb477f0568 ("dpif: Minimize memory copy for revalidation.")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
7 years agotunnels: Update schema documentation related to tunnels.
Jesse Gross [Tue, 7 Jun 2016 20:53:44 +0000 (13:53 -0700)]
tunnels: Update schema documentation related to tunnels.

As both OVS and tunnel protocols themselves have evolved, some changes
have caused the documentation to drift from current reality.

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoxenserver: Remove deprecated print statement.
Joe Stringer [Tue, 24 May 2016 01:20:31 +0000 (18:20 -0700)]
xenserver: Remove deprecated print statement.

PEP 3105 removed the print statement in favour of a print function.
Replace usage of the old statement with equivalent functionality that
works in both python2.7 and python3.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoxenserver: Use PEP 3110 exception syntax.
Joe Stringer [Tue, 24 May 2016 01:20:30 +0000 (18:20 -0700)]
xenserver: Use PEP 3110 exception syntax.

This syntax is usable with both python2.7 and python3, so use it instead
of the outdated syntax.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoxenserver: Remove tuple unpacking in lambdas.
Joe Stringer [Tue, 24 May 2016 01:20:29 +0000 (18:20 -0700)]
xenserver: Remove tuple unpacking in lambdas.

PEP 3113 removed the use of tuple parameter unpacking in conjunction
with lambdas, replace this code with something that works in python2.7
and python3.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoxenserver: Fix list/map access for python3.
Joe Stringer [Tue, 24 May 2016 01:20:28 +0000 (18:20 -0700)]
xenserver: Fix list/map access for python3.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoxenserver: Fix string compatibility in python3.
Joe Stringer [Tue, 24 May 2016 01:20:27 +0000 (18:20 -0700)]
xenserver: Fix string compatibility in python3.

PEP 3120 made UTF-8 the default source encoding for python3 strings;
ensure that the output for strings are consistent between python2.7 and
python3.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoxenserver: Sort vsctl port options.
Joe Stringer [Tue, 24 May 2016 01:20:26 +0000 (18:20 -0700)]
xenserver: Sort vsctl port options.

In python3, dictionaries are less likely to be sorted consistently from
one run to the next, so sort port options when outputting to provide
reliable test results.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
7 years agoINSTALL.Debian.md: Describe a pitfall and some solutions.
Ben Pfaff [Thu, 2 Jun 2016 23:13:13 +0000 (16:13 -0700)]
INSTALL.Debian.md: Describe a pitfall and some solutions.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
7 years agotestsuite: Add PMD specific tests.
Ilya Maximets [Tue, 7 Jun 2016 12:36:21 +0000 (15:36 +0300)]
testsuite: Add PMD specific tests.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agoofproto-dpif.at: Run tests with dummy-pmd.
Ilya Maximets [Tue, 7 Jun 2016 12:36:20 +0000 (15:36 +0300)]
ofproto-dpif.at: Run tests with dummy-pmd.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agodpif-netdev.at: Run tests with dummy-pmd.
Ilya Maximets [Tue, 7 Jun 2016 12:36:19 +0000 (15:36 +0300)]
dpif-netdev.at: Run tests with dummy-pmd.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
7 years agotests: Allow extra cmd line args to OVS_VSWITCHD_START.
Daniele Di Proietto [Tue, 7 Jun 2016 00:05:49 +0000 (17:05 -0700)]
tests: Allow extra cmd line args to OVS_VSWITCHD_START.

This will be used by a following commit, to add dummy-numa options.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agovswitchd: Add --dummy-numa command line option.
Daniele Di Proietto [Tue, 7 Jun 2016 00:05:49 +0000 (17:05 -0700)]
vswitchd: Add --dummy-numa command line option.

This option is used to initialize the ovs_numa module with a fake
configuration and to avoid pthread_setaffinity_np() calls.  It will be
useful to test dpif-netdev with pmd threads.

Since it is only used for testing it is not documented in the man pages.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agonetdev-dummy: Introduce sched_yield() in rxq_recv() for pmd devices.
Daniele Di Proietto [Tue, 7 Jun 2016 00:05:49 +0000 (17:05 -0700)]
netdev-dummy: Introduce sched_yield() in rxq_recv() for pmd devices.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agoovs-numa: Introduce function to set current thread affinity.
Daniele Di Proietto [Tue, 7 Jun 2016 00:05:49 +0000 (17:05 -0700)]
ovs-numa: Introduce function to set current thread affinity.

This commit moves the code that sets the pmd threads affinity from
netdev-dpdk to ovs-numa.  There's one small part left in netdev-dpdk, to
set the lcore_id.

Now dpif-netdev will call both modules (ovs-numa and netdev-dpdk) when
starting a pmd thread.

This change will allow having a dummy implementation of the set affinity
call, for testing purposes.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
7 years agoovs-numa: Remove non-linux stubs.
Daniele Di Proietto [Tue, 7 Jun 2016 00:05:49 +0000 (17:05 -0700)]
ovs-numa: Remove non-linux stubs.

Instead of having static inline stubs for non linux platform we can use
the implementations in ovs-numa.c.  With one small change to
ovs_numa_dump_cores_on_numa(), they will behave exactly like the
stubs for the non-linux case, because 'found_numa_and_core' will be
false and the socket and cpu hmaps will be empty.

There are a few places where conditional compilation is required: the
code that parses the linux specific sysfs entries and its dependencies.
It requires opendir() and readdir() and doesn't make sense outside of
linux anyway.

This change is required to have a cross-platform ovs-numa dummy
implementation for testing.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>