5 ** New OVN logical actions
7 *** icmp4 { action... }
9 Generates an ICMPv4 packet based on the current IPv4 packet and
10 processes it according to each nested action (and then pops back to
11 processing the original IPv4 packet). The intended use case is for
12 generating "time exceeded" and "destination unreachable" errors.
14 ovn-sb.xml includes a tentative specification for this action.
16 Tentatively, the icmp4 action sets a default icmp_type and icmp_code
17 and lets the nested actions override it. This means that we'd have to
18 make icmp_type and icmp_code writable. Because changing icmp_type and
19 icmp_code can change the interpretation of the rest of the data in the
20 ICMP packet, we would want to think this through carefully. If it
21 seems like a bad idea then we could instead make the type and code a
22 parameter to the action: icmp4(type, code) { action... }
24 It is worth considering what should be considered the ingress port for
25 the ICMPv4 packet. It's quite likely that the ICMPv4 packet is going
26 to go back out the ingress port. Maybe the icmp4 action, therefore,
27 should clear the inport, so that output to the original inport won't
32 Transforms the current TCP packet into a RST reply.
34 ovn-sb.xml includes a tentative specification for this action.
36 *** Other actions for IPv6.
38 IPv6 will probably need an action or actions for ND that is similar to
39 the "arp" action, and an action for generating
41 *** ct_label 128-bit support.
43 We only support 64-bits for the ct_label argument to ct_commit(), but ct_label
44 is a 128-bit field. The OVN lexer only supports parsing 64-bit integers, but
45 we can use parse_int_string() to support larger integers.
55 ** Dynamic IP to MAC bindings
57 OVN has basic support for establishing IP to MAC bindings dynamically,
62 From casual observation, Linux appears to generate at most one ARP per
63 second per destination.
65 This might be supported by adding a new OVN logical action for
70 It's probably best to only record in the database responses to queries
71 actually issued by an L3 logical router, so somehow they have to be
72 tracked, probably by putting a tentative binding without a MAC address
75 *** Renewal and expiration.
77 Something needs to make sure that bindings remain valid and expire
78 those that become stale.
80 One way to do this might be to add some support for time to the
81 database server itself.
83 *** Table size limiting.
85 The table of MAC bindings must not be allowed to grow unreasonably
88 ** MTU handling (fragmentation on output)
92 ** ovn-controller parameters and configuration.
94 *** SSL configuration.
96 Can probably get this from Open_vSwitch database.
100 *** Limiting the impact of a compromised chassis.
102 Every instance of ovn-controller has the same full access to the central
103 OVN_Southbound database. This means that a compromised chassis can
104 interfere with the normal operation of the rest of the deployment. Some
105 specific examples include writing to the logical flow table to alter
106 traffic handling or updating the port binding table to claim ports that are
107 actually present on a different chassis. In practice, the compromised host
108 would be fighting against ovn-northd and other instances of ovn-controller
109 that would be trying to restore the correct state. The impact could include
110 at least temporarily redirecting traffic (so the compromised host could
111 receive traffic that it shouldn't) and potentially a more general denial of
114 There are different potential improvements to this area. The first would be
115 to add some sort of ACL scheme to ovsdb-server. A proposal for this should
116 first include an ACL scheme for ovn-controller. An example policy would
117 be to make Logical_Flow read-only. Table-level control is needed, but is
118 not enough. For example, ovn-controller must be able to update the Chassis
119 and Encap tables, but should only be able to modify the rows associated with
120 that chassis and no others.
122 A more complex example is the Port_Binding table. Currently, ovn-controller
123 is the source of truth of where a port is located. There seems to be no
124 policy that can prevent malicious behavior of a compromised host with this
127 An alternative scheme for port bindings would be to provide an optional mode
128 where an external entity controls port bindings and make them read-only to
129 ovn-controller. This is actually how OpenStack works today, for example.
130 The part of OpenStack that manages VMs (Nova) tells the networking component
131 (Neutron) where a port will be located, as opposed to the networking
132 component discovering it.
134 ** Gratuitous ARP generation
136 ovn-controller should generate a GARP when a port is bound to a chassis.
137 This is needed when ports are migrated from one chassis to another, such
138 as live migrating a VM.
142 ovsdb-server should have adequate features for OVN but it probably
143 needs work for scale and possibly for availability as deployments
144 grow. Here are some thoughts.
146 Andy Zhou is looking at these issues.
148 *** Reducing amount of data sent to clients.
150 Currently, whenever a row monitored by a client changes,
151 ovsdb-server sends the client every monitored column in the row,
152 even if only one column changes. It might be valuable to reduce
153 this only to the columns that changes.
155 Also, whenever a column changes, ovsdb-server sends the entire
156 contents of the column. It might be valuable, for columns that
157 are sets or maps, to send only added or removed values or
160 Currently, clients monitor the entire contents of a table. It
161 might make sense to allow clients to monitor only rows that
162 satisfy specific criteria, e.g. to allow an ovn-controller to
163 receive only Logical_Flow rows for logical networks on its hypervisor.
165 *** Reducing redundant data and code within ovsdb-server.
167 Currently, ovsdb-server separately composes database update
168 information to send to each of its clients. This is fine for a
169 small number of clients, but it wastes time and memory when
170 hundreds of clients all want the same updates (as will be in the
173 (This is somewhat opposed to the idea of letting a client monitor
174 only some rows in a table, since that would increase the diversity
179 If it turns out that other changes don't let ovsdb-server scale
180 adequately, we can multithread ovsdb-server. Initially one might
181 only break protocol handling into separate threads, leaving the
182 actual database work serialized through a lock.
184 ** Increasing availability.
186 Database availability might become an issue. The OVN system
187 shouldn't grind to a halt if the database becomes unavailable, but
188 it would become impossible to bring VIFs up or down, etc.
190 My current thought on how to increase availability is to add
191 clustering to ovsdb-server, probably via the Raft consensus
192 algorithm. As an experiment, I wrote an implementation of Raft
193 for Open vSwitch that you can clone from:
195 https://github.com/blp/ovs-reviews.git raft
197 ** Reducing startup time.
199 As-is, if ovsdb-server restarts, every client will fetch a fresh
200 copy of the part of the database that it cares about. With
201 hundreds of clients, this could cause heavy CPU load on
202 ovsdb-server and use excessive network bandwidth. It would be
203 better to allow incremental updates even across connection loss.
204 One way might be to use "Difference Digests" as described in
205 Epstein et al., "What's the Difference? Efficient Set
206 Reconciliation Without Prior Context". (I'm not yet aware of
207 previous non-academic use of this technique.)
209 ** Support multiple tunnel encapsulations in Chassis.
211 So far, both ovn-controller and ovn-controller-vtep only allow
212 chassis to have one tunnel encapsulation entry. We should extend
213 the implementation to support multiple tunnel encapsulations.
215 ** Update learned MAC addresses from VTEP to OVN
217 The VTEP gateway stores all MAC addresses learned from its
218 physical interfaces in the 'Ucast_Macs_Local' and the
219 'Mcast_Macs_Local' tables. ovn-controller-vtep should be
220 able to update that information back to ovn-sb database,
221 so that other chassis know where to send packets destined
222 to the extended external network instead of broadcasting.
224 ** Translate ovn-sb Multicast_Group table into VTEP config
226 The ovn-controller-vtep daemon should be able to translate
227 the Multicast_Group table entry in ovn-sb database into
228 Mcast_Macs_Remote table configuration in VTEP database.
230 * Consider the use of BFD as tunnel monitor.
232 The use of BFD for hypervisor-to-hypervisor tunnels is probably not worth it,
233 since there's no alternative to switch to if a tunnel goes down. It could
234 make sense at a slow rate if someone does OVN monitoring system integration,
237 When OVN gets to supporting HA for gateways (see ovn/OVN-GW-HA.md), BFD is
238 likely needed as a part of that solution.
240 There's more commentary in this ML post:
241 http://openvswitch.org/pipermail/dev/2015-November/062385.html
247 ** Support reject action.
249 ** Support log option.
251 * Software L2 gateway
253 ** Support "chassis" option in Logical_Switch_Port with type of "l2gateway".
255 Right now an "l2gateway" port is bound to a chassis by setting the "chassis"
256 column of the port binding in the southbound database directly. We should
257 support a "chassis" option in the "options" column of the
258 "Logical_Switch_Port" in the northbound database. This would bring
259 "l2gateway" into alignment with how chassis binding is done for L3 gateways
260 (a "chassis" option for Logical_Router).