X-Git-Url: http://git.cascardo.eti.br/?a=blobdiff_plain;f=ovn%2FTODO;h=2c696a0749efdbdbc7d8176758710b689c912aa0;hb=220b0d192e8d8bd6fd9b932874fd6ed927278198;hp=ec05a83ae728754f8bccbcc7dc66de480ec4a471;hpb=f2d371f7f6e765d8e97a3bd31fffa8197bd45fe2;p=cascardo%2Fovs.git diff --git a/ovn/TODO b/ovn/TODO index ec05a83ae..2c696a074 100644 --- a/ovn/TODO +++ b/ovn/TODO @@ -1,84 +1,121 @@ -* ovn-controller +-*- outline -*- + +* L3 support + +** New OVN logical actions + +*** arp + +Generates an ARP packet based on the current IPv4 packet and allows it +to be processed as part of the current pipeline (and then pop back to +processing the original IPv4 packet). + +TCP/IP stacks typically limit the rate at which ARPs are sent, e.g. to +one per second for a given target. We might need to do this too. + +We probably need to buffer the packet that generated the ARP. I don't +know where to do that. + +*** icmp4 { action... } + +Generates an ICMPv4 packet based on the current IPv4 packet and +processes it according to each nested action (and then pops back to +processing the original IPv4 packet). The intended use case is for +generating "time exceeded" and "destination unreachable" errors. + +ovn-sb.xml includes a tentative specification for this action. + +Tentatively, the icmp4 action sets a default icmp_type and icmp_code +and lets the nested actions override it. This means that we'd have to +make icmp_type and icmp_code writable. Because changing icmp_type and +icmp_code can change the interpretation of the rest of the data in the +ICMP packet, we would want to think this through carefully. If it +seems like a bad idea then we could instead make the type and code a +parameter to the action: icmp4(type, code) { action... } + +It is worth considering what should be considered the ingress port for +the ICMPv4 packet. It's quite likely that the ICMPv4 packet is going +to go back out the ingress port. Maybe the icmp4 action, therefore, +should clear the inport, so that output to the original inport won't +be discarded. + +*** tcp_reset + +Transforms the current TCP packet into a RST reply. + +ovn-sb.xml includes a tentative specification for this action. + +*** Other actions for IPv6. + +IPv6 will probably need an action or actions for ND that is similar to +the "arp" action, and an action for generating + +** IPv6 + +*** ND versus ARP + +*** IPv6 routing + +*** ICMPv6 + +** Dynamic IP to MAC bindings + +Some bindings from IP address to MAC will undoubtedly need to be +discovered dynamically through ARP requests. It's straightforward +enough for a logical L3 router to generate ARP requests and forward +them to the appropriate switch. + +It's more difficult to figure out where the reply should be processed +and stored. It might seem at first that a first-cut implementation +could just keep track of the binding on the hypervisor that needs to +know, but that can't happen easily because the VM that sends the reply +might not be on the same HV as the VM that needs the answer (that is, +the VM that sent the packet that needs the binding to be resolved) and +there isn't an easy way for it to know which HV needs the answer. -** Flow table handling in ovn-controller. - - ovn-controller has to transform logical datapath flows from the - database into OpenFlow flows. - -*** Definition (or choice) of data structure for flows and flow table. - - It would be natural enough to use "struct flow" and "struct - classifier" for this. Maybe that is what we should do. However, - "struct classifier" is optimized for searches based on packet - headers, whereas all we care about here can be implemented with a - hash table. Also, we may want to make it easy to add and remove - support for fields without recompiling, which is not possible with - "struct flow" or "struct classifier". - - On the other hand, we may find that it is difficult to decide that - two OXM flow matches are identical (to normalize them) without a - lot of domain-specific knowledge that is already embedded in struct - flow. It's also going to be a pain to come up with a way to make - anything other than "struct flow" work with the ofputil_*() - functions for encoding and decoding OpenFlow. - - It's also possible we could use struct flow without struct - classifier. - -*** Translating logical datapath actions into OpenFlow actions. - - Some of the logical datapath actions do not have natural - representations as OpenFlow actions: they require - packet-in/packet-out round trips through ovn-controller. The - trickiest part of that is going to be making sure that the - packet-out resumes the control flow that was broken off by the - packet-in. That's tricky; we'll probably have to restrict control - flow or add OVS features to make resuming in general possible. Not - sure which is better at this point. - -*** OpenFlow flow table synchronization. - - The internal representation of the OpenFlow flow table has to be - synced across the controller connection to OVS. This probably - boils down to the "flow monitoring" feature of OF1.4 which was then - made available as a "standard extension" to OF1.3. (OVS hasn't - implemented this for OF1.4 yet, but the feature is based on a OVS - extension to OF1.0, so it should be straightforward to add it.) - - We probably need some way to catch cases where OVS and OVN don't - see eye-to-eye on what exactly constitutes a flow, so that OVN - doesn't waste a lot of CPU time hammering at OVS trying to install - something that it's not going to do. - -*** Logical/physical translation. - - When a packet comes into the integration bridge, the first stage of - processing needs to translate it from a physical to a logical - context. When a packet leaves the integration bridge, the final - stage of processing needs to translate it back into a physical - context. ovn-controller needs to populate the OpenFlow flows - tables to do these translations. - -*** Determine how to split logical pipeline across physical nodes. - - From the original OVN architecture document: - - The pipeline processing is split between the ingress and egress - transport nodes. In particular, the logical egress processing may - occur at either hypervisor. Processing the logical egress on the - ingress hypervisor requires more state about the egress vif's - policies, but reduces traffic on the wire that would eventually be - dropped. Whereas, processing on the egress hypervisor can reduce - broadcast traffic on the wire by doing local replication. We - initially plan to process logical egress on the egress hypervisor - so that less state needs to be replicated. However, we may change - this behavior once we gain some experience writing the logical - flows. - - The split pipeline processing split will influence how tunnel keys - are encoded. - -*** Monitor Pipeline table in OVN, trigger flow table recomputation on change. +Thus, the HV that processes the ARP reply (which is unknown when the +ARP is sent) has to tell all the HVs the binding. The most obvious +place for this in the OVN_Southbound database. + +Details need to be worked out, including: + +*** OVN_Southbound schema changes. + +Possibly bindings could be added to the Port_Binding table by adding +or modifying columns. Another possibility is that another table +should be added. + +*** Logical_Flow representation + +It would be really nice to maintain the general-purpose nature of +logical flows, but these bindings might have to include some +hard-coded special cases, especially when it comes to the relationship +with populating the bindings into the OVN_Southbound table. + +*** Tracking queries + +It's probably best to only record in the database responses to queries +actually issued by an L3 logical router, so somehow they have to be +tracked, probably by putting a tentative binding without a MAC address +into the database. + +*** Renewal and expiration. + +Something needs to make sure that bindings remain valid and expire +those that become stale. + +** MTU handling (fragmentation on output) + +** Ratelimiting. + +*** ARP. + +*** ICMP error generation, TCP reset, UDP unreachable, protocol unreachable, ... + +As a point of comparison, Linux doesn't ratelimit TCP resets but I +think it does everything else. + +* ovn-controller ** ovn-controller parameters and configuration. @@ -86,6 +123,48 @@ Can probably get this from Open_vSwitch database. +** Security + +*** Limiting the impact of a compromised chassis. + + Every instance of ovn-controller has the same full access to the central + OVN_Southbound database. This means that a compromised chassis can + interfere with the normal operation of the rest of the deployment. Some + specific examples include writing to the logical flow table to alter + traffic handling or updating the port binding table to claim ports that are + actually present on a different chassis. In practice, the compromised host + would be fighting against ovn-northd and other instances of ovn-controller + that would be trying to restore the correct state. The impact could include + at least temporarily redirecting traffic (so the compromised host could + receive traffic that it shouldn't) and potentially a more general denial of + service. + + There are different potential improvements to this area. The first would be + to add some sort of ACL scheme to ovsdb-server. A proposal for this should + first include an ACL scheme for ovn-controller. An example policy would + be to make Logical_Flow read-only. Table-level control is needed, but is + not enough. For example, ovn-controller must be able to update the Chassis + and Encap tables, but should only be able to modify the rows associated with + that chassis and no others. + + A more complex example is the Port_Binding table. Currently, ovn-controller + is the source of truth of where a port is located. There seems to be no + policy that can prevent malicious behavior of a compromised host with this + table. + + An alternative scheme for port bindings would be to provide an optional mode + where an external entity controls port bindings and make them read-only to + ovn-controller. This is actually how OpenStack works today, for example. + The part of OpenStack that manages VMs (Nova) tells the networking component + (Neutron) where a port will be located, as opposed to the networking + component discovering it. + +** Gratuitous ARP generation + + ovn-controller should generate a GARP when a port is bound to a chassis. + This is needed when ports are migrated from one chassis to another, such + as live migrating a VM. + * ovsdb-server ovsdb-server should have adequate features for OVN but it probably @@ -94,13 +173,6 @@ Andy Zhou is looking at these issues. -** Scaling number of connections. - - In typical use today a given ovsdb-server has only a single-digit - number of simultaneous connections. The OVN Southbound database will - have a connection from every hypervisor. This use case needs testing - and probably coding work. Here are some possible improvements. - *** Reducing amount of data sent to clients. Currently, whenever a row monitored by a client changes, @@ -116,7 +188,7 @@ Currently, clients monitor the entire contents of a table. It might make sense to allow clients to monitor only rows that satisfy specific criteria, e.g. to allow an ovn-controller to - receive only Pipeline rows for logical networks on its hypervisor. + receive only Logical_Flow rows for logical networks on its hypervisor. *** Reducing redundant data and code within ovsdb-server. @@ -162,21 +234,44 @@ Reconciliation Without Prior Context". (I'm not yet aware of previous non-academic use of this technique.) -* Miscellaneous: +** Support multiple tunnel encapsulations in Chassis. + + So far, both ovn-controller and ovn-controller-vtep only allow + chassis to have one tunnel encapsulation entry. We should extend + the implementation to support multiple tunnel encapsulations. + +** Update learned MAC addresses from VTEP to OVN + + The VTEP gateway stores all MAC addresses learned from its + physical interfaces in the 'Ucast_Macs_Local' and the + 'Mcast_Macs_Local' tables. ovn-controller-vtep should be + able to update that information back to ovn-sb database, + so that other chassis know where to send packets destined + to the extended external network instead of broadcasting. + +** Translate ovn-sb Multicast_Group table into VTEP config + + The ovn-controller-vtep daemon should be able to translate + the Multicast_Group table entry in ovn-sb database into + Mcast_Macs_Remote table configuration in VTEP database. -** Init scripts for ovn-controller (on HVs), ovn-northd, OVN DB server. +* Consider the use of BFD as tunnel monitor. -** Distribution packaging. + The use of BFD for hypervisor-to-hypervisor tunnels is probably not worth it, + since there's no alternative to switch to if a tunnel goes down. It could + make sense at a slow rate if someone does OVN monitoring system integration, + but not otherwise. -* Not yet scoped: + When OVN gets to supporting HA for gateways (see ovn/OVN-GW-HA.md), BFD is + likely needed as a part of that solution. -** Neutron plugin. + There's more commentary in this ML post: + http://openvswitch.org/pipermail/dev/2015-November/062385.html - This is being developed on OpenStack's development infrastructure - to be along side most of the other Neutron plugins. +* ACL - http://git.openstack.org/cgit/stackforge/networking-ovn +** Support FTP ALGs. - http://git.openstack.org/cgit/stackforge/networking-ovn/tree/doc/source/todo.rst +** Support reject action. -** Gateways. +** Support log option.