ovn/TODO

   1 * Flow match expression handling library.
   2
   3   ovn-controller is the primary user of flow match expressions, but
   4   the same syntax and I imagine the same code ought to be useful in
   5   ovn-nbd for ACL match expressions.
   6
   7 ** Definition of data structures to represent a match expression as a
   8    syntax tree.
   9
  10 ** Definition of data structures to represent variables (fields).
  11
  12    Fields need names and prerequisites.  Most fields are numeric and
  13    thus need widths.  We need also need a way to represent nominal
  14    fields (currently just logical port names).  It might be
  15    appropriate to associate fields directly with OXM/NXM code points;
  16    we have to decide whether we want OVN to use the OVS flow structure
  17    or work with OXM more directly.
  18
  19    Probably should be defined so that the data structure is also
  20    useful for references to fields in action parsing.
  21
  22 ** Lexical analysis.
  23
  24    Probably should be defined so that the lexer can be reused for
  25    parsing actions.
  26
  27 ** Parsing into syntax tree.
  28
  29 ** Semantic checking against variable definitions.
  30
  31 ** Applying prerequisites.
  32
  33 ** Simplification into conjunction-of-disjunctions (CoD) form.
  34
  35 ** Transformation from CoD form into OXM matches.
  36
  37 * ovn-controller
  38
  39 ** Flow table handling in ovn-controller.
  40
  41    ovn-controller has to transform logical datapath flows from the
  42    database into OpenFlow flows.
  43
  44 *** Definition (or choice) of data structure for flows and flow table.
  45
  46     It would be natural enough to use "struct flow" and "struct
  47     classifier" for this.  Maybe that is what we should do.  However,
  48     "struct classifier" is optimized for searches based on packet
  49     headers, whereas all we care about here can be implemented with a
  50     hash table.  Also, we may want to make it easy to add and remove
  51     support for fields without recompiling, which is not possible with
  52     "struct flow" or "struct classifier".
  53
  54     On the other hand, we may find that it is difficult to decide that
  55     two OXM flow matches are identical (to normalize them) without a
  56     lot of domain-specific knowledge that is already embedded in struct
  57     flow.  It's also going to be a pain to come up with a way to make
  58     anything other than "struct flow" work with the ofputil_*()
  59     functions for encoding and decoding OpenFlow.
  60
  61     It's also possible we could use struct flow without struct
  62     classifier.
  63
  64 *** Assembling conjunctive flows from flow match expressions.
  65
  66     This transformation explodes logical datapath flows into multiple
  67     OpenFlow flow table entries, since a flow match expression in CoD
  68     form requires several OpenFlow flow table entries.  It also
  69     requires merging together OpenFlow flow tables entries that contain
  70     "conjunction" actions (really just concatenating their actions).
  71
  72 *** Translating logical datapath port names into port numbers.
  73
  74     Logical ports are specified by name in logical datapath flows, but
  75     OpenFlow only works in terms of numbers.
  76
  77 *** Translating logical datapath actions into OpenFlow actions.
  78
  79     Some of the logical datapath actions do not have natural
  80     representations as OpenFlow actions: they require
  81     packet-in/packet-out round trips through ovn-controller.  The
  82     trickiest part of that is going to be making sure that the
  83     packet-out resumes the control flow that was broken off by the
  84     packet-in.  That's tricky; we'll probably have to restrict control
  85     flow or add OVS features to make resuming in general possible.  Not
  86     sure which is better at this point.
  87
  88 *** OpenFlow flow table synchronization.
  89
  90     The internal representation of the OpenFlow flow table has to be
  91     synced across the controller connection to OVS.  This probably
  92     boils down to the "flow monitoring" feature of OF1.4 which was then
  93     made available as a "standard extension" to OF1.3.  (OVS hasn't
  94     implemented this for OF1.4 yet, but the feature is based on a OVS
  95     extension to OF1.0, so it should be straightforward to add it.)
  96
  97     We probably need some way to catch cases where OVS and OVN don't
  98     see eye-to-eye on what exactly constitutes a flow, so that OVN
  99     doesn't waste a lot of CPU time hammering at OVS trying to install
 100     something that it's not going to do.
 101
 102 *** Logical/physical translation.
 103
 104     When a packet comes into the integration bridge, the first stage of
 105     processing needs to translate it from a physical to a logical
 106     context.  When a packet leaves the integration bridge, the final
 107     stage of processing needs to translate it back into a physical
 108     context.  ovn-controller needs to populate the OpenFlow flows
 109     tables to do these translations.
 110
 111 *** Determine how to split logical pipeline across physical nodes.
 112
 113     From the original OVN architecture document:
 114
 115     The pipeline processing is split between the ingress and egress
 116     transport nodes.  In particular, the logical egress processing may
 117     occur at either hypervisor.  Processing the logical egress on the
 118     ingress hypervisor requires more state about the egress vif's
 119     policies, but reduces traffic on the wire that would eventually be
 120     dropped.  Whereas, processing on the egress hypervisor can reduce
 121     broadcast traffic on the wire by doing local replication.  We
 122     initially plan to process logical egress on the egress hypervisor
 123     so that less state needs to be replicated.  However, we may change
 124     this behavior once we gain some experience writing the logical
 125     flows.
 126
 127     The split pipeline processing split will influence how tunnel keys
 128     are encoded.
 129
 130 ** Interaction with Open_vSwitch and OVN databases:
 131
 132 *** Monitor VIFs attached to the integration bridge in Open_vSwitch.
 133
 134     In response to changes, add or remove corresponding rows in
 135     Bindings table in OVN.
 136
 137 *** Populate Chassis row in OVN at startup.  Maintain Chassis row over time.
 138
 139     (Warn if any other Chassis claims the same IP address.)
 140
 141 *** Remove Chassis and Bindings rows from OVN on exit.
 142
 143 *** Monitor Chassis table in OVN.
 144
 145     Populate Port records for tunnels to other chassis into
 146     Open_vSwitch database.  As a scale optimization later on, one can
 147     populate only records for tunnels to other chassis that have
 148     logical networks in common with this one.
 149
 150 *** Monitor Pipeline table in OVN, trigger flow table recomputation on change.
 151
 152 ** ovn-controller parameters and configuration.
 153
 154 *** Tunnel encapsulation to publish.
 155
 156     Default: VXLAN? Geneve?
 157
 158 *** Location of Open_vSwitch database.
 159
 160     We can probably use the same default as ovs-vsctl.
 161
 162 *** Location of OVN database.
 163
 164     Probably no useful default.
 165
 166 *** SSL configuration.
 167
 168     Can probably get this from Open_vSwitch database.
 169
 170 * ovn-nbd
 171
 172 ** Monitor OVN_Northbound database, trigger Pipeline recomputation on change.
 173
 174 ** Translate each OVN_Northbound entity into Pipeline logical datapath flows.
 175
 176    We have to first sit down and figure out what the general
 177    translation of each entity is.  The original OVN architecture
 178    description at
 179    http://openvswitch.org/pipermail/dev/2015-January/050380.html had
 180    some sketches of these, but they need to be completed and
 181    elaborated.
 182
 183    Initially, the simplest way to do this is probably to write
 184    straight C code to do a full translation of the entire
 185    OVN_Northbound database into the format for the Pipeline table in
 186    the OVN database.  As scale increases, this will probably be too
 187    inefficient since a small change in OVN_Northbound requires a full
 188    recomputation.  At that point, we probably want to adopt a more
 189    systematic approach, such as something akin to the "nlog" system
 190    used in NVP (see Koponen et al. "Network Virtualization in
 191    Multi-tenant Datacenters", NSDI 2014).
 192
 193 ** Push logical datapath flows to Pipeline table.
 194
 195 ** Monitor OVN database Bindings table.
 196
 197    Sync rows in the OVN Bindings table to the "up" column in the
 198    OVN_Northbound database.
 199
 200 * ovsdb-server
 201
 202   ovsdb-server should have adequate features for OVN but it probably
 203   needs work for scale and possibly for availability as deployments
 204   grow.  Here are some thoughts.
 205
 206   Andy Zhou is looking at these issues.
 207
 208 ** Scaling number of connections.
 209
 210    In typical use today a given ovsdb-server has only a single-digit
 211    number of simultaneous connections.  The OVN database will have a
 212    connection from every hypervisor.  This use case needs testing and
 213    probably coding work.  Here are some possible improvements.
 214
 215 *** Reducing amount of data sent to clients.
 216
 217     Currently, whenever a row monitored by a client changes,
 218     ovsdb-server sends the client every monitored column in the row,
 219     even if only one column changes.  It might be valuable to reduce
 220     this only to the columns that changes.
 221
 222     Also, whenever a column changes, ovsdb-server sends the entire
 223     contents of the column.  It might be valuable, for columns that
 224     are sets or maps, to send only added or removed values or
 225     key-values pairs.
 226
 227     Currently, clients monitor the entire contents of a table.  It
 228     might make sense to allow clients to monitor only rows that
 229     satisfy specific criteria, e.g. to allow an ovn-controller to
 230     receive only Pipeline rows for logical networks on its hypervisor.
 231
 232 *** Reducing redundant data and code within ovsdb-server.
 233
 234     Currently, ovsdb-server separately composes database update
 235     information to send to each of its clients.  This is fine for a
 236     small number of clients, but it wastes time and memory when
 237     hundreds of clients all want the same updates (as will be in the
 238     case in OVN).
 239
 240     (This is somewhat opposed to the idea of letting a client monitor
 241     only some rows in a table, since that would increase the diversity
 242     among clients.)
 243
 244 *** Multithreading.
 245
 246     If it turns out that other changes don't let ovsdb-server scale
 247     adequately, we can multithread ovsdb-server.  Initially one might
 248     only break protocol handling into separate threads, leaving the
 249     actual database work serialized through a lock.
 250
 251 ** Increasing availability.
 252
 253    Database availability might become an issue.  The OVN system
 254    shouldn't grind to a halt if the database becomes unavailable, but
 255    it would become impossible to bring VIFs up or down, etc.
 256
 257    My current thought on how to increase availability is to add
 258    clustering to ovsdb-server, probably via the Raft consensus
 259    algorithm.  As an experiment, I wrote an implementation of Raft
 260    for Open vSwitch that you can clone from:
 261
 262        https://github.com/blp/ovs-reviews.git raft
 263
 264 ** Reducing startup time.
 265
 266    As-is, if ovsdb-server restarts, every client will fetch a fresh
 267    copy of the part of the database that it cares about.  With
 268    hundreds of clients, this could cause heavy CPU load on
 269    ovsdb-server and use excessive network bandwidth.  It would be
 270    better to allow incremental updates even across connection loss.
 271    One way might be to use "Difference Digests" as described in
 272    Epstein et al., "What's the Difference? Efficient Set
 273    Reconciliation Without Prior Context".  (I'm not yet aware of
 274    previous non-academic use of this technique.)
 275
 276 * Miscellaneous:
 277
 278 ** Write ovn-nbctl utility.
 279
 280    The idea here is that we need a utility to act on the OVN_Northbound
 281    database in a way similar to a CMS, so that we can do some testing
 282    without an actual CMS in the picture.
 283
 284    No details yet.
 285
 286 ** Init scripts for ovn-controller (on HVs), ovn-nbd, OVN DB server.
 287
 288 ** Distribution packaging.
 289
 290 * Not yet scoped:
 291
 292 ** Neutron plugin.
 293
 294    This is being developed on OpenStack's development infrastructure
 295    to be along side most of the other Neutron plugins.
 296
 297    http://git.openstack.org/cgit/stackforge/networking-ovn
 298
 299    http://git.openstack.org/cgit/stackforge/networking-ovn/tree/doc/source/todo.rst
 300
 301 ** Gateways.