X-Git-Url: http://git.cascardo.eti.br/?a=blobdiff_plain;f=DESIGN.md;h=6865d477fe665aa41bd091569fbb71c7327b211d;hb=7ae1f322d7794dc5c0528ef1a01154cbc58684d0;hp=bd0ed272d38494ae3e5e10f065e319d03f6c0873;hpb=ca26eb4437120e3b95b3727ccc6037dfa4e4065d;p=cascardo%2Fovs.git diff --git a/DESIGN.md b/DESIGN.md index bd0ed272d..6865d477f 100644 --- a/DESIGN.md +++ b/DESIGN.md @@ -54,16 +54,34 @@ sent, an entry labeled "---" means that the message is suppressed. OFPR_NO_MATCH yes --- OFPR_ACTION yes --- OFPR_INVALID_TTL --- --- + OFPR_ACTION_SET (OF1.4+) yes --- + OFPR_GROUP (OF1.4+) yes --- OFPT_FLOW_REMOVED / NXT_FLOW_REMOVED OFPRR_IDLE_TIMEOUT yes --- OFPRR_HARD_TIMEOUT yes --- OFPRR_DELETE yes --- + OFPRR_GROUP_DELETE (OF1.4+) yes --- + OFPRR_METER_DELETE (OF1.4+) yes --- + OFPRR_EVICTION (OF1.4+) yes --- OFPT_PORT_STATUS OFPPR_ADD yes yes OFPPR_DELETE yes yes OFPPR_MODIFY yes yes + + OFPT_ROLE_REQUEST / OFPT_ROLE_REPLY (OF1.4+) + OFPCRR_MASTER_REQUEST --- --- + OFPCRR_CONFIG --- --- + OFPCRR_EXPERIMENTER --- --- + + OFPT_TABLE_STATUS (OF1.4+) + OFPTR_VACANCY_DOWN --- --- + OFPTR_VACANCY_UP --- --- + + OFPT_REQUESTFORWARD (OF1.4+) + OFPRFR_GROUP_MOD --- --- + OFPRFR_METER_MOD --- --- ``` The NXT_SET_ASYNC_CONFIG message directly sets all of the values in @@ -275,13 +293,68 @@ The table for 1.3 is the same as the one shown above for 1.2. OpenFlow 1.4 ------------- +----------- + +OpenFlow 1.4 makes these changes: + + - Adds the "importance" field to flow_mods, but it does not + explicitly specify which kinds of flow_mods set the importance. + For consistency, Open vSwitch uses the same rule for importance + as for idle_timeout and hard_timeout, that is, only an "ADD" + flow_mod sets the importance. (This issue has been filed with + the ONF as EXT-496.) + + - Eviction Mechanism to automatically delete entries of lower + importance to make space for newer entries. + + +OpenFlow 1.4 Bundles +==================== + +Open vSwitch makes all flow table modifications atomically, i.e., any +datapath packet only sees flow table configurations either before or +after any change made by any flow_mod. For example, if a controller +removes all flows with a single OpenFlow "flow_mod", no packet sees an +intermediate version of the OpenFlow pipeline where only some of the +flows have been deleted. + +It should be noted that Open vSwitch caches datapath flows, and that +the cached flows are NOT flushed immediately when a flow table +changes. Instead, the datapath flows are revalidated against the new +flow table as soon as possible, and usually within one second of the +modification. This design amortizes the cost of datapath cache +flushing across multiple flow table changes, and has a significant +performance effect during simultaneous heavy flow table churn and high +traffic load. This means that different cached datapath flows may +have been computed based on a different flow table configurations, but +each of the datapath flows is guaranteed to have been computed over a +coherent view of the flow tables, as described above. + +With OpenFlow 1.4 bundles this atomicity can be extended across an +arbitrary set of flow_mods. Bundles are supported for flow_mod and +port_mod messages only. For flow_mods, both 'atomic' and 'ordered' +bundle flags are trivially supported, as all bundled messages are +executed in the order they were added and all flow table modifications +are now atomic to the datapath. Port mods may not appear in atomic +bundles, as port status modifications are not atomic. + +To support bundles, ovs-ofctl has a '--bundle' option that makes the +flow mod commands ('add-flow', 'add-flows', 'mod-flows', 'del-flows', +and 'replace-flows') use an OpenFlow 1.4 bundle to operate the +modifications as a single atomic transaction. If any of the flow mods +in a transaction fail, none of them are executed. All flow mods in a +bundle appear to datapath lookups simultaneously. + +Furthermore, ovs-ofctl 'add-flow' and 'add-flows' commands now accept +arbitrary flow mods as an input by allowing the flow specification to +start with an explicit 'add', 'modify', 'modify_strict', 'delete', or +'delete_strict' keyword. A missing keyword is treated as 'add', so +this is fully backwards compatible. With the new '--bundle' option +all the flow mods are executed as a single atomic transaction using an +OpenFlow 1.4 bundle. Without the '--bundle' option the flow mods are +executed in order up to the first failing flow_mod, and in case of an +error the earlier successful flow_mods are not rolled back. -OpenFlow 1.4 adds the "importance" field to flow_mods, but it does not -explicitly specify which kinds of flow_mods set the importance.For -consistency, Open vSwitch uses the same rule for importance as for -idle_timeout and hard_timeout, that is, only an "ADD" flow_mod sets -the importance. (This issue has been filed with the ONF as EXT-496.) OFPT_PACKET_IN ============== @@ -363,11 +436,13 @@ Each column is interpreted as follows. NXM_OF_VLAN_TCI(_W), a mask of ffff is equivalent to NXM_OF_VLAN_TCI. - - OF1.0 and OF1.1: wwww/x,yy/z means dl_vlan wwww, OFPFW_DL_VLAN - x, dl_vlan_pcp yy, and OFPFW_DL_VLAN_PCP z. ? means that the - given nibble is ignored (and conventionally 0 for wwww or yy, - conventionally 1 for x or z). means that the given match - is not supported. + - OF1.0 and OF1.1: wwww/x,yy/z means dl_vlan wwww, OFPFW_DL_VLAN x, + dl_vlan_pcp yy, and OFPFW_DL_VLAN_PCP z. If OFPFW_DL_VLAN or + OFPFW_DL_VLAN_PCP is 1, the corresponding field value is + wildcarded, otherwise it is matched. ? means that the given bits + are ignored (their conventional values are 0000/x,00/0 in OF1.0, + 0000/x,00/1 in OF1.1; x is never ignored). means that the + given match is not supported. - OF1.2: xxxx/yyyy,zz means OXM_OF_VLAN_VID_W with value xxxx and mask yyyy, and OXM_OF_VLAN_PCP (which is not maskable) with @@ -555,6 +630,73 @@ Tables 128 and above are reserved for use by the switch itself. Controllers should use only tables 0 through 127. +OFPTC_* Table Configuration +=========================== + +This section covers the history of the OFPTC_* table configuration +bits across OpenFlow versions. + +OpenFlow 1.0 flow tables had fixed configurations. + +OpenFlow 1.1 enabled controllers to configure behavior upon flow table +miss and added the OFPTC_MISS_* constants for that purpose. OFPTC_* +did not control anything else but it was nevertheless conceptualized +as a set of bit-fields instead of an enum. OF1.1 added the +OFPT_TABLE_MOD message to set OFPTC_MISS_* for a flow table and added +the 'config' field to the OFPST_TABLE reply to report the current +setting. + +OpenFlow 1.2 did not change anything in this regard. + +OpenFlow 1.3 switched to another means to changing flow table miss +behavior and deprecated OFPTC_MISS_* without adding any more OFPTC_* +constants. This meant that OFPT_TABLE_MOD now had no purpose at all, +but OF1.3 kept it around "for backward compatibility with older and +newer versions of the specification." At the same time, OF1.3 +introduced a new message OFPMP_TABLE_FEATURES that included a field +'config' documented as reporting the OFPTC_* values set with +OFPT_TABLE_MOD; of course this served no real purpose because no +OFPTC_* values are defined. OF1.3 did remove the OFPTC_* field from +OFPMP_TABLE (previously named OFPST_TABLE). + +OpenFlow 1.4 defined two new OFPTC_* constants, OFPTC_EVICTION and +OFPTC_VACANCY_EVENTS, using bits that did not overlap with +OFPTC_MISS_* even though those bits had not been defined since OF1.2. +OFPT_TABLE_MOD still controlled these settings. The field for OFPTC_* +values in OFPMP_TABLE_FEATURES was renamed from 'config' to +'capabilities' and documented as reporting the flags that are +supported in a OFPT_TABLE_MOD message. The OFPMP_TABLE_DESC message +newly added in OF1.4 reported the OFPTC_* setting. + +OpenFlow 1.5 did not change anything in this regard. + +The following table summarizes. The columns say: + + - OpenFlow version(s). + + - The OFPTC_* flags defined in those versions. + + - Whether OFPT_TABLE_MOD can modify OFPTC_* flags. + + - Whether OFPST_TABLE/OFPMP_TABLE reports the OFPTC_* flags. + + - What OFPMP_TABLE_FEATURES reports (if it exists): either the + current configuration or the switch's capabilities. + + - Whether OFPMP_TABLE_DESC reports the current configuration. + +OpenFlow OFPTC_* flags TABLE_MOD stats? TABLE_FEATURES TABLE_DESC +--------- ----------------------- --------- ------ -------------- ---------- +OF1.0 none no[*][+] no[*] nothing[*][+] no[*][+] +OF1.1/1.2 MISS_* yes yes nothing[+] no[+] +OF1.3 none yes[*] no[*] config[*] no[*][+] +OF1.4/1.5 EVICTION/VACANCY_EVENTS yes no capabilities yes + + [*] Nothing to report/change anyway. + + [+] No such message. + + IPv6 ==== @@ -842,7 +984,7 @@ not know the MAC address of the local port that is sending the traffic or the MAC address of the remote in the guest VM. With a few notable exceptions below, in-band should work in most -network setups. The following are considered "supported' in the +network setups. The following are considered "supported" in the current implementation: - Locally Connected. The switch and remote are on the same