1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2018 6WIND S.A.
3
4.. _switch_representation:
5
6Switch Representation within DPDK Applications
7==============================================
8
9.. contents:: :local:
10
11Introduction
12------------
13
14Network adapters with multiple physical ports and/or SR-IOV capabilities
15usually support the offload of traffic steering rules between their virtual
16functions (VFs), sub functions (SFs), physical functions (PFs) and ports.
17
18Like for standard Ethernet switches, this involves a combination of
19automatic MAC learning and manual configuration. For most purposes it is
20managed by the host system and fully transparent to users and applications.
21
22On the other hand, applications typically found on hypervisors that process
23layer 2 (L2) traffic (such as OVS) need to steer traffic themselves
24according on their own criteria.
25
26Without a standard software interface to manage traffic steering rules
27between VFs, SFs, PFs and the various physical ports of a given device,
28applications cannot take advantage of these offloads; software processing is
29mandatory even for traffic which ends up re-injected into the device it
30originates from.
31
32This document describes how such steering rules can be configured through
33the DPDK flow API (**rte_flow**), with emphasis on the SR-IOV use case
34(PF/VF steering) using a single physical port for clarity, however the same
35logic applies to any number of ports without necessarily involving SR-IOV.
36
37Sub Function
38------------
39Besides SR-IOV, Sub function is a portion of the PCI device, a SF netdev
40has its own dedicated queues(txq, rxq). A SF netdev supports E-Switch
41representation offload similar to existing PF and VF representors.
42A SF shares PCI level resources with other SFs and/or with its parent PCI
43function.
44
45Sub function is created on-demand, coexists with VFs. Number of SFs is
46limited by hardware resources.
47
48Port Representors
49-----------------
50
51In many cases, traffic steering rules cannot be determined in advance;
52applications usually have to process a bit of traffic in software before
53thinking about offloading specific flows to hardware.
54
55Applications therefore need the ability to receive and inject traffic to
56various device endpoints (other VFs, SFs, PFs or physical ports) before
57connecting them together. Device drivers must provide means to hook the
58"other end" of these endpoints and to refer them when configuring flow
59rules.
60
61This role is left to so-called "port representors" (also known as "VF
62representors" in the specific context of VFs, "SF representors" in the
63specific context of SFs), which are to DPDK what the Ethernet switch
64device driver model (**switchdev**) [1]_ is to Linux, and which can be
65thought as a software "patch panel" front-end for applications.
66
67- DPDK port representors are implemented as additional virtual Ethernet
68  device (**ethdev**) instances, spawned on an as needed basis through
69  configuration parameters passed to the driver of the underlying
70  device using devargs.
71
72::
73
74   -a pci:dbdf,representor=vf0
75   -a pci:dbdf,representor=vf[0-3]
76   -a pci:dbdf,representor=vf[0,5-11]
77   -a pci:dbdf,representor=sf1
78   -a pci:dbdf,representor=sf[0-1023]
79   -a pci:dbdf,representor=sf[0,2-1023]
80
81- As virtual devices, they may be more limited than their physical
82  counterparts, for instance by exposing only a subset of device
83  configuration callbacks and/or by not necessarily having Rx/Tx capability.
84
85- Among other things, they can be used to assign MAC addresses to the
86  resource they represent.
87
88- Applications can tell port representors apart from other physical of virtual
89  port by checking the dev_flags field within their device information
90  structure for the RTE_ETH_DEV_REPRESENTOR bit-field.
91
92.. code-block:: c
93
94  struct rte_eth_dev_info {
95      ...
96      uint32_t dev_flags; /**< Device flags */
97      ...
98  };
99
100- The device or group relationship of ports can be discovered using the
101  switch ``domain_id`` field within the devices switch information structure. By
102  default the switch ``domain_id`` of a port will be
103  ``RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID`` to indicate that the port doesn't
104  support the concept of a switch domain, but ports which do support the concept
105  will be allocated a unique switch ``domain_id``, ports within the same switch
106  domain will share the same ``domain_id``. The switch ``port_id`` is used to
107  specify the port_id in terms of the switch, so in the case of SR-IOV devices
108  the switch ``port_id`` would represent the virtual function identifier of the
109  port.
110
111.. code-block:: c
112
113   /**
114    * Ethernet device associated switch information
115    */
116   struct rte_eth_switch_info {
117       const char *name; /**< switch name */
118       uint16_t domain_id; /**< switch domain id */
119       uint16_t port_id; /**< switch port id */
120   };
121
122
123.. [1] `Ethernet switch device driver model (switchdev)
124       <https://www.kernel.org/doc/Documentation/networking/switchdev.txt>`_
125
126- For some PMDs, memory usage of representors is huge when number of
127  representor grows, mbufs are allocated for each descriptor of Rx queue.
128  Polling large number of ports brings more CPU load, cache miss and
129  latency. Shared Rx queue can be used to share Rx queue between PF and
130  representors among same Rx domain. ``RTE_ETH_DEV_CAPA_RXQ_SHARE`` in
131  device info is used to indicate the capability. Setting non-zero share
132  group in Rx queue configuration to enable share, share_qid is used to
133  identify the shared Rx queue in group. Polling any member port can
134  receive packets of all member ports in the group, port ID is saved in
135  ``mbuf.port``.
136
137Basic SR-IOV
138------------
139
140"Basic" in the sense that it is not managed by applications, which
141nonetheless expect traffic to flow between the various endpoints and the
142outside as if everything was linked by an Ethernet hub.
143
144The following diagram pictures a setup involving a device with one PF, two
145VFs and one shared physical port
146
147::
148
149       .-------------.                 .-------------. .-------------.
150       | hypervisor  |                 |    VM 1     | |    VM 2     |
151       | application |                 | application | | application |
152       `--+----------'                 `----------+--' `--+----------'
153          |                                       |       |
154    .-----+-----.                                 |       |
155    | port_id 3 |                                 |       |
156    `-----+-----'                                 |       |
157          |                                       |       |
158        .-+--.                                .---+--. .--+---.
159        | PF |                                | VF 1 | | VF 2 |
160        `-+--'                                `---+--' `--+---'
161          |                                       |       |
162          `---------.     .-----------------------'       |
163                    |     |     .-------------------------'
164                    |     |     |
165                 .--+-----+-----+--.
166                 | interconnection |
167                 `--------+--------'
168                          |
169                     .----+-----.
170                     | physical |
171                     |  port 0  |
172                     `----------'
173
174- A DPDK application running on the hypervisor owns the PF device, which is
175  arbitrarily assigned port index 3.
176
177- Both VFs are assigned to VMs and used by unknown applications; they may be
178  DPDK-based or anything else.
179
180- Interconnection is not necessarily done through a true Ethernet switch and
181  may not even exist as a separate entity. The role of this block is to show
182  that something brings PF, VFs and physical ports together and enables
183  communication between them, with a number of built-in restrictions.
184
185Subsequent sections in this document describe means for DPDK applications
186running on the hypervisor to freely assign specific flows between PF, VFs
187and physical ports based on traffic properties, by managing this
188interconnection.
189
190Controlled SR-IOV
191-----------------
192
193Initialization
194~~~~~~~~~~~~~~
195
196When a DPDK application gets assigned a PF device and is deliberately not
197started in `basic SR-IOV`_ mode, any traffic coming from physical ports is
198received by PF according to default rules, while VFs remain isolated.
199
200::
201
202       .-------------.                 .-------------. .-------------.
203       | hypervisor  |                 |    VM 1     | |    VM 2     |
204       | application |                 | application | | application |
205       `--+----------'                 `----------+--' `--+----------'
206          |                                       |       |
207    .-----+-----.                                 |       |
208    | port_id 3 |                                 |       |
209    `-----+-----'                                 |       |
210          |                                       |       |
211        .-+--.                                .---+--. .--+---.
212        | PF |                                | VF 1 | | VF 2 |
213        `-+--'                                `------' `------'
214          |
215          `-----.
216                |
217             .--+----------------------.
218             | managed interconnection |
219             `------------+------------'
220                          |
221                     .----+-----.
222                     | physical |
223                     |  port 0  |
224                     `----------'
225
226In this mode, interconnection must be configured by the application to
227enable VF communication, for instance by explicitly directing traffic with a
228given destination MAC address to VF 1 and allowing that with the same source
229MAC address to come out of it.
230
231For this to work, hypervisor applications need a way to refer to either VF 1
232or VF 2 in addition to the PF. This is addressed by `VF representors`_.
233
234VF Representors
235~~~~~~~~~~~~~~~
236
237VF representors are virtual but standard DPDK network devices (albeit with
238limited capabilities) created by PMDs when managing a PF device.
239
240Since they represent VF instances used by other applications, configuring
241them (e.g. assigning a MAC address or setting up promiscuous mode) affects
242interconnection accordingly. If supported, they may also be used as two-way
243communication ports with VFs (assuming **switchdev** topology)
244
245
246::
247
248       .-------------.                 .-------------. .-------------.
249       | hypervisor  |                 |    VM 1     | |    VM 2     |
250       | application |                 | application | | application |
251       `--+---+---+--'                 `----------+--' `--+----------'
252          |   |   |                               |       |
253          |   |   `-------------------.           |       |
254          |   `---------.             |           |       |
255          |             |             |           |       |
256    .-----+-----. .-----+-----. .-----+-----.     |       |
257    | port_id 3 | | port_id 4 | | port_id 5 |     |       |
258    `-----+-----' `-----+-----' `-----+-----'     |       |
259          |             |             |           |       |
260        .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
261        | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
262        `-+--'    `-----+-----' `-----+-----' `---+--' `--+---'
263          |             |             |           |       |
264          |             |   .---------'           |       |
265          `-----.       |   |   .-----------------'       |
266                |       |   |   |   .---------------------'
267                |       |   |   |   |
268             .--+-------+---+---+---+--.
269             | managed interconnection |
270             `------------+------------'
271                          |
272                     .----+-----.
273                     | physical |
274                     |  port 0  |
275                     `----------'
276
277- VF representors are assigned arbitrary port indices 4 and 5 in the
278  hypervisor application and are respectively associated with VF 1 and VF 2.
279
280- They can't be dissociated; even if VF 1 and VF 2 were not connected,
281  representors could still be used for configuration.
282
283- In this context, port index 3 can be thought as a representor for physical
284  port 0.
285
286As previously described, the "interconnection" block represents a logical
287concept. Interconnection occurs when hardware configuration enables traffic
288flows from one place to another (e.g. physical port 0 to VF 1) according to
289some criteria.
290
291This is discussed in more detail in `traffic steering`_.
292
293Traffic Steering
294~~~~~~~~~~~~~~~~
295
296In the following diagram, each meaningful traffic origin or endpoint as seen
297by the hypervisor application is tagged with a unique letter from A to F.
298
299::
300
301       .-------------.                 .-------------. .-------------.
302       | hypervisor  |                 |    VM 1     | |    VM 2     |
303       | application |                 | application | | application |
304       `--+---+---+--'                 `----------+--' `--+----------'
305          |   |   |                               |       |
306          |   |   `-------------------.           |       |
307          |   `---------.             |           |       |
308          |             |             |           |       |
309    .----(A)----. .----(B)----. .----(C)----.     |       |
310    | port_id 3 | | port_id 4 | | port_id 5 |     |       |
311    `-----+-----' `-----+-----' `-----+-----'     |       |
312          |             |             |           |       |
313        .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
314        | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
315        `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
316          |             |             |           |       |
317          |             |   .---------'           |       |
318          `-----.       |   |   .-----------------'       |
319                |       |   |   |   .---------------------'
320                |       |   |   |   |
321             .--+-------+---+---+---+--.
322             | managed interconnection |
323             `------------+------------'
324                          |
325                     .---(F)----.
326                     | physical |
327                     |  port 0  |
328                     `----------'
329
330- **A**: PF device.
331- **B**: port representor for VF 1.
332- **C**: port representor for VF 2.
333- **D**: VF 1 proper.
334- **E**: VF 2 proper.
335- **F**: physical port.
336
337Although uncommon, some devices do not enforce a one to one mapping between
338PF and physical ports. For instance, by default all ports of **mlx4**
339adapters are available to all their PF/VF instances, in which case
340additional ports appear next to **F** in the above diagram.
341
342Assuming no interconnection is provided by default in this mode, setting up
343a `basic SR-IOV`_ configuration involving physical port 0 could be broken
344down as:
345
346PF:
347
348- **A to F**: let everything through.
349- **F to A**: PF MAC as destination.
350
351VF 1:
352
353- **A to D**, **E to D** and **F to D**: VF 1 MAC as destination.
354- **D to A**: VF 1 MAC as source and PF MAC as destination.
355- **D to E**: VF 1 MAC as source and VF 2 MAC as destination.
356- **D to F**: VF 1 MAC as source.
357
358VF 2:
359
360- **A to E**, **D to E** and **F to E**: VF 2 MAC as destination.
361- **E to A**: VF 2 MAC as source and PF MAC as destination.
362- **E to D**: VF 2 MAC as source and VF 1 MAC as destination.
363- **E to F**: VF 2 MAC as source.
364
365Devices may additionally support advanced matching criteria such as
366IPv4/IPv6 addresses or TCP/UDP ports.
367
368The combination of matching criteria with target endpoints fits well with
369**rte_flow** [6]_, which expresses flow rules as combinations of patterns
370and actions.
371
372Enhancing **rte_flow** with the ability to make flow rules match and target
373these endpoints provides a standard interface to manage their
374interconnection without introducing new concepts and whole new API to
375implement them. This is described in `flow API (rte_flow)`_.
376
377.. [6] :doc:`Generic flow API (rte_flow) <rte_flow>`
378
379Flow API (rte_flow)
380-------------------
381
382Extensions
383~~~~~~~~~~
384
385Compared to creating a brand new dedicated interface, **rte_flow** was
386deemed flexible enough to manage representor traffic only with minor
387extensions:
388
389- Using physical ports, PF, SF, VF or port representors as targets.
390
391- Affecting traffic that is not necessarily addressed to the DPDK port ID a
392  flow rule is associated with (e.g. forcing VF traffic redirection to PF).
393
394For advanced uses:
395
396- Rule-based packet counters.
397
398- The ability to combine several identical actions for traffic duplication
399  (e.g. VF representor in addition to a physical port).
400
401- Dedicated actions for traffic encapsulation / decapsulation before
402  reaching an endpoint.
403
404Traffic Direction
405~~~~~~~~~~~~~~~~~
406
407From an application standpoint, "ingress" and "egress" flow rule attributes
408apply to the DPDK port ID they are associated with. They select a traffic
409direction for matching patterns, but have no impact on actions.
410
411When matching traffic coming from or going to a different place than the
412immediate port ID a flow rule is associated with, these attributes keep
413their meaning while applying to the chosen origin, as highlighted by the
414following diagram
415
416::
417
418       .-------------.                 .-------------. .-------------.
419       | hypervisor  |                 |    VM 1     | |    VM 2     |
420       | application |                 | application | | application |
421       `--+---+---+--'                 `----------+--' `--+----------'
422          |   |   |                               |       |
423          |   |   `-------------------.           |       |
424          |   `---------.             |           |       |
425          | ^           | ^           | ^         |       |
426          | | ingress   | | ingress   | | ingress |       |
427          | | egress    | | egress    | | egress  |       |
428          | v           | v           | v         |       |
429    .----(A)----. .----(B)----. .----(C)----.     |       |
430    | port_id 3 | | port_id 4 | | port_id 5 |     |       |
431    `-----+-----' `-----+-----' `-----+-----'     |       |
432          |             |             |           |       |
433        .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
434        | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
435        `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
436          |             |             |         ^ |       | ^
437          |             |             |  egress | |       | | egress
438          |             |             | ingress | |       | | ingress
439          |             |   .---------'         v |       | v
440          `-----.       |   |   .-----------------'       |
441                |       |   |   |   .---------------------'
442                |       |   |   |   |
443             .--+-------+---+---+---+--.
444             | managed interconnection |
445             `------------+------------'
446                        ^ |
447                ingress | |
448                 egress | |
449                        v |
450                     .---(F)----.
451                     | physical |
452                     |  port 0  |
453                     `----------'
454
455Ingress and egress are defined as relative to the application creating the
456flow rule.
457
458For instance, matching traffic sent by VM 2 would be done through an ingress
459flow rule on VF 2 (**E**). Likewise for incoming traffic on physical port
460(**F**). This also applies to **C** and **A** respectively.
461
462Transferring Traffic
463~~~~~~~~~~~~~~~~~~~~
464
465Without Port Representors
466^^^^^^^^^^^^^^^^^^^^^^^^^
467
468`Traffic direction`_ describes how an application could match traffic coming
469from or going to a specific place reachable from a DPDK port ID. This makes
470sense when the traffic in question is normally seen (i.e. sent or received)
471by the application creating the flow rule (e.g. as in "redirect all traffic
472coming from VF 1 to local queue 6").
473
474However this does not force such traffic to take a specific route. Creating
475a flow rule on **A** matching traffic coming from **D** is only meaningful
476if it can be received by **A** in the first place, otherwise doing so simply
477has no effect.
478
479A new flow rule attribute named "transfer" is necessary for that. Combining
480it with "ingress" or "egress" and a specific origin requests a flow rule to
481be applied at the lowest level
482
483::
484
485             ingress only           :       ingress + transfer
486                                    :
487    .-------------. .-------------. : .-------------. .-------------.
488    | hypervisor  | |    VM 1     | : | hypervisor  | |    VM 1     |
489    | application | | application | : | application | | application |
490    `------+------' `--+----------' : `------+------' `--+----------'
491           |           | | traffic  :        |           | | traffic
492     .----(A)----.     | v          :  .----(A)----.     | v
493     | port_id 3 |     |            :  | port_id 3 |     |
494     `-----+-----'     |            :  `-----+-----'     |
495           |           |            :        | ^         |
496           |           |            :        | | traffic |
497         .-+--.    .---+--.         :      .-+--.    .---+--.
498         | PF |    | VF 1 |         :      | PF |    | VF 1 |
499         `-+--'    `--(D)-'         :      `-+--'    `--(D)-'
500           |           | | traffic  :        | ^         | | traffic
501           |           | v          :        | | traffic | v
502        .--+-----------+--.         :     .--+-----------+--.
503        | interconnection |         :     | interconnection |
504        `--------+--------'         :     `--------+--------'
505                 | | traffic        :              |
506                 | v                :              |
507            .---(F)----.            :         .---(F)----.
508            | physical |            :         | physical |
509            |  port 0  |            :         |  port 0  |
510            `----------'            :         `----------'
511
512With "ingress" only, traffic is matched on **A** thus still goes to physical
513port **F** by default
514
515
516::
517
518   testpmd> flow create 3 ingress pattern vf id is 1 / end
519              actions queue index 6 / end
520
521With "ingress + transfer", traffic is matched on **D** and is therefore
522successfully assigned to queue 6 on **A**
523
524
525::
526
527    testpmd> flow create 3 ingress transfer pattern vf id is 1 / end
528              actions queue index 6 / end
529
530
531With Port Representors
532^^^^^^^^^^^^^^^^^^^^^^
533
534When port representors exist, implicit flow rules with the "transfer"
535attribute (described in `without port representors`_) are be assumed to
536exist between them and their represented resources. These may be immutable.
537
538In this case, traffic is received by default through the representor and
539neither the "transfer" attribute nor traffic origin in flow rule patterns
540are necessary. They simply have to be created on the representor port
541directly and may target a different representor as described in `PORT_ID
542action`_.
543
544Implicit traffic flow with port representor
545
546::
547
548       .-------------.   .-------------.
549       | hypervisor  |   |    VM 1     |
550       | application |   | application |
551       `--+-------+--'   `----------+--'
552          |       | ^               | | traffic
553          |       | | traffic       | v
554          |       `-----.           |
555          |             |           |
556    .----(A)----. .----(B)----.     |
557    | port_id 3 | | port_id 4 |     |
558    `-----+-----' `-----+-----'     |
559          |             |           |
560        .-+--.    .-----+-----. .---+--.
561        | PF |    | VF 1 rep. | | VF 1 |
562        `-+--'    `-----+-----' `--(D)-'
563          |             |           |
564       .--|-------------|-----------|--.
565       |  |             |           |  |
566       |  |             `-----------'  |
567       |  |              <-- traffic   |
568       `--|----------------------------'
569          |
570     .---(F)----.
571     | physical |
572     |  port 0  |
573     `----------'
574
575Pattern Items And Actions
576~~~~~~~~~~~~~~~~~~~~~~~~~
577
578PORT Pattern Item
579^^^^^^^^^^^^^^^^^
580
581Matches traffic originating from (ingress) or going to (egress) a physical
582port of the underlying device.
583
584Using this pattern item without specifying a port index matches the physical
585port associated with the current DPDK port ID by default. As described in
586`traffic steering`_, specifying it should be rarely needed.
587
588- Matches **F** in `traffic steering`_.
589
590PORT Action
591^^^^^^^^^^^
592
593Directs matching traffic to a given physical port index.
594
595- Targets **F** in `traffic steering`_.
596
597PORT_ID Pattern Item
598^^^^^^^^^^^^^^^^^^^^
599
600Matches traffic originating from (ingress) or going to (egress) a given DPDK
601port ID.
602
603Normally only supported if the port ID in question is known by the
604underlying PMD and related to the device the flow rule is created against.
605
606This must not be confused with the `PORT pattern item`_ which refers to the
607physical port of a device. ``PORT_ID`` refers to a ``struct rte_eth_dev``
608object on the application side (also known as "port representor" depending
609on the kind of underlying device).
610
611- Matches **A**, **B** or **C** in `traffic steering`_.
612
613PORT_ID Action
614^^^^^^^^^^^^^^
615
616Directs matching traffic to a given DPDK port ID.
617
618Same restrictions as `PORT_ID pattern item`_.
619
620- Targets **A**, **B** or **C** in `traffic steering`_.
621
622PF Pattern Item
623^^^^^^^^^^^^^^^
624
625Matches traffic originating from (ingress) or going to (egress) the physical
626function of the current device.
627
628If supported, should work even if the physical function is not managed by
629the application and thus not associated with a DPDK port ID. Its behavior is
630otherwise similar to `PORT_ID pattern item`_ using PF port ID.
631
632- Matches **A** in `traffic steering`_.
633
634PF Action
635^^^^^^^^^
636
637Directs matching traffic to the physical function of the current device.
638
639Same restrictions as `PF pattern item`_.
640
641- Targets **A** in `traffic steering`_.
642
643VF Pattern Item
644^^^^^^^^^^^^^^^
645
646Matches traffic originating from (ingress) or going to (egress) a given
647virtual function of the current device.
648
649If supported, should work even if the virtual function is not managed by
650the application and thus not associated with a DPDK port ID. Its behavior is
651otherwise similar to `PORT_ID pattern item`_ using VF port ID.
652
653Note this pattern item does not match VF representors traffic which, as
654separate entities, should be addressed through their own port IDs.
655
656- Matches **D** or **E** in `traffic steering`_.
657
658VF Action
659^^^^^^^^^
660
661Directs matching traffic to a given virtual function of the current device.
662
663Same restrictions as `VF pattern item`_.
664
665- Targets **D** or **E** in `traffic steering`_.
666
667\*_ENCAP actions
668^^^^^^^^^^^^^^^^
669
670These actions are named according to the protocol they encapsulate traffic
671with (e.g. ``VXLAN_ENCAP``) and using specific parameters (e.g. VNI for
672VXLAN).
673
674While they modify traffic and can be used multiple times (order matters),
675unlike `PORT_ID action`_ and friends, they have no impact on steering.
676
677As described in `actions order and repetition`_ this means they are useless
678if used alone in an action list, the resulting traffic gets dropped unless
679combined with either ``PASSTHRU`` or other endpoint-targeting actions.
680
681\*_DECAP actions
682^^^^^^^^^^^^^^^^
683
684They perform the reverse of `\*_ENCAP actions`_ by popping protocol headers
685from traffic instead of pushing them. They can be used multiple times as
686well.
687
688Note that using these actions on non-matching traffic results in undefined
689behavior. It is recommended to match the protocol headers to decapsulate on
690the pattern side of a flow rule in order to use these actions or otherwise
691make sure only matching traffic goes through.
692
693Actions Order and Repetition
694~~~~~~~~~~~~~~~~~~~~~~~~~~~~
695
696Flow rules are currently restricted to at most a single action of each
697supported type, performed in an unpredictable order (or all at once). To
698repeat actions in a predictable fashion, applications have to make rules
699pass-through and use priority levels.
700
701It's now clear that PMD support for chaining multiple non-terminating flow
702rules of varying priority levels is prohibitively difficult to implement
703compared to simply allowing multiple identical actions performed in a
704defined order by a single flow rule.
705
706- This change is required to support protocol encapsulation offloads and the
707  ability to perform them multiple times (e.g. VLAN then VXLAN).
708
709- It makes the ``DUP`` action redundant since multiple ``QUEUE`` actions can
710  be combined for duplication.
711
712- The (non-)terminating property of actions must be discarded. Instead, flow
713  rules themselves must be considered terminating by default (i.e. dropping
714  traffic if there is no specific target) unless a ``PASSTHRU`` action is
715  also specified.
716
717Switching Examples
718------------------
719
720This section provides practical examples based on the established testpmd
721flow command syntax [2]_, in the context described in `traffic steering`_
722
723::
724
725      .-------------.                 .-------------. .-------------.
726      | hypervisor  |                 |    VM 1     | |    VM 2     |
727      | application |                 | application | | application |
728      `--+---+---+--'                 `----------+--' `--+----------'
729         |   |   |                               |       |
730         |   |   `-------------------.           |       |
731         |   `---------.             |           |       |
732         |             |             |           |       |
733   .----(A)----. .----(B)----. .----(C)----.     |       |
734   | port_id 3 | | port_id 4 | | port_id 5 |     |       |
735   `-----+-----' `-----+-----' `-----+-----'     |       |
736         |             |             |           |       |
737       .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
738       | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
739       `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
740         |             |             |           |       |
741         |             |   .---------'           |       |
742         `-----.       |   |   .-----------------'       |
743               |       |   |   |   .---------------------'
744               |       |   |   |   |
745            .--|-------|---|---|---|--.
746            |  |       |   `---|---'  |
747            |  |       `-------'      |
748            |  `---------.            |
749            `------------|------------'
750                         |
751                    .---(F)----.
752                    | physical |
753                    |  port 0  |
754                    `----------'
755
756By default, PF (**A**) can communicate with the physical port it is
757associated with (**F**), while VF 1 (**D**) and VF 2 (**E**) are isolated
758and restricted to communicate with the hypervisor application through their
759respective representors (**B** and **C**) if supported.
760
761Examples in subsequent sections apply to hypervisor applications only and
762are based on port representors **A**, **B** and **C**.
763
764.. [2] :ref:`Flow syntax <testpmd_rte_flow>`
765
766Associating VF 1 with Physical Port 0
767~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
768
769Assign all port traffic (**F**) to VF 1 (**D**) indiscriminately through
770their representors
771
772::
773
774   flow create 3 ingress pattern / end actions port_id id 4 / end
775   flow create 4 ingress pattern / end actions port_id id 3 / end
776
777More practical example with MAC address restrictions
778
779::
780
781   flow create 3 ingress
782       pattern eth dst is {VF 1 MAC} / end
783       actions port_id id 4 / end
784
785::
786
787   flow create 4 ingress
788       pattern eth src is {VF 1 MAC} / end
789       actions port_id id 3 / end
790
791
792Sharing Broadcasts
793~~~~~~~~~~~~~~~~~~
794
795From outside to PF and VFs
796
797::
798
799   flow create 3 ingress
800      pattern eth dst is ff:ff:ff:ff:ff:ff / end
801      actions port_id id 3 / port_id id 4 / port_id id 5 / end
802
803Note ``port_id id 3`` is necessary otherwise only VFs would receive matching
804traffic.
805
806From PF to outside and VFs
807
808::
809
810   flow create 3 egress
811      pattern eth dst is ff:ff:ff:ff:ff:ff / end
812      actions port / port_id id 4 / port_id id 5 / end
813
814From VFs to outside and PF
815
816::
817
818   flow create 4 ingress
819      pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 1 MAC} / end
820      actions port_id id 3 / port_id id 5 / end
821
822   flow create 5 ingress
823      pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 2 MAC} / end
824      actions port_id id 4 / port_id id 4 / end
825
826Similar ``33:33:*`` rules based on known MAC addresses should be added for
827IPv6 traffic.
828
829Encapsulating VF 2 Traffic in VXLAN
830~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
831
832Assuming pass-through flow rules are supported
833
834::
835
836   flow create 5 ingress
837      pattern eth / end
838      actions vxlan_encap vni 42 / passthru / end
839
840::
841
842   flow create 5 egress
843      pattern vxlan vni is 42 / end
844      actions vxlan_decap / passthru / end
845
846Here ``passthru`` is needed since as described in `actions order and
847repetition`_, flow rules are otherwise terminating; if supported, a rule
848without a target endpoint will drop traffic.
849
850Without pass-through support, ingress encapsulation on the destination
851endpoint might not be supported and action list must provide one
852
853::
854
855   flow create 5 ingress
856      pattern eth src is {VF 2 MAC} / end
857      actions vxlan_encap vni 42 / port_id id 3 / end
858
859   flow create 3 ingress
860      pattern vxlan vni is 42 / end
861      actions vxlan_decap / port_id id 5 / end
862