xref: /f-stack/dpdk/doc/guides/nics/bnxt.rst (revision 2d9fd380)
1.. SPDX-License-Identifier: BSD-3-Clause
2   Copyright 2020 Broadcom Inc.
3
4BNXT Poll Mode Driver
5=====================
6
7The Broadcom BNXT PMD (**librte_net_bnxt**) implements support for adapters
8based on Ethernet controllers and SoCs belonging to the Broadcom
9BCM574XX/BCM575XX NetXtreme-E® Family of Ethernet Network Controllers,
10the Broadcom BCM588XX Stingray Family of Smart NIC Adapters, and the Broadcom
11StrataGX® BCM5873X Series of Communications Processors.
12
13A complete list with links to reference material is in the Appendix section.
14
15CPU Support
16-----------
17
18BNXT PMD supports multiple CPU architectures, including x86-32, x86-64, and ARMv8.
19
20Kernel Dependency
21-----------------
22
23BNXT PMD requires a kernel module (VFIO or UIO) for setting up a device, mapping
24device memory to userspace, registering interrupts, etc.
25VFIO is more secure than UIO, relying on IOMMU protection.
26UIO requires the IOMMU disabled or configured to pass-through mode.
27
28The BNXT PMD supports operating with:
29
30* Linux vfio-pci
31* Linux uio_pci_generic
32* Linux igb_uio
33* BSD nic_uio
34
35Running BNXT PMD
36----------------
37
38Bind the device to one of the kernel modules listed above
39
40.. code-block:: console
41
42    ./dpdk-devbind.py -b vfio-pci|igb_uio|uio_pci_generic bus_id:device_id.function_id
43
44The BNXT PMD can run on PF or VF.
45
46PCI-SIG Single Root I/O Virtualization (SR-IOV) involves the direct assignment
47of part of the network port resources to guest operating systems using the
48SR-IOV standard.
49NIC is logically distributed among multiple virtual machines (VMs), while still
50having global data in common to share with the PF and other VFs.
51
52Sysadmin can create and configure VFs:
53
54.. code-block:: console
55
56  echo num_vfs > /sys/bus/pci/devices/domain_id:bus_id:device_id:function_id/sriov_numvfs
57  (ex) echo 4 > /sys/bus/pci/devices/0000:82:00:0/sriov_numvfs
58
59Sysadmin also can change the VF property such as MAC address, transparent VLAN,
60TX rate limit, and trusted VF:
61
62.. code-block:: console
63
64  ip link set pf_id vf vf_id mac (mac_address) vlan (vlan_id) txrate (rate_value) trust (enable|disable)
65  (ex) ip link set 0 vf 0 mac 00:11:22:33:44:55 vlan 0x100 txrate 100 trust disable
66
67Running on VF
68~~~~~~~~~~~~~
69
70Flow Bifurcation
71^^^^^^^^^^^^^^^^
72
73The Flow Bifurcation splits the incoming data traffic to user space applications
74(such as DPDK applications) and/or kernel space programs (such as the Linux
75kernel stack).
76It can direct some traffic, for example data plane traffic, to DPDK.
77Rest of the traffic, for example control plane traffic, would be redirected to
78the traditional Linux networking stack.
79
80Refer to https://doc.dpdk.org/guides/howto/flow_bifurcation.html
81
82Benefits of the flow bifurcation include:
83
84* Better performance with less CPU overhead, as user application can directly
85  access the NIC for data path
86* NIC is still being controlled by the kernel, as control traffic is forwarded
87  only to the kernel driver
88* Control commands, e.g. ethtool, will work as usual
89
90Running on a VF, the BXNT PMD supports the flow bifurcation with a combination
91of SR-IOV and packet classification and/or forwarding capability.
92In the simplest case of flow bifurcation, a PF driver configures a NIC to
93forward all user traffic directly to VFs with matching destination MAC address,
94while the rest of the traffic is forwarded to a PF.
95Note that the broadcast packets will be forwarded to both PF and VF.
96
97.. code-block:: console
98
99    (ex) ethtool --config-ntuple ens2f0 flow-type ether dst 00:01:02:03:00:01 vlan 10 vlan-mask 0xf000 action 0x100000000
100
101Trusted VF
102^^^^^^^^^^
103
104By default, VFs are *not* allowed to perform privileged operations, such as
105modifying the VF’s MAC address in the guest. These security measures are
106designed to prevent possible attacks.
107However, when a DPDK application can be trusted (e.g., OVS-DPDK, here), these
108operations performed by a VF would be legitimate and can be allowed.
109
110To enable VF to request "trusted mode," a new trusted VF concept was introduced
111in Linux kernel 4.4 and allowed VFs to become “trusted” and perform some
112privileged operations.
113
114The BNXT PMD supports the trusted VF mode of operation. Only a PF can enable the
115trusted attribute on the VF. It is preferable to enable the Trusted setting on a
116VF before starting applications.
117However, the BNXT PMD handles dynamic changes in trusted settings as well.
118
119Note that control commands, e.g., ethtool, will work via the kernel PF driver,
120*not* via the trusted VF driver.
121
122Operations supported by trusted VF:
123
124* MAC address configuration
125* Flow rule creation
126
127Operations *not* supported by trusted VF:
128
129* Firmware upgrade
130* Promiscuous mode setting
131
132Running on PF
133~~~~~~~~~~~~~
134
135Unlike the VF when BNXT PMD runs on a PF there are no restrictions placed on the
136features which the PF can enable or request. In a multiport NIC, each port will
137have a corresponding PF. Also depending on the configuration of the NIC there
138can be more than one PF associated per port.
139A sysadmin can load the kernel driver on one PF, and run BNXT PMD on the other
140PF or run the PMD on both the PFs. In such cases, the firmware picks one of the
141PFs as a master PF.
142
143Much like in the trusted VF, the DPDK application must be *trusted* and expected
144to be *well-behaved*.
145
146Features
147--------
148
149The BNXT PMD supports the following features:
150
151* Port Control
152    * Port MTU
153    * LED
154    * Flow Control and Autoneg
155* Packet Filtering
156    * Unicast MAC Filter
157    * Multicast MAC Filter
158    * VLAN Filtering
159    * Allmulticast Mode
160    * Promiscuous Mode
161* Stateless Offloads
162    * CRC Offload
163    * Checksum Offload (IPv4, TCP, and UDP)
164    * Multi-Queue (TSS and RSS)
165    * Segmentation and Reassembly (TSO and LRO)
166* VLAN insert strip
167* Stats Collection
168* Generic Flow Offload
169
170Port Control
171~~~~~~~~~~~~
172
173**Port MTU**: BNXT PMD supports the MTU (Maximum Transmission Unit) up to 9,574
174bytes:
175
176.. code-block:: console
177
178    testpmd> port config mtu (port_id) mtu_value
179    testpmd> show port info (port_id)
180
181**LED**: Application tunes on (or off) a port LED, typically for a port
182identification:
183
184.. code-block:: console
185
186    int rte_eth_led_on (uint16_t port_id)
187    int rte_eth_led_off (uint16_t port_id)
188
189**Flow Control and Autoneg**: Application tunes on (or off) flow control and/or
190auto-negotiation on a port:
191
192.. code-block:: console
193
194    testpmd> set flow_ctrl rx (on|off) (port_id)
195    testpmd> set flow_ctrl tx (on|off) (port_id)
196    testpmd> set flow_ctrl autoneg (on|off) (port_id)
197
198Note that the BNXT PMD does *not* support some options and ignores them when
199requested:
200
201* high_water
202* low_water
203* pause_time
204* mac_ctrl_frame_fwd
205* send_xon
206
207Packet Filtering
208~~~~~~~~~~~~~~~~
209
210Applications control the packet-forwarding behaviors with packet filters.
211
212The BNXT PMD supports hardware-based packet filtering:
213
214* UC (Unicast) MAC Filters
215    * No unicast packets are forwarded to an application except the one with
216      DMAC address added to the port
217    * At initialization, the station MAC address is added to the port
218* MC (Multicast) MAC Filters
219    * No multicast packets are forwarded to an application except the one with
220      MC address added to the port
221    * When the application listens to a multicast group, it adds the MC address
222      to the port
223* VLAN Filtering Mode
224    * When enabled, no packets are forwarded to an application except the ones
225      with the VLAN tag assigned to the port
226* Allmulticast Mode
227    * When enabled, every multicast packet received on the port is forwarded to
228      the application
229    * Typical usage is routing applications
230* Promiscuous Mode
231    * When enabled, every packet received on the port is forwarded to the
232      application
233
234Unicast MAC Filter
235^^^^^^^^^^^^^^^^^^
236
237The application can add (or remove) MAC addresses to enable (or disable)
238filtering on MAC address used to accept packets.
239
240.. code-block:: console
241
242    testpmd> show port (port_id) macs
243    testpmd> mac_addr (add|remove) (port_id) (XX:XX:XX:XX:XX:XX)
244
245Multicast MAC Filter
246^^^^^^^^^^^^^^^^^^^^
247
248The application can add (or remove) Multicast addresses that enable (or disable)
249filtering on multicast MAC address used to accept packets.
250
251.. code-block:: console
252
253    testpmd> show port (port_id) mcast_macs
254    testpmd> mcast_addr (add|remove) (port_id) (XX:XX:XX:XX:XX:XX)
255
256Application adds (or removes) Multicast addresses to enable (or disable)
257allowlist filtering to accept packets.
258
259Note that the BNXT PMD supports up to 16 MC MAC filters. if the user adds more
260than 16 MC MACs, the BNXT PMD puts the port into the Allmulticast mode.
261
262VLAN Filtering
263^^^^^^^^^^^^^^
264
265The application enables (or disables) VLAN filtering mode. When the mode is
266enabled, no packets are forwarded to an application except ones with VLAN tag
267assigned for the application.
268
269.. code-block:: console
270
271    testpmd> vlan set filter (on|off) (port_id)
272    testpmd> rx_vlan (add|rm) (vlan_id) (port_id)
273
274Allmulticast Mode
275^^^^^^^^^^^^^^^^^
276
277The application enables (or disables) the allmulticast mode. When the mode is
278enabled, every multicast packet received is forwarded to the application.
279
280.. code-block:: console
281
282    testpmd> show port info (port_id)
283    testpmd> set allmulti (port_id) (on|off)
284
285Promiscuous Mode
286^^^^^^^^^^^^^^^^
287
288The application enables (or disables) the promiscuous mode. When the mode is
289enabled on a port, every packet received on the port is forwarded to the
290application.
291
292.. code-block:: console
293
294    testpmd> show port info (port_id)
295    testpmd> set promisc port_id (on|off)
296
297Stateless Offloads
298~~~~~~~~~~~~~~~~~~
299
300Like Linux, DPDK provides enabling hardware offload of some stateless processing
301(such as checksum calculation) of the stack, alleviating the CPU from having to
302burn cycles on every packet.
303
304Listed below are the stateless offloads supported by the BNXT PMD:
305
306* CRC offload (for both TX and RX packets)
307* Checksum Offload (for both TX and RX packets)
308    * IPv4 Checksum Offload
309    * TCP Checksum Offload
310    * UDP Checksum Offload
311* Segmentation/Reassembly Offloads
312    * TCP Segmentation Offload (TSO)
313    * Large Receive Offload (LRO)
314* Multi-Queue
315    * Transmit Side Scaling (TSS)
316    * Receive Side Scaling (RSS)
317
318Also, the BNXT PMD supports stateless offloads on inner frames for tunneled
319packets. Listed below are the tunneling protocols supported by the BNXT PMD:
320
321* VXLAN
322* GRE
323* NVGRE
324
325Note that enabling (or disabling) stateless offloads requires applications to
326stop DPDK before changing configuration.
327
328CRC Offload
329^^^^^^^^^^^
330
331The FCS (Frame Check Sequence) in the Ethernet frame is a four-octet CRC (Cyclic
332Redundancy Check) that allows detection of corrupted data within the entire
333frame as received on the receiver side.
334
335The BNXT PMD supports hardware-based CRC offload:
336
337* TX: calculate and insert CRC
338* RX: check and remove CRC, notify the application on CRC error
339
340Note that the CRC offload is always turned on.
341
342Checksum Offload
343^^^^^^^^^^^^^^^^
344
345The application enables hardware checksum calculation for IPv4, TCP, and UDP.
346
347.. code-block:: console
348
349    testpmd> port stop (port_id)
350    testpmd> csum set (ip|tcp|udp|outer-ip|outer-udp) (sw|hw) (port_id)
351    testpmd> set fwd csum
352
353Multi-Queue
354^^^^^^^^^^^
355
356Multi-Queue, also known as TSS (Transmit Side Scaling) or RSS (Receive Side
357Scaling), is a common networking technique that allows for more efficient load
358balancing across multiple CPU cores.
359
360The application enables multiple TX and RX queues when it is started.
361
362.. code-block:: console
363
364    testpmd -l 1,3,5 --main-lcore 1 --txq=2 –rxq=2 --nb-cores=2
365
366**TSS**
367
368TSS distributes network transmit processing across several hardware-based
369transmit queues, allowing outbound network traffic to be processed by multiple
370CPU cores.
371
372**RSS**
373
374RSS distributes network receive processing across several hardware-based receive
375queues, allowing inbound network traffic to be processed by multiple CPU cores.
376
377The application can select the RSS mode, i.e. select the header fields that are
378included for hash calculation. The BNXT PMD supports the RSS mode of
379``default|ip|tcp|udp|none``, where default mode is L3 and L4.
380
381For tunneled packets, RSS hash is calculated over inner frame header fields.
382Applications may want to select the tunnel header fields for hash calculation,
383and it will be supported in 20.08 using RSS level.
384
385.. code-block:: console
386
387    testpmd> port config (port_id) rss (all|default|ip|tcp|udp|none)
388
389    // note that the testpmd defaults the RSS mode to ip
390    // ensure to issue the command below to enable L4 header (TCP or UDP) along with IPv4 header
391    testpmd> port config (port_id) rss default
392
393    // to check the current RSS configuration, such as RSS function and RSS key
394    testpmd> show port (port_id) rss-hash key
395
396    // RSS is enabled by default. However, application can disable RSS as follows
397    testpmd> port config (port_id) rss none
398
399Application can change the flow distribution, i.e. remap the received traffic to
400CPU cores, using RSS RETA (Redirection Table).
401
402.. code-block:: console
403
404    // application queries the current RSS RETA configuration
405    testpmd> show port (port_id) rss reta size (mask0, mask1)
406
407    // application changes the RSS RETA configuration
408    testpmd> port config (port_id) rss reta (hash, queue) [, (hash, queue)]
409
410TSO
411^^^
412
413TSO (TCP Segmentation Offload), also known as LSO (Large Send Offload), enables
414the TCP/IP stack to pass to the NIC a larger datagram than the MTU (Maximum
415Transmit Unit). NIC breaks it into multiple segments before sending it to the
416network.
417
418The BNXT PMD supports hardware-based TSO.
419
420.. code-block:: console
421
422    // display the status of TSO
423    testpmd> tso show (port_id)
424
425    // enable/disable TSO
426    testpmd> port config (port_id) tx_offload tcp_tso (on|off)
427
428    // set TSO segment size
429    testpmd> tso set segment_size (port_id)
430
431The BNXT PMD also supports hardware-based tunneled TSO.
432
433.. code-block:: console
434
435    // display the status of tunneled TSO
436    testpmd> tunnel_tso show (port_id)
437
438    // enable/disable tunneled TSO
439    testpmd> port config (port_id) tx_offload vxlan_tnl_tso|gre_tnl_tso (on|off)
440
441    // set tunneled TSO segment size
442    testpmd> tunnel_tso set segment_size (port_id)
443
444Note that the checksum offload is always assumed to be enabled for TSO.
445
446LRO
447^^^
448
449LRO (Large Receive Offload) enables NIC to aggregate multiple incoming TCP/IP
450packets from a single stream into a larger buffer, before passing to the
451networking stack.
452
453The BNXT PMD supports hardware-based LRO.
454
455.. code-block:: console
456
457    // display the status of LRO
458    testpmd> show port (port_id) rx_offload capabilities
459    testpmd> show port (port_id) rx_offload configuration
460
461    // enable/disable LRO
462    testpmd> port config (port_id) rx_offload tcp_lro (on|off)
463
464    // set max LRO packet (datagram) size
465    testpmd> port config (port_id) max-lro-pkt-size (max_size)
466
467The BNXT PMD also supports tunneled LRO.
468
469Some applications, such as routing, should *not* change the packet headers as
470they pass through (i.e. received from and sent back to the network). In such a
471case, GRO (Generic Receive Offload) should be used instead of LRO.
472
473VLAN Insert/Strip
474~~~~~~~~~~~~~~~~~
475
476DPDK application offloads VLAN insert/strip to improve performance. The BNXT PMD
477supports hardware-based VLAN insert/strip offload for both single and double
478VLAN packets.
479
480
481VLAN Insert
482^^^^^^^^^^^
483
484Application configures the VLAN TPID (Tag Protocol ID). By default, the TPID is
4850x8100.
486
487.. code-block:: console
488
489    // configure outer TPID value for a port
490    testpmd> vlan set outer tpid (tpid_value) (port_id)
491
492The inner TPID set will be rejected as the BNXT PMD supports inserting only an
493outer VLAN. Note that when a packet has a single VLAN, the tag is considered as
494outer, i.e. the inner VLAN is relevant only when a packet is double-tagged.
495
496The BNXT PMD supports various TPID values shown below. Any other values will be
497rejected.
498
499* ``0x8100``
500* ``0x88a8``
501* ``0x9100``
502* ``0x9200``
503* ``0x9300``
504
505The BNXT PMD supports the VLAN insert offload per-packet basis. The application
506provides the TCI (Tag Control Info) for a packet via mbuf. In turn, the BNXT PMD
507inserts the VLAN tag (via hardware) using the provided TCI along with the
508configured TPID.
509
510.. code-block:: console
511
512    // enable VLAN insert offload
513    testpmd> port config (port_id) rx_offload vlan_insert|qinq_insert (on|off)
514
515    if (mbuf->ol_flags && PKT_TX_QINQ)       // case-1: insert VLAN to single-tagged packet
516        tci_value = mbuf->vlan_tci_outer
517    else if (mbuf->ol_flags && PKT_TX_VLAN)  // case-2: insert VLAN to untagged packet
518        tci_value = mbuf->vlan_tci
519
520VLAN Strip
521^^^^^^^^^^
522
523The application configures the per-port VLAN strip offload.
524
525.. code-block:: console
526
527    // enable VLAN strip on a port
528    testpmd> port config (port_id) tx_offload vlan_strip (on|off)
529
530    // notify application VLAN strip via mbuf
531    mbuf->ol_flags |= PKT_RX_VLAN | PKT_RX_STRIPPED // outer VLAN is found and stripped
532    mbuf->vlan_tci = tci_value                      // TCI of the stripped VLAN
533
534Time Synchronization
535~~~~~~~~~~~~~~~~~~~~
536
537System operators may run a PTP (Precision Time Protocol) client application to
538synchronize the time on the NIC (and optionally, on the system) to a PTP master.
539
540The BNXT PMD supports a PTP client application to communicate with a PTP master
541clock using DPDK IEEE1588 APIs. Note that the PTP client application needs to
542run on PF and vector mode needs to be disabled.
543
544.. code-block:: console
545
546    testpmd> set fwd ieee1588 // enable IEEE 1588 mode
547
548When enabled, the BNXT PMD configures hardware to insert IEEE 1588 timestamps to
549the outgoing PTP packets and reports IEEE 1588 timestamps from the incoming PTP
550packets to application via mbuf.
551
552.. code-block:: console
553
554    // RX packet completion will indicate whether the packet is PTP
555    mbuf->ol_flags |= PKT_RX_IEEE1588_PTP
556
557Statistics Collection
558~~~~~~~~~~~~~~~~~~~~~
559
560In Linux, the *ethtool -S* enables us to query the NIC stats. DPDK provides the
561similar functionalities via rte_eth_stats and rte_eth_xstats.
562
563The BNXT PMD supports both basic and extended stats collection:
564
565* Basic stats
566* Extended stats
567
568Basic Stats
569^^^^^^^^^^^
570
571The application collects per-port and per-queue stats using rte_eth_stats APIs.
572
573.. code-block:: console
574
575    testpmd> show port stats (port_id)
576
577Basic stats include:
578
579* ipackets
580* ibytes
581* opackets
582* obytes
583* imissed
584* ierrors
585* oerrors
586
587By default, per-queue stats for 16 queues are supported. For more than 16
588queues, BNXT PMD should be compiled with ``RTE_ETHDEV_QUEUE_STAT_CNTRS``
589set to the desired number of queues.
590
591Extended Stats
592^^^^^^^^^^^^^^
593
594Unlike basic stats, the extended stats are vendor-specific, i.e. each vendor
595provides its own set of counters.
596
597The BNXT PMD provides a rich set of counters, including per-flow counters,
598per-cos counters, per-priority counters, etc.
599
600.. code-block:: console
601
602    testpmd> show port xstats (port_id)
603
604Shown below is the elaborated sequence to retrieve extended stats:
605
606.. code-block:: console
607
608    // application queries the number of xstats
609    len = rte_eth_xstats_get(port_id, NULL, 0);
610    // BNXT PMD returns the size of xstats array (i.e. the number of entries)
611    // BNXT PMD returns 0, if the feature is compiled out or disabled
612
613    // application allocates memory for xstats
614    struct rte_eth_xstats_name *names; // name is 64 character or less
615    struct rte_eth_xstats *xstats;
616    names = calloc(len, sizeof(*names));
617    xstats = calloc(len, sizeof(*xstats));
618
619    // application retrieves xstats // names and values
620    ret = rte_eth_xstats_get_names(port_id, *names, len);
621    ret = rte_eth_xstats_get(port_id, *xstats, len);
622
623    // application checks the xstats
624    // application may repeat the below:
625    len = rte_eth_xstats_reset(port_id); // reset the xstats
626
627    // reset can be skipped, if application wants to see accumulated stats
628    // run traffic
629    // probably stop the traffic
630    // retrieve xstats // no need to retrieve xstats names again
631    // check xstats
632
633Generic Flow Offload
634~~~~~~~~~~~~~~~~~~~~
635
636Applications can get benefit by offloading all or part of flow processing to
637hardware. For example, applications can offload packet classification only
638(partial offload) or whole match-action (full offload).
639
640DPDK offers the Generic Flow API (rte_flow API) to configure hardware to
641perform flow processing.
642
643Listed below are the rte_flow APIs BNXT PMD supports:
644
645* rte_flow_validate
646* rte_flow_create
647* rte_flow_destroy
648* rte_flow_flush
649
650Host Based Flow Table Management
651^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
652
653Starting with 20.05 BNXT PMD supports host based flow table management. This is
654a new mechanism that should allow higher flow scalability than what is currently
655supported. This new approach also defines a new rte_flow parser, and mapper
656which currently supports basic packet classification in the receive path.
657
658The feature uses a newly implemented control-plane firmware interface which
659optimizes flow insertions and deletions.
660
661This is a tech preview feature, and is disabled by default. It can be enabled
662using bnxt devargs. For ex: "-a 0000:0d:00.0,host-based-truflow=1”.
663
664This feature is currently supported on Whitney+ and Stingray devices.
665
666Notes
667-----
668
669- On stopping a device port, all the flows created on a port by the
670  application will be flushed from the hardware and any tables maintained
671  by the PMD. After stopping the device port, all flows on the port become
672  invalid and are not represented in the system anymore.
673  Instead of destroying or flushing such flows an application should discard
674  all references to these flows and re-create the flows as required after the
675  port is restarted.
676
677- While an application is free to use the group id attribute to group flows
678  together using a specific criteria, the BNXT PMD currently associates this
679  group id to a VNIC id. One such case is grouping of flows which are filtered
680  on the same source or destination MAC address. This allows packets of such
681  flows to be directed to one or more queues associated with the VNIC id.
682  This implementation is supported only when TRUFLOW functionality is disabled.
683
684- An application can issue a VXLAN decap offload request using rte_flow API
685  either as a single rte_flow request or a combination of two stages.
686  The PMD currently supports the two stage offload design.
687  In this approach the offload request may come as two flow offload requests
688  Flow1 & Flow2.  The match criteria for Flow1 is O_DMAC, O_SMAC, O_DST_IP,
689  O_UDP_DPORT and actions are COUNT, MARK, JUMP. The match criteria for Flow2
690  is O_SRC_IP, O_DST_IP, VNI and inner header fields.
691  Flow1 and Flow2 flow offload requests can come in any order. If Flow2 flow
692  offload request comes first then Flow2 can’t be offloaded as there is
693  no O_DMAC information in Flow2. In this case, Flow2 will be deferred until
694  Flow1 flow offload request arrives. When Flow1 flow offload request is
695  received it will have O_DMAC information. Using Flow1’s O_DMAC, driver
696  creates an L2 context entry in the hardware as part of offloading Flow1.
697  Flow2 will now use Flow1’s O_DMAC to get the L2 context id associated with
698  this O_DMAC and other flow fields that are cached already at the time
699  of deferring Flow2 for offloading. Flow2 that arrive after Flow1 is offloaded
700  will be directly programmed and not cached.
701
702- PMD supports thread-safe rte_flow operations.
703
704Note: A VNIC represents a virtual interface in the hardware. It is a resource
705in the RX path of the chip and is used to setup various target actions such as
706RSS, MAC filtering etc. for the physical function in use.
707
708Virtual Function Port Representors
709----------------------------------
710The BNXT PMD supports the creation of VF port representors for the control
711and monitoring of BNXT virtual function devices. Each port representor
712corresponds to a single virtual function of that device that is connected to a
713VF. When there is no hardware flow offload, each packet transmitted by the VF
714will be received by the corresponding representor. Similarly each packet that is
715sent to a representor will be received by the VF. Applications can take
716advantage of this feature when SRIOV is enabled. The representor will allow the
717first packet that is transmitted by the VF to be received by the DPDK
718application which can then decide if the flow should be offloaded to the
719hardware. Once the flow is offloaded in the hardware, any packet matching the
720flow will be received by the VF while the DPDK application will not receive it
721any more. The BNXT PMD supports creation and handling of the port representors
722when the PMD is initialized on a PF or trusted-VF. The user can specify the list
723of VF IDs of the VFs for which the representors are needed by using the
724``devargs`` option ``representor``.::
725
726  -a DBDF,representor=[0,1,4]
727
728Note that currently hot-plugging of representor ports is not supported so all
729the required representors must be specified on the creation of the PF or the
730trusted VF.
731
732Representors on Stingray SoC
733----------------------------
734A representor created on X86 host typically represents a VF running in the same
735X86 domain. But in case of the SoC, the application can run on the CPU complex
736inside the SoC. The representor can be created on the SoC to represent a PF or a
737VF running in the x86 domain. Since the representator creation requires passing
738the bus:device.function of the PCI device endpoint which is not necessarily in the
739same host domain, additional dev args have been added to the PMD.
740
741* rep_is_vf - false to indicate VF representor
742* rep_is_pf - true to indicate PF representor
743* rep_based_pf - Physical index of the PF
744* rep_q_r2f - Logical COS Queue index for the rep to endpoint direction
745* rep_q_f2r - Logical COS Queue index for the endpoint to rep direction
746* rep_fc_r2f - Flow control for the representor to endpoint direction
747* rep_fc_f2r - Flow control for the endpoint to representor direction
748
749The sample command line with the new ``devargs`` looks like this::
750
751  -a 0000:06:02.0,host-based-truflow=1,representor=[1],rep-based-pf=8,\
752	rep-is-pf=1,rep-q-r2f=1,rep-fc-r2f=0,rep-q-f2r=1,rep-fc-f2r=1
753
754.. code-block:: console
755
756	testpmd -l1-4 -n2 -a 0008:01:00.0,host-based-truflow=1,\
757	representor=[0], rep-based-pf=8,rep-is-pf=0,rep-q-r2f=1,rep-fc-r2f=1,\
758	rep-q-f2r=0,rep-fc-f2r=1 --log-level="pmd.*",8 -- -i --rxq=3 --txq=3
759
760Number of flows supported
761-------------------------
762The number of flows that can be support can be changed using the devargs
763parameter ``max_num_kflows``. The default number of flows supported is 16K each
764in ingress and egress path.
765
766Selecting EM vs EEM
767-------------------
768Broadcom devices can support filter creation in the onchip memory or the
769external memory. This is referred to as EM or EEM mode respectively.
770The decision for internal/external EM support is based on the ``devargs``
771parameter ``max_num_kflows``.  If this is set by the user, external EM is used.
772Otherwise EM support is enabled with flows created in internal memory.
773
774Application Support
775-------------------
776
777Firmware
778~~~~~~~~
779
780The BNXT PMD supports the application to retrieve the firmware version.
781
782.. code-block:: console
783
784    testpmd> show port info (port_id)
785
786Note that the applications cannot update the firmware using BNXT PMD.
787
788Multiple Processes
789~~~~~~~~~~~~~~~~~~
790
791When two or more DPDK applications (e.g., testpmd and dpdk-pdump) share a single
792instance of DPDK, the BNXT PMD supports a single primary application and one or
793more secondary applications. Note that the DPDK-layer (not the PMD) ensures
794there is only one primary application.
795
796There are two modes:
797
798Manual mode
799
800* Application notifies whether it is primary or secondary using *proc-type* flag
801* 1st process should be spawned with ``--proc-type=primary``
802* All subsequent processes should be spawned with ``--proc-type=secondary``
803
804Auto detection mode
805
806* Application is using ``proc-type=auto`` flag
807* A process is spawned as a secondary if a primary is already running
808
809The BNXT PMD uses the info to skip a device initialization, i.e. performs a
810device initialization only when being brought up by a primary application.
811
812Runtime Queue Setup
813~~~~~~~~~~~~~~~~~~~
814
815Typically, a DPDK application allocates TX and RX queues statically: i.e. queues
816are allocated at start. However, an application may want to increase (or
817decrease) the number of queues dynamically for various reasons, e.g. power
818savings.
819
820The BNXT PMD supports applications to increase or decrease queues at runtime.
821
822.. code-block:: console
823
824    testpmd> port config all (rxq|txq) (num_queues)
825
826Note that a DPDK application must allocate default queues (one for TX and one
827for RX at minimum) at initialization.
828
829Descriptor Status
830~~~~~~~~~~~~~~~~~
831
832Applications may use the descriptor status for various reasons, e.g. for power
833savings. For example, an application may stop polling and change to interrupt
834mode when the descriptor status shows no packets to service for a while.
835
836The BNXT PMD supports the application to retrieve both TX and RX descriptor
837status.
838
839.. code-block:: console
840
841    testpmd> show port (port_id) (rxq|txq) (queue_id) desc (desc_id) status
842
843Bonding
844~~~~~~~
845
846DPDK implements a light-weight library to allow PMDs to be bonded together and provide a single logical PMD to the application.
847
848.. code-block:: console
849
850    testpmd -l 0-3 -n4 --vdev 'net_bonding0,mode=0,slave=<PCI B:D.F device 1>,slave=<PCI B:D.F device 2>,mac=XX:XX:XX:XX:XX:XX’ – --socket_num=1 – -i --port-topology=chained
851    (ex) testpmd -l 1,3,5,7,9 -n4 --vdev 'net_bonding0,mode=0,slave=0000:82:00.0,slave=0000:82:00.1,mac=00:1e:67:1d:fd:1d' – --socket-num=1 – -i --port-topology=chained
852
853Vector Processing
854-----------------
855
856Vector processing provides significantly improved performance over scalar
857processing (see Vector Processor, here).
858
859The BNXT PMD supports the vector processing using SSE (Streaming SIMD
860Extensions) instructions on x86 platforms. It also supports NEON intrinsics for
861vector processing on ARM CPUs. The BNXT vPMD (vector mode PMD) is available for
862Intel/AMD and ARM CPU architectures.
863
864This improved performance comes from several optimizations:
865
866* Batching
867    * TX: processing completions in bulk
868    * RX: allocating mbufs in bulk
869* Chained mbufs are *not* supported, i.e. a packet should fit a single mbuf
870* Some stateless offloads are *not* supported with vector processing
871    * TX: no offloads will be supported
872    * RX: reduced RX offloads (listed below) will be supported::
873
874       DEV_RX_OFFLOAD_VLAN_STRIP
875       DEV_RX_OFFLOAD_KEEP_CRC
876       DEV_RX_OFFLOAD_JUMBO_FRAME
877       DEV_RX_OFFLOAD_IPV4_CKSUM
878       DEV_RX_OFFLOAD_UDP_CKSUM
879       DEV_RX_OFFLOAD_TCP_CKSUM
880       DEV_RX_OFFLOAD_OUTER_IPV4_CKSUM
881       DEV_RX_OFFLOAD_RSS_HASH
882       DEV_RX_OFFLOAD_VLAN_FILTER
883
884The BNXT Vector PMD is enabled in DPDK builds by default.
885
886However, a decision to enable vector mode will be made when the port transitions
887from stopped to started. Any TX offloads or some RX offloads (other than listed
888above) will disable the vector mode.
889Offload configuration changes that impact vector mode must be made when the port
890is stopped.
891
892Note that TX (or RX) vector mode can be enabled independently from RX (or TX)
893vector mode.
894
895Also vector mode is allowed when jumbo is enabled
896as long as the MTU setting does not require scattered Rx.
897
898Appendix
899--------
900
901Supported Chipsets and Adapters
902~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
903
904BCM5730x NetXtreme-C® Family of Ethernet Network Controllers
905^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
906
907Information about Ethernet adapters in the NetXtreme family of adapters can be
908found in the `NetXtreme® Brand section <https://www.broadcom.com/products/ethernet-connectivity/network-adapters/>`_ of the `Broadcom website <http://www.broadcom.com/>`_.
909
910* ``M150c ... Single-port 40/50 Gigabit Ethernet Adapter``
911* ``P150c ... Single-port 40/50 Gigabit Ethernet Adapter``
912* ``P225c ... Dual-port 10/25 Gigabit Ethernet Adapter``
913
914BCM574xx/575xx NetXtreme-E® Family of Ethernet Network Controllers
915^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
916
917Information about Ethernet adapters in the NetXtreme family of adapters can be
918found in the `NetXtreme® Brand section <https://www.broadcom.com/products/ethernet-connectivity/network-adapters/>`_ of the `Broadcom website <http://www.broadcom.com/>`_.
919
920* ``M125P .... Single-port OCP 2.0 10/25 Gigabit Ethernet Adapter``
921* ``M150P .... Single-port OCP 2.0 50 Gigabit Ethernet Adapter``
922* ``M150PM ... Single-port OCP 2.0 Multi-Host 50 Gigabit Ethernet Adapter``
923* ``M210P .... Dual-port OCP 2.0 10 Gigabit Ethernet Adapter``
924* ``M210TP ... Dual-port OCP 2.0 10 Gigabit Ethernet Adapter``
925* ``M1100G ... Single-port OCP 2.0 10/25/50/100 Gigabit Ethernet Adapter``
926* ``N150G .... Single-port OCP 3.0 50 Gigabit Ethernet Adapter``
927* ``M225P .... Dual-port OCP 2.0 10/25 Gigabit Ethernet Adapter``
928* ``N210P .... Dual-port OCP 3.0 10 Gigabit Ethernet Adapter``
929* ``N210TP ... Dual-port OCP 3.0 10 Gigabit Ethernet Adapter``
930* ``N225P .... Dual-port OCP 3.0 10/25 Gigabit Ethernet Adapter``
931* ``N250G .... Dual-port OCP 3.0 50 Gigabit Ethernet Adapter``
932* ``N410SG ... Quad-port OCP 3.0 10 Gigabit Ethernet Adapter``
933* ``N410SGBT . Quad-port OCP 3.0 10 Gigabit Ethernet Adapter``
934* ``N425G .... Quad-port OCP 3.0 10/25 Gigabit Ethernet Adapter``
935* ``N1100G ... Single-port OCP 3.0 10/25/50/100 Gigabit Ethernet Adapter``
936* ``N2100G ... Dual-port OCP 3.0 10/25/50/100 Gigabit Ethernet Adapter``
937* ``N2200G ... Dual-port OCP 3.0 10/25/50/100/200 Gigabit Ethernet Adapter``
938* ``P150P .... Single-port 50 Gigabit Ethernet Adapter``
939* ``P210P .... Dual-port 10 Gigabit Ethernet Adapter``
940* ``P210TP ... Dual-port 10 Gigabit Ethernet Adapter``
941* ``P225P .... Dual-port 10/25 Gigabit Ethernet Adapter``
942* ``P410SG ... Quad-port 10 Gigabit Ethernet Adapter``
943* ``P410SGBT . Quad-port 10 Gigabit Ethernet Adapter``
944* ``P425G .... Quad-port 10/25 Gigabit Ethernet Adapter``
945* ``P1100G ... Single-port 10/25/50/100 Gigabit Ethernet Adapter``
946* ``P2100G ... Dual-port 10/25/50/100 Gigabit Ethernet Adapter``
947* ``P2200G ... Dual-port 10/25/50/100/200 Gigabit Ethernet Adapter``
948
949BCM588xx NetXtreme-S® Family of SmartNIC Network Controllers
950^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
951
952Information about the Stingray family of SmartNIC adapters can be found in the
953`Stingray® Brand section <https://www.broadcom.com/products/ethernet-connectivity/smartnic/>`_ of the `Broadcom website <http://www.broadcom.com/>`_.
954
955* ``PS225 ... Dual-port 25 Gigabit Ethernet SmartNIC``
956
957BCM5873x StrataGX® Family of Communications Processors
958^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
959
960These ARM-based processors target a broad range of networking applications,
961including virtual CPE (vCPE) and NFV appliances, 10G service routers and
962gateways, control plane processing for Ethernet switches, and network-attached
963storage (NAS).
964
965* ``StrataGX BCM58732 ... Octal-Core 3.0GHz 64-bit ARM®v8 Cortex®-A72 based SoC``
966