1..  BSD LICENSE
2    Copyright(c) 2017 Intel Corporation. All rights reserved.
3    All rights reserved.
4
5    Redistribution and use in source and binary forms, with or without
6    modification, are permitted provided that the following conditions
7    are met:
8
9    * Redistributions of source code must retain the above copyright
10    notice, this list of conditions and the following disclaimer.
11    * Redistributions in binary form must reproduce the above copyright
12    notice, this list of conditions and the following disclaimer in
13    the documentation and/or other materials provided with the
14    distribution.
15    * Neither the name of Intel Corporation nor the names of its
16    contributors may be used to endorse or promote products derived
17    from this software without specific prior written permission.
18
19    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31Flow Classify Sample Application
32================================
33
34The Flow Classify sample application is based on the simple *skeleton* example
35of a forwarding application.
36
37It is intended as a demonstration of the basic components of a DPDK forwarding
38application which uses the Flow Classify library API's.
39
40Please refer to the
41:doc:`../prog_guide/flow_classify_lib`
42for more information.
43
44Compiling the Application
45-------------------------
46
47To compile the sample application see :doc:`compiling`.
48
49The application is located in the ``flow_classify`` sub-directory.
50
51Running the Application
52-----------------------
53
54To run the example in a ``linuxapp`` environment:
55
56.. code-block:: console
57
58    cd ~/dpdk/examples/flow_classify
59    ./build/flow_classify -c 4 -n 4 -- --rule_ipv4="../ipv4_rules_file.txt"
60
61Please refer to the *DPDK Getting Started Guide*, section
62:doc:`../linux_gsg/build_sample_apps`
63for general information on running applications and the Environment Abstraction
64Layer (EAL) options.
65
66
67Sample ipv4_rules_file.txt
68--------------------------
69
70.. code-block:: console
71
72    #file format:
73    #src_ip/masklen dst_ip/masklen src_port : mask dst_port : mask proto/mask priority
74    #
75    2.2.2.3/24 2.2.2.7/24 32 : 0xffff 33 : 0xffff 17/0xff 0
76    9.9.9.3/24 9.9.9.7/24 32 : 0xffff 33 : 0xffff 17/0xff 1
77    9.9.9.3/24 9.9.9.7/24 32 : 0xffff 33 : 0xffff 6/0xff 2
78    9.9.8.3/24 9.9.8.7/24 32 : 0xffff 33 : 0xffff 6/0xff 3
79    6.7.8.9/24 2.3.4.5/24 32 : 0x0000 33 : 0x0000 132/0xff 4
80
81Explanation
82-----------
83
84The following sections provide an explanation of the main components of the
85code.
86
87All DPDK library functions used in the sample code are prefixed with ``rte_``
88and are explained in detail in the *DPDK API Documentation*.
89
90ACL field definitions for the IPv4 5 tuple rule
91~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
92
93The following field definitions are used when creating the ACL table during
94initialisation of the ``Flow Classify`` application..
95
96.. code-block:: c
97
98     enum {
99         PROTO_FIELD_IPV4,
100         SRC_FIELD_IPV4,
101         DST_FIELD_IPV4,
102         SRCP_FIELD_IPV4,
103         DSTP_FIELD_IPV4,
104         NUM_FIELDS_IPV4
105    };
106
107    enum {
108        PROTO_INPUT_IPV4,
109        SRC_INPUT_IPV4,
110        DST_INPUT_IPV4,
111        SRCP_DESTP_INPUT_IPV4
112    };
113
114    static struct rte_acl_field_def ipv4_defs[NUM_FIELDS_IPV4] = {
115        /* first input field - always one byte long. */
116        {
117            .type = RTE_ACL_FIELD_TYPE_BITMASK,
118            .size = sizeof(uint8_t),
119            .field_index = PROTO_FIELD_IPV4,
120            .input_index = PROTO_INPUT_IPV4,
121            .offset = sizeof(struct ether_hdr) +
122                offsetof(struct ipv4_hdr, next_proto_id),
123        },
124        /* next input field (IPv4 source address) - 4 consecutive bytes. */
125        {
126            /* rte_flow uses a bit mask for IPv4 addresses */
127            .type = RTE_ACL_FIELD_TYPE_BITMASK,
128            .size = sizeof(uint32_t),
129            .field_index = SRC_FIELD_IPV4,
130            .input_index = SRC_INPUT_IPV4,
131            .offset = sizeof(struct ether_hdr) +
132                offsetof(struct ipv4_hdr, src_addr),
133        },
134        /* next input field (IPv4 destination address) - 4 consecutive bytes. */
135        {
136            /* rte_flow uses a bit mask for IPv4 addresses */
137            .type = RTE_ACL_FIELD_TYPE_BITMASK,
138            .size = sizeof(uint32_t),
139            .field_index = DST_FIELD_IPV4,
140            .input_index = DST_INPUT_IPV4,
141            .offset = sizeof(struct ether_hdr) +
142                offsetof(struct ipv4_hdr, dst_addr),
143        },
144        /*
145         * Next 2 fields (src & dst ports) form 4 consecutive bytes.
146         * They share the same input index.
147         */
148	{
149            /* rte_flow uses a bit mask for protocol ports */
150            .type = RTE_ACL_FIELD_TYPE_BITMASK,
151            .size = sizeof(uint16_t),
152            .field_index = SRCP_FIELD_IPV4,
153            .input_index = SRCP_DESTP_INPUT_IPV4,
154            .offset = sizeof(struct ether_hdr) +
155                sizeof(struct ipv4_hdr) +
156                offsetof(struct tcp_hdr, src_port),
157        },
158        {
159             /* rte_flow uses a bit mask for protocol ports */
160             .type = RTE_ACL_FIELD_TYPE_BITMASK,
161             .size = sizeof(uint16_t),
162             .field_index = DSTP_FIELD_IPV4,
163             .input_index = SRCP_DESTP_INPUT_IPV4,
164             .offset = sizeof(struct ether_hdr) +
165                 sizeof(struct ipv4_hdr) +
166                 offsetof(struct tcp_hdr, dst_port),
167        },
168    };
169
170The Main Function
171~~~~~~~~~~~~~~~~~
172
173The ``main()`` function performs the initialization and calls the execution
174threads for each lcore.
175
176The first task is to initialize the Environment Abstraction Layer (EAL).
177The ``argc`` and ``argv`` arguments are provided to the ``rte_eal_init()``
178function. The value returned is the number of parsed arguments:
179
180.. code-block:: c
181
182    int ret = rte_eal_init(argc, argv);
183    if (ret < 0)
184        rte_exit(EXIT_FAILURE, "Error with EAL initialization\n");
185
186It then parses the flow_classify application arguments
187
188.. code-block:: c
189
190    ret = parse_args(argc, argv);
191    if (ret < 0)
192        rte_exit(EXIT_FAILURE, "Invalid flow_classify parameters\n");
193
194The ``main()`` function also allocates a mempool to hold the mbufs
195(Message Buffers) used by the application:
196
197.. code-block:: c
198
199    mbuf_pool = rte_mempool_create("MBUF_POOL",
200                                   NUM_MBUFS * nb_ports,
201                                   MBUF_SIZE,
202                                   MBUF_CACHE_SIZE,
203                                   sizeof(struct rte_pktmbuf_pool_private),
204                                   rte_pktmbuf_pool_init, NULL,
205                                   rte_pktmbuf_init, NULL,
206                                   rte_socket_id(),
207                                   0);
208
209mbufs are the packet buffer structure used by DPDK. They are explained in
210detail in the "Mbuf Library" section of the *DPDK Programmer's Guide*.
211
212The ``main()`` function also initializes all the ports using the user defined
213``port_init()`` function which is explained in the next section:
214
215.. code-block:: c
216
217    for (portid = 0; portid < nb_ports; portid++) {
218        if (port_init(portid, mbuf_pool) != 0) {
219            rte_exit(EXIT_FAILURE,
220                     "Cannot init port %" PRIu8 "\n", portid);
221        }
222    }
223
224The ``main()`` function creates the ``flow classifier object`` and adds an ``ACL
225table`` to the flow classifier.
226
227.. code-block:: c
228
229    struct flow_classifier {
230        struct rte_flow_classifier *cls;
231        uint32_t table_id[RTE_FLOW_CLASSIFY_TABLE_MAX];
232    };
233
234    struct flow_classifier_acl {
235        struct flow_classifier cls;
236    } __rte_cache_aligned;
237
238    /* Memory allocation */
239    size = RTE_CACHE_LINE_ROUNDUP(sizeof(struct flow_classifier_acl));
240    cls_app = rte_zmalloc(NULL, size, RTE_CACHE_LINE_SIZE);
241    if (cls_app == NULL)
242        rte_exit(EXIT_FAILURE, "Cannot allocate classifier memory\n");
243
244    cls_params.name = "flow_classifier";
245    cls_params.socket_id = socket_id;
246    cls_params.type = RTE_FLOW_CLASSIFY_TABLE_TYPE_ACL;
247
248    cls_app->cls = rte_flow_classifier_create(&cls_params);
249    if (cls_app->cls == NULL) {
250        rte_free(cls_app);
251        rte_exit(EXIT_FAILURE, "Cannot create classifier\n");
252    }
253
254    /* initialise ACL table params */
255    table_acl_params.name = "table_acl_ipv4_5tuple";
256    table_acl_params.n_rule_fields = RTE_DIM(ipv4_defs);
257    table_acl_params.n_rules = FLOW_CLASSIFY_MAX_RULE_NUM;
258    memcpy(table_acl_params.field_format, ipv4_defs, sizeof(ipv4_defs));
259
260    /* initialise table create params */
261    cls_table_params.ops = &rte_table_acl_ops,
262    cls_table_params.arg_create = &table_acl_params,
263    cls_table_params.table_metadata_size = 0;
264
265    ret = rte_flow_classify_table_create(cls_app->cls, &cls_table_params,
266                  &cls->table_id[0]);
267    if (ret) {
268        rte_flow_classifier_free(cls_app->cls);
269        rte_free(cls);
270        rte_exit(EXIT_FAILURE, "Failed to create classifier table\n");
271    }
272
273It then reads the ipv4_rules_file.txt file and initialises the parameters for
274the ``rte_flow_classify_table_entry_add`` API.
275This API adds a rule to the ACL table.
276
277.. code-block:: c
278
279    if (add_rules(parm_config.rule_ipv4_name)) {
280        rte_flow_classifier_free(cls_app->cls);
281        rte_free(cls_app);
282        rte_exit(EXIT_FAILURE, "Failed to add rules\n");
283    }
284
285Once the initialization is complete, the application is ready to launch a
286function on an lcore. In this example ``lcore_main()`` is called on a single
287lcore.
288
289.. code-block:: c
290
291    lcore_main(cls_app);
292
293The ``lcore_main()`` function is explained below.
294
295The Port Initialization  Function
296~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
297
298The main functional part of the port initialization used in the Basic
299Forwarding application is shown below:
300
301.. code-block:: c
302
303    static inline int
304    port_init(uint8_t port, struct rte_mempool *mbuf_pool)
305    {
306        struct rte_eth_conf port_conf = port_conf_default;
307        const uint16_t rx_rings = 1, tx_rings = 1;
308        struct ether_addr addr;
309        int retval;
310        uint16_t q;
311
312        if (port >= rte_eth_dev_count())
313            return -1;
314
315        /* Configure the Ethernet device. */
316        retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
317        if (retval != 0)
318            return retval;
319
320        /* Allocate and set up 1 RX queue per Ethernet port. */
321        for (q = 0; q < rx_rings; q++) {
322            retval = rte_eth_rx_queue_setup(port, q, RX_RING_SIZE,
323                    rte_eth_dev_socket_id(port), NULL, mbuf_pool);
324            if (retval < 0)
325                return retval;
326        }
327
328        /* Allocate and set up 1 TX queue per Ethernet port. */
329        for (q = 0; q < tx_rings; q++) {
330            retval = rte_eth_tx_queue_setup(port, q, TX_RING_SIZE,
331                    rte_eth_dev_socket_id(port), NULL);
332            if (retval < 0)
333                return retval;
334        }
335
336        /* Start the Ethernet port. */
337        retval = rte_eth_dev_start(port);
338        if (retval < 0)
339            return retval;
340
341        /* Display the port MAC address. */
342        rte_eth_macaddr_get(port, &addr);
343        printf("Port %u MAC: %02" PRIx8 " %02" PRIx8 " %02" PRIx8
344               " %02" PRIx8 " %02" PRIx8 " %02" PRIx8 "\n",
345               port,
346               addr.addr_bytes[0], addr.addr_bytes[1],
347               addr.addr_bytes[2], addr.addr_bytes[3],
348               addr.addr_bytes[4], addr.addr_bytes[5]);
349
350        /* Enable RX in promiscuous mode for the Ethernet device. */
351        rte_eth_promiscuous_enable(port);
352
353        return 0;
354    }
355
356The Ethernet ports are configured with default settings using the
357``rte_eth_dev_configure()`` function and the ``port_conf_default`` struct.
358
359.. code-block:: c
360
361    static const struct rte_eth_conf port_conf_default = {
362        .rxmode = { .max_rx_pkt_len = ETHER_MAX_LEN }
363    };
364
365For this example the ports are set up with 1 RX and 1 TX queue using the
366``rte_eth_rx_queue_setup()`` and ``rte_eth_tx_queue_setup()`` functions.
367
368The Ethernet port is then started:
369
370.. code-block:: c
371
372    retval  = rte_eth_dev_start(port);
373
374
375Finally the RX port is set in promiscuous mode:
376
377.. code-block:: c
378
379    rte_eth_promiscuous_enable(port);
380
381The Add Rules function
382~~~~~~~~~~~~~~~~~~~~~~
383
384The ``add_rules`` function reads the ``ipv4_rules_file.txt`` file and calls the
385``add_classify_rule`` function which calls the
386``rte_flow_classify_table_entry_add`` API.
387
388.. code-block:: c
389
390    static int
391    add_rules(const char *rule_path)
392    {
393        FILE *fh;
394        char buff[LINE_MAX];
395        unsigned int i = 0;
396        unsigned int total_num = 0;
397        struct rte_eth_ntuple_filter ntuple_filter;
398
399        fh = fopen(rule_path, "rb");
400        if (fh == NULL)
401            rte_exit(EXIT_FAILURE, "%s: Open %s failed\n", __func__,
402                     rule_path);
403
404        fseek(fh, 0, SEEK_SET);
405
406        i = 0;
407        while (fgets(buff, LINE_MAX, fh) != NULL) {
408            i++;
409
410            if (is_bypass_line(buff))
411                continue;
412
413            if (total_num >= FLOW_CLASSIFY_MAX_RULE_NUM - 1) {
414                printf("\nINFO: classify rule capacity %d reached\n",
415                       total_num);
416                break;
417            }
418
419            if (parse_ipv4_5tuple_rule(buff, &ntuple_filter) != 0)
420                rte_exit(EXIT_FAILURE,
421                         "%s Line %u: parse rules error\n",
422                         rule_path, i);
423
424            if (add_classify_rule(&ntuple_filter) != 0)
425                rte_exit(EXIT_FAILURE, "add rule error\n");
426
427            total_num++;
428	}
429
430	fclose(fh);
431	return 0;
432    }
433
434
435The Lcore Main function
436~~~~~~~~~~~~~~~~~~~~~~~
437
438As we saw above the ``main()`` function calls an application function on the
439available lcores.
440The ``lcore_main`` function calls the ``rte_flow_classifier_query`` API.
441For the Basic Forwarding application the ``lcore_main`` function looks like the
442following:
443
444.. code-block:: c
445
446    /* flow classify data */
447    static int num_classify_rules;
448    static struct rte_flow_classify_rule *rules[MAX_NUM_CLASSIFY];
449    static struct rte_flow_classify_ipv4_5tuple_stats ntuple_stats;
450    static struct rte_flow_classify_stats classify_stats = {
451            .stats = (void *)&ntuple_stats
452    };
453
454    static __attribute__((noreturn)) void
455    lcore_main(cls_app)
456    {
457        const uint8_t nb_ports = rte_eth_dev_count();
458        uint8_t port;
459
460        /*
461         * Check that the port is on the same NUMA node as the polling thread
462         * for best performance.
463         */
464        for (port = 0; port < nb_ports; port++)
465            if (rte_eth_dev_socket_id(port) > 0 &&
466                rte_eth_dev_socket_id(port) != (int)rte_socket_id()) {
467                printf("\n\n");
468                printf("WARNING: port %u is on remote NUMA node\n",
469                       port);
470                printf("to polling thread.\n");
471                printf("Performance will not be optimal.\n");
472
473                printf("\nCore %u forwarding packets. \n",
474                       rte_lcore_id());
475                printf("[Ctrl+C to quit]\n
476            }
477
478        /* Run until the application is quit or killed. */
479        for (;;) {
480            /*
481             * Receive packets on a port and forward them on the paired
482             * port. The mapping is 0 -> 1, 1 -> 0, 2 -> 3, 3 -> 2, etc.
483             */
484            for (port = 0; port < nb_ports; port++) {
485
486                /* Get burst of RX packets, from first port of pair. */
487                struct rte_mbuf *bufs[BURST_SIZE];
488                const uint16_t nb_rx = rte_eth_rx_burst(port, 0,
489                        bufs, BURST_SIZE);
490
491                if (unlikely(nb_rx == 0))
492                    continue;
493
494                for (i = 0; i < MAX_NUM_CLASSIFY; i++) {
495                    if (rules[i]) {
496                        ret = rte_flow_classifier_query(
497                            cls_app->cls,
498                            cls_app->table_id[0],
499                            bufs, nb_rx, rules[i],
500                            &classify_stats);
501                        if (ret)
502                            printf(
503                                "rule [%d] query failed ret [%d]\n\n",
504                                i, ret);
505                        else {
506                            printf(
507                                "rule[%d] count=%"PRIu64"\n",
508                                i, ntuple_stats.counter1);
509
510                            printf("proto = %d\n",
511                                ntuple_stats.ipv4_5tuple.proto);
512                        }
513                     }
514                 }
515
516                /* Send burst of TX packets, to second port of pair. */
517                const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1, 0,
518                        bufs, nb_rx);
519
520                /* Free any unsent packets. */
521                if (unlikely(nb_tx < nb_rx)) {
522                    uint16_t buf;
523                    for (buf = nb_tx; buf < nb_rx; buf++)
524                        rte_pktmbuf_free(bufs[buf]);
525                }
526            }
527        }
528    }
529
530The main work of the application is done within the loop:
531
532.. code-block:: c
533
534        for (;;) {
535            for (port = 0; port < nb_ports; port++) {
536
537                /* Get burst of RX packets, from first port of pair. */
538                struct rte_mbuf *bufs[BURST_SIZE];
539                const uint16_t nb_rx = rte_eth_rx_burst(port, 0,
540                        bufs, BURST_SIZE);
541
542                if (unlikely(nb_rx == 0))
543                    continue;
544
545                /* Send burst of TX packets, to second port of pair. */
546                const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1, 0,
547                        bufs, nb_rx);
548
549                /* Free any unsent packets. */
550                if (unlikely(nb_tx < nb_rx)) {
551                    uint16_t buf;
552                    for (buf = nb_tx; buf < nb_rx; buf++)
553                        rte_pktmbuf_free(bufs[buf]);
554                }
555            }
556        }
557
558Packets are received in bursts on the RX ports and transmitted in bursts on
559the TX ports. The ports are grouped in pairs with a simple mapping scheme
560using the an XOR on the port number::
561
562    0 -> 1
563    1 -> 0
564
565    2 -> 3
566    3 -> 2
567
568    etc.
569
570The ``rte_eth_tx_burst()`` function frees the memory buffers of packets that
571are transmitted. If packets fail to transmit, ``(nb_tx < nb_rx)``, then they
572must be freed explicitly using ``rte_pktmbuf_free()``.
573
574The forwarding loop can be interrupted and the application closed using
575``Ctrl-C``.
576