1..  BSD LICENSE
2    Copyright(c) 2017 Intel Corporation. All rights reserved.
3    All rights reserved.
4
5    Redistribution and use in source and binary forms, with or without
6    modification, are permitted provided that the following conditions
7    are met:
8
9    * Redistributions of source code must retain the above copyright
10    notice, this list of conditions and the following disclaimer.
11    * Redistributions in binary form must reproduce the above copyright
12    notice, this list of conditions and the following disclaimer in
13    the documentation and/or other materials provided with the
14    distribution.
15    * Neither the name of Intel Corporation nor the names of its
16    contributors may be used to endorse or promote products derived
17    from this software without specific prior written permission.
18
19    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31Flow Classify Sample Application
32================================
33
34The Flow Classify sample application is based on the simple *skeleton* example
35of a forwarding application.
36
37It is intended as a demonstration of the basic components of a DPDK forwarding
38application which uses the Flow Classify library API's.
39
40Please refer to the
41:doc:`../prog_guide/flow_classify_lib`
42for more information.
43
44Compiling the Application
45-------------------------
46
47To compile the sample application see :doc:`compiling`.
48
49The application is located in the ``flow_classify`` sub-directory.
50
51Running the Application
52-----------------------
53
54To run the example in a ``linuxapp`` environment:
55
56.. code-block:: console
57
58    cd ~/dpdk/examples/flow_classify
59    ./build/flow_classify -c 4 -n 4 -- --rule_ipv4="../ipv4_rules_file.txt"
60
61Please refer to the *DPDK Getting Started Guide*, section
62:doc:`../linux_gsg/build_sample_apps`
63for general information on running applications and the Environment Abstraction
64Layer (EAL) options.
65
66
67Sample ipv4_rules_file.txt
68--------------------------
69
70.. code-block:: console
71
72    #file format:
73    #src_ip/masklen dst_ip/masklen src_port : mask dst_port : mask proto/mask priority
74    #
75    2.2.2.3/24 2.2.2.7/24 32 : 0xffff 33 : 0xffff 17/0xff 0
76    9.9.9.3/24 9.9.9.7/24 32 : 0xffff 33 : 0xffff 17/0xff 1
77    9.9.9.3/24 9.9.9.7/24 32 : 0xffff 33 : 0xffff 6/0xff 2
78    9.9.8.3/24 9.9.8.7/24 32 : 0xffff 33 : 0xffff 6/0xff 3
79    6.7.8.9/24 2.3.4.5/24 32 : 0x0000 33 : 0x0000 132/0xff 4
80
81Explanation
82-----------
83
84The following sections provide an explanation of the main components of the
85code.
86
87All DPDK library functions used in the sample code are prefixed with ``rte_``
88and are explained in detail in the *DPDK API Documentation*.
89
90ACL field definitions for the IPv4 5 tuple rule
91~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
92
93The following field definitions are used when creating the ACL table during
94initialisation of the ``Flow Classify`` application..
95
96.. code-block:: c
97
98     enum {
99         PROTO_FIELD_IPV4,
100         SRC_FIELD_IPV4,
101         DST_FIELD_IPV4,
102         SRCP_FIELD_IPV4,
103         DSTP_FIELD_IPV4,
104         NUM_FIELDS_IPV4
105    };
106
107    enum {
108        PROTO_INPUT_IPV4,
109        SRC_INPUT_IPV4,
110        DST_INPUT_IPV4,
111        SRCP_DESTP_INPUT_IPV4
112    };
113
114    static struct rte_acl_field_def ipv4_defs[NUM_FIELDS_IPV4] = {
115        /* first input field - always one byte long. */
116        {
117            .type = RTE_ACL_FIELD_TYPE_BITMASK,
118            .size = sizeof(uint8_t),
119            .field_index = PROTO_FIELD_IPV4,
120            .input_index = PROTO_INPUT_IPV4,
121            .offset = sizeof(struct ether_hdr) +
122                offsetof(struct ipv4_hdr, next_proto_id),
123        },
124        /* next input field (IPv4 source address) - 4 consecutive bytes. */
125        {
126            /* rte_flow uses a bit mask for IPv4 addresses */
127            .type = RTE_ACL_FIELD_TYPE_BITMASK,
128            .size = sizeof(uint32_t),
129            .field_index = SRC_FIELD_IPV4,
130            .input_index = SRC_INPUT_IPV4,
131            .offset = sizeof(struct ether_hdr) +
132                offsetof(struct ipv4_hdr, src_addr),
133        },
134        /* next input field (IPv4 destination address) - 4 consecutive bytes. */
135        {
136            /* rte_flow uses a bit mask for IPv4 addresses */
137            .type = RTE_ACL_FIELD_TYPE_BITMASK,
138            .size = sizeof(uint32_t),
139            .field_index = DST_FIELD_IPV4,
140            .input_index = DST_INPUT_IPV4,
141            .offset = sizeof(struct ether_hdr) +
142                offsetof(struct ipv4_hdr, dst_addr),
143        },
144        /*
145         * Next 2 fields (src & dst ports) form 4 consecutive bytes.
146         * They share the same input index.
147         */
148	{
149            /* rte_flow uses a bit mask for protocol ports */
150            .type = RTE_ACL_FIELD_TYPE_BITMASK,
151            .size = sizeof(uint16_t),
152            .field_index = SRCP_FIELD_IPV4,
153            .input_index = SRCP_DESTP_INPUT_IPV4,
154            .offset = sizeof(struct ether_hdr) +
155                sizeof(struct ipv4_hdr) +
156                offsetof(struct tcp_hdr, src_port),
157        },
158        {
159             /* rte_flow uses a bit mask for protocol ports */
160             .type = RTE_ACL_FIELD_TYPE_BITMASK,
161             .size = sizeof(uint16_t),
162             .field_index = DSTP_FIELD_IPV4,
163             .input_index = SRCP_DESTP_INPUT_IPV4,
164             .offset = sizeof(struct ether_hdr) +
165                 sizeof(struct ipv4_hdr) +
166                 offsetof(struct tcp_hdr, dst_port),
167        },
168    };
169
170The Main Function
171~~~~~~~~~~~~~~~~~
172
173The ``main()`` function performs the initialization and calls the execution
174threads for each lcore.
175
176The first task is to initialize the Environment Abstraction Layer (EAL).
177The ``argc`` and ``argv`` arguments are provided to the ``rte_eal_init()``
178function. The value returned is the number of parsed arguments:
179
180.. code-block:: c
181
182    int ret = rte_eal_init(argc, argv);
183    if (ret < 0)
184        rte_exit(EXIT_FAILURE, "Error with EAL initialization\n");
185
186It then parses the flow_classify application arguments
187
188.. code-block:: c
189
190    ret = parse_args(argc, argv);
191    if (ret < 0)
192        rte_exit(EXIT_FAILURE, "Invalid flow_classify parameters\n");
193
194The ``main()`` function also allocates a mempool to hold the mbufs
195(Message Buffers) used by the application:
196
197.. code-block:: c
198
199    mbuf_pool = rte_mempool_create("MBUF_POOL",
200                                   NUM_MBUFS * nb_ports,
201                                   MBUF_SIZE,
202                                   MBUF_CACHE_SIZE,
203                                   sizeof(struct rte_pktmbuf_pool_private),
204                                   rte_pktmbuf_pool_init, NULL,
205                                   rte_pktmbuf_init, NULL,
206                                   rte_socket_id(),
207                                   0);
208
209mbufs are the packet buffer structure used by DPDK. They are explained in
210detail in the "Mbuf Library" section of the *DPDK Programmer's Guide*.
211
212The ``main()`` function also initializes all the ports using the user defined
213``port_init()`` function which is explained in the next section:
214
215.. code-block:: c
216
217    for (portid = 0; portid < nb_ports; portid++) {
218        if (port_init(portid, mbuf_pool) != 0) {
219            rte_exit(EXIT_FAILURE,
220                     "Cannot init port %" PRIu8 "\n", portid);
221        }
222    }
223
224The ``main()`` function creates the ``flow classifier object`` and adds an ``ACL
225table`` to the flow classifier.
226
227.. code-block:: c
228
229    struct flow_classifier {
230        struct rte_flow_classifier *cls;
231    };
232
233    struct flow_classifier_acl {
234        struct flow_classifier cls;
235    } __rte_cache_aligned;
236
237    /* Memory allocation */
238    size = RTE_CACHE_LINE_ROUNDUP(sizeof(struct flow_classifier_acl));
239    cls_app = rte_zmalloc(NULL, size, RTE_CACHE_LINE_SIZE);
240    if (cls_app == NULL)
241        rte_exit(EXIT_FAILURE, "Cannot allocate classifier memory\n");
242
243    cls_params.name = "flow_classifier";
244    cls_params.socket_id = socket_id;
245
246    cls_app->cls = rte_flow_classifier_create(&cls_params);
247    if (cls_app->cls == NULL) {
248        rte_free(cls_app);
249        rte_exit(EXIT_FAILURE, "Cannot create classifier\n");
250    }
251
252    /* initialise ACL table params */
253    table_acl_params.name = "table_acl_ipv4_5tuple";
254    table_acl_params.n_rule_fields = RTE_DIM(ipv4_defs);
255    table_acl_params.n_rules = FLOW_CLASSIFY_MAX_RULE_NUM;
256    memcpy(table_acl_params.field_format, ipv4_defs, sizeof(ipv4_defs));
257
258    /* initialise table create params */
259    cls_table_params.ops = &rte_table_acl_ops,
260    cls_table_params.arg_create = &table_acl_params,
261    cls_table_params.type = RTE_FLOW_CLASSIFY_TABLE_ACL_IP4_5TUPLE;
262
263    ret = rte_flow_classify_table_create(cls_app->cls, &cls_table_params);
264    if (ret) {
265        rte_flow_classifier_free(cls_app->cls);
266        rte_free(cls);
267        rte_exit(EXIT_FAILURE, "Failed to create classifier table\n");
268    }
269
270It then reads the ipv4_rules_file.txt file and initialises the parameters for
271the ``rte_flow_classify_table_entry_add`` API.
272This API adds a rule to the ACL table.
273
274.. code-block:: c
275
276    if (add_rules(parm_config.rule_ipv4_name)) {
277        rte_flow_classifier_free(cls_app->cls);
278        rte_free(cls_app);
279        rte_exit(EXIT_FAILURE, "Failed to add rules\n");
280    }
281
282Once the initialization is complete, the application is ready to launch a
283function on an lcore. In this example ``lcore_main()`` is called on a single
284lcore.
285
286.. code-block:: c
287
288    lcore_main(cls_app);
289
290The ``lcore_main()`` function is explained below.
291
292The Port Initialization  Function
293~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
294
295The main functional part of the port initialization used in the Basic
296Forwarding application is shown below:
297
298.. code-block:: c
299
300    static inline int
301    port_init(uint8_t port, struct rte_mempool *mbuf_pool)
302    {
303        struct rte_eth_conf port_conf = port_conf_default;
304        const uint16_t rx_rings = 1, tx_rings = 1;
305        struct ether_addr addr;
306        int retval;
307        uint16_t q;
308
309        if (port >= rte_eth_dev_count())
310            return -1;
311
312        /* Configure the Ethernet device. */
313        retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
314        if (retval != 0)
315            return retval;
316
317        /* Allocate and set up 1 RX queue per Ethernet port. */
318        for (q = 0; q < rx_rings; q++) {
319            retval = rte_eth_rx_queue_setup(port, q, RX_RING_SIZE,
320                    rte_eth_dev_socket_id(port), NULL, mbuf_pool);
321            if (retval < 0)
322                return retval;
323        }
324
325        /* Allocate and set up 1 TX queue per Ethernet port. */
326        for (q = 0; q < tx_rings; q++) {
327            retval = rte_eth_tx_queue_setup(port, q, TX_RING_SIZE,
328                    rte_eth_dev_socket_id(port), NULL);
329            if (retval < 0)
330                return retval;
331        }
332
333        /* Start the Ethernet port. */
334        retval = rte_eth_dev_start(port);
335        if (retval < 0)
336            return retval;
337
338        /* Display the port MAC address. */
339        rte_eth_macaddr_get(port, &addr);
340        printf("Port %u MAC: %02" PRIx8 " %02" PRIx8 " %02" PRIx8
341               " %02" PRIx8 " %02" PRIx8 " %02" PRIx8 "\n",
342               port,
343               addr.addr_bytes[0], addr.addr_bytes[1],
344               addr.addr_bytes[2], addr.addr_bytes[3],
345               addr.addr_bytes[4], addr.addr_bytes[5]);
346
347        /* Enable RX in promiscuous mode for the Ethernet device. */
348        rte_eth_promiscuous_enable(port);
349
350        return 0;
351    }
352
353The Ethernet ports are configured with default settings using the
354``rte_eth_dev_configure()`` function and the ``port_conf_default`` struct.
355
356.. code-block:: c
357
358    static const struct rte_eth_conf port_conf_default = {
359        .rxmode = { .max_rx_pkt_len = ETHER_MAX_LEN }
360    };
361
362For this example the ports are set up with 1 RX and 1 TX queue using the
363``rte_eth_rx_queue_setup()`` and ``rte_eth_tx_queue_setup()`` functions.
364
365The Ethernet port is then started:
366
367.. code-block:: c
368
369    retval  = rte_eth_dev_start(port);
370
371
372Finally the RX port is set in promiscuous mode:
373
374.. code-block:: c
375
376    rte_eth_promiscuous_enable(port);
377
378The Add Rules function
379~~~~~~~~~~~~~~~~~~~~~~
380
381The ``add_rules`` function reads the ``ipv4_rules_file.txt`` file and calls the
382``add_classify_rule`` function which calls the
383``rte_flow_classify_table_entry_add`` API.
384
385.. code-block:: c
386
387    static int
388    add_rules(const char *rule_path)
389    {
390        FILE *fh;
391        char buff[LINE_MAX];
392        unsigned int i = 0;
393        unsigned int total_num = 0;
394        struct rte_eth_ntuple_filter ntuple_filter;
395
396        fh = fopen(rule_path, "rb");
397        if (fh == NULL)
398            rte_exit(EXIT_FAILURE, "%s: Open %s failed\n", __func__,
399                     rule_path);
400
401        fseek(fh, 0, SEEK_SET);
402
403        i = 0;
404        while (fgets(buff, LINE_MAX, fh) != NULL) {
405            i++;
406
407            if (is_bypass_line(buff))
408                continue;
409
410            if (total_num >= FLOW_CLASSIFY_MAX_RULE_NUM - 1) {
411                printf("\nINFO: classify rule capacity %d reached\n",
412                       total_num);
413                break;
414            }
415
416            if (parse_ipv4_5tuple_rule(buff, &ntuple_filter) != 0)
417                rte_exit(EXIT_FAILURE,
418                         "%s Line %u: parse rules error\n",
419                         rule_path, i);
420
421            if (add_classify_rule(&ntuple_filter) != 0)
422                rte_exit(EXIT_FAILURE, "add rule error\n");
423
424            total_num++;
425	}
426
427	fclose(fh);
428	return 0;
429    }
430
431
432The Lcore Main function
433~~~~~~~~~~~~~~~~~~~~~~~
434
435As we saw above the ``main()`` function calls an application function on the
436available lcores.
437The ``lcore_main`` function calls the ``rte_flow_classifier_query`` API.
438For the Basic Forwarding application the ``lcore_main`` function looks like the
439following:
440
441.. code-block:: c
442
443    /* flow classify data */
444    static int num_classify_rules;
445    static struct rte_flow_classify_rule *rules[MAX_NUM_CLASSIFY];
446    static struct rte_flow_classify_ipv4_5tuple_stats ntuple_stats;
447    static struct rte_flow_classify_stats classify_stats = {
448            .stats = (void *)&ntuple_stats
449    };
450
451    static __attribute__((noreturn)) void
452    lcore_main(cls_app)
453    {
454        const uint8_t nb_ports = rte_eth_dev_count();
455        uint8_t port;
456
457        /*
458         * Check that the port is on the same NUMA node as the polling thread
459         * for best performance.
460         */
461        for (port = 0; port < nb_ports; port++)
462            if (rte_eth_dev_socket_id(port) > 0 &&
463                rte_eth_dev_socket_id(port) != (int)rte_socket_id()) {
464                printf("\n\n");
465                printf("WARNING: port %u is on remote NUMA node\n",
466                       port);
467                printf("to polling thread.\n");
468                printf("Performance will not be optimal.\n");
469
470                printf("\nCore %u forwarding packets. \n",
471                       rte_lcore_id());
472                printf("[Ctrl+C to quit]\n
473            }
474
475        /* Run until the application is quit or killed. */
476        for (;;) {
477            /*
478             * Receive packets on a port and forward them on the paired
479             * port. The mapping is 0 -> 1, 1 -> 0, 2 -> 3, 3 -> 2, etc.
480             */
481            for (port = 0; port < nb_ports; port++) {
482
483                /* Get burst of RX packets, from first port of pair. */
484                struct rte_mbuf *bufs[BURST_SIZE];
485                const uint16_t nb_rx = rte_eth_rx_burst(port, 0,
486                        bufs, BURST_SIZE);
487
488                if (unlikely(nb_rx == 0))
489                    continue;
490
491                for (i = 0; i < MAX_NUM_CLASSIFY; i++) {
492                    if (rules[i]) {
493                        ret = rte_flow_classifier_query(
494                            cls_app->cls,
495                            bufs, nb_rx, rules[i],
496                            &classify_stats);
497                        if (ret)
498                            printf(
499                                "rule [%d] query failed ret [%d]\n\n",
500                                i, ret);
501                        else {
502                            printf(
503                                "rule[%d] count=%"PRIu64"\n",
504                                i, ntuple_stats.counter1);
505
506                            printf("proto = %d\n",
507                                ntuple_stats.ipv4_5tuple.proto);
508                        }
509                     }
510                 }
511
512                /* Send burst of TX packets, to second port of pair. */
513                const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1, 0,
514                        bufs, nb_rx);
515
516                /* Free any unsent packets. */
517                if (unlikely(nb_tx < nb_rx)) {
518                    uint16_t buf;
519                    for (buf = nb_tx; buf < nb_rx; buf++)
520                        rte_pktmbuf_free(bufs[buf]);
521                }
522            }
523        }
524    }
525
526The main work of the application is done within the loop:
527
528.. code-block:: c
529
530        for (;;) {
531            for (port = 0; port < nb_ports; port++) {
532
533                /* Get burst of RX packets, from first port of pair. */
534                struct rte_mbuf *bufs[BURST_SIZE];
535                const uint16_t nb_rx = rte_eth_rx_burst(port, 0,
536                        bufs, BURST_SIZE);
537
538                if (unlikely(nb_rx == 0))
539                    continue;
540
541                /* Send burst of TX packets, to second port of pair. */
542                const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1, 0,
543                        bufs, nb_rx);
544
545                /* Free any unsent packets. */
546                if (unlikely(nb_tx < nb_rx)) {
547                    uint16_t buf;
548                    for (buf = nb_tx; buf < nb_rx; buf++)
549                        rte_pktmbuf_free(bufs[buf]);
550                }
551            }
552        }
553
554Packets are received in bursts on the RX ports and transmitted in bursts on
555the TX ports. The ports are grouped in pairs with a simple mapping scheme
556using the an XOR on the port number::
557
558    0 -> 1
559    1 -> 0
560
561    2 -> 3
562    3 -> 2
563
564    etc.
565
566The ``rte_eth_tx_burst()`` function frees the memory buffers of packets that
567are transmitted. If packets fail to transmit, ``(nb_tx < nb_rx)``, then they
568must be freed explicitly using ``rte_pktmbuf_free()``.
569
570The forwarding loop can be interrupted and the application closed using
571``Ctrl-C``.
572