1d30ea906Sjfb8856606.. SPDX-License-Identifier: BSD-3-Clause 2d30ea906Sjfb8856606 Copyright(c) 2010-2014 Intel Corporation. 3a9643ea8Slogwang 4a9643ea8Slogwang.. _multi_process_app: 5a9643ea8Slogwang 6a9643ea8SlogwangMulti-process Sample Application 7a9643ea8Slogwang================================ 8a9643ea8Slogwang 9a9643ea8SlogwangThis chapter describes the example applications for multi-processing that are included in the DPDK. 10a9643ea8Slogwang 11a9643ea8SlogwangExample Applications 12a9643ea8Slogwang-------------------- 13a9643ea8Slogwang 14a9643ea8SlogwangBuilding the Sample Applications 15a9643ea8Slogwang~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 16a9643ea8SlogwangThe multi-process example applications are built in the same way as other sample applications, 17a9643ea8Slogwangand as documented in the *DPDK Getting Started Guide*. 18a9643ea8Slogwang 19a9643ea8Slogwang 202bfe3f2eSlogwangTo compile the sample application see :doc:`compiling`. 21a9643ea8Slogwang 222bfe3f2eSlogwangThe applications are located in the ``multi_process`` sub-directory. 23a9643ea8Slogwang 24a9643ea8Slogwang.. note:: 25a9643ea8Slogwang 26a9643ea8Slogwang If just a specific multi-process application needs to be built, 27a9643ea8Slogwang the final make command can be run just in that application's directory, 28a9643ea8Slogwang rather than at the top-level multi-process directory. 29a9643ea8Slogwang 30a9643ea8SlogwangBasic Multi-process Example 31a9643ea8Slogwang~~~~~~~~~~~~~~~~~~~~~~~~~~~ 32a9643ea8Slogwang 33a9643ea8SlogwangThe examples/simple_mp folder in the DPDK release contains a basic example application to demonstrate how 34a9643ea8Slogwangtwo DPDK processes can work together using queues and memory pools to share information. 35a9643ea8Slogwang 36a9643ea8SlogwangRunning the Application 37a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^ 38a9643ea8Slogwang 39a9643ea8SlogwangTo run the application, start one copy of the simple_mp binary in one terminal, 402bfe3f2eSlogwangpassing at least two cores in the coremask/corelist, as follows: 41a9643ea8Slogwang 42a9643ea8Slogwang.. code-block:: console 43a9643ea8Slogwang 44*2d9fd380Sjfb8856606 ./<build_dir>/examples/dpdk-simple_mp -l 0-1 -n 4 --proc-type=primary 45a9643ea8Slogwang 46a9643ea8SlogwangFor the first DPDK process run, the proc-type flag can be omitted or set to auto, 47a9643ea8Slogwangsince all DPDK processes will default to being a primary instance, 48a9643ea8Slogwangmeaning they have control over the hugepage shared memory regions. 49a9643ea8SlogwangThe process should start successfully and display a command prompt as follows: 50a9643ea8Slogwang 51a9643ea8Slogwang.. code-block:: console 52a9643ea8Slogwang 53*2d9fd380Sjfb8856606 $ ./<build_dir>/examples/dpdk-simple_mp -l 0-1 -n 4 --proc-type=primary 54a9643ea8Slogwang EAL: coremask set to 3 55a9643ea8Slogwang EAL: Detected lcore 0 on socket 0 56a9643ea8Slogwang EAL: Detected lcore 1 on socket 0 57a9643ea8Slogwang EAL: Detected lcore 2 on socket 0 58a9643ea8Slogwang EAL: Detected lcore 3 on socket 0 59a9643ea8Slogwang ... 60a9643ea8Slogwang 61a9643ea8Slogwang EAL: Requesting 2 pages of size 1073741824 62a9643ea8Slogwang EAL: Requesting 768 pages of size 2097152 63a9643ea8Slogwang EAL: Ask a virtual area of 0x40000000 bytes 64a9643ea8Slogwang EAL: Virtual area found at 0x7ff200000000 (size = 0x40000000) 65a9643ea8Slogwang ... 66a9643ea8Slogwang 67a9643ea8Slogwang EAL: check module finished 68*2d9fd380Sjfb8856606 EAL: Main core 0 is ready (tid=54e41820) 69a9643ea8Slogwang EAL: Core 1 is ready (tid=53b32700) 70a9643ea8Slogwang 71a9643ea8Slogwang Starting core 1 72a9643ea8Slogwang 73a9643ea8Slogwang simple_mp > 74a9643ea8Slogwang 75a9643ea8SlogwangTo run the secondary process to communicate with the primary process, 762bfe3f2eSlogwangagain run the same binary setting at least two cores in the coremask/corelist: 77a9643ea8Slogwang 78a9643ea8Slogwang.. code-block:: console 79a9643ea8Slogwang 80*2d9fd380Sjfb8856606 ./<build_dir>/examples/dpdk-simple_mp -l 2-3 -n 4 --proc-type=secondary 81a9643ea8Slogwang 82a9643ea8SlogwangWhen running a secondary process such as that shown above, the proc-type parameter can again be specified as auto. 83a9643ea8SlogwangHowever, omitting the parameter altogether will cause the process to try and start as a primary rather than secondary process. 84a9643ea8Slogwang 85a9643ea8SlogwangOnce the process type is specified correctly, 86a9643ea8Slogwangthe process starts up, displaying largely similar status messages to the primary instance as it initializes. 87a9643ea8SlogwangOnce again, you will be presented with a command prompt. 88a9643ea8Slogwang 89a9643ea8SlogwangOnce both processes are running, messages can be sent between them using the send command. 90a9643ea8SlogwangAt any stage, either process can be terminated using the quit command. 91a9643ea8Slogwang 92a9643ea8Slogwang.. code-block:: console 93a9643ea8Slogwang 94*2d9fd380Sjfb8856606 EAL: Main core 10 is ready (tid=b5f89820) EAL: Main core 8 is ready (tid=864a3820) 95a9643ea8Slogwang EAL: Core 11 is ready (tid=84ffe700) EAL: Core 9 is ready (tid=85995700) 96a9643ea8Slogwang Starting core 11 Starting core 9 97a9643ea8Slogwang simple_mp > send hello_secondary simple_mp > core 9: Received 'hello_secondary' 98a9643ea8Slogwang simple_mp > core 11: Received 'hello_primary' simple_mp > send hello_primary 99a9643ea8Slogwang simple_mp > quit simple_mp > quit 100a9643ea8Slogwang 101a9643ea8Slogwang.. note:: 102a9643ea8Slogwang 103a9643ea8Slogwang If the primary instance is terminated, the secondary instance must also be shut-down and restarted after the primary. 104a9643ea8Slogwang This is necessary because the primary instance will clear and reset the shared memory regions on startup, 105a9643ea8Slogwang invalidating the secondary process's pointers. 106a9643ea8Slogwang The secondary process can be stopped and restarted without affecting the primary process. 107a9643ea8Slogwang 108a9643ea8SlogwangHow the Application Works 109a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^^^ 110a9643ea8Slogwang 111a9643ea8SlogwangThe core of this example application is based on using two queues and a single memory pool in shared memory. 112a9643ea8SlogwangThese three objects are created at startup by the primary process, 113a9643ea8Slogwangsince the secondary process cannot create objects in memory as it cannot reserve memory zones, 114a9643ea8Slogwangand the secondary process then uses lookup functions to attach to these objects as it starts up. 115a9643ea8Slogwang 116a9643ea8Slogwang.. code-block:: c 117a9643ea8Slogwang 118a9643ea8Slogwang if (rte_eal_process_type() == RTE_PROC_PRIMARY){ 119a9643ea8Slogwang send_ring = rte_ring_create(_PRI_2_SEC, ring_size, SOCKET0, flags); 120a9643ea8Slogwang recv_ring = rte_ring_create(_SEC_2_PRI, ring_size, SOCKET0, flags); 121a9643ea8Slogwang message_pool = rte_mempool_create(_MSG_POOL, pool_size, string_size, pool_cache, priv_data_sz, NULL, NULL, NULL, NULL, SOCKET0, flags); 122a9643ea8Slogwang } else { 123a9643ea8Slogwang recv_ring = rte_ring_lookup(_PRI_2_SEC); 124a9643ea8Slogwang send_ring = rte_ring_lookup(_SEC_2_PRI); 125a9643ea8Slogwang message_pool = rte_mempool_lookup(_MSG_POOL); 126a9643ea8Slogwang } 127a9643ea8Slogwang 128a9643ea8SlogwangNote, however, that the named ring structure used as send_ring in the primary process is the recv_ring in the secondary process. 129a9643ea8Slogwang 130a9643ea8SlogwangOnce the rings and memory pools are all available in both the primary and secondary processes, 131a9643ea8Slogwangthe application simply dedicates two threads to sending and receiving messages respectively. 132a9643ea8SlogwangThe receive thread simply dequeues any messages on the receive ring, prints them, 133a9643ea8Slogwangand frees the buffer space used by the messages back to the memory pool. 134a9643ea8SlogwangThe send thread makes use of the command-prompt library to interactively request user input for messages to send. 135a9643ea8SlogwangOnce a send command is issued by the user, a buffer is allocated from the memory pool, filled in with the message contents, 136a9643ea8Slogwangthen enqueued on the appropriate rte_ring. 137a9643ea8Slogwang 138a9643ea8SlogwangSymmetric Multi-process Example 139a9643ea8Slogwang~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 140a9643ea8Slogwang 141a9643ea8SlogwangThe second example of DPDK multi-process support demonstrates how a set of processes can run in parallel, 142a9643ea8Slogwangwith each process performing the same set of packet- processing operations. 143a9643ea8Slogwang(Since each process is identical in functionality to the others, 144a9643ea8Slogwangwe refer to this as symmetric multi-processing, to differentiate it from asymmetric multi- processing - 145a9643ea8Slogwangsuch as a client-server mode of operation seen in the next example, 146a9643ea8Slogwangwhere different processes perform different tasks, yet co-operate to form a packet-processing system.) 147a9643ea8SlogwangThe following diagram shows the data-flow through the application, using two processes. 148a9643ea8Slogwang 149a9643ea8Slogwang.. _figure_sym_multi_proc_app: 150a9643ea8Slogwang 151a9643ea8Slogwang.. figure:: img/sym_multi_proc_app.* 152a9643ea8Slogwang 153a9643ea8Slogwang Example Data Flow in a Symmetric Multi-process Application 154a9643ea8Slogwang 155a9643ea8Slogwang 156a9643ea8SlogwangAs the diagram shows, each process reads packets from each of the network ports in use. 157a9643ea8SlogwangRSS is used to distribute incoming packets on each port to different hardware RX queues. 158a9643ea8SlogwangEach process reads a different RX queue on each port and so does not contend with any other process for that queue access. 159a9643ea8SlogwangSimilarly, each process writes outgoing packets to a different TX queue on each port. 160a9643ea8Slogwang 161a9643ea8SlogwangRunning the Application 162a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^ 163a9643ea8Slogwang 164a9643ea8SlogwangAs with the simple_mp example, the first instance of the symmetric_mp process must be run as the primary instance, 165a9643ea8Slogwangthough with a number of other application- specific parameters also provided after the EAL arguments. 166a9643ea8SlogwangThese additional parameters are: 167a9643ea8Slogwang 168a9643ea8Slogwang* -p <portmask>, where portmask is a hexadecimal bitmask of what ports on the system are to be used. 169a9643ea8Slogwang For example: -p 3 to use ports 0 and 1 only. 170a9643ea8Slogwang 171a9643ea8Slogwang* --num-procs <N>, where N is the total number of symmetric_mp instances that will be run side-by-side to perform packet processing. 172a9643ea8Slogwang This parameter is used to configure the appropriate number of receive queues on each network port. 173a9643ea8Slogwang 174a9643ea8Slogwang* --proc-id <n>, where n is a numeric value in the range 0 <= n < N (number of processes, specified above). 175a9643ea8Slogwang This identifies which symmetric_mp instance is being run, so that each process can read a unique receive queue on each network port. 176a9643ea8Slogwang 177a9643ea8SlogwangThe secondary symmetric_mp instances must also have these parameters specified, 178a9643ea8Slogwangand the first two must be the same as those passed to the primary instance, or errors result. 179a9643ea8Slogwang 180a9643ea8SlogwangFor example, to run a set of four symmetric_mp instances, running on lcores 1-4, 181a9643ea8Slogwangall performing level-2 forwarding of packets between ports 0 and 1, 182a9643ea8Slogwangthe following commands can be used (assuming run as root): 183a9643ea8Slogwang 184a9643ea8Slogwang.. code-block:: console 185a9643ea8Slogwang 186*2d9fd380Sjfb8856606 # ./<build_dir>/examples/dpdk-symmetric_mp -l 1 -n 4 --proc-type=auto -- -p 3 --num-procs=4 --proc-id=0 187*2d9fd380Sjfb8856606 # ./<build_dir>/examples/dpdk-symmetric_mp -l 2 -n 4 --proc-type=auto -- -p 3 --num-procs=4 --proc-id=1 188*2d9fd380Sjfb8856606 # ./<build_dir>/examples/dpdk-symmetric_mp -l 3 -n 4 --proc-type=auto -- -p 3 --num-procs=4 --proc-id=2 189*2d9fd380Sjfb8856606 # ./<build_dir>/examples/dpdk-symmetric_mp -l 4 -n 4 --proc-type=auto -- -p 3 --num-procs=4 --proc-id=3 190a9643ea8Slogwang 191a9643ea8Slogwang.. note:: 192a9643ea8Slogwang 193a9643ea8Slogwang In the above example, the process type can be explicitly specified as primary or secondary, rather than auto. 194a9643ea8Slogwang When using auto, the first process run creates all the memory structures needed for all processes - 195a9643ea8Slogwang irrespective of whether it has a proc-id of 0, 1, 2 or 3. 196a9643ea8Slogwang 197a9643ea8Slogwang.. note:: 198a9643ea8Slogwang 199a9643ea8Slogwang For the symmetric multi-process example, since all processes work in the same manner, 200a9643ea8Slogwang once the hugepage shared memory and the network ports are initialized, 201a9643ea8Slogwang it is not necessary to restart all processes if the primary instance dies. 202a9643ea8Slogwang Instead, that process can be restarted as a secondary, 203a9643ea8Slogwang by explicitly setting the proc-type to secondary on the command line. 204a9643ea8Slogwang (All subsequent instances launched will also need this explicitly specified, 205a9643ea8Slogwang as auto-detection will detect no primary processes running and therefore attempt to re-initialize shared memory.) 206a9643ea8Slogwang 207a9643ea8SlogwangHow the Application Works 208a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^^^ 209a9643ea8Slogwang 210a9643ea8SlogwangThe initialization calls in both the primary and secondary instances are the same for the most part, 2110c6bd470Sfengbojiangcalling the rte_eal_init(), 1 G and 10 G driver initialization and then probing devices. 212a9643ea8SlogwangThereafter, the initialization done depends on whether the process is configured as a primary or secondary instance. 213a9643ea8Slogwang 214a9643ea8SlogwangIn the primary instance, a memory pool is created for the packet mbufs and the network ports to be used are initialized - 215a9643ea8Slogwangthe number of RX and TX queues per port being determined by the num-procs parameter passed on the command-line. 216a9643ea8SlogwangThe structures for the initialized network ports are stored in shared memory and 217a9643ea8Slogwangtherefore will be accessible by the secondary process as it initializes. 218a9643ea8Slogwang 219a9643ea8Slogwang.. code-block:: c 220a9643ea8Slogwang 221a9643ea8Slogwang if (num_ports & 1) 222a9643ea8Slogwang rte_exit(EXIT_FAILURE, "Application must use an even number of ports\n"); 223a9643ea8Slogwang 224a9643ea8Slogwang for(i = 0; i < num_ports; i++){ 225a9643ea8Slogwang if(proc_type == RTE_PROC_PRIMARY) 226a9643ea8Slogwang if (smp_port_init(ports[i], mp, (uint16_t)num_procs) < 0) 227a9643ea8Slogwang rte_exit(EXIT_FAILURE, "Error initializing ports\n"); 228a9643ea8Slogwang } 229a9643ea8Slogwang 230a9643ea8SlogwangIn the secondary instance, rather than initializing the network ports, the port information exported by the primary process is used, 231a9643ea8Slogwanggiving the secondary process access to the hardware and software rings for each network port. 232a9643ea8SlogwangSimilarly, the memory pool of mbufs is accessed by doing a lookup for it by name: 233a9643ea8Slogwang 234a9643ea8Slogwang.. code-block:: c 235a9643ea8Slogwang 236a9643ea8Slogwang mp = (proc_type == RTE_PROC_SECONDARY) ? rte_mempool_lookup(_SMP_MBUF_POOL) : rte_mempool_create(_SMP_MBUF_POOL, NB_MBUFS, MBUF_SIZE, ... ) 237a9643ea8Slogwang 238a9643ea8SlogwangOnce this initialization is complete, the main loop of each process, both primary and secondary, 239a9643ea8Slogwangis exactly the same - each process reads from each port using the queue corresponding to its proc-id parameter, 240a9643ea8Slogwangand writes to the corresponding transmit queue on the output port. 241a9643ea8Slogwang 242a9643ea8SlogwangClient-Server Multi-process Example 243a9643ea8Slogwang~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 244a9643ea8Slogwang 245a9643ea8SlogwangThe third example multi-process application included with the DPDK shows how one can 246a9643ea8Slogwanguse a client-server type multi-process design to do packet processing. 247a9643ea8SlogwangIn this example, a single server process performs the packet reception from the ports being used and 248a9643ea8Slogwangdistributes these packets using round-robin ordering among a set of client processes, 249a9643ea8Slogwangwhich perform the actual packet processing. 250a9643ea8SlogwangIn this case, the client applications just perform level-2 forwarding of packets by sending each packet out on a different network port. 251a9643ea8Slogwang 252a9643ea8SlogwangThe following diagram shows the data-flow through the application, using two client processes. 253a9643ea8Slogwang 254a9643ea8Slogwang.. _figure_client_svr_sym_multi_proc_app: 255a9643ea8Slogwang 256a9643ea8Slogwang.. figure:: img/client_svr_sym_multi_proc_app.* 257a9643ea8Slogwang 258a9643ea8Slogwang Example Data Flow in a Client-Server Symmetric Multi-process Application 259a9643ea8Slogwang 260a9643ea8Slogwang 261a9643ea8SlogwangRunning the Application 262a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^ 263a9643ea8Slogwang 264a9643ea8SlogwangThe server process must be run initially as the primary process to set up all memory structures for use by the clients. 265a9643ea8SlogwangIn addition to the EAL parameters, the application- specific parameters are: 266a9643ea8Slogwang 267a9643ea8Slogwang* -p <portmask >, where portmask is a hexadecimal bitmask of what ports on the system are to be used. 268a9643ea8Slogwang For example: -p 3 to use ports 0 and 1 only. 269a9643ea8Slogwang 270a9643ea8Slogwang* -n <num-clients>, where the num-clients parameter is the number of client processes that will process the packets received 271a9643ea8Slogwang by the server application. 272a9643ea8Slogwang 273a9643ea8Slogwang.. note:: 274a9643ea8Slogwang 275*2d9fd380Sjfb8856606 In the server process, a single thread, the main thread, that is, the lowest numbered lcore in the coremask/corelist, performs all packet I/O. 2762bfe3f2eSlogwang If a coremask/corelist is specified with more than a single lcore bit set in it, 277a9643ea8Slogwang an additional lcore will be used for a thread to periodically print packet count statistics. 278a9643ea8Slogwang 279a9643ea8SlogwangSince the server application stores configuration data in shared memory, including the network ports to be used, 280a9643ea8Slogwangthe only application parameter needed by a client process is its client instance ID. 281a9643ea8SlogwangTherefore, to run a server application on lcore 1 (with lcore 2 printing statistics) along with two client processes running on lcores 3 and 4, 282a9643ea8Slogwangthe following commands could be used: 283a9643ea8Slogwang 284a9643ea8Slogwang.. code-block:: console 285a9643ea8Slogwang 286*2d9fd380Sjfb8856606 # ./<build_dir>/examples/dpdk-mp_server -l 1-2 -n 4 -- -p 3 -n 2 287*2d9fd380Sjfb8856606 # ./<build_dir>/examples/dpdk-mp_client -l 3 -n 4 --proc-type=auto -- -n 0 288*2d9fd380Sjfb8856606 # ./<build_dir>/examples/dpdk-mp_client -l 4 -n 4 --proc-type=auto -- -n 1 289a9643ea8Slogwang 290a9643ea8Slogwang.. note:: 291a9643ea8Slogwang 292a9643ea8Slogwang If the server application dies and needs to be restarted, all client applications also need to be restarted, 293a9643ea8Slogwang as there is no support in the server application for it to run as a secondary process. 294a9643ea8Slogwang Any client processes that need restarting can be restarted without affecting the server process. 295a9643ea8Slogwang 296a9643ea8SlogwangHow the Application Works 297a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^^^ 298a9643ea8Slogwang 299a9643ea8SlogwangThe server process performs the network port and data structure initialization much as the symmetric multi-process application does when run as primary. 300a9643ea8SlogwangOne additional enhancement in this sample application is that the server process stores its port configuration data in a memory zone in hugepage shared memory. 301a9643ea8SlogwangThis eliminates the need for the client processes to have the portmask parameter passed into them on the command line, 302a9643ea8Slogwangas is done for the symmetric multi-process application, and therefore eliminates mismatched parameters as a potential source of errors. 303a9643ea8Slogwang 304a9643ea8SlogwangIn the same way that the server process is designed to be run as a primary process instance only, 305a9643ea8Slogwangthe client processes are designed to be run as secondary instances only. 306a9643ea8SlogwangThey have no code to attempt to create shared memory objects. 307a9643ea8SlogwangInstead, handles to all needed rings and memory pools are obtained via calls to rte_ring_lookup() and rte_mempool_lookup(). 308a9643ea8SlogwangThe network ports for use by the processes are obtained by loading the network port drivers and probing the PCI bus, 309a9643ea8Slogwangwhich will, as in the symmetric multi-process example, 310a9643ea8Slogwangautomatically get access to the network ports using the settings already configured by the primary/server process. 311a9643ea8Slogwang 312a9643ea8SlogwangOnce all applications are initialized, the server operates by reading packets from each network port in turn and 313a9643ea8Slogwangdistributing those packets to the client queues (software rings, one for each client process) in round-robin order. 314a9643ea8SlogwangOn the client side, the packets are read from the rings in as big of bursts as possible, then routed out to a different network port. 315a9643ea8SlogwangThe routing used is very simple. All packets received on the first NIC port are transmitted back out on the second port and vice versa. 316a9643ea8SlogwangSimilarly, packets are routed between the 3rd and 4th network ports and so on. 317a9643ea8SlogwangThe sending of packets is done by writing the packets directly to the network ports; they are not transferred back via the server process. 318a9643ea8Slogwang 319a9643ea8SlogwangIn both the server and the client processes, outgoing packets are buffered before being sent, 320a9643ea8Slogwangso as to allow the sending of multiple packets in a single burst to improve efficiency. 321a9643ea8SlogwangFor example, the client process will buffer packets to send, 322a9643ea8Slogwanguntil either the buffer is full or until we receive no further packets from the server. 323