1d30ea906Sjfb8856606..  SPDX-License-Identifier: BSD-3-Clause
2d30ea906Sjfb8856606    Copyright(c) 2010-2014 Intel Corporation.
3a9643ea8Slogwang
4a9643ea8Slogwang.. _multi_process_app:
5a9643ea8Slogwang
6a9643ea8SlogwangMulti-process Sample Application
7a9643ea8Slogwang================================
8a9643ea8Slogwang
9a9643ea8SlogwangThis chapter describes the example applications for multi-processing that are included in the DPDK.
10a9643ea8Slogwang
11a9643ea8SlogwangExample Applications
12a9643ea8Slogwang--------------------
13a9643ea8Slogwang
14a9643ea8SlogwangBuilding the Sample Applications
15a9643ea8Slogwang~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
16a9643ea8SlogwangThe multi-process example applications are built in the same way as other sample applications,
17a9643ea8Slogwangand as documented in the *DPDK Getting Started Guide*.
18a9643ea8Slogwang
19a9643ea8Slogwang
202bfe3f2eSlogwangTo compile the sample application see :doc:`compiling`.
21a9643ea8Slogwang
222bfe3f2eSlogwangThe applications are located in the ``multi_process`` sub-directory.
23a9643ea8Slogwang
24a9643ea8Slogwang.. note::
25a9643ea8Slogwang
26a9643ea8Slogwang    If just a specific multi-process application needs to be built,
27a9643ea8Slogwang    the final make command can be run just in that application's directory,
28a9643ea8Slogwang    rather than at the top-level multi-process directory.
29a9643ea8Slogwang
30a9643ea8SlogwangBasic Multi-process Example
31a9643ea8Slogwang~~~~~~~~~~~~~~~~~~~~~~~~~~~
32a9643ea8Slogwang
33a9643ea8SlogwangThe examples/simple_mp folder in the DPDK release contains a basic example application to demonstrate how
34a9643ea8Slogwangtwo DPDK processes can work together using queues and memory pools to share information.
35a9643ea8Slogwang
36a9643ea8SlogwangRunning the Application
37a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^
38a9643ea8Slogwang
39a9643ea8SlogwangTo run the application, start one copy of the simple_mp binary in one terminal,
402bfe3f2eSlogwangpassing at least two cores in the coremask/corelist, as follows:
41a9643ea8Slogwang
42a9643ea8Slogwang.. code-block:: console
43a9643ea8Slogwang
44*2d9fd380Sjfb8856606    ./<build_dir>/examples/dpdk-simple_mp -l 0-1 -n 4 --proc-type=primary
45a9643ea8Slogwang
46a9643ea8SlogwangFor the first DPDK process run, the proc-type flag can be omitted or set to auto,
47a9643ea8Slogwangsince all DPDK processes will default to being a primary instance,
48a9643ea8Slogwangmeaning they have control over the hugepage shared memory regions.
49a9643ea8SlogwangThe process should start successfully and display a command prompt as follows:
50a9643ea8Slogwang
51a9643ea8Slogwang.. code-block:: console
52a9643ea8Slogwang
53*2d9fd380Sjfb8856606    $ ./<build_dir>/examples/dpdk-simple_mp -l 0-1 -n 4 --proc-type=primary
54a9643ea8Slogwang    EAL: coremask set to 3
55a9643ea8Slogwang    EAL: Detected lcore 0 on socket 0
56a9643ea8Slogwang    EAL: Detected lcore 1 on socket 0
57a9643ea8Slogwang    EAL: Detected lcore 2 on socket 0
58a9643ea8Slogwang    EAL: Detected lcore 3 on socket 0
59a9643ea8Slogwang    ...
60a9643ea8Slogwang
61a9643ea8Slogwang    EAL: Requesting 2 pages of size 1073741824
62a9643ea8Slogwang    EAL: Requesting 768 pages of size 2097152
63a9643ea8Slogwang    EAL: Ask a virtual area of 0x40000000 bytes
64a9643ea8Slogwang    EAL: Virtual area found at 0x7ff200000000 (size = 0x40000000)
65a9643ea8Slogwang    ...
66a9643ea8Slogwang
67a9643ea8Slogwang    EAL: check module finished
68*2d9fd380Sjfb8856606    EAL: Main core 0 is ready (tid=54e41820)
69a9643ea8Slogwang    EAL: Core 1 is ready (tid=53b32700)
70a9643ea8Slogwang
71a9643ea8Slogwang    Starting core 1
72a9643ea8Slogwang
73a9643ea8Slogwang    simple_mp >
74a9643ea8Slogwang
75a9643ea8SlogwangTo run the secondary process to communicate with the primary process,
762bfe3f2eSlogwangagain run the same binary setting at least two cores in the coremask/corelist:
77a9643ea8Slogwang
78a9643ea8Slogwang.. code-block:: console
79a9643ea8Slogwang
80*2d9fd380Sjfb8856606    ./<build_dir>/examples/dpdk-simple_mp -l 2-3 -n 4 --proc-type=secondary
81a9643ea8Slogwang
82a9643ea8SlogwangWhen running a secondary process such as that shown above, the proc-type parameter can again be specified as auto.
83a9643ea8SlogwangHowever, omitting the parameter altogether will cause the process to try and start as a primary rather than secondary process.
84a9643ea8Slogwang
85a9643ea8SlogwangOnce the process type is specified correctly,
86a9643ea8Slogwangthe process starts up, displaying largely similar status messages to the primary instance as it initializes.
87a9643ea8SlogwangOnce again, you will be presented with a command prompt.
88a9643ea8Slogwang
89a9643ea8SlogwangOnce both processes are running, messages can be sent between them using the send command.
90a9643ea8SlogwangAt any stage, either process can be terminated using the quit command.
91a9643ea8Slogwang
92a9643ea8Slogwang.. code-block:: console
93a9643ea8Slogwang
94*2d9fd380Sjfb8856606   EAL: Main core 10 is ready (tid=b5f89820)             EAL: Main core 8 is ready (tid=864a3820)
95a9643ea8Slogwang   EAL: Core 11 is ready (tid=84ffe700)                  EAL: Core 9 is ready (tid=85995700)
96a9643ea8Slogwang   Starting core 11                                      Starting core 9
97a9643ea8Slogwang   simple_mp > send hello_secondary                      simple_mp > core 9: Received 'hello_secondary'
98a9643ea8Slogwang   simple_mp > core 11: Received 'hello_primary'         simple_mp > send hello_primary
99a9643ea8Slogwang   simple_mp > quit                                      simple_mp > quit
100a9643ea8Slogwang
101a9643ea8Slogwang.. note::
102a9643ea8Slogwang
103a9643ea8Slogwang    If the primary instance is terminated, the secondary instance must also be shut-down and restarted after the primary.
104a9643ea8Slogwang    This is necessary because the primary instance will clear and reset the shared memory regions on startup,
105a9643ea8Slogwang    invalidating the secondary process's pointers.
106a9643ea8Slogwang    The secondary process can be stopped and restarted without affecting the primary process.
107a9643ea8Slogwang
108a9643ea8SlogwangHow the Application Works
109a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^^^
110a9643ea8Slogwang
111a9643ea8SlogwangThe core of this example application is based on using two queues and a single memory pool in shared memory.
112a9643ea8SlogwangThese three objects are created at startup by the primary process,
113a9643ea8Slogwangsince the secondary process cannot create objects in memory as it cannot reserve memory zones,
114a9643ea8Slogwangand the secondary process then uses lookup functions to attach to these objects as it starts up.
115a9643ea8Slogwang
116a9643ea8Slogwang.. code-block:: c
117a9643ea8Slogwang
118a9643ea8Slogwang    if (rte_eal_process_type() == RTE_PROC_PRIMARY){
119a9643ea8Slogwang        send_ring = rte_ring_create(_PRI_2_SEC, ring_size, SOCKET0, flags);
120a9643ea8Slogwang        recv_ring = rte_ring_create(_SEC_2_PRI, ring_size, SOCKET0, flags);
121a9643ea8Slogwang        message_pool = rte_mempool_create(_MSG_POOL, pool_size, string_size, pool_cache, priv_data_sz, NULL, NULL, NULL, NULL, SOCKET0, flags);
122a9643ea8Slogwang    } else {
123a9643ea8Slogwang        recv_ring = rte_ring_lookup(_PRI_2_SEC);
124a9643ea8Slogwang        send_ring = rte_ring_lookup(_SEC_2_PRI);
125a9643ea8Slogwang        message_pool = rte_mempool_lookup(_MSG_POOL);
126a9643ea8Slogwang    }
127a9643ea8Slogwang
128a9643ea8SlogwangNote, however, that the named ring structure used as send_ring in the primary process is the recv_ring in the secondary process.
129a9643ea8Slogwang
130a9643ea8SlogwangOnce the rings and memory pools are all available in both the primary and secondary processes,
131a9643ea8Slogwangthe application simply dedicates two threads to sending and receiving messages respectively.
132a9643ea8SlogwangThe receive thread simply dequeues any messages on the receive ring, prints them,
133a9643ea8Slogwangand frees the buffer space used by the messages back to the memory pool.
134a9643ea8SlogwangThe send thread makes use of the command-prompt library to interactively request user input for messages to send.
135a9643ea8SlogwangOnce a send command is issued by the user, a buffer is allocated from the memory pool, filled in with the message contents,
136a9643ea8Slogwangthen enqueued on the appropriate rte_ring.
137a9643ea8Slogwang
138a9643ea8SlogwangSymmetric Multi-process Example
139a9643ea8Slogwang~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140a9643ea8Slogwang
141a9643ea8SlogwangThe second example of DPDK multi-process support demonstrates how a set of processes can run in parallel,
142a9643ea8Slogwangwith each process performing the same set of packet- processing operations.
143a9643ea8Slogwang(Since each process is identical in functionality to the others,
144a9643ea8Slogwangwe refer to this as symmetric multi-processing, to differentiate it from asymmetric multi- processing -
145a9643ea8Slogwangsuch as a client-server mode of operation seen in the next example,
146a9643ea8Slogwangwhere different processes perform different tasks, yet co-operate to form a packet-processing system.)
147a9643ea8SlogwangThe following diagram shows the data-flow through the application, using two processes.
148a9643ea8Slogwang
149a9643ea8Slogwang.. _figure_sym_multi_proc_app:
150a9643ea8Slogwang
151a9643ea8Slogwang.. figure:: img/sym_multi_proc_app.*
152a9643ea8Slogwang
153a9643ea8Slogwang   Example Data Flow in a Symmetric Multi-process Application
154a9643ea8Slogwang
155a9643ea8Slogwang
156a9643ea8SlogwangAs the diagram shows, each process reads packets from each of the network ports in use.
157a9643ea8SlogwangRSS is used to distribute incoming packets on each port to different hardware RX queues.
158a9643ea8SlogwangEach process reads a different RX queue on each port and so does not contend with any other process for that queue access.
159a9643ea8SlogwangSimilarly, each process writes outgoing packets to a different TX queue on each port.
160a9643ea8Slogwang
161a9643ea8SlogwangRunning the Application
162a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^
163a9643ea8Slogwang
164a9643ea8SlogwangAs with the simple_mp example, the first instance of the symmetric_mp process must be run as the primary instance,
165a9643ea8Slogwangthough with a number of other application- specific parameters also provided after the EAL arguments.
166a9643ea8SlogwangThese additional parameters are:
167a9643ea8Slogwang
168a9643ea8Slogwang*   -p <portmask>, where portmask is a hexadecimal bitmask of what ports on the system are to be used.
169a9643ea8Slogwang    For example: -p 3 to use ports 0 and 1 only.
170a9643ea8Slogwang
171a9643ea8Slogwang*   --num-procs <N>, where N is the total number of symmetric_mp instances that will be run side-by-side to perform packet processing.
172a9643ea8Slogwang    This parameter is used to configure the appropriate number of receive queues on each network port.
173a9643ea8Slogwang
174a9643ea8Slogwang*   --proc-id <n>, where n is a numeric value in the range 0 <= n < N (number of processes, specified above).
175a9643ea8Slogwang    This identifies which symmetric_mp instance is being run, so that each process can read a unique receive queue on each network port.
176a9643ea8Slogwang
177a9643ea8SlogwangThe secondary symmetric_mp instances must also have these parameters specified,
178a9643ea8Slogwangand the first two must be the same as those passed to the primary instance, or errors result.
179a9643ea8Slogwang
180a9643ea8SlogwangFor example, to run a set of four symmetric_mp instances, running on lcores 1-4,
181a9643ea8Slogwangall performing level-2 forwarding of packets between ports 0 and 1,
182a9643ea8Slogwangthe following commands can be used (assuming run as root):
183a9643ea8Slogwang
184a9643ea8Slogwang.. code-block:: console
185a9643ea8Slogwang
186*2d9fd380Sjfb8856606    # ./<build_dir>/examples/dpdk-symmetric_mp -l 1 -n 4 --proc-type=auto -- -p 3 --num-procs=4 --proc-id=0
187*2d9fd380Sjfb8856606    # ./<build_dir>/examples/dpdk-symmetric_mp -l 2 -n 4 --proc-type=auto -- -p 3 --num-procs=4 --proc-id=1
188*2d9fd380Sjfb8856606    # ./<build_dir>/examples/dpdk-symmetric_mp -l 3 -n 4 --proc-type=auto -- -p 3 --num-procs=4 --proc-id=2
189*2d9fd380Sjfb8856606    # ./<build_dir>/examples/dpdk-symmetric_mp -l 4 -n 4 --proc-type=auto -- -p 3 --num-procs=4 --proc-id=3
190a9643ea8Slogwang
191a9643ea8Slogwang.. note::
192a9643ea8Slogwang
193a9643ea8Slogwang    In the above example, the process type can be explicitly specified as primary or secondary, rather than auto.
194a9643ea8Slogwang    When using auto, the first process run creates all the memory structures needed for all processes -
195a9643ea8Slogwang    irrespective of whether it has a proc-id of 0, 1, 2 or 3.
196a9643ea8Slogwang
197a9643ea8Slogwang.. note::
198a9643ea8Slogwang
199a9643ea8Slogwang    For the symmetric multi-process example, since all processes work in the same manner,
200a9643ea8Slogwang    once the hugepage shared memory and the network ports are initialized,
201a9643ea8Slogwang    it is not necessary to restart all processes if the primary instance dies.
202a9643ea8Slogwang    Instead, that process can be restarted as a secondary,
203a9643ea8Slogwang    by explicitly setting the proc-type to secondary on the command line.
204a9643ea8Slogwang    (All subsequent instances launched will also need this explicitly specified,
205a9643ea8Slogwang    as auto-detection will detect no primary processes running and therefore attempt to re-initialize shared memory.)
206a9643ea8Slogwang
207a9643ea8SlogwangHow the Application Works
208a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^^^
209a9643ea8Slogwang
210a9643ea8SlogwangThe initialization calls in both the primary and secondary instances are the same for the most part,
2110c6bd470Sfengbojiangcalling the rte_eal_init(), 1 G and 10 G driver initialization and then probing devices.
212a9643ea8SlogwangThereafter, the initialization done depends on whether the process is configured as a primary or secondary instance.
213a9643ea8Slogwang
214a9643ea8SlogwangIn the primary instance, a memory pool is created for the packet mbufs and the network ports to be used are initialized -
215a9643ea8Slogwangthe number of RX and TX queues per port being determined by the num-procs parameter passed on the command-line.
216a9643ea8SlogwangThe structures for the initialized network ports are stored in shared memory and
217a9643ea8Slogwangtherefore will be accessible by the secondary process as it initializes.
218a9643ea8Slogwang
219a9643ea8Slogwang.. code-block:: c
220a9643ea8Slogwang
221a9643ea8Slogwang    if (num_ports & 1)
222a9643ea8Slogwang       rte_exit(EXIT_FAILURE, "Application must use an even number of ports\n");
223a9643ea8Slogwang
224a9643ea8Slogwang    for(i = 0; i < num_ports; i++){
225a9643ea8Slogwang        if(proc_type == RTE_PROC_PRIMARY)
226a9643ea8Slogwang            if (smp_port_init(ports[i], mp, (uint16_t)num_procs) < 0)
227a9643ea8Slogwang                rte_exit(EXIT_FAILURE, "Error initializing ports\n");
228a9643ea8Slogwang    }
229a9643ea8Slogwang
230a9643ea8SlogwangIn the secondary instance, rather than initializing the network ports, the port information exported by the primary process is used,
231a9643ea8Slogwanggiving the secondary process access to the hardware and software rings for each network port.
232a9643ea8SlogwangSimilarly, the memory pool of mbufs is accessed by doing a lookup for it by name:
233a9643ea8Slogwang
234a9643ea8Slogwang.. code-block:: c
235a9643ea8Slogwang
236a9643ea8Slogwang    mp = (proc_type == RTE_PROC_SECONDARY) ? rte_mempool_lookup(_SMP_MBUF_POOL) : rte_mempool_create(_SMP_MBUF_POOL, NB_MBUFS, MBUF_SIZE, ... )
237a9643ea8Slogwang
238a9643ea8SlogwangOnce this initialization is complete, the main loop of each process, both primary and secondary,
239a9643ea8Slogwangis exactly the same - each process reads from each port using the queue corresponding to its proc-id parameter,
240a9643ea8Slogwangand writes to the corresponding transmit queue on the output port.
241a9643ea8Slogwang
242a9643ea8SlogwangClient-Server Multi-process Example
243a9643ea8Slogwang~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
244a9643ea8Slogwang
245a9643ea8SlogwangThe third example multi-process application included with the DPDK shows how one can
246a9643ea8Slogwanguse a client-server type multi-process design to do packet processing.
247a9643ea8SlogwangIn this example, a single server process performs the packet reception from the ports being used and
248a9643ea8Slogwangdistributes these packets using round-robin ordering among a set of client  processes,
249a9643ea8Slogwangwhich perform the actual packet processing.
250a9643ea8SlogwangIn this case, the client applications just perform level-2 forwarding of packets by sending each packet out on a different network port.
251a9643ea8Slogwang
252a9643ea8SlogwangThe following diagram shows the data-flow through the application, using two client processes.
253a9643ea8Slogwang
254a9643ea8Slogwang.. _figure_client_svr_sym_multi_proc_app:
255a9643ea8Slogwang
256a9643ea8Slogwang.. figure:: img/client_svr_sym_multi_proc_app.*
257a9643ea8Slogwang
258a9643ea8Slogwang   Example Data Flow in a Client-Server Symmetric Multi-process Application
259a9643ea8Slogwang
260a9643ea8Slogwang
261a9643ea8SlogwangRunning the Application
262a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^
263a9643ea8Slogwang
264a9643ea8SlogwangThe server process must be run initially as the primary process to set up all memory structures for use by the clients.
265a9643ea8SlogwangIn addition to the EAL parameters, the application- specific parameters are:
266a9643ea8Slogwang
267a9643ea8Slogwang*   -p <portmask >, where portmask is a hexadecimal bitmask of what ports on the system are to be used.
268a9643ea8Slogwang    For example: -p 3 to use ports 0 and 1 only.
269a9643ea8Slogwang
270a9643ea8Slogwang*   -n <num-clients>, where the num-clients parameter is the number of client processes that will process the packets received
271a9643ea8Slogwang    by the server application.
272a9643ea8Slogwang
273a9643ea8Slogwang.. note::
274a9643ea8Slogwang
275*2d9fd380Sjfb8856606    In the server process, a single thread, the main thread, that is, the lowest numbered lcore in the coremask/corelist, performs all packet I/O.
2762bfe3f2eSlogwang    If a coremask/corelist is specified with more than a single lcore bit set in it,
277a9643ea8Slogwang    an additional lcore will be used for a thread to periodically print packet count statistics.
278a9643ea8Slogwang
279a9643ea8SlogwangSince the server application stores configuration data in shared memory, including the network ports to be used,
280a9643ea8Slogwangthe only application parameter needed by a client process is its client instance ID.
281a9643ea8SlogwangTherefore, to run a server application on lcore 1 (with lcore 2 printing statistics) along with two client processes running on lcores 3 and 4,
282a9643ea8Slogwangthe following commands could be used:
283a9643ea8Slogwang
284a9643ea8Slogwang.. code-block:: console
285a9643ea8Slogwang
286*2d9fd380Sjfb8856606    # ./<build_dir>/examples/dpdk-mp_server -l 1-2 -n 4 -- -p 3 -n 2
287*2d9fd380Sjfb8856606    # ./<build_dir>/examples/dpdk-mp_client -l 3 -n 4 --proc-type=auto -- -n 0
288*2d9fd380Sjfb8856606    # ./<build_dir>/examples/dpdk-mp_client -l 4 -n 4 --proc-type=auto -- -n 1
289a9643ea8Slogwang
290a9643ea8Slogwang.. note::
291a9643ea8Slogwang
292a9643ea8Slogwang    If the server application dies and needs to be restarted, all client applications also need to be restarted,
293a9643ea8Slogwang    as there is no support in the server application for it to run as a secondary process.
294a9643ea8Slogwang    Any client processes that need restarting can be restarted without affecting the server process.
295a9643ea8Slogwang
296a9643ea8SlogwangHow the Application Works
297a9643ea8Slogwang^^^^^^^^^^^^^^^^^^^^^^^^^
298a9643ea8Slogwang
299a9643ea8SlogwangThe server process performs the network port and data structure initialization much as the symmetric multi-process application does when run as primary.
300a9643ea8SlogwangOne additional enhancement in this sample application is that the server process stores its port configuration data in a memory zone in hugepage shared memory.
301a9643ea8SlogwangThis eliminates the need for the client processes to have the portmask parameter passed into them on the command line,
302a9643ea8Slogwangas is done for the symmetric multi-process application, and therefore eliminates mismatched parameters as a potential source of errors.
303a9643ea8Slogwang
304a9643ea8SlogwangIn the same way that the server process is designed to be run as a primary process instance only,
305a9643ea8Slogwangthe client processes are designed to be run as secondary instances only.
306a9643ea8SlogwangThey have no code to attempt to create shared memory objects.
307a9643ea8SlogwangInstead, handles to all needed rings and memory pools are obtained via calls to rte_ring_lookup() and rte_mempool_lookup().
308a9643ea8SlogwangThe network ports for use by the processes are obtained by loading the network port drivers and probing the PCI bus,
309a9643ea8Slogwangwhich will, as in the symmetric multi-process example,
310a9643ea8Slogwangautomatically get access to the network ports using the settings already configured by the primary/server process.
311a9643ea8Slogwang
312a9643ea8SlogwangOnce all applications are initialized, the server operates by reading packets from each network port in turn and
313a9643ea8Slogwangdistributing those packets to the client queues (software rings, one for each client process) in round-robin order.
314a9643ea8SlogwangOn the client side, the packets are read from the rings in as big of bursts as possible, then routed out to a different network port.
315a9643ea8SlogwangThe routing used is very simple. All packets received on the first NIC port are transmitted back out on the second port and vice versa.
316a9643ea8SlogwangSimilarly, packets are routed between the 3rd and 4th network ports and so on.
317a9643ea8SlogwangThe sending of packets is done by writing the packets directly to the network ports; they are not transferred back via the server process.
318a9643ea8Slogwang
319a9643ea8SlogwangIn both the server and the client processes, outgoing packets are buffered before being sent,
320a9643ea8Slogwangso as to allow the sending of multiple packets in a single burst to improve efficiency.
321a9643ea8SlogwangFor example, the client process will buffer packets to send,
322a9643ea8Slogwanguntil either the buffer is full or until we receive no further packets from the server.
323