1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2019 Intel Corporation. 3 4.. include:: <isonum.txt> 5 6IOAT Rawdev Driver 7=================== 8 9The ``ioat`` rawdev driver provides a poll-mode driver (PMD) for Intel\ |reg| 10Data Streaming Accelerator `(Intel DSA) 11<https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator>`_ and for Intel\ |reg| 12QuickData Technology, part of Intel\ |reg| I/O Acceleration Technology 13`(Intel I/OAT) 14<https://www.intel.com/content/www/us/en/wireless-network/accel-technology.html>`_. 15This PMD, when used on supported hardware, allows data copies, for example, 16cloning packet data, to be accelerated by that hardware rather than having to 17be done by software, freeing up CPU cycles for other tasks. 18 19Hardware Requirements 20---------------------- 21 22The ``dpdk-devbind.py`` script, included with DPDK, 23can be used to show the presence of supported hardware. 24Running ``dpdk-devbind.py --status-dev misc`` will show all the miscellaneous, 25or rawdev-based devices on the system. 26For Intel\ |reg| QuickData Technology devices, the hardware will be often listed as "Crystal Beach DMA", 27or "CBDMA". 28For Intel\ |reg| DSA devices, they are currently (at time of writing) appearing as devices with type "0b25", 29due to the absence of pci-id database entries for them at this point. 30 31Compilation 32------------ 33 34For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. 35No additional compilation steps are necessary. 36 37Device Setup 38------------- 39 40Depending on support provided by the PMD, HW devices can either use the kernel configured driver 41or be bound to a user-space IO driver for use. 42For example, Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, 43such as ``vfio-pci``. 44 45Intel\ |reg| DSA devices using idxd kernel driver 46~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 47 48To use a Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. 49The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. 50 51.. note:: 52 The device configuration can also be done by directly interacting with the sysfs nodes. 53 An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` 54 included in the driver source directory. 55 56There are some mandatory configuration steps before being able to use a device with an application. 57The internal engines, which do the copies or other operations, 58and the work-queues, which are used by applications to assign work to the device, 59need to be assigned to groups, and the various other configuration options, 60such as priority or queue depth, need to be set for each queue. 61 62To assign an engine to a group:: 63 64 $ accel-config config-engine dsa0/engine0.0 --group-id=0 65 $ accel-config config-engine dsa0/engine0.1 --group-id=1 66 67To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. 68However, the work queues also need to be configured depending on the use-case. 69Some configuration options include: 70 71* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. 72* priority: WQ priority between 1 and 15. Larger value means higher priority. 73* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. 74* type: WQ type (kernel/mdev/user). Determines how the device is presented. 75* name: identifier given to the WQ. 76 77Example configuration for a work queue:: 78 79 $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ 80 --mode=dedicated --priority=10 --wq-size=8 \ 81 --type=user --name=app1 82 83Once the devices have been configured, they need to be enabled:: 84 85 $ accel-config enable-device dsa0 86 $ accel-config enable-wq dsa0/wq0.0 87 88Check the device configuration:: 89 90 $ accel-config list 91 92Devices using VFIO/UIO drivers 93~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 94 95The HW devices to be used will need to be bound to a user-space IO driver for use. 96The ``dpdk-devbind.py`` script can be used to view the state of the devices 97and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. 98For example:: 99 100 $ dpdk-devbind.py -b vfio-pci 00:04.0 00:04.1 101 102Device Probing and Initialization 103~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 104 105For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will 106be found as part of the device scan done at application initialization time without 107the need to pass parameters to the application. 108 109If the device is bound to the IDXD kernel driver (and previously configured with sysfs), 110then a specific work queue needs to be passed to the application via a vdev parameter. 111This vdev parameter take the driver name and work queue name as parameters. 112For example, to use work queue 0 on Intel\ |reg| DSA instance 0:: 113 114 $ dpdk-test --no-pci --vdev=rawdev_idxd,wq=0.0 115 116Once probed successfully, the device will appear as a ``rawdev``, that is a 117"raw device type" inside DPDK, and can be accessed using APIs from the 118``rte_rawdev`` library. 119 120Using IOAT Rawdev Devices 121-------------------------- 122 123To use the devices from an application, the rawdev API can be used, along 124with definitions taken from the device-specific header file 125``rte_ioat_rawdev.h``. This header is needed to get the definition of 126structure parameters used by some of the rawdev APIs for IOAT rawdev 127devices, as well as providing key functions for using the device for memory 128copies. 129 130Getting Device Information 131~~~~~~~~~~~~~~~~~~~~~~~~~~~ 132 133Basic information about each rawdev device can be queried using the 134``rte_rawdev_info_get()`` API. For most applications, this API will be 135needed to verify that the rawdev in question is of the expected type. For 136example, the following code snippet can be used to identify an IOAT 137rawdev device for use by an application: 138 139.. code-block:: C 140 141 for (i = 0; i < count && !found; i++) { 142 struct rte_rawdev_info info = { .dev_private = NULL }; 143 found = (rte_rawdev_info_get(i, &info, 0) == 0 && 144 strcmp(info.driver_name, 145 IOAT_PMD_RAWDEV_NAME_STR) == 0); 146 } 147 148When calling the ``rte_rawdev_info_get()`` API for an IOAT rawdev device, 149the ``dev_private`` field in the ``rte_rawdev_info`` struct should either 150be NULL, or else be set to point to a structure of type 151``rte_ioat_rawdev_config``, in which case the size of the configured device 152input ring will be returned in that structure. 153 154Device Configuration 155~~~~~~~~~~~~~~~~~~~~~ 156 157Configuring an IOAT rawdev device is done using the 158``rte_rawdev_configure()`` API, which takes the same structure parameters 159as the, previously referenced, ``rte_rawdev_info_get()`` API. The main 160difference is that, because the parameter is used as input rather than 161output, the ``dev_private`` structure element cannot be NULL, and must 162point to a valid ``rte_ioat_rawdev_config`` structure, containing the ring 163size to be used by the device. The ring size must be a power of two, 164between 64 and 4096. 165If it is not needed, the tracking by the driver of user-provided completion 166handles may be disabled by setting the ``hdls_disable`` flag in 167the configuration structure also. 168 169The following code shows how the device is configured in 170``test_ioat_rawdev.c``: 171 172.. code-block:: C 173 174 #define IOAT_TEST_RINGSIZE 512 175 struct rte_ioat_rawdev_config p = { .ring_size = -1 }; 176 struct rte_rawdev_info info = { .dev_private = &p }; 177 178 /* ... */ 179 180 p.ring_size = IOAT_TEST_RINGSIZE; 181 if (rte_rawdev_configure(dev_id, &info, sizeof(p)) != 0) { 182 printf("Error with rte_rawdev_configure()\n"); 183 return -1; 184 } 185 186Once configured, the device can then be made ready for use by calling the 187``rte_rawdev_start()`` API. 188 189Performing Data Copies 190~~~~~~~~~~~~~~~~~~~~~~~ 191 192To perform data copies using IOAT rawdev devices, the functions 193``rte_ioat_enqueue_copy()`` and ``rte_ioat_perform_ops()`` should be used. 194Once copies have been completed, the completion will be reported back when 195the application calls ``rte_ioat_completed_ops()``. 196 197The ``rte_ioat_enqueue_copy()`` function enqueues a single copy to the 198device ring for copying at a later point. The parameters to that function 199include the IOVA addresses of both the source and destination buffers, 200as well as two "handles" to be returned to the user when the copy is 201completed. These handles can be arbitrary values, but two are provided so 202that the library can track handles for both source and destination on 203behalf of the user, e.g. virtual addresses for the buffers, or mbuf 204pointers if packet data is being copied. 205 206While the ``rte_ioat_enqueue_copy()`` function enqueues a copy operation on 207the device ring, the copy will not actually be performed until after the 208application calls the ``rte_ioat_perform_ops()`` function. This function 209informs the device hardware of the elements enqueued on the ring, and the 210device will begin to process them. It is expected that, for efficiency 211reasons, a burst of operations will be enqueued to the device via multiple 212enqueue calls between calls to the ``rte_ioat_perform_ops()`` function. 213 214The following code from ``test_ioat_rawdev.c`` demonstrates how to enqueue 215a burst of copies to the device and start the hardware processing of them: 216 217.. code-block:: C 218 219 struct rte_mbuf *srcs[32], *dsts[32]; 220 unsigned int j; 221 222 for (i = 0; i < RTE_DIM(srcs); i++) { 223 char *src_data; 224 225 srcs[i] = rte_pktmbuf_alloc(pool); 226 dsts[i] = rte_pktmbuf_alloc(pool); 227 srcs[i]->data_len = srcs[i]->pkt_len = length; 228 dsts[i]->data_len = dsts[i]->pkt_len = length; 229 src_data = rte_pktmbuf_mtod(srcs[i], char *); 230 231 for (j = 0; j < length; j++) 232 src_data[j] = rand() & 0xFF; 233 234 if (rte_ioat_enqueue_copy(dev_id, 235 srcs[i]->buf_iova + srcs[i]->data_off, 236 dsts[i]->buf_iova + dsts[i]->data_off, 237 length, 238 (uintptr_t)srcs[i], 239 (uintptr_t)dsts[i]) != 1) { 240 printf("Error with rte_ioat_enqueue_copy for buffer %u\n", 241 i); 242 return -1; 243 } 244 } 245 rte_ioat_perform_ops(dev_id); 246 247To retrieve information about completed copies, the API 248``rte_ioat_completed_ops()`` should be used. This API will return to the 249application a set of completion handles passed in when the relevant copies 250were enqueued. 251 252The following code from ``test_ioat_rawdev.c`` shows the test code 253retrieving information about the completed copies and validating the data 254is correct before freeing the data buffers using the returned handles: 255 256.. code-block:: C 257 258 if (rte_ioat_completed_ops(dev_id, 64, (void *)completed_src, 259 (void *)completed_dst) != RTE_DIM(srcs)) { 260 printf("Error with rte_ioat_completed_ops\n"); 261 return -1; 262 } 263 for (i = 0; i < RTE_DIM(srcs); i++) { 264 char *src_data, *dst_data; 265 266 if (completed_src[i] != srcs[i]) { 267 printf("Error with source pointer %u\n", i); 268 return -1; 269 } 270 if (completed_dst[i] != dsts[i]) { 271 printf("Error with dest pointer %u\n", i); 272 return -1; 273 } 274 275 src_data = rte_pktmbuf_mtod(srcs[i], char *); 276 dst_data = rte_pktmbuf_mtod(dsts[i], char *); 277 for (j = 0; j < length; j++) 278 if (src_data[j] != dst_data[j]) { 279 printf("Error with copy of packet %u, byte %u\n", 280 i, j); 281 return -1; 282 } 283 rte_pktmbuf_free(srcs[i]); 284 rte_pktmbuf_free(dsts[i]); 285 } 286 287 288Filling an Area of Memory 289~~~~~~~~~~~~~~~~~~~~~~~~~~ 290 291The IOAT driver also has support for the ``fill`` operation, where an area 292of memory is overwritten, or filled, with a short pattern of data. 293Fill operations can be performed in much the same was as copy operations 294described above, just using the ``rte_ioat_enqueue_fill()`` function rather 295than the ``rte_ioat_enqueue_copy()`` function. 296 297 298Querying Device Statistics 299~~~~~~~~~~~~~~~~~~~~~~~~~~~ 300 301The statistics from the IOAT rawdev device can be got via the xstats 302functions in the ``rte_rawdev`` library, i.e. 303``rte_rawdev_xstats_names_get()``, ``rte_rawdev_xstats_get()`` and 304``rte_rawdev_xstats_by_name_get``. The statistics returned for each device 305instance are: 306 307* ``failed_enqueues`` 308* ``successful_enqueues`` 309* ``copies_started`` 310* ``copies_completed`` 311