1d1e91173SMaryam Tahhan.. SPDX-License-Identifier: GPL-2.0-only 2d1e91173SMaryam Tahhan.. Copyright (C) 2022 Red Hat, Inc. 3d1e91173SMaryam Tahhan 4d1e91173SMaryam Tahhan================================================= 5d1e91173SMaryam TahhanBPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_DEVMAP_HASH 6d1e91173SMaryam Tahhan================================================= 7d1e91173SMaryam Tahhan 8d1e91173SMaryam Tahhan.. note:: 9d1e91173SMaryam Tahhan - ``BPF_MAP_TYPE_DEVMAP`` was introduced in kernel version 4.14 10d1e91173SMaryam Tahhan - ``BPF_MAP_TYPE_DEVMAP_HASH`` was introduced in kernel version 5.4 11d1e91173SMaryam Tahhan 12d1e91173SMaryam Tahhan``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` are BPF maps primarily 13d1e91173SMaryam Tahhanused as backend maps for the XDP BPF helper call ``bpf_redirect_map()``. 14d1e91173SMaryam Tahhan``BPF_MAP_TYPE_DEVMAP`` is backed by an array that uses the key as 15d1e91173SMaryam Tahhanthe index to lookup a reference to a net device. While ``BPF_MAP_TYPE_DEVMAP_HASH`` 16d1e91173SMaryam Tahhanis backed by a hash table that uses a key to lookup a reference to a net device. 17d1e91173SMaryam TahhanThe user provides either <``key``/ ``ifindex``> or <``key``/ ``struct bpf_devmap_val``> 18d1e91173SMaryam Tahhanpairs to update the maps with new net devices. 19d1e91173SMaryam Tahhan 20d1e91173SMaryam Tahhan.. note:: 21d1e91173SMaryam Tahhan - The key to a hash map doesn't have to be an ``ifindex``. 22d1e91173SMaryam Tahhan - While ``BPF_MAP_TYPE_DEVMAP_HASH`` allows for densely packing the net devices 23d1e91173SMaryam Tahhan it comes at the cost of a hash of the key when performing a look up. 24d1e91173SMaryam Tahhan 25d1e91173SMaryam TahhanThe setup and packet enqueue/send code is shared between the two types of 26d1e91173SMaryam Tahhandevmap; only the lookup and insertion is different. 27d1e91173SMaryam Tahhan 28d1e91173SMaryam TahhanUsage 29d1e91173SMaryam Tahhan===== 30d1e91173SMaryam TahhanKernel BPF 31d1e91173SMaryam Tahhan---------- 32*c645eee4SMaryam Tahhanbpf_redirect_map() 33*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^ 34*c645eee4SMaryam Tahhan.. code-block:: c 35*c645eee4SMaryam Tahhan 36d1e91173SMaryam Tahhan long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 37d1e91173SMaryam Tahhan 38d1e91173SMaryam TahhanRedirect the packet to the endpoint referenced by ``map`` at index ``key``. 39d1e91173SMaryam TahhanFor ``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` this map contains 40d1e91173SMaryam Tahhanreferences to net devices (for forwarding packets through other ports). 41d1e91173SMaryam Tahhan 42d1e91173SMaryam TahhanThe lower two bits of *flags* are used as the return code if the map lookup 43d1e91173SMaryam Tahhanfails. This is so that the return value can be one of the XDP program return 44d1e91173SMaryam Tahhancodes up to ``XDP_TX``, as chosen by the caller. The higher bits of ``flags`` 45d1e91173SMaryam Tahhancan be set to ``BPF_F_BROADCAST`` or ``BPF_F_EXCLUDE_INGRESS`` as defined 46d1e91173SMaryam Tahhanbelow. 47d1e91173SMaryam Tahhan 48d1e91173SMaryam TahhanWith ``BPF_F_BROADCAST`` the packet will be broadcast to all the interfaces 49d1e91173SMaryam Tahhanin the map, with ``BPF_F_EXCLUDE_INGRESS`` the ingress interface will be excluded 50d1e91173SMaryam Tahhanfrom the broadcast. 51d1e91173SMaryam Tahhan 52d1e91173SMaryam Tahhan.. note:: 53d1e91173SMaryam Tahhan - The key is ignored if BPF_F_BROADCAST is set. 54d1e91173SMaryam Tahhan - The broadcast feature can also be used to implement multicast forwarding: 55d1e91173SMaryam Tahhan simply create multiple DEVMAPs, each one corresponding to a single multicast group. 56d1e91173SMaryam Tahhan 57d1e91173SMaryam TahhanThis helper will return ``XDP_REDIRECT`` on success, or the value of the two 58d1e91173SMaryam Tahhanlower bits of the ``flags`` argument if the map lookup fails. 59d1e91173SMaryam Tahhan 60d1e91173SMaryam TahhanMore information about redirection can be found :doc:`redirect` 61d1e91173SMaryam Tahhan 62*c645eee4SMaryam Tahhanbpf_map_lookup_elem() 63*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^^^^ 64*c645eee4SMaryam Tahhan.. code-block:: c 65*c645eee4SMaryam Tahhan 66d1e91173SMaryam Tahhan void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 67d1e91173SMaryam Tahhan 68d1e91173SMaryam TahhanNet device entries can be retrieved using the ``bpf_map_lookup_elem()`` 69d1e91173SMaryam Tahhanhelper. 70d1e91173SMaryam Tahhan 71d1e91173SMaryam TahhanUser space 72*c645eee4SMaryam Tahhan---------- 73d1e91173SMaryam Tahhan.. note:: 74d1e91173SMaryam Tahhan DEVMAP entries can only be updated/deleted from user space and not 75d1e91173SMaryam Tahhan from an eBPF program. Trying to call these functions from a kernel eBPF 76d1e91173SMaryam Tahhan program will result in the program failing to load and a verifier warning. 77d1e91173SMaryam Tahhan 78*c645eee4SMaryam Tahhanbpf_map_update_elem() 79*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^^^^ 80*c645eee4SMaryam Tahhan.. code-block:: c 81*c645eee4SMaryam Tahhan 82d1e91173SMaryam Tahhan int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags); 83d1e91173SMaryam Tahhan 84d1e91173SMaryam TahhanNet device entries can be added or updated using the ``bpf_map_update_elem()`` 85d1e91173SMaryam Tahhanhelper. This helper replaces existing elements atomically. The ``value`` parameter 86d1e91173SMaryam Tahhancan be ``struct bpf_devmap_val`` or a simple ``int ifindex`` for backwards 87d1e91173SMaryam Tahhancompatibility. 88d1e91173SMaryam Tahhan 89d1e91173SMaryam Tahhan .. code-block:: c 90d1e91173SMaryam Tahhan 91d1e91173SMaryam Tahhan struct bpf_devmap_val { 92d1e91173SMaryam Tahhan __u32 ifindex; /* device index */ 93d1e91173SMaryam Tahhan union { 94d1e91173SMaryam Tahhan int fd; /* prog fd on map write */ 95d1e91173SMaryam Tahhan __u32 id; /* prog id on map read */ 96d1e91173SMaryam Tahhan } bpf_prog; 97d1e91173SMaryam Tahhan }; 98d1e91173SMaryam Tahhan 99d1e91173SMaryam TahhanThe ``flags`` argument can be one of the following: 100d1e91173SMaryam Tahhan - ``BPF_ANY``: Create a new element or update an existing element. 101d1e91173SMaryam Tahhan - ``BPF_NOEXIST``: Create a new element only if it did not exist. 102d1e91173SMaryam Tahhan - ``BPF_EXIST``: Update an existing element. 103d1e91173SMaryam Tahhan 104d1e91173SMaryam TahhanDEVMAPs can associate a program with a device entry by adding a ``bpf_prog.fd`` 105d1e91173SMaryam Tahhanto ``struct bpf_devmap_val``. Programs are run after ``XDP_REDIRECT`` and have 106d1e91173SMaryam Tahhanaccess to both Rx device and Tx device. The program associated with the ``fd`` 107d1e91173SMaryam Tahhanmust have type XDP with expected attach type ``xdp_devmap``. 108d1e91173SMaryam TahhanWhen a program is associated with a device index, the program is run on an 109d1e91173SMaryam Tahhan``XDP_REDIRECT`` and before the buffer is added to the per-cpu queue. Examples 110d1e91173SMaryam Tahhanof how to attach/use xdp_devmap progs can be found in the kernel selftests: 111d1e91173SMaryam Tahhan 112d1e91173SMaryam Tahhan- ``tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c`` 113d1e91173SMaryam Tahhan- ``tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c`` 114d1e91173SMaryam Tahhan 115*c645eee4SMaryam Tahhanbpf_map_lookup_elem() 116*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^^^^ 117*c645eee4SMaryam Tahhan.. code-block:: c 118*c645eee4SMaryam Tahhan 119d1e91173SMaryam Tahhan.. c:function:: 120d1e91173SMaryam Tahhan int bpf_map_lookup_elem(int fd, const void *key, void *value); 121d1e91173SMaryam Tahhan 122d1e91173SMaryam TahhanNet device entries can be retrieved using the ``bpf_map_lookup_elem()`` 123d1e91173SMaryam Tahhanhelper. 124d1e91173SMaryam Tahhan 125*c645eee4SMaryam Tahhanbpf_map_delete_elem() 126*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^^^^ 127*c645eee4SMaryam Tahhan.. code-block:: c 128*c645eee4SMaryam Tahhan 129d1e91173SMaryam Tahhan.. c:function:: 130d1e91173SMaryam Tahhan int bpf_map_delete_elem(int fd, const void *key); 131d1e91173SMaryam Tahhan 132d1e91173SMaryam TahhanNet device entries can be deleted using the ``bpf_map_delete_elem()`` 133d1e91173SMaryam Tahhanhelper. This helper will return 0 on success, or negative error in case of 134d1e91173SMaryam Tahhanfailure. 135d1e91173SMaryam Tahhan 136d1e91173SMaryam TahhanExamples 137d1e91173SMaryam Tahhan======== 138d1e91173SMaryam Tahhan 139d1e91173SMaryam TahhanKernel BPF 140d1e91173SMaryam Tahhan---------- 141d1e91173SMaryam Tahhan 142d1e91173SMaryam TahhanThe following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP`` 143d1e91173SMaryam Tahhancalled tx_port. 144d1e91173SMaryam Tahhan 145d1e91173SMaryam Tahhan.. code-block:: c 146d1e91173SMaryam Tahhan 147d1e91173SMaryam Tahhan struct { 148d1e91173SMaryam Tahhan __uint(type, BPF_MAP_TYPE_DEVMAP); 149d1e91173SMaryam Tahhan __type(key, __u32); 150d1e91173SMaryam Tahhan __type(value, __u32); 151d1e91173SMaryam Tahhan __uint(max_entries, 256); 152d1e91173SMaryam Tahhan } tx_port SEC(".maps"); 153d1e91173SMaryam Tahhan 154d1e91173SMaryam TahhanThe following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP_HASH`` 155d1e91173SMaryam Tahhancalled forward_map. 156d1e91173SMaryam Tahhan 157d1e91173SMaryam Tahhan.. code-block:: c 158d1e91173SMaryam Tahhan 159d1e91173SMaryam Tahhan struct { 160d1e91173SMaryam Tahhan __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); 161d1e91173SMaryam Tahhan __type(key, __u32); 162d1e91173SMaryam Tahhan __type(value, struct bpf_devmap_val); 163d1e91173SMaryam Tahhan __uint(max_entries, 32); 164d1e91173SMaryam Tahhan } forward_map SEC(".maps"); 165d1e91173SMaryam Tahhan 166d1e91173SMaryam Tahhan.. note:: 167d1e91173SMaryam Tahhan 168d1e91173SMaryam Tahhan The value type in the DEVMAP above is a ``struct bpf_devmap_val`` 169d1e91173SMaryam Tahhan 170d1e91173SMaryam TahhanThe following code snippet shows a simple xdp_redirect_map program. This program 171d1e91173SMaryam Tahhanwould work with a user space program that populates the devmap ``forward_map`` based 172d1e91173SMaryam Tahhanon ingress ifindexes. The BPF program (below) is redirecting packets using the 173d1e91173SMaryam Tahhaningress ``ifindex`` as the ``key``. 174d1e91173SMaryam Tahhan 175d1e91173SMaryam Tahhan.. code-block:: c 176d1e91173SMaryam Tahhan 177d1e91173SMaryam Tahhan SEC("xdp") 178d1e91173SMaryam Tahhan int xdp_redirect_map_func(struct xdp_md *ctx) 179d1e91173SMaryam Tahhan { 180d1e91173SMaryam Tahhan int index = ctx->ingress_ifindex; 181d1e91173SMaryam Tahhan 182d1e91173SMaryam Tahhan return bpf_redirect_map(&forward_map, index, 0); 183d1e91173SMaryam Tahhan } 184d1e91173SMaryam Tahhan 185d1e91173SMaryam TahhanThe following code snippet shows a BPF program that is broadcasting packets to 186d1e91173SMaryam Tahhanall the interfaces in the ``tx_port`` devmap. 187d1e91173SMaryam Tahhan 188d1e91173SMaryam Tahhan.. code-block:: c 189d1e91173SMaryam Tahhan 190d1e91173SMaryam Tahhan SEC("xdp") 191d1e91173SMaryam Tahhan int xdp_redirect_map_func(struct xdp_md *ctx) 192d1e91173SMaryam Tahhan { 193d1e91173SMaryam Tahhan return bpf_redirect_map(&tx_port, 0, BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS); 194d1e91173SMaryam Tahhan } 195d1e91173SMaryam Tahhan 196d1e91173SMaryam TahhanUser space 197d1e91173SMaryam Tahhan---------- 198d1e91173SMaryam Tahhan 199d1e91173SMaryam TahhanThe following code snippet shows how to update a devmap called ``tx_port``. 200d1e91173SMaryam Tahhan 201d1e91173SMaryam Tahhan.. code-block:: c 202d1e91173SMaryam Tahhan 203d1e91173SMaryam Tahhan int update_devmap(int ifindex, int redirect_ifindex) 204d1e91173SMaryam Tahhan { 205d1e91173SMaryam Tahhan int ret; 206d1e91173SMaryam Tahhan 207d1e91173SMaryam Tahhan ret = bpf_map_update_elem(bpf_map__fd(tx_port), &ifindex, &redirect_ifindex, 0); 208d1e91173SMaryam Tahhan if (ret < 0) { 209d1e91173SMaryam Tahhan fprintf(stderr, "Failed to update devmap_ value: %s\n", 210d1e91173SMaryam Tahhan strerror(errno)); 211d1e91173SMaryam Tahhan } 212d1e91173SMaryam Tahhan 213d1e91173SMaryam Tahhan return ret; 214d1e91173SMaryam Tahhan } 215d1e91173SMaryam Tahhan 216d1e91173SMaryam TahhanThe following code snippet shows how to update a hash_devmap called ``forward_map``. 217d1e91173SMaryam Tahhan 218d1e91173SMaryam Tahhan.. code-block:: c 219d1e91173SMaryam Tahhan 220d1e91173SMaryam Tahhan int update_devmap(int ifindex, int redirect_ifindex) 221d1e91173SMaryam Tahhan { 222d1e91173SMaryam Tahhan struct bpf_devmap_val devmap_val = { .ifindex = redirect_ifindex }; 223d1e91173SMaryam Tahhan int ret; 224d1e91173SMaryam Tahhan 225d1e91173SMaryam Tahhan ret = bpf_map_update_elem(bpf_map__fd(forward_map), &ifindex, &devmap_val, 0); 226d1e91173SMaryam Tahhan if (ret < 0) { 227d1e91173SMaryam Tahhan fprintf(stderr, "Failed to update devmap_ value: %s\n", 228d1e91173SMaryam Tahhan strerror(errno)); 229d1e91173SMaryam Tahhan } 230d1e91173SMaryam Tahhan return ret; 231d1e91173SMaryam Tahhan } 232d1e91173SMaryam Tahhan 233d1e91173SMaryam TahhanReferences 234d1e91173SMaryam Tahhan=========== 235d1e91173SMaryam Tahhan 236d1e91173SMaryam Tahhan- https://lwn.net/Articles/728146/ 237d1e91173SMaryam Tahhan- https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=6f9d451ab1a33728adb72d7ff66a7b374d665176 238d1e91173SMaryam Tahhan- https://elixir.bootlin.com/linux/latest/source/net/core/filter.c#L4106 239