1.. SPDX-License-Identifier: GPL-2.0-only 2.. Copyright (C) 2022 Red Hat, Inc. 3 4================================================= 5BPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_DEVMAP_HASH 6================================================= 7 8.. note:: 9 - ``BPF_MAP_TYPE_DEVMAP`` was introduced in kernel version 4.14 10 - ``BPF_MAP_TYPE_DEVMAP_HASH`` was introduced in kernel version 5.4 11 12``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` are BPF maps primarily 13used as backend maps for the XDP BPF helper call ``bpf_redirect_map()``. 14``BPF_MAP_TYPE_DEVMAP`` is backed by an array that uses the key as 15the index to lookup a reference to a net device. While ``BPF_MAP_TYPE_DEVMAP_HASH`` 16is backed by a hash table that uses a key to lookup a reference to a net device. 17The user provides either <``key``/ ``ifindex``> or <``key``/ ``struct bpf_devmap_val``> 18pairs to update the maps with new net devices. 19 20.. note:: 21 - The key to a hash map doesn't have to be an ``ifindex``. 22 - While ``BPF_MAP_TYPE_DEVMAP_HASH`` allows for densely packing the net devices 23 it comes at the cost of a hash of the key when performing a look up. 24 25The setup and packet enqueue/send code is shared between the two types of 26devmap; only the lookup and insertion is different. 27 28Usage 29===== 30Kernel BPF 31---------- 32.. c:function:: 33 long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 34 35Redirect the packet to the endpoint referenced by ``map`` at index ``key``. 36For ``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` this map contains 37references to net devices (for forwarding packets through other ports). 38 39The lower two bits of *flags* are used as the return code if the map lookup 40fails. This is so that the return value can be one of the XDP program return 41codes up to ``XDP_TX``, as chosen by the caller. The higher bits of ``flags`` 42can be set to ``BPF_F_BROADCAST`` or ``BPF_F_EXCLUDE_INGRESS`` as defined 43below. 44 45With ``BPF_F_BROADCAST`` the packet will be broadcast to all the interfaces 46in the map, with ``BPF_F_EXCLUDE_INGRESS`` the ingress interface will be excluded 47from the broadcast. 48 49.. note:: 50 - The key is ignored if BPF_F_BROADCAST is set. 51 - The broadcast feature can also be used to implement multicast forwarding: 52 simply create multiple DEVMAPs, each one corresponding to a single multicast group. 53 54This helper will return ``XDP_REDIRECT`` on success, or the value of the two 55lower bits of the ``flags`` argument if the map lookup fails. 56 57More information about redirection can be found :doc:`redirect` 58 59.. c:function:: 60 void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 61 62Net device entries can be retrieved using the ``bpf_map_lookup_elem()`` 63helper. 64 65Userspace 66--------- 67.. note:: 68 DEVMAP entries can only be updated/deleted from user space and not 69 from an eBPF program. Trying to call these functions from a kernel eBPF 70 program will result in the program failing to load and a verifier warning. 71 72.. c:function:: 73 int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags); 74 75 Net device entries can be added or updated using the ``bpf_map_update_elem()`` 76 helper. This helper replaces existing elements atomically. The ``value`` parameter 77 can be ``struct bpf_devmap_val`` or a simple ``int ifindex`` for backwards 78 compatibility. 79 80 .. code-block:: c 81 82 struct bpf_devmap_val { 83 __u32 ifindex; /* device index */ 84 union { 85 int fd; /* prog fd on map write */ 86 __u32 id; /* prog id on map read */ 87 } bpf_prog; 88 }; 89 90 The ``flags`` argument can be one of the following: 91 92 - ``BPF_ANY``: Create a new element or update an existing element. 93 - ``BPF_NOEXIST``: Create a new element only if it did not exist. 94 - ``BPF_EXIST``: Update an existing element. 95 96 DEVMAPs can associate a program with a device entry by adding a ``bpf_prog.fd`` 97 to ``struct bpf_devmap_val``. Programs are run after ``XDP_REDIRECT`` and have 98 access to both Rx device and Tx device. The program associated with the ``fd`` 99 must have type XDP with expected attach type ``xdp_devmap``. 100 When a program is associated with a device index, the program is run on an 101 ``XDP_REDIRECT`` and before the buffer is added to the per-cpu queue. Examples 102 of how to attach/use xdp_devmap progs can be found in the kernel selftests: 103 104 - ``tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c`` 105 - ``tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c`` 106 107.. c:function:: 108 int bpf_map_lookup_elem(int fd, const void *key, void *value); 109 110 Net device entries can be retrieved using the ``bpf_map_lookup_elem()`` 111 helper. 112 113.. c:function:: 114 int bpf_map_delete_elem(int fd, const void *key); 115 116 Net device entries can be deleted using the ``bpf_map_delete_elem()`` 117 helper. This helper will return 0 on success, or negative error in case of 118 failure. 119 120Examples 121======== 122 123Kernel BPF 124---------- 125 126The following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP`` 127called tx_port. 128 129.. code-block:: c 130 131 struct { 132 __uint(type, BPF_MAP_TYPE_DEVMAP); 133 __type(key, __u32); 134 __type(value, __u32); 135 __uint(max_entries, 256); 136 } tx_port SEC(".maps"); 137 138The following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP_HASH`` 139called forward_map. 140 141.. code-block:: c 142 143 struct { 144 __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); 145 __type(key, __u32); 146 __type(value, struct bpf_devmap_val); 147 __uint(max_entries, 32); 148 } forward_map SEC(".maps"); 149 150.. note:: 151 152 The value type in the DEVMAP above is a ``struct bpf_devmap_val`` 153 154The following code snippet shows a simple xdp_redirect_map program. This program 155would work with a user space program that populates the devmap ``forward_map`` based 156on ingress ifindexes. The BPF program (below) is redirecting packets using the 157ingress ``ifindex`` as the ``key``. 158 159.. code-block:: c 160 161 SEC("xdp") 162 int xdp_redirect_map_func(struct xdp_md *ctx) 163 { 164 int index = ctx->ingress_ifindex; 165 166 return bpf_redirect_map(&forward_map, index, 0); 167 } 168 169The following code snippet shows a BPF program that is broadcasting packets to 170all the interfaces in the ``tx_port`` devmap. 171 172.. code-block:: c 173 174 SEC("xdp") 175 int xdp_redirect_map_func(struct xdp_md *ctx) 176 { 177 return bpf_redirect_map(&tx_port, 0, BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS); 178 } 179 180User space 181---------- 182 183The following code snippet shows how to update a devmap called ``tx_port``. 184 185.. code-block:: c 186 187 int update_devmap(int ifindex, int redirect_ifindex) 188 { 189 int ret; 190 191 ret = bpf_map_update_elem(bpf_map__fd(tx_port), &ifindex, &redirect_ifindex, 0); 192 if (ret < 0) { 193 fprintf(stderr, "Failed to update devmap_ value: %s\n", 194 strerror(errno)); 195 } 196 197 return ret; 198 } 199 200The following code snippet shows how to update a hash_devmap called ``forward_map``. 201 202.. code-block:: c 203 204 int update_devmap(int ifindex, int redirect_ifindex) 205 { 206 struct bpf_devmap_val devmap_val = { .ifindex = redirect_ifindex }; 207 int ret; 208 209 ret = bpf_map_update_elem(bpf_map__fd(forward_map), &ifindex, &devmap_val, 0); 210 if (ret < 0) { 211 fprintf(stderr, "Failed to update devmap_ value: %s\n", 212 strerror(errno)); 213 } 214 return ret; 215 } 216 217References 218=========== 219 220- https://lwn.net/Articles/728146/ 221- https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=6f9d451ab1a33728adb72d7ff66a7b374d665176 222- https://elixir.bootlin.com/linux/latest/source/net/core/filter.c#L4106 223