1d1e91173SMaryam Tahhan.. SPDX-License-Identifier: GPL-2.0-only
2d1e91173SMaryam Tahhan.. Copyright (C) 2022 Red Hat, Inc.
3d1e91173SMaryam Tahhan
4d1e91173SMaryam Tahhan=================================================
5d1e91173SMaryam TahhanBPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_DEVMAP_HASH
6d1e91173SMaryam Tahhan=================================================
7d1e91173SMaryam Tahhan
8d1e91173SMaryam Tahhan.. note::
9d1e91173SMaryam Tahhan   - ``BPF_MAP_TYPE_DEVMAP`` was introduced in kernel version 4.14
10d1e91173SMaryam Tahhan   - ``BPF_MAP_TYPE_DEVMAP_HASH`` was introduced in kernel version 5.4
11d1e91173SMaryam Tahhan
12d1e91173SMaryam Tahhan``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` are BPF maps primarily
13d1e91173SMaryam Tahhanused as backend maps for the XDP BPF helper call ``bpf_redirect_map()``.
14d1e91173SMaryam Tahhan``BPF_MAP_TYPE_DEVMAP`` is backed by an array that uses the key as
15d1e91173SMaryam Tahhanthe index to lookup a reference to a net device. While ``BPF_MAP_TYPE_DEVMAP_HASH``
16d1e91173SMaryam Tahhanis backed by a hash table that uses a key to lookup a reference to a net device.
17d1e91173SMaryam TahhanThe user provides either <``key``/ ``ifindex``> or <``key``/ ``struct bpf_devmap_val``>
18d1e91173SMaryam Tahhanpairs to update the maps with new net devices.
19d1e91173SMaryam Tahhan
20d1e91173SMaryam Tahhan.. note::
21d1e91173SMaryam Tahhan    - The key to a hash map doesn't have to be an ``ifindex``.
22d1e91173SMaryam Tahhan    - While ``BPF_MAP_TYPE_DEVMAP_HASH`` allows for densely packing the net devices
23d1e91173SMaryam Tahhan      it comes at the cost of a hash of the key when performing a look up.
24d1e91173SMaryam Tahhan
25d1e91173SMaryam TahhanThe setup and packet enqueue/send code is shared between the two types of
26d1e91173SMaryam Tahhandevmap; only the lookup and insertion is different.
27d1e91173SMaryam Tahhan
28d1e91173SMaryam TahhanUsage
29d1e91173SMaryam Tahhan=====
30d1e91173SMaryam TahhanKernel BPF
31d1e91173SMaryam Tahhan----------
32*c645eee4SMaryam Tahhanbpf_redirect_map()
33*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^
34*c645eee4SMaryam Tahhan.. code-block:: c
35*c645eee4SMaryam Tahhan
36d1e91173SMaryam Tahhan    long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
37d1e91173SMaryam Tahhan
38d1e91173SMaryam TahhanRedirect the packet to the endpoint referenced by ``map`` at index ``key``.
39d1e91173SMaryam TahhanFor ``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` this map contains
40d1e91173SMaryam Tahhanreferences to net devices (for forwarding packets through other ports).
41d1e91173SMaryam Tahhan
42d1e91173SMaryam TahhanThe lower two bits of *flags* are used as the return code if the map lookup
43d1e91173SMaryam Tahhanfails. This is so that the return value can be one of the XDP program return
44d1e91173SMaryam Tahhancodes up to ``XDP_TX``, as chosen by the caller. The higher bits of ``flags``
45d1e91173SMaryam Tahhancan be set to ``BPF_F_BROADCAST`` or ``BPF_F_EXCLUDE_INGRESS`` as defined
46d1e91173SMaryam Tahhanbelow.
47d1e91173SMaryam Tahhan
48d1e91173SMaryam TahhanWith ``BPF_F_BROADCAST`` the packet will be broadcast to all the interfaces
49d1e91173SMaryam Tahhanin the map, with ``BPF_F_EXCLUDE_INGRESS`` the ingress interface will be excluded
50d1e91173SMaryam Tahhanfrom the broadcast.
51d1e91173SMaryam Tahhan
52d1e91173SMaryam Tahhan.. note::
53d1e91173SMaryam Tahhan    - The key is ignored if BPF_F_BROADCAST is set.
54d1e91173SMaryam Tahhan    - The broadcast feature can also be used to implement multicast forwarding:
55d1e91173SMaryam Tahhan      simply create multiple DEVMAPs, each one corresponding to a single multicast group.
56d1e91173SMaryam Tahhan
57d1e91173SMaryam TahhanThis helper will return ``XDP_REDIRECT`` on success, or the value of the two
58d1e91173SMaryam Tahhanlower bits of the ``flags`` argument if the map lookup fails.
59d1e91173SMaryam Tahhan
60d1e91173SMaryam TahhanMore information about redirection can be found :doc:`redirect`
61d1e91173SMaryam Tahhan
62*c645eee4SMaryam Tahhanbpf_map_lookup_elem()
63*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^^^^
64*c645eee4SMaryam Tahhan.. code-block:: c
65*c645eee4SMaryam Tahhan
66d1e91173SMaryam Tahhan   void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
67d1e91173SMaryam Tahhan
68d1e91173SMaryam TahhanNet device entries can be retrieved using the ``bpf_map_lookup_elem()``
69d1e91173SMaryam Tahhanhelper.
70d1e91173SMaryam Tahhan
71d1e91173SMaryam TahhanUser space
72*c645eee4SMaryam Tahhan----------
73d1e91173SMaryam Tahhan.. note::
74d1e91173SMaryam Tahhan    DEVMAP entries can only be updated/deleted from user space and not
75d1e91173SMaryam Tahhan    from an eBPF program. Trying to call these functions from a kernel eBPF
76d1e91173SMaryam Tahhan    program will result in the program failing to load and a verifier warning.
77d1e91173SMaryam Tahhan
78*c645eee4SMaryam Tahhanbpf_map_update_elem()
79*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^^^^
80*c645eee4SMaryam Tahhan.. code-block:: c
81*c645eee4SMaryam Tahhan
82d1e91173SMaryam Tahhan   int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags);
83d1e91173SMaryam Tahhan
84d1e91173SMaryam TahhanNet device entries can be added or updated using the ``bpf_map_update_elem()``
85d1e91173SMaryam Tahhanhelper. This helper replaces existing elements atomically. The ``value`` parameter
86d1e91173SMaryam Tahhancan be ``struct bpf_devmap_val`` or a simple ``int ifindex`` for backwards
87d1e91173SMaryam Tahhancompatibility.
88d1e91173SMaryam Tahhan
89d1e91173SMaryam Tahhan .. code-block:: c
90d1e91173SMaryam Tahhan
91d1e91173SMaryam Tahhan    struct bpf_devmap_val {
92d1e91173SMaryam Tahhan        __u32 ifindex;   /* device index */
93d1e91173SMaryam Tahhan        union {
94d1e91173SMaryam Tahhan            int   fd;  /* prog fd on map write */
95d1e91173SMaryam Tahhan            __u32 id;  /* prog id on map read */
96d1e91173SMaryam Tahhan        } bpf_prog;
97d1e91173SMaryam Tahhan    };
98d1e91173SMaryam Tahhan
99d1e91173SMaryam TahhanThe ``flags`` argument can be one of the following:
100d1e91173SMaryam Tahhan  - ``BPF_ANY``: Create a new element or update an existing element.
101d1e91173SMaryam Tahhan  - ``BPF_NOEXIST``: Create a new element only if it did not exist.
102d1e91173SMaryam Tahhan  - ``BPF_EXIST``: Update an existing element.
103d1e91173SMaryam Tahhan
104d1e91173SMaryam TahhanDEVMAPs can associate a program with a device entry by adding a ``bpf_prog.fd``
105d1e91173SMaryam Tahhanto ``struct bpf_devmap_val``. Programs are run after ``XDP_REDIRECT`` and have
106d1e91173SMaryam Tahhanaccess to both Rx device and Tx device. The  program associated with the ``fd``
107d1e91173SMaryam Tahhanmust have type XDP with expected attach type ``xdp_devmap``.
108d1e91173SMaryam TahhanWhen a program is associated with a device index, the program is run on an
109d1e91173SMaryam Tahhan``XDP_REDIRECT`` and before the buffer is added to the per-cpu queue. Examples
110d1e91173SMaryam Tahhanof how to attach/use xdp_devmap progs can be found in the kernel selftests:
111d1e91173SMaryam Tahhan
112d1e91173SMaryam Tahhan- ``tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c``
113d1e91173SMaryam Tahhan- ``tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c``
114d1e91173SMaryam Tahhan
115*c645eee4SMaryam Tahhanbpf_map_lookup_elem()
116*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^^^^
117*c645eee4SMaryam Tahhan.. code-block:: c
118*c645eee4SMaryam Tahhan
119d1e91173SMaryam Tahhan.. c:function::
120d1e91173SMaryam Tahhan   int bpf_map_lookup_elem(int fd, const void *key, void *value);
121d1e91173SMaryam Tahhan
122d1e91173SMaryam TahhanNet device entries can be retrieved using the ``bpf_map_lookup_elem()``
123d1e91173SMaryam Tahhanhelper.
124d1e91173SMaryam Tahhan
125*c645eee4SMaryam Tahhanbpf_map_delete_elem()
126*c645eee4SMaryam Tahhan^^^^^^^^^^^^^^^^^^^^^
127*c645eee4SMaryam Tahhan.. code-block:: c
128*c645eee4SMaryam Tahhan
129d1e91173SMaryam Tahhan.. c:function::
130d1e91173SMaryam Tahhan   int bpf_map_delete_elem(int fd, const void *key);
131d1e91173SMaryam Tahhan
132d1e91173SMaryam TahhanNet device entries can be deleted using the ``bpf_map_delete_elem()``
133d1e91173SMaryam Tahhanhelper. This helper will return 0 on success, or negative error in case of
134d1e91173SMaryam Tahhanfailure.
135d1e91173SMaryam Tahhan
136d1e91173SMaryam TahhanExamples
137d1e91173SMaryam Tahhan========
138d1e91173SMaryam Tahhan
139d1e91173SMaryam TahhanKernel BPF
140d1e91173SMaryam Tahhan----------
141d1e91173SMaryam Tahhan
142d1e91173SMaryam TahhanThe following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP``
143d1e91173SMaryam Tahhancalled tx_port.
144d1e91173SMaryam Tahhan
145d1e91173SMaryam Tahhan.. code-block:: c
146d1e91173SMaryam Tahhan
147d1e91173SMaryam Tahhan    struct {
148d1e91173SMaryam Tahhan        __uint(type, BPF_MAP_TYPE_DEVMAP);
149d1e91173SMaryam Tahhan        __type(key, __u32);
150d1e91173SMaryam Tahhan        __type(value, __u32);
151d1e91173SMaryam Tahhan        __uint(max_entries, 256);
152d1e91173SMaryam Tahhan    } tx_port SEC(".maps");
153d1e91173SMaryam Tahhan
154d1e91173SMaryam TahhanThe following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP_HASH``
155d1e91173SMaryam Tahhancalled forward_map.
156d1e91173SMaryam Tahhan
157d1e91173SMaryam Tahhan.. code-block:: c
158d1e91173SMaryam Tahhan
159d1e91173SMaryam Tahhan    struct {
160d1e91173SMaryam Tahhan        __uint(type, BPF_MAP_TYPE_DEVMAP_HASH);
161d1e91173SMaryam Tahhan        __type(key, __u32);
162d1e91173SMaryam Tahhan        __type(value, struct bpf_devmap_val);
163d1e91173SMaryam Tahhan        __uint(max_entries, 32);
164d1e91173SMaryam Tahhan    } forward_map SEC(".maps");
165d1e91173SMaryam Tahhan
166d1e91173SMaryam Tahhan.. note::
167d1e91173SMaryam Tahhan
168d1e91173SMaryam Tahhan    The value type in the DEVMAP above is a ``struct bpf_devmap_val``
169d1e91173SMaryam Tahhan
170d1e91173SMaryam TahhanThe following code snippet shows a simple xdp_redirect_map program. This program
171d1e91173SMaryam Tahhanwould work with a user space program that populates the devmap ``forward_map`` based
172d1e91173SMaryam Tahhanon ingress ifindexes. The BPF program (below) is redirecting packets using the
173d1e91173SMaryam Tahhaningress ``ifindex`` as the ``key``.
174d1e91173SMaryam Tahhan
175d1e91173SMaryam Tahhan.. code-block:: c
176d1e91173SMaryam Tahhan
177d1e91173SMaryam Tahhan    SEC("xdp")
178d1e91173SMaryam Tahhan    int xdp_redirect_map_func(struct xdp_md *ctx)
179d1e91173SMaryam Tahhan    {
180d1e91173SMaryam Tahhan        int index = ctx->ingress_ifindex;
181d1e91173SMaryam Tahhan
182d1e91173SMaryam Tahhan        return bpf_redirect_map(&forward_map, index, 0);
183d1e91173SMaryam Tahhan    }
184d1e91173SMaryam Tahhan
185d1e91173SMaryam TahhanThe following code snippet shows a BPF program that is broadcasting packets to
186d1e91173SMaryam Tahhanall the interfaces in the ``tx_port`` devmap.
187d1e91173SMaryam Tahhan
188d1e91173SMaryam Tahhan.. code-block:: c
189d1e91173SMaryam Tahhan
190d1e91173SMaryam Tahhan    SEC("xdp")
191d1e91173SMaryam Tahhan    int xdp_redirect_map_func(struct xdp_md *ctx)
192d1e91173SMaryam Tahhan    {
193d1e91173SMaryam Tahhan        return bpf_redirect_map(&tx_port, 0, BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS);
194d1e91173SMaryam Tahhan    }
195d1e91173SMaryam Tahhan
196d1e91173SMaryam TahhanUser space
197d1e91173SMaryam Tahhan----------
198d1e91173SMaryam Tahhan
199d1e91173SMaryam TahhanThe following code snippet shows how to update a devmap called ``tx_port``.
200d1e91173SMaryam Tahhan
201d1e91173SMaryam Tahhan.. code-block:: c
202d1e91173SMaryam Tahhan
203d1e91173SMaryam Tahhan    int update_devmap(int ifindex, int redirect_ifindex)
204d1e91173SMaryam Tahhan    {
205d1e91173SMaryam Tahhan        int ret;
206d1e91173SMaryam Tahhan
207d1e91173SMaryam Tahhan        ret = bpf_map_update_elem(bpf_map__fd(tx_port), &ifindex, &redirect_ifindex, 0);
208d1e91173SMaryam Tahhan        if (ret < 0) {
209d1e91173SMaryam Tahhan            fprintf(stderr, "Failed to update devmap_ value: %s\n",
210d1e91173SMaryam Tahhan                strerror(errno));
211d1e91173SMaryam Tahhan        }
212d1e91173SMaryam Tahhan
213d1e91173SMaryam Tahhan        return ret;
214d1e91173SMaryam Tahhan    }
215d1e91173SMaryam Tahhan
216d1e91173SMaryam TahhanThe following code snippet shows how to update a hash_devmap called ``forward_map``.
217d1e91173SMaryam Tahhan
218d1e91173SMaryam Tahhan.. code-block:: c
219d1e91173SMaryam Tahhan
220d1e91173SMaryam Tahhan    int update_devmap(int ifindex, int redirect_ifindex)
221d1e91173SMaryam Tahhan    {
222d1e91173SMaryam Tahhan        struct bpf_devmap_val devmap_val = { .ifindex = redirect_ifindex };
223d1e91173SMaryam Tahhan        int ret;
224d1e91173SMaryam Tahhan
225d1e91173SMaryam Tahhan        ret = bpf_map_update_elem(bpf_map__fd(forward_map), &ifindex, &devmap_val, 0);
226d1e91173SMaryam Tahhan        if (ret < 0) {
227d1e91173SMaryam Tahhan            fprintf(stderr, "Failed to update devmap_ value: %s\n",
228d1e91173SMaryam Tahhan                strerror(errno));
229d1e91173SMaryam Tahhan        }
230d1e91173SMaryam Tahhan        return ret;
231d1e91173SMaryam Tahhan    }
232d1e91173SMaryam Tahhan
233d1e91173SMaryam TahhanReferences
234d1e91173SMaryam Tahhan===========
235d1e91173SMaryam Tahhan
236d1e91173SMaryam Tahhan- https://lwn.net/Articles/728146/
237d1e91173SMaryam Tahhan- https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=6f9d451ab1a33728adb72d7ff66a7b374d665176
238d1e91173SMaryam Tahhan- https://elixir.bootlin.com/linux/latest/source/net/core/filter.c#L4106
239