|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14 |
|
| #
ae0a457f |
| 18-Mar-2025 |
Emil Tsalapatis <[email protected]> |
bpf: Make perf_event_read_output accessible in all program types.
The perf_event_read_event_output helper is currently only available to tracing protrams, but is useful for other BPF programs like s
bpf: Make perf_event_read_output accessible in all program types.
The perf_event_read_event_output helper is currently only available to tracing protrams, but is useful for other BPF programs like sched_ext schedulers. When the helper is available, provide its bpf_func_proto directly from the bpf base_proto.
Signed-off-by: Emil Tsalapatis (Meta) <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
daec295a |
| 26-Feb-2025 |
Mykyta Yatsenko <[email protected]> |
bpf/helpers: Introduce bpf_dynptr_copy kfunc
Introducing bpf_dynptr_copy kfunc allowing copying data from one dynptr to another. This functionality is useful in scenarios such as capturing XDP data
bpf/helpers: Introduce bpf_dynptr_copy kfunc
Introducing bpf_dynptr_copy kfunc allowing copying data from one dynptr to another. This functionality is useful in scenarios such as capturing XDP data to a ring buffer. The implementation consists of 4 branches: * A fast branch for contiguous buffer capacity in both source and destination dynptrs * 3 branches utilizing __bpf_dynptr_read and __bpf_dynptr_write to copy data to/from non-contiguous buffer
Signed-off-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
09206af6 |
| 26-Feb-2025 |
Mykyta Yatsenko <[email protected]> |
bpf/helpers: Refactor bpf_dynptr_read and bpf_dynptr_write
Refactor bpf_dynptr_read and bpf_dynptr_write helpers: extract code into the static functions namely __bpf_dynptr_read and __bpf_dynptr_wri
bpf/helpers: Refactor bpf_dynptr_read and bpf_dynptr_write
Refactor bpf_dynptr_read and bpf_dynptr_write helpers: extract code into the static functions namely __bpf_dynptr_read and __bpf_dynptr_write, this allows calling these without compiler warnings.
Signed-off-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3 |
|
| #
f0f8a5b5 |
| 13-Feb-2025 |
Jordan Rome <[email protected]> |
bpf: Add bpf_copy_from_user_task_str() kfunc
This new kfunc will be able to copy a zero-terminated C strings from another task's address space. This is similar to `bpf_copy_from_user_str()` but read
bpf: Add bpf_copy_from_user_task_str() kfunc
This new kfunc will be able to copy a zero-terminated C strings from another task's address space. This is similar to `bpf_copy_from_user_str()` but reads memory of specified task.
Signed-off-by: Jordan Rome <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
|
Revision tags: v6.14-rc2 |
|
| #
deacdc87 |
| 05-Feb-2025 |
Nam Cao <[email protected]> |
bpf: Switch to use hrtimer_setup()
hrtimer_setup() takes the callback function pointer as argument and initializes the timer completely.
Replace hrtimer_init() and the open coded initialization of
bpf: Switch to use hrtimer_setup()
hrtimer_setup() takes the callback function pointer as argument and initializes the timer completely.
Replace hrtimer_init() and the open coded initialization of hrtimer::function with the new setup mechanism.
Patch was created by using Coccinelle.
Signed-off-by: Nam Cao <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/e4be2486f02a8e0ef5aa42624f1708d23e88ad57.1738746821.git.namcao@linutronix.de
show more ...
|
|
Revision tags: v6.14-rc1, v6.13 |
|
| #
58f038e6 |
| 17-Jan-2025 |
Hou Tao <[email protected]> |
bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT
During the update procedure, when overwrite element in a pre-allocated htab, the freeing of old_element is protected by the bucket lo
bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT
During the update procedure, when overwrite element in a pre-allocated htab, the freeing of old_element is protected by the bucket lock. The reason why the bucket lock is necessary is that the old_element has already been stashed in htab->extra_elems after alloc_htab_elem() returns. If freeing the old_element after the bucket lock is unlocked, the stashed element may be reused by concurrent update procedure and the freeing of old_element will run concurrently with the reuse of the old_element. However, the invocation of check_and_free_fields() may acquire a spin-lock which violates the lockdep rule because its caller has already held a raw-spin-lock (bucket lock). The following warning will be reported when such race happens:
BUG: scheduling while atomic: test_progs/676/0x00000003 3 locks held by test_progs/676: #0: ffffffff864b0240 (rcu_read_lock_trace){....}-{0:0}, at: bpf_prog_test_run_syscall+0x2c0/0x830 #1: ffff88810e961188 (&htab->lockdep_key){....}-{2:2}, at: htab_map_update_elem+0x306/0x1500 #2: ffff8881f4eac1b8 (&base->softirq_expiry_lock){....}-{2:2}, at: hrtimer_cancel_wait_running+0xe9/0x1b0 Modules linked in: bpf_testmod(O) Preemption disabled at: [<ffffffff817837a3>] htab_map_update_elem+0x293/0x1500 CPU: 0 UID: 0 PID: 676 Comm: test_progs Tainted: G ... 6.12.0+ #11 Tainted: [W]=WARN, [O]=OOT_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)... Call Trace: <TASK> dump_stack_lvl+0x57/0x70 dump_stack+0x10/0x20 __schedule_bug+0x120/0x170 __schedule+0x300c/0x4800 schedule_rtlock+0x37/0x60 rtlock_slowlock_locked+0x6d9/0x54c0 rt_spin_lock+0x168/0x230 hrtimer_cancel_wait_running+0xe9/0x1b0 hrtimer_cancel+0x24/0x30 bpf_timer_delete_work+0x1d/0x40 bpf_timer_cancel_and_free+0x5e/0x80 bpf_obj_free_fields+0x262/0x4a0 check_and_free_fields+0x1d0/0x280 htab_map_update_elem+0x7fc/0x1500 bpf_prog_9f90bc20768e0cb9_overwrite_cb+0x3f/0x43 bpf_prog_ea601c4649694dbd_overwrite_timer+0x5d/0x7e bpf_prog_test_run_syscall+0x322/0x830 __sys_bpf+0x135d/0x3ca0 __x64_sys_bpf+0x75/0xb0 x64_sys_call+0x1b5/0xa10 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 ... </TASK>
It seems feasible to break the reuse and refill of per-cpu extra_elems into two independent parts: reuse the per-cpu extra_elems with bucket lock being held and refill the old_element as per-cpu extra_elems after the bucket lock is unlocked. However, it will make the concurrent overwrite procedures on the same CPU return unexpected -E2BIG error when the map is full.
Therefore, the patch fixes the lock problem by breaking the cancelling of bpf_timer into two steps for PREEMPT_RT: 1) use hrtimer_try_to_cancel() and check its return value 2) if the timer is running, use hrtimer_cancel() through a kworker to cancel it again Considering that the current implementation of hrtimer_cancel() will try to acquire a being held softirq_expiry_lock when the current timer is running, these steps above are reasonable. However, it also has downside. When the timer is running, the cancelling of the timer is delayed when releasing the last map uref. The delay is also fixable (e.g., break the cancelling of bpf timer into two parts: one part in locked scope, another one in unlocked scope), it can be revised later if necessary.
It is a bit hard to decide the right fix tag. One reason is that the problem depends on PREEMPT_RT which is enabled in v6.12. Considering the softirq_expiry_lock lock exists since v5.4 and bpf_timer is introduced in v5.15, the bpf_timer commit is used in the fixes tag and an extra depends-on tag is added to state the dependency on PREEMPT_RT.
Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.") Depends-on: v6.12+ with PREEMPT_RT enabled Reported-by: Sebastian Andrzej Siewior <[email protected]> Closes: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Hou Tao <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7 |
|
| #
1d2dbe71 |
| 08-Jan-2025 |
Hou Tao <[email protected]> |
bpf: Remove migrate_{disable|enable} in bpf_obj_free_fields()
The callers of bpf_obj_free_fields() have already guaranteed that the migration is disabled, therefore, there is no need to invoke migra
bpf: Remove migrate_{disable|enable} in bpf_obj_free_fields()
The callers of bpf_obj_free_fields() have already guaranteed that the migration is disabled, therefore, there is no need to invoke migrate_{disable,enable} pair in bpf_obj_free_fields()'s underly implementation.
This patch removes unnecessary migrate_{disable|enable} pairs from bpf_obj_free_fields() and its callees: bpf_list_head_free() and bpf_rb_root_free().
Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3 |
|
| #
00a5acdb |
| 12-Dec-2024 |
Thomas Weißschuh <[email protected]> |
bpf: Fix configuration-dependent BTF function references
These BTF functions are not available unconditionally, only reference them when they are available.
Avoid the following build warnings:
B
bpf: Fix configuration-dependent BTF function references
These BTF functions are not available unconditionally, only reference them when they are available.
Avoid the following build warnings:
BTF .tmp_vmlinux1.btf.o btf_encoder__tag_kfunc: failed to find kfunc 'bpf_send_signal_task' in BTF btf_encoder__tag_kfuncs: failed to tag kfunc 'bpf_send_signal_task' NM .tmp_vmlinux1.syms KSYMS .tmp_vmlinux1.kallsyms.S AS .tmp_vmlinux1.kallsyms.o LD .tmp_vmlinux2 NM .tmp_vmlinux2.syms KSYMS .tmp_vmlinux2.kallsyms.S AS .tmp_vmlinux2.kallsyms.o LD vmlinux BTFIDS vmlinux WARN: resolve_btfids: unresolved symbol prog_test_ref_kfunc WARN: resolve_btfids: unresolved symbol bpf_crypto_ctx WARN: resolve_btfids: unresolved symbol bpf_send_signal_task WARN: resolve_btfids: unresolved symbol bpf_modify_return_test_tp WARN: resolve_btfids: unresolved symbol bpf_dynptr_from_xdp WARN: resolve_btfids: unresolved symbol bpf_dynptr_from_skb
Signed-off-by: Thomas Weißschuh <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
|
Revision tags: v6.13-rc2 |
|
| #
c8e2ee1f |
| 04-Dec-2024 |
Kumar Kartikeya Dwivedi <[email protected]> |
bpf: Introduce support for bpf_local_irq_{save,restore}
Teach the verifier about IRQ-disabled sections through the introduction of two new kfuncs, bpf_local_irq_save, to save IRQ state and disable t
bpf: Introduce support for bpf_local_irq_{save,restore}
Teach the verifier about IRQ-disabled sections through the introduction of two new kfuncs, bpf_local_irq_save, to save IRQ state and disable them, and bpf_local_irq_restore, to restore IRQ state and enable them back again.
For the purposes of tracking the saved IRQ state, the verifier is taught about a new special object on the stack of type STACK_IRQ_FLAG. This is a 8 byte value which saves the IRQ flags which are to be passed back to the IRQ restore kfunc.
Renumber the enums for REF_TYPE_* to simplify the check in find_lock_state, filtering out non-lock types as they grow will become cumbersome and is unecessary.
To track a dynamic number of IRQ-disabled regions and their associated saved states, a new resource type RES_TYPE_IRQ is introduced, which its state management functions: acquire_irq_state and release_irq_state, taking advantage of the refactoring and clean ups made in earlier commits.
One notable requirement of the kernel's IRQ save and restore API is that they cannot happen out of order. For this purpose, when releasing reference we keep track of the prev_id we saw with REF_TYPE_IRQ. Since reference states are inserted in increasing order of the index, this is used to remember the ordering of acquisitions of IRQ saved states, so that we maintain a logical stack in acquisition order of resource identities, and can enforce LIFO ordering when restoring IRQ state. The top of the stack is maintained using bpf_verifier_state's active_irq_id.
To maintain the stack property when releasing reference states, we need to modify release_reference_state to instead shift the remaining array left using memmove instead of swapping deleted element with last that might break the ordering. A selftest to test this subtle behavior is added in late patches.
The logic to detect initialized and unitialized irq flag slots, marking and unmarking is similar to how it's done for iterators. No additional checks are needed in refsafe for REF_TYPE_IRQ, apart from the usual check_id satisfiability check on the ref[i].id. We have to perform the same check_ids check on state->active_irq_id as well.
To ensure we don't get assigned REF_TYPE_PTR by default after acquire_reference_state, if someone forgets to assign the type, let's also renumber the enum ref_state_type. This way any unassigned types get caught by refsafe's default switch statement, don't assume REF_TYPE_PTR by default.
The kfuncs themselves are plain wrappers over local_irq_save and local_irq_restore macros.
Acked-by: Eduard Zingerman <[email protected]> Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6 |
|
| #
2e9a5480 |
| 30-Oct-2024 |
Namhyung Kim <[email protected]> |
bpf: Add open coded version of kmem_cache iterator
Add a new open coded iterator for kmem_cache which can be called from a BPF program like below. It doesn't take any argument and traverses all kme
bpf: Add open coded version of kmem_cache iterator
Add a new open coded iterator for kmem_cache which can be called from a BPF program like below. It doesn't take any argument and traverses all kmem_cache entries.
struct kmem_cache *pos;
bpf_for_each(kmem_cache, pos) { ... }
As it needs to grab slab_mutex, it should be called from sleepable BPF programs only.
Also update the existing iterator code to use the open coded version internally as suggested by Andrii.
Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
e1339383 |
| 30-Oct-2024 |
Hou Tao <[email protected]> |
bpf: Use __u64 to save the bits in bits iterator
On 32-bit hosts (e.g., arm32), when a bpf program passes a u64 to bpf_iter_bits_new(), bpf_iter_bits_new() will use bits_copy to store the content of
bpf: Use __u64 to save the bits in bits iterator
On 32-bit hosts (e.g., arm32), when a bpf program passes a u64 to bpf_iter_bits_new(), bpf_iter_bits_new() will use bits_copy to store the content of the u64. However, bits_copy is only 4 bytes, leading to stack corruption.
The straightforward solution would be to replace u64 with unsigned long in bpf_iter_bits_new(). However, this introduces confusion and problems for 32-bit hosts because the size of ulong in bpf program is 8 bytes, but it is treated as 4-bytes after passed to bpf_iter_bits_new().
Fix it by changing the type of both bits and bit_count from unsigned long to u64. However, the change is not enough. The main reason is that bpf_iter_bits_next() uses find_next_bit() to find the next bit and the pointer passed to find_next_bit() is an unsigned long pointer instead of a u64 pointer. For 32-bit little-endian host, it is fine but it is not the case for 32-bit big-endian host. Because under 32-bit big-endian host, the first iterated unsigned long will be the bits 32-63 of the u64 instead of the expected bits 0-31. Therefore, in addition to changing the type, swap the two unsigned longs within the u64 for 32-bit big-endian host.
Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
393397fb |
| 30-Oct-2024 |
Hou Tao <[email protected]> |
bpf: Check the validity of nr_words in bpf_iter_bits_new()
Check the validity of nr_words in bpf_iter_bits_new(). Without this check, when multiplication overflow occurs for nr_bits (e.g., when nr_w
bpf: Check the validity of nr_words in bpf_iter_bits_new()
Check the validity of nr_words in bpf_iter_bits_new(). Without this check, when multiplication overflow occurs for nr_bits (e.g., when nr_words = 0x0400-0001, nr_bits becomes 64), stack corruption may occur due to bpf_probe_read_kernel_common(..., nr_bytes = 0x2000-0008).
Fix it by limiting the maximum value of nr_words to 511. The value is derived from the current implementation of BPF memory allocator. To ensure compatibility if the BPF memory allocator's size limitation changes in the future, use the helper bpf_mem_alloc_check_size() to check whether nr_bytes is too larger. And return -E2BIG instead of -ENOMEM for oversized nr_bytes.
Fixes: 4665415975b0 ("bpf: Add bits iterator") Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
101ccfba |
| 30-Oct-2024 |
Hou Tao <[email protected]> |
bpf: Free dynamically allocated bits in bpf_iter_bits_destroy()
bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the bits are dynamically allocated. However, the check is incorrect
bpf: Free dynamically allocated bits in bpf_iter_bits_destroy()
bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the bits are dynamically allocated. However, the check is incorrect and may cause a kmemleak as shown below:
unreferenced object 0xffff88812628c8c0 (size 32): comm "swapper/0", pid 1, jiffies 4294727320 hex dump (first 32 bytes): b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U........... f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 .............. backtrace (crc 781e32cc): [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 [<00000000b119f72f>] kernel_init+0x20/0x160 [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30
That is because nr_bits will be set as zero in bpf_iter_bits_next() after all bits have been iterated.
Fix the issue by setting kit->bit to kit->nr_bits instead of setting kit->nr_bits to zero when the iteration completes in bpf_iter_bits_next(). In addition, use "!nr_bits || bits >= nr_bits" to check whether the iteration is complete and still use "nr_bits > 64" to indicate whether bits are dynamically allocated. The "!nr_bits" check is necessary because bpf_iter_bits_new() may fail before setting kit->nr_bits, and this condition will stop the iteration early instead of accessing the zeroed or freed kit->bits.
Considering the initial value of kit->bits is -1 and the type of kit->nr_bits is unsigned int, change the type of kit->nr_bits to int. The potential overflow problem will be handled in the following patch.
Fixes: 4665415975b0 ("bpf: Add bits iterator") Acked-by: Yafang Shao <[email protected]> Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc5 |
|
| #
6fad274f |
| 21-Oct-2024 |
Daniel Borkmann <[email protected]> |
bpf: Add MEM_WRITE attribute
Add a MEM_WRITE attribute for BPF helper functions which can be used in bpf_func_proto to annotate an argument type in order to let the verifier know that the helper wri
bpf: Add MEM_WRITE attribute
Add a MEM_WRITE attribute for BPF helper functions which can be used in bpf_func_proto to annotate an argument type in order to let the verifier know that the helper writes into the memory passed as an argument. In the past MEM_UNINIT has been (ab)used for this function, but the latter merely tells the verifier that the passed memory can be uninitialized.
There have been bugs with overloading the latter but aside from that there are also cases where the passed memory is read + written which currently cannot be expressed, see also 4b3786a6c539 ("bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error").
Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc4 |
|
| #
6280cf71 |
| 16-Oct-2024 |
Puranjay Mohan <[email protected]> |
bpf: Implement bpf_send_signal_task() kfunc
Implement bpf_send_signal_task kfunc that is similar to bpf_send_signal_thread and bpf_send_signal helpers but can be used to send signals to other threa
bpf: Implement bpf_send_signal_task() kfunc
Implement bpf_send_signal_task kfunc that is similar to bpf_send_signal_thread and bpf_send_signal helpers but can be used to send signals to other threads and processes. It also supports sending a cookie with the signal similar to sigqueue().
If the receiving process establishes a handler for the signal using the SA_SIGINFO flag to sigaction(), then it can obtain this cookie via the si_value field of the siginfo_t structure passed as the second argument to the handler.
Signed-off-by: Puranjay Mohan <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
| #
675c3596 |
| 14-Oct-2024 |
Juntong Deng <[email protected]> |
bpf: Add bpf_task_from_vpid() kfunc
bpf_task_from_pid() that currently exists looks up the struct task_struct corresponding to the pid in the root pid namespace (init_pid_ns).
This patch adds bpf_t
bpf: Add bpf_task_from_vpid() kfunc
bpf_task_from_pid() that currently exists looks up the struct task_struct corresponding to the pid in the root pid namespace (init_pid_ns).
This patch adds bpf_task_from_vpid() which looks up the struct task_struct corresponding to vpid in the pid namespace of the current process.
This is useful for getting information about other processes in the same pid namespace.
Signed-off-by: Juntong Deng <[email protected]> Link: https://lore.kernel.org/r/AM6PR03MB5848E50DA58F79CDE65433C399442@AM6PR03MB5848.eurprd03.prod.outlook.com Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc3 |
|
| #
a992d7a3 |
| 10-Oct-2024 |
Namhyung Kim <[email protected]> |
mm/bpf: Add bpf_get_kmem_cache() kfunc
The bpf_get_kmem_cache() is to get a slab cache information from a virtual address like virt_to_cache(). If the address is a pointer to a slab object, it'd re
mm/bpf: Add bpf_get_kmem_cache() kfunc
The bpf_get_kmem_cache() is to get a slab cache information from a virtual address like virt_to_cache(). If the address is a pointer to a slab object, it'd return a valid kmem_cache pointer, otherwise NULL is returned.
It doesn't grab a reference count of the kmem_cache so the caller is responsible to manage the access. The returned point is marked as PTR_UNTRUSTED.
The intended use case for now is to symbolize locks in slab objects from the lock contention tracepoints.
Suggested-by: Vlastimil Babka <[email protected]> Acked-by: Roman Gushchin <[email protected]> (mm/*) Acked-by: Vlastimil Babka <[email protected]> #mm/slab Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc2, v6.12-rc1 |
|
| #
da7d71bc |
| 16-Sep-2024 |
Eduard Zingerman <[email protected]> |
bpf: Use KF_FASTCALL to mark kfuncs supporting fastcall contract
In order to allow pahole add btf_decl_tag("bpf_fastcall") for kfuncs supporting bpf_fastcall, mark such functions with KF_FASTCALL in
bpf: Use KF_FASTCALL to mark kfuncs supporting fastcall contract
In order to allow pahole add btf_decl_tag("bpf_fastcall") for kfuncs supporting bpf_fastcall, mark such functions with KF_FASTCALL in id_set8 objects.
Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.11 |
|
| #
4b3786a6 |
| 13-Sep-2024 |
Daniel Borkmann <[email protected]> |
bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error
For all non-tracing helpers which formerly had ARG_PTR_TO_{LONG,INT} as input arguments, zero the value for the case of an error as other
bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error
For all non-tracing helpers which formerly had ARG_PTR_TO_{LONG,INT} as input arguments, zero the value for the case of an error as otherwise it could leak memory. For tracing, it is not needed given CAP_PERFMON can already read all kernel memory anyway hence bpf_get_func_arg() and bpf_get_func_ret() is skipped in here.
Also, the MTU helpers mtu_len pointer value is being written but also read. Technically, the MEM_UNINIT should not be there in order to always force init. Removing MEM_UNINIT needs more verifier rework though: MEM_UNINIT right now implies two things actually: i) write into memory, ii) memory does not have to be initialized. If we lift MEM_UNINIT, it then becomes: i) read into memory, ii) memory must be initialized. This means that for bpf_*_check_mtu() we're readding the issue we're trying to fix, that is, it would then be able to write back into things like .rodata BPF maps. Follow-up work will rework the MEM_UNINIT semantics such that the intent can be better expressed. For now just clear the *mtu_len on error path which can be lifted later again.
Fixes: 8a67f2de9b1d ("bpf: expose bpf_strtol and bpf_strtoul to all program types") Fixes: d7a4cb9b6705 ("bpf: Introduce bpf_strtol and bpf_strtoul helpers") Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
32556ce9 |
| 13-Sep-2024 |
Daniel Borkmann <[email protected]> |
bpf: Fix helper writes to read-only maps
Lonial found an issue that despite user- and BPF-side frozen BPF map (like in case of .rodata), it was still possible to write into it from a BPF program sid
bpf: Fix helper writes to read-only maps
Lonial found an issue that despite user- and BPF-side frozen BPF map (like in case of .rodata), it was still possible to write into it from a BPF program side through specific helpers having ARG_PTR_TO_{LONG,INT} as arguments.
In check_func_arg() when the argument is as mentioned, the meta->raw_mode is never set. Later, check_helper_mem_access(), under the case of PTR_TO_MAP_VALUE as register base type, it assumes BPF_READ for the subsequent call to check_map_access_type() and given the BPF map is read-only it succeeds.
The helpers really need to be annotated as ARG_PTR_TO_{LONG,INT} | MEM_UNINIT when results are written into them as opposed to read out of them. The latter indicates that it's okay to pass a pointer to uninitialized memory as the memory is written to anyway.
However, ARG_PTR_TO_{LONG,INT} is a special case of ARG_PTR_TO_FIXED_SIZE_MEM just with additional alignment requirement. So it is better to just get rid of the ARG_PTR_TO_{LONG,INT} special cases altogether and reuse the fixed size memory types. For this, add MEM_ALIGNED to additionally ensure alignment given these helpers write directly into the args via *<ptr> = val. The .arg*_size has been initialized reflecting the actual sizeof(*<ptr>).
MEM_ALIGNED can only be used in combination with MEM_FIXED_SIZE annotated argument types, since in !MEM_FIXED_SIZE cases the verifier does not know the buffer size a priori and therefore cannot blindly write *<ptr> = val.
Fixes: 57c3bb725a3d ("bpf: Introduce ARG_PTR_TO_{INT,LONG} arg types") Reported-by: Lonial Con <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Shung-Hsi Yu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
7d71f59e |
| 13-Sep-2024 |
Daniel Borkmann <[email protected]> |
bpf: Remove truncation test in bpf_strtol and bpf_strtoul helpers
Both bpf_strtol() and bpf_strtoul() helpers passed a temporary "long long" respectively "unsigned long long" to __bpf_strtoll() / __
bpf: Remove truncation test in bpf_strtol and bpf_strtoul helpers
Both bpf_strtol() and bpf_strtoul() helpers passed a temporary "long long" respectively "unsigned long long" to __bpf_strtoll() / __bpf_strtoull().
Later, the result was checked for truncation via _res != ({unsigned,} long)_res as the destination buffer for the BPF helpers was of type {unsigned,} long which is 32bit on 32bit architectures.
Given the latter was a bug in the helper signatures where the destination buffer got adjusted to {s,u}64, the truncation check can now be removed.
Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
cfe69c50 |
| 13-Sep-2024 |
Daniel Borkmann <[email protected]> |
bpf: Fix bpf_strtol and bpf_strtoul helpers for 32bit
The bpf_strtol() and bpf_strtoul() helpers are currently broken on 32bit:
The argument type ARG_PTR_TO_LONG is BPF-side "long", not kernel-side
bpf: Fix bpf_strtol and bpf_strtoul helpers for 32bit
The bpf_strtol() and bpf_strtoul() helpers are currently broken on 32bit:
The argument type ARG_PTR_TO_LONG is BPF-side "long", not kernel-side "long" and therefore always considered fixed 64bit no matter if 64 or 32bit underlying architecture.
This contract breaks in case of the two mentioned helpers since their BPF_CALL definition for the helpers was added with {unsigned,}long *res. Meaning, the transition from BPF-side "long" (BPF program) to kernel-side "long" (BPF helper) breaks here.
Both helpers call __bpf_strtoll() with "long long" correctly, but later assigning the result into 32-bit "*(long *)" on 32bit architectures. From a BPF program point of view, this means upper bits will be seen as uninitialised.
Therefore, fix both BPF_CALL signatures to {s,u}64 types to fix this situation.
Now, changing also uapi/bpf.h helper documentation which generates bpf_helper_defs.h for BPF programs is tricky: Changing signatures there to __{s,u}64 would trigger compiler warnings (incompatible pointer types passing 'long *' to parameter of type '__s64 *' (aka 'long long *')) for existing BPF programs.
Leaving the signatures as-is would be fine as from BPF program point of view it is still BPF-side "long" and thus equivalent to __{s,u}64 on 64 or 32bit underlying architectures.
Note that bpf_strtol() and bpf_strtoul() are the only helpers with this issue.
Fixes: d7a4cb9b6705 ("bpf: Introduce bpf_strtol and bpf_strtoul helpers") Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc7, v6.11-rc6 |
|
| #
866d571e |
| 29-Aug-2024 |
Martin KaFai Lau <[email protected]> |
bpf: Export bpf_base_func_proto
The bpf_testmod needs to use the bpf_tail_call helper in a later selftest patch. This patch is to EXPORT_GPL_SYMBOL the bpf_base_func_proto.
Signed-off-by: Martin Ka
bpf: Export bpf_base_func_proto
The bpf_testmod needs to use the bpf_tail_call helper in a later selftest patch. This patch is to EXPORT_GPL_SYMBOL the bpf_base_func_proto.
Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc5 |
|
| #
65ab5ac4 |
| 23-Aug-2024 |
Jordan Rome <[email protected]> |
bpf: Add bpf_copy_from_user_str kfunc
This adds a kfunc wrapper around strncpy_from_user, which can be called from sleepable BPF programs.
This matches the non-sleepable 'bpf_probe_read_user_str' h
bpf: Add bpf_copy_from_user_str kfunc
This adds a kfunc wrapper around strncpy_from_user, which can be called from sleepable BPF programs.
This matches the non-sleepable 'bpf_probe_read_user_str' helper except it includes an additional 'flags' param, which allows consumers to clear the entire destination buffer on success or failure.
Signed-off-by: Jordan Rome <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc4 |
|
| #
b0966c72 |
| 13-Aug-2024 |
Dave Marchevsky <[email protected]> |
bpf: Support bpf_kptr_xchg into local kptr
Currently, users can only stash kptr into map values with bpf_kptr_xchg(). This patch further supports stashing kptr into local kptr by adding local kptr a
bpf: Support bpf_kptr_xchg into local kptr
Currently, users can only stash kptr into map values with bpf_kptr_xchg(). This patch further supports stashing kptr into local kptr by adding local kptr as a valid destination type.
When stashing into local kptr, btf_record in program BTF is used instead of btf_record in map to search for the btf_field of the local kptr.
The local kptr specific checks in check_reg_type() only apply when the source argument of bpf_kptr_xchg() is local kptr. Therefore, we make the scope of the check explicit as the destination now can also be local kptr.
Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Dave Marchevsky <[email protected]> Signed-off-by: Amery Hung <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|