|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7 |
|
| #
0de20461 |
| 16-Mar-2025 |
Kumar Kartikeya Dwivedi <[email protected]> |
bpf: Implement verifier support for rqspinlock
Introduce verifier-side support for rqspinlock kfuncs. The first step is allowing bpf_res_spin_lock type to be defined in map values and allocated obje
bpf: Implement verifier support for rqspinlock
Introduce verifier-side support for rqspinlock kfuncs. The first step is allowing bpf_res_spin_lock type to be defined in map values and allocated objects, so BTF-side is updated with a new BPF_RES_SPIN_LOCK field to recognize and validate.
Any object cannot have both bpf_spin_lock and bpf_res_spin_lock, only one of them (and at most one of them per-object, like before) must be present. The bpf_res_spin_lock can also be used to protect objects that require lock protection for their kfuncs, like BPF rbtree and linked list.
The verifier plumbing to simulate success and failure cases when calling the kfuncs is done by pushing a new verifier state to the verifier state stack which will verify the failure case upon calling the kfunc. The path where success is indicated creates all lock reference state and IRQ state (if necessary for irqsave variants). In the case of failure, the state clears the registers r0-r5, sets the return value, and skips kfunc processing, proceeding to the next instruction.
When marking the return value for success case, the value is marked as 0, and for the failure case as [-MAX_ERRNO, -1]. Then, in the program, whenever user checks the return value as 'if (ret)' or 'if (ret < 0)' the verifier never traverses such branches for success cases, and would be aware that the lock is not held in such cases.
We push the kfunc state in check_kfunc_call whenever rqspinlock kfuncs are invoked. We introduce a kfunc_class state to avoid mixing lock irqrestore kfuncs with IRQ state created by bpf_local_irq_save.
With all this infrastructure, these kfuncs become usable in programs while satisfying all safety properties required by the kernel.
Acked-by: Eduard Zingerman <[email protected]> Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc6, v6.14-rc5 |
|
| #
4e4136c6 |
| 25-Feb-2025 |
Amery Hung <[email protected]> |
selftests/bpf: Test gen_pro/epilogue that generate kfuncs
Test gen_prologue and gen_epilogue that generate kfuncs that have not been seen in the main program.
The main bpf program and return value
selftests/bpf: Test gen_pro/epilogue that generate kfuncs
Test gen_prologue and gen_epilogue that generate kfuncs that have not been seen in the main program.
The main bpf program and return value checks are identical to pro_epilogue.c introduced in commit 47e69431b57a ("selftests/bpf: Test gen_prologue and gen_epilogue"). However, now when bpf_testmod_st_ops detects a program name with prefix "test_kfunc_", it generates slightly different prologue and epilogue: They still add 1000 to args->a in prologue, add 10000 to args->a and set r0 to 2 * args->a in epilogue, but involve kfuncs.
At high level, the alternative version of prologue and epilogue look like this:
cgrp = bpf_cgroup_from_id(0); if (cgrp) bpf_cgroup_release(cgrp); else /* Perform what original bpf_testmod_st_ops prologue or * epilogue does */
Since 0 is never a valid cgroup id, the original prologue or epilogue logic will be performed. As a result, the __retval check should expect the exact same return value.
Signed-off-by: Amery Hung <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4 |
|
| #
59422464 |
| 20-Feb-2025 |
Jason Xing <[email protected]> |
bpf: Support selective sampling for bpf timestamping
Add the bpf_sock_ops_enable_tx_tstamp kfunc to allow BPF programs to selectively enable TX timestamping on a skb during tcp_sendmsg().
For examp
bpf: Support selective sampling for bpf timestamping
Add the bpf_sock_ops_enable_tx_tstamp kfunc to allow BPF programs to selectively enable TX timestamping on a skb during tcp_sendmsg().
For example, BPF program will limit tracking X numbers of packets and then will stop there instead of tracing all the sendmsgs of matched flow all along. It would be helpful for users who cannot afford to calculate latencies from every sendmsg call probably due to the performance or storage space consideration.
Signed-off-by: Jason Xing <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected]
show more ...
|
| #
a687df20 |
| 17-Feb-2025 |
Amery Hung <[email protected]> |
bpf: Support getting referenced kptr from struct_ops argument
Allows struct_ops programs to acqurie referenced kptrs from arguments by directly reading the argument.
The verifier will acquire a ref
bpf: Support getting referenced kptr from struct_ops argument
Allows struct_ops programs to acqurie referenced kptrs from arguments by directly reading the argument.
The verifier will acquire a reference for struct_ops a argument tagged with "__ref" in the stub function in the beginning of the main program. The user will be able to access the referenced kptr directly by reading the context as long as it has not been released by the program.
This new mechanism to acquire referenced kptr (compared to the existing "kfunc with KF_ACQUIRE") is introduced for ergonomic and semantic reasons. In the first use case, Qdisc_ops, an skb is passed to .enqueue in the first argument. This mechanism provides a natural way for users to get a referenced kptr in the .enqueue struct_ops programs and makes sure that a qdisc will always enqueue or drop the skb.
Signed-off-by: Amery Hung <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc3 |
|
| #
c83e2d97 |
| 10-Feb-2025 |
Jiri Olsa <[email protected]> |
bpf: Add tracepoints with null-able arguments
Some of the tracepoints slipped when we did the first scan, adding them now.
Fixes: 838a10bd2ebf ("bpf: Augment raw_tp arguments with PTR_MAYBE_NULL")
bpf: Add tracepoints with null-able arguments
Some of the tracepoints slipped when we did the first scan, adding them now.
Fixes: 838a10bd2ebf ("bpf: Augment raw_tp arguments with PTR_MAYBE_NULL") Signed-off-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc2, v6.14-rc1 |
|
| #
5da7e15f |
| 01-Feb-2025 |
Kuniyuki Iwashima <[email protected]> |
net: Add rx_skb of kfree_skb to raw_tp_null_args[].
Yan Zhai reported a BPF prog could trigger a null-ptr-deref [0] in trace_kfree_skb if the prog does not check if rx_sk is NULL.
Commit c53795d48e
net: Add rx_skb of kfree_skb to raw_tp_null_args[].
Yan Zhai reported a BPF prog could trigger a null-ptr-deref [0] in trace_kfree_skb if the prog does not check if rx_sk is NULL.
Commit c53795d48ee8 ("net: add rx_sk to trace_kfree_skb") added rx_sk to trace_kfree_skb, but rx_sk is optional and could be NULL.
Let's add kfree_skb to raw_tp_null_args[] to let the BPF verifier validate such a prog and prevent the issue.
Now we fail to load such a prog:
libbpf: prog 'drop': -- BEGIN PROG LOAD LOG -- 0: R1=ctx() R10=fp0 ; int BPF_PROG(drop, struct sk_buff *skb, void *location, @ kfree_skb_sk_null.bpf.c:21 0: (79) r3 = *(u64 *)(r1 +24) func 'kfree_skb' arg3 has btf_id 5253 type STRUCT 'sock' 1: R1=ctx() R3_w=trusted_ptr_or_null_sock(id=1) ; bpf_printk("sk: %d, %d\n", sk, sk->__sk_common.skc_family); @ kfree_skb_sk_null.bpf.c:24 1: (69) r4 = *(u16 *)(r3 +16) R3 invalid mem access 'trusted_ptr_or_null_' processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 -- END PROG LOAD LOG --
Note this fix requires commit 838a10bd2ebf ("bpf: Augment raw_tp arguments with PTR_MAYBE_NULL").
[0]: BUG: kernel NULL pointer dereference, address: 0000000000000010 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 P4D 0 PREEMPT SMP RIP: 0010:bpf_prog_5e21a6db8fcff1aa_drop+0x10/0x2d Call Trace: <TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x148/0x420 ? search_bpf_extables+0x5b/0x70 ? fixup_exception+0x27/0x2c0 ? exc_page_fault+0x75/0x170 ? asm_exc_page_fault+0x22/0x30 ? bpf_prog_5e21a6db8fcff1aa_drop+0x10/0x2d bpf_trace_run4+0x68/0xd0 ? unix_stream_connect+0x1f4/0x6f0 sk_skb_reason_drop+0x90/0x120 unix_stream_connect+0x1f4/0x6f0 __sys_connect+0x7f/0xb0 __x64_sys_connect+0x14/0x20 do_syscall_64+0x47/0xc30 entry_SYSCALL_64_after_hwframe+0x4b/0x53
Fixes: c53795d48ee8 ("net: add rx_sk to trace_kfree_skb") Reported-by: Yan Zhai <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Signed-off-by: Kuniyuki Iwashima <[email protected]> Tested-by: Yan Zhai <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
53ee0d66 |
| 30-Jan-2025 |
Ihor Solodrai <[email protected]> |
bpf: Allow kind_flag for BTF type and decl tags
BTF type tags and decl tags now may have info->kflag set to 1, changing the semantics of the tag.
Change BTF verification to permit BTF that makes us
bpf: Allow kind_flag for BTF type and decl tags
BTF type tags and decl tags now may have info->kflag set to 1, changing the semantics of the tag.
Change BTF verification to permit BTF that makes use of this feature: * remove kflag check in btf_decl_tag_check_meta(), as both values are valid * allow kflag to be set for BTF_KIND_TYPE_TAG type in btf_ref_type_check_meta()
Make sure kind_flag is NOT set when checking for specific BTF tags, such as "kptr", "user" etc.
Modify a selftest checking for kflag in decl_tag accordingly.
Signed-off-by: Ihor Solodrai <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
|
Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5 |
|
| #
18032c6b |
| 28-Dec-2024 |
Thomas Weißschuh <[email protected]> |
btf: Switch module BTF attribute to sysfs_bin_attr_simple_read()
The generic function from the sysfs core can replace the custom one.
Signed-off-by: Thomas Weißschuh <[email protected]> Acked-by
btf: Switch module BTF attribute to sysfs_bin_attr_simple_read()
The generic function from the sysfs core can replace the custom one.
Signed-off-by: Thomas Weißschuh <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/20241228-sysfs-const-bin_attr-simple-v2-3-7c6f3f1767a3@weissschuh.net Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc4 |
|
| #
96ea081e |
| 20-Dec-2024 |
Martin KaFai Lau <[email protected]> |
bpf: Reject struct_ops registration that uses module ptr and the module btf_id is missing
There is a UAF report in the bpf_struct_ops when CONFIG_MODULES=n. In particular, the report is on tcp_conge
bpf: Reject struct_ops registration that uses module ptr and the module btf_id is missing
There is a UAF report in the bpf_struct_ops when CONFIG_MODULES=n. In particular, the report is on tcp_congestion_ops that has a "struct module *owner" member.
For struct_ops that has a "struct module *owner" member, it can be extended either by the regular kernel module or by the bpf_struct_ops. bpf_try_module_get() will be used to do the refcounting and different refcount is done based on the owner pointer. When CONFIG_MODULES=n, the btf_id of the "struct module" is missing:
WARN: resolve_btfids: unresolved symbol module
Thus, the bpf_try_module_get() cannot do the correct refcounting.
Not all subsystem's struct_ops requires the "struct module *owner" member. e.g. the recent sched_ext_ops.
This patch is to disable bpf_struct_ops registration if the struct_ops has the "struct module *" member and the "struct module" btf_id is missing. The btf_type_is_fwd() helper is moved to the btf.h header file for this test.
This has happened since the beginning of bpf_struct_ops which has gone through many changes. The Fixes tag is set to a recent commit that this patch can apply cleanly. Considering CONFIG_MODULES=n is not common and the age of the issue, targeting for bpf-next also.
Fixes: 1611603537a4 ("bpf: Create argument information for nullable arguments.") Reported-by: Robert Morris <[email protected]> Closes: https://lore.kernel.org/bpf/74665.1733669976@localhost/ Signed-off-by: Martin KaFai Lau <[email protected]> Tested-by: Eduard Zingerman <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc3 |
|
| #
838a10bd |
| 13-Dec-2024 |
Kumar Kartikeya Dwivedi <[email protected]> |
bpf: Augment raw_tp arguments with PTR_MAYBE_NULL
Arguments to a raw tracepoint are tagged as trusted, which carries the semantics that the pointer will be non-NULL. However, in certain cases, a ra
bpf: Augment raw_tp arguments with PTR_MAYBE_NULL
Arguments to a raw tracepoint are tagged as trusted, which carries the semantics that the pointer will be non-NULL. However, in certain cases, a raw tracepoint argument may end up being NULL. More context about this issue is available in [0].
Thus, there is a discrepancy between the reality, that raw_tp arguments can actually be NULL, and the verifier's knowledge, that they are never NULL, causing explicit NULL check branch to be dead code eliminated.
A previous attempt [1], i.e. the second fixed commit, was made to simulate symbolic execution as if in most accesses, the argument is a non-NULL raw_tp, except for conditional jumps. This tried to suppress branch prediction while preserving compatibility, but surfaced issues with production programs that were difficult to solve without increasing verifier complexity. A more complete discussion of issues and fixes is available at [2].
Fix this by maintaining an explicit list of tracepoints where the arguments are known to be NULL, and mark the positional arguments as PTR_MAYBE_NULL. Additionally, capture the tracepoints where arguments are known to be ERR_PTR, and mark these arguments as scalar values to prevent potential dereference.
Each hex digit is used to encode NULL-ness (0x1) or ERR_PTR-ness (0x2), shifted by the zero-indexed argument number x 4. This can be represented as follows: 1st arg: 0x1 2nd arg: 0x10 3rd arg: 0x100 ... and so on (likewise for ERR_PTR case).
In the future, an automated pass will be used to produce such a list, or insert __nullable annotations automatically for tracepoints. Each compilation unit will be analyzed and results will be collated to find whether a tracepoint pointer is definitely not null, maybe null, or an unknown state where verifier conservatively marks it PTR_MAYBE_NULL. A proof of concept of this tool from Eduard is available at [3].
Note that in case we don't find a specification in the raw_tp_null_args array and the tracepoint belongs to a kernel module, we will conservatively mark the arguments as PTR_MAYBE_NULL. This is because unlike for in-tree modules, out-of-tree module tracepoints may pass NULL freely to the tracepoint. We don't protect against such tracepoints passing ERR_PTR (which is uncommon anyway), lest we mark all such arguments as SCALAR_VALUE.
While we are it, let's adjust the test raw_tp_null to not perform dereference of the skb->mark, as that won't be allowed anymore, and make it more robust by using inline assembly to test the dead code elimination behavior, which should still stay the same.
[0]: https://lore.kernel.org/bpf/[email protected] [1]: https://lore.kernel.org/all/[email protected] [2]: https://lore.kernel.org/bpf/[email protected] [3]: https://github.com/eddyz87/llvm-project/tree/nullness-for-tracepoint-params
Reported-by: Juri Lelli <[email protected]> # original bug Reported-by: Manu Bretelle <[email protected]> # bugs in masking fix Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs") Fixes: cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") Reviewed-by: Eduard Zingerman <[email protected]> Co-developed-by: Jiri Olsa <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
c00d738e |
| 13-Dec-2024 |
Kumar Kartikeya Dwivedi <[email protected]> |
bpf: Revert "bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"
This patch reverts commit cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"). The patch was well-intended and meant to be as
bpf: Revert "bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"
This patch reverts commit cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"). The patch was well-intended and meant to be as a stop-gap fixing branch prediction when the pointer may actually be NULL at runtime. Eventually, it was supposed to be replaced by an automated script or compiler pass detecting possibly NULL arguments and marking them accordingly.
However, it caused two main issues observed for production programs and failed to preserve backwards compatibility. First, programs relied on the verifier not exploring == NULL branch when pointer is not NULL, thus they started failing with a 'dereference of scalar' error. Next, allowing raw_tp arguments to be modified surfaced the warning in the verifier that warns against reg->off when PTR_MAYBE_NULL is set.
More information, context, and discusson on both problems is available in [0]. Overall, this approach had several shortcomings, and the fixes would further complicate the verifier's logic, and the entire masking scheme would have to be removed eventually anyway.
Hence, revert the patch in preparation of a better fix avoiding these issues to replace this commit.
[0]: https://lore.kernel.org/bpf/[email protected]
Reported-by: Manu Bretelle <[email protected]> Fixes: cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
4e885fab |
| 13-Dec-2024 |
Anton Protopopov <[email protected]> |
bpf: Add a __btf_get_by_fd helper
Add a new helper to get a pointer to a struct btf from a file descriptor. This helper doesn't increase a refcnt. Add a comment explaining this and pointing to a cor
bpf: Add a __btf_get_by_fd helper
Add a new helper to get a pointer to a struct btf from a file descriptor. This helper doesn't increase a refcnt. Add a comment explaining this and pointing to a corresponding function which does take a reference.
Signed-off-by: Anton Protopopov <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
| #
659b9ba7 |
| 12-Dec-2024 |
Kumar Kartikeya Dwivedi <[email protected]> |
bpf: Check size for BTF-based ctx access of pointer members
Robert Morris reported the following program type which passes the verifier in [0]:
SEC("struct_ops/bpf_cubic_init") void BPF_PROG(bpf_cu
bpf: Check size for BTF-based ctx access of pointer members
Robert Morris reported the following program type which passes the verifier in [0]:
SEC("struct_ops/bpf_cubic_init") void BPF_PROG(bpf_cubic_init, struct sock *sk) { asm volatile("r2 = *(u16*)(r1 + 0)"); // verifier should demand u64 asm volatile("*(u32 *)(r2 +1504) = 0"); // 1280 in some configs }
The second line may or may not work, but the first instruction shouldn't pass, as it's a narrow load into the context structure of the struct ops callback. The code falls back to btf_ctx_access to ensure correctness and obtaining the types of pointers. Ensure that the size of the access is correctly checked to be 8 bytes, otherwise the verifier thinks the narrow load obtained a trusted BTF pointer and will permit loads/stores as it sees fit.
Perform the check on size after we've verified that the load is for a pointer field, as for scalar values narrow loads are fine. Access to structs passed as arguments to a BPF program are also treated as scalars, therefore no adjustment is needed in their case.
Existing verifier selftests are broken by this change, but because they were incorrect. Verifier tests for d_path were performing narrow load into context to obtain path pointer, had this program actually run it would cause a crash. The same holds for verifier_btf_ctx_access tests.
[0]: https://lore.kernel.org/bpf/51338.1732985814@localhost
Fixes: 9e15db66136a ("bpf: Implement accurate raw_tp context access via BTF") Reported-by: Robert Morris <[email protected]> Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7 |
|
| #
cb4158ce |
| 04-Nov-2024 |
Kumar Kartikeya Dwivedi <[email protected]> |
bpf: Mark raw_tp arguments with PTR_MAYBE_NULL
Arguments to a raw tracepoint are tagged as trusted, which carries the semantics that the pointer will be non-NULL. However, in certain cases, a raw t
bpf: Mark raw_tp arguments with PTR_MAYBE_NULL
Arguments to a raw tracepoint are tagged as trusted, which carries the semantics that the pointer will be non-NULL. However, in certain cases, a raw tracepoint argument may end up being NULL. More context about this issue is available in [0].
Thus, there is a discrepancy between the reality, that raw_tp arguments can actually be NULL, and the verifier's knowledge, that they are never NULL, causing explicit NULL checks to be deleted, and accesses to such pointers potentially crashing the kernel.
To fix this, mark raw_tp arguments as PTR_MAYBE_NULL, and then special case the dereference and pointer arithmetic to permit it, and allow passing them into helpers/kfuncs; these exceptions are made for raw_tp programs only. Ensure that we don't do this when ref_obj_id > 0, as in that case this is an acquired object and doesn't need such adjustment.
The reason we do mask_raw_tp_trusted_reg logic is because other will recheck in places whether the register is a trusted_reg, and then consider our register as untrusted when detecting the presence of the PTR_MAYBE_NULL flag.
To allow safe dereference, we enable PROBE_MEM marking when we see loads into trusted pointers with PTR_MAYBE_NULL.
While trusted raw_tp arguments can also be passed into helpers or kfuncs where such broken assumption may cause issues, a future patch set will tackle their case separately, as PTR_TO_BTF_ID (without PTR_TRUSTED) can already be passed into helpers and causes similar problems. Thus, they are left alone for now.
It is possible that these checks also permit passing non-raw_tp args that are trusted PTR_TO_BTF_ID with null marking. In such a case, allowing dereference when pointer is NULL expands allowed behavior, so won't regress existing programs, and the case of passing these into helpers is the same as above and will be dealt with later.
Also update the failure case in tp_btf_nullable selftest to capture the new behavior, as the verifier will no longer cause an error when directly dereference a raw tracepoint argument marked as __nullable.
[0]: https://lore.kernel.org/bpf/[email protected]
Reviewed-by: Jiri Olsa <[email protected]> Reported-by: Juri Lelli <[email protected]> Tested-by: Juri Lelli <[email protected]> Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs") Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc6, v6.12-rc5 |
|
| #
1cb80d9e |
| 23-Oct-2024 |
Kui-Feng Lee <[email protected]> |
bpf: Support __uptr type tag in BTF
This patch introduces the "__uptr" type tag to BTF. It is to define a pointer pointing to the user space memory. This patch adds BTF logic to pass the "__uptr" ty
bpf: Support __uptr type tag in BTF
This patch introduces the "__uptr" type tag to BTF. It is to define a pointer pointing to the user space memory. This patch adds BTF logic to pass the "__uptr" type tag.
btf_find_kptr() is reused for the "__uptr" tag. The "__uptr" will only be supported in the map_value of the task storage map. However, btf_parse_struct_meta() also uses btf_find_kptr() but it is not interested in "__uptr". This patch adds a "field_mask" argument to btf_find_kptr() which will return BTF_FIELD_IGNORE if the caller is not interested in a “__uptr” field.
btf_parse_kptr() is also reused to parse the uptr. The btf_check_and_fixup_fields() is changed to do extra checks on the uptr to ensure that its struct size is not larger than PAGE_SIZE. It is not clear how a uptr pointing to a CO-RE supported kernel struct will be used, so it is also not allowed now.
Signed-off-by: Kui-Feng Lee <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc4, v6.12-rc3 |
|
| #
797d73ee |
| 08-Oct-2024 |
Hou Tao <[email protected]> |
bpf: Check the remaining info_cnt before repeating btf fields
When trying to repeat the btf fields for array of nested struct, it doesn't check the remaining info_cnt. The following splat will be re
bpf: Check the remaining info_cnt before repeating btf fields
When trying to repeat the btf fields for array of nested struct, it doesn't check the remaining info_cnt. The following splat will be reported when the value of ret * nelems is greater than BTF_FIELDS_MAX:
------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in ../kernel/bpf/btf.c:3951:49 index 11 is out of range for type 'btf_field_info [11]' CPU: 6 UID: 0 PID: 411 Comm: test_progs ...... 6.11.0-rc4+ #1 Tainted: [O]=OOT_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ... Call Trace: <TASK> dump_stack_lvl+0x57/0x70 dump_stack+0x10/0x20 ubsan_epilogue+0x9/0x40 __ubsan_handle_out_of_bounds+0x6f/0x80 ? kallsyms_lookup_name+0x48/0xb0 btf_parse_fields+0x992/0xce0 map_create+0x591/0x770 __sys_bpf+0x229/0x2410 __x64_sys_bpf+0x1f/0x30 x64_sys_call+0x199/0x9f0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fea56f2cc5d ...... </TASK> ---[ end trace ]---
Fix it by checking the remaining info_cnt in btf_repeat_fields() before repeating the btf fields.
Fixes: 64e8ee814819 ("bpf: look into the types of the fields of a struct type recursively.") Signed-off-by: Hou Tao <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
45126b15 |
| 07-Oct-2024 |
Jiri Olsa <[email protected]> |
bpf: Fix memory leak in bpf_core_apply
We need to free specs properly.
Fixes: 3d2786d65aaa ("bpf: correctly handle malformed BPF_CORE_TYPE_ID_LOCAL relos") Signed-off-by: Jiri Olsa <[email protected]
bpf: Fix memory leak in bpf_core_apply
We need to free specs properly.
Fixes: 3d2786d65aaa ("bpf: correctly handle malformed BPF_CORE_TYPE_ID_LOCAL relos") Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
|
Revision tags: v6.12-rc2, v6.12-rc1 |
|
| #
7bae563c |
| 15-Sep-2024 |
Christophe JAILLET <[email protected]> |
bpf: Constify struct btf_kind_operations
struct btf_kind_operations are not modified in BTF.
Constifying this structures moves some data to a read-only section, so increase overall security, especi
bpf: Constify struct btf_kind_operations
struct btf_kind_operations are not modified in BTF.
Constifying this structures moves some data to a read-only section, so increase overall security, especially when the structure holds some function pointers.
On a x86_64, with allmodconfig:
Before: ====== text data bss dec hex filename 184320 7091 548 191959 2edd7 kernel/bpf/btf.o
After: ===== text data bss dec hex filename 184896 6515 548 191959 2edd7 kernel/bpf/btf.o
Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/r/9192ab72b2e9c66aefd6520f359a20297186327f.1726417289.git.christophe.jaillet@wanadoo.fr Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.11 |
|
| #
986deb29 |
| 12-Sep-2024 |
Hou Tao <[email protected]> |
bpf: Call the missed kfree() when there is no special field in btf
Call the missed kfree() in btf_parse_struct_metas() when there is no special field in btf, otherwise will get the following kmemlea
bpf: Call the missed kfree() when there is no special field in btf
Call the missed kfree() in btf_parse_struct_metas() when there is no special field in btf, otherwise will get the following kmemleak report:
unreferenced object 0xffff888101033620 (size 8): comm "test_progs", pid 604, jiffies 4295127011 ...... backtrace (crc e77dc444): [<00000000186f90f3>] kmemleak_alloc+0x4b/0x80 [<00000000ac8e9c4d>] __kmalloc_cache_noprof+0x2a1/0x310 [<00000000d99d68d6>] btf_new_fd+0x72d/0xe90 [<00000000f010b7f8>] __sys_bpf+0xec3/0x2410 [<00000000e077ed6f>] __x64_sys_bpf+0x1f/0x30 [<00000000a12f9e55>] x64_sys_call+0x199/0x9f0 [<00000000f3029ea6>] do_syscall_64+0x3b/0xc0 [<000000005640913a>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
Fixes: 7a851ecb1806 ("bpf: Search for kptrs in prog BTF structs") Signed-off-by: Hou Tao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
8aeaed21 |
| 11-Sep-2024 |
Philo Lu <[email protected]> |
bpf: Support __nullable argument suffix for tp_btf
Pointers passed to tp_btf were trusted to be valid, but some tracepoints do take NULL pointer as input, such as trace_tcp_send_reset(). Then the in
bpf: Support __nullable argument suffix for tp_btf
Pointers passed to tp_btf were trusted to be valid, but some tracepoints do take NULL pointer as input, such as trace_tcp_send_reset(). Then the invalid memory access cannot be detected by verifier.
This patch fix it by add a suffix "__nullable" to the unreliable argument. The suffix is shown in btf, and PTR_MAYBE_NULL will be added to nullable arguments. Then users must check the pointer before use it.
A problem here is that we use "btf_trace_##call" to search func_proto. As it is a typedef, argument names as well as the suffix are not recorded. To solve this, I use bpf_raw_event_map to find "__bpf_trace##template" from "btf_trace_##call", and then we can see the suffix.
Suggested-by: Alexei Starovoitov <[email protected]> Signed-off-by: Philo Lu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc7 |
|
| #
bc638d8c |
| 05-Sep-2024 |
JP Kobryn <[email protected]> |
bpf: allow kfuncs within tracepoint and perf event programs
Associate tracepoint and perf event program types with the kfunc tracing hook. This allows calling kfuncs within these types of programs.
bpf: allow kfuncs within tracepoint and perf event programs
Associate tracepoint and perf event program types with the kfunc tracing hook. This allows calling kfuncs within these types of programs.
Signed-off-by: JP Kobryn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc6 |
|
| #
bb6705c3 |
| 31-Aug-2024 |
Jeongjun Park <[email protected]> |
bpf: add check for invalid name in btf_name_valid_section()
If the length of the name string is 1 and the value of name[0] is NULL byte, an OOB vulnerability occurs in btf_name_valid_section() and t
bpf: add check for invalid name in btf_name_valid_section()
If the length of the name string is 1 and the value of name[0] is NULL byte, an OOB vulnerability occurs in btf_name_valid_section() and the return value is true, so the invalid name passes the check.
To solve this, you need to check if the first position is NULL byte and if the first character is printable.
Suggested-by: Eduard Zingerman <[email protected]> Fixes: bd70a8fb7ca4 ("bpf: Allow all printable characters in BTF DATASEC names") Signed-off-by: Jeongjun Park <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Eduard Zingerman <[email protected]>
show more ...
|
| #
b408473e |
| 30-Aug-2024 |
Martin KaFai Lau <[email protected]> |
bpf: Fix a crash when btf_parse_base() returns an error pointer
The pointer returned by btf_parse_base could be an error pointer. IS_ERR() check is needed before calling btf_free(base_btf).
Fixes:
bpf: Fix a crash when btf_parse_base() returns an error pointer
The pointer returned by btf_parse_base could be an error pointer. IS_ERR() check is needed before calling btf_free(base_btf).
Fixes: 8646db238997 ("libbpf,bpf: Share BTF relocate-related code with kernel") Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Reviewed-by: Alan Maguire <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
| #
c6d9dafb |
| 28-Aug-2024 |
Hongbo Li <[email protected]> |
bpf: Use kvmemdup to simplify the code
Use kvmemdup instead of kvmalloc() + memcpy() to simplify the code.
No functional change intended.
Acked-by: Yonghong Song <[email protected]> Signed-o
bpf: Use kvmemdup to simplify the code
Use kvmemdup instead of kvmalloc() + memcpy() to simplify the code.
No functional change intended.
Acked-by: Yonghong Song <[email protected]> Signed-off-by: Hongbo Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc5, v6.11-rc4 |
|
| #
7a851ecb |
| 13-Aug-2024 |
Dave Marchevsky <[email protected]> |
bpf: Search for kptrs in prog BTF structs
Currently btf_parse_fields is used in two places to create struct btf_record's for structs: when looking at mapval type, and when looking at any struct in p
bpf: Search for kptrs in prog BTF structs
Currently btf_parse_fields is used in two places to create struct btf_record's for structs: when looking at mapval type, and when looking at any struct in program BTF. The former looks for kptr fields while the latter does not. This patch modifies the btf_parse_fields call made when looking at prog BTF struct types to search for kptrs as well.
Before this series there was no reason to search for kptrs in non-mapval types: a referenced kptr needs some owner to guarantee resource cleanup, and map values were the only owner that supported this. If a struct with a kptr field were to have some non-kptr-aware owner, the kptr field might not be properly cleaned up and result in resources leaking. Only searching for kptr fields in mapval was a simple way to avoid this problem.
In practice, though, searching for BPF_KPTR when populating struct_meta_tab does not expose us to this risk, as struct_meta_tab is only accessed through btf_find_struct_meta helper, and that helper is only called in contexts where recognizing the kptr field is safe:
* PTR_TO_BTF_ID reg w/ MEM_ALLOC flag * Such a reg is a local kptr and must be free'd via bpf_obj_drop, which will correctly handle kptr field
* When handling specific kfuncs which either expect MEM_ALLOC input or return MEM_ALLOC output (obj_{new,drop}, percpu_obj_{new,drop}, list+rbtree funcs, refcount_acquire) * Will correctly handle kptr field for same reasons as above
* When looking at kptr pointee type * Called by functions which implement "correct kptr resource handling"
* In btf_check_and_fixup_fields * Helper that ensures no ownership loops for lists and rbtrees, doesn't care about kptr field existence
So we should be able to find BPF_KPTR fields in all prog BTF structs without leaking resources.
Further patches in the series will build on this change to support kptr_xchg into non-mapval local kptr. Without this change there would be no kptr field found in such a type.
Acked-by: Martin KaFai Lau <[email protected]> Acked-by: Hou Tao <[email protected]> Signed-off-by: Dave Marchevsky <[email protected]> Signed-off-by: Amery Hung <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|