History log of /linux-6.15/kernel/kprobes.c (Results 1 – 25 of 310)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1
# 7e74a7c0 29-Jan-2025 Sebastian Andrzej Siewior <[email protected]>

kprobes: Use RCU in all users of __module_text_address().

__module_text_address() can be invoked within a RCU section, there is no
requirement to have preemption disabled.

Replace the preempt_disab

kprobes: Use RCU in all users of __module_text_address().

__module_text_address() can be invoked within a RCU section, there is no
requirement to have preemption disabled.

Replace the preempt_disable() section around __module_text_address()
with RCU.

Cc: David S. Miller <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Naveen N Rao <[email protected]>
Cc: [email protected]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Petr Pavlu <[email protected]>

show more ...


# 1751f872 28-Jan-2025 Joel Granados <[email protected]>

treewide: const qualify ctl_tables where applicable

Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysc

treewide: const qualify ctl_tables where applicable

Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysctl_table and the ones calling register_net_sysctl (./net,
drivers/inifiniband dirs). These are special cases as they use a
registration function with a non-const qualified ctl_table argument or
modify the arrays before passing them on to the registration function.

Constifying ctl_table structs will prevent the modification of
proc_handler function pointers as the arrays would reside in .rodata.
This is made possible after commit 78eb4ea25cd5 ("sysctl: treewide:
constify the ctl_table argument of proc_handlers") constified all the
proc_handlers.

Created this by running an spatch followed by a sed command:
Spatch:
virtual patch

@
depends on !(file in "net")
disable optional_qualifier
@

identifier table_name != {
watchdog_hardlockup_sysctl,
iwcm_ctl_table,
ucma_ctl_table,
memory_allocation_profiling_sysctls,
loadpin_sysctl_table
};
@@

+ const
struct ctl_table table_name [] = { ... };

sed:
sed --in-place \
-e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \
kernel/utsname_sysctl.c

Reviewed-by: Song Liu <[email protected]>
Acked-by: Steven Rostedt (Google) <[email protected]> # for kernel/trace/
Reviewed-by: Martin K. Petersen <[email protected]> # SCSI
Reviewed-by: Darrick J. Wong <[email protected]> # xfs
Acked-by: Jani Nikula <[email protected]>
Acked-by: Corey Minyard <[email protected]>
Acked-by: Wei Liu <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Reviewed-by: Bill O'Donnell <[email protected]>
Acked-by: Baoquan He <[email protected]>
Acked-by: Ashutosh Dixit <[email protected]>
Acked-by: Anna Schumaker <[email protected]>
Signed-off-by: Joel Granados <[email protected]>

show more ...


Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3
# bef8e6af 09-Dec-2024 Masami Hiramatsu (Google) <[email protected]>

kprobes: Remove remaining gotos

Remove remaining gotos from kprobes.c to clean up the code.
This does not use cleanup macros, but changes code flow for avoiding
gotos.

Link: https://lore.kernel.org

kprobes: Remove remaining gotos

Remove remaining gotos from kprobes.c to clean up the code.
This does not use cleanup macros, but changes code flow for avoiding
gotos.

Link: https://lore.kernel.org/all/173371212474.480397.5684523564137819115.stgit@devnote2/

Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# 5e5b8b49 09-Dec-2024 Masami Hiramatsu (Google) <[email protected]>

kprobes: Remove unneeded goto

Remove unneeded gotos. Since the labels referred by these gotos have
only one reference for each, we can replace those gotos with the
referred code.

Link: https://lore

kprobes: Remove unneeded goto

Remove unneeded gotos. Since the labels referred by these gotos have
only one reference for each, we can replace those gotos with the
referred code.

Link: https://lore.kernel.org/all/173371211203.480397.13988907319659165160.stgit@devnote2/

Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# a35fb2bc 09-Dec-2024 Masami Hiramatsu (Google) <[email protected]>

kprobes: Use guard for rcu_read_lock

Use guard(rcu) for rcu_read_lock so that it can remove unneeded
gotos and make it more structured.

Link: https://lore.kernel.org/all/173371209846.480397.3852648

kprobes: Use guard for rcu_read_lock

Use guard(rcu) for rcu_read_lock so that it can remove unneeded
gotos and make it more structured.

Link: https://lore.kernel.org/all/173371209846.480397.3852648910271029695.stgit@devnote2/

Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# 54c79390 09-Dec-2024 Masami Hiramatsu (Google) <[email protected]>

kprobes: Use guard() for external locks

Use guard() for text_mutex, cpu_read_lock, and jump_label_lock in
the kprobes.

Link: https://lore.kernel.org/all/173371208663.480397.7535769878667655223.stgi

kprobes: Use guard() for external locks

Use guard() for text_mutex, cpu_read_lock, and jump_label_lock in
the kprobes.

Link: https://lore.kernel.org/all/173371208663.480397.7535769878667655223.stgit@devnote2/

Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


Revision tags: v6.13-rc2, v6.13-rc1
# 587e8e6d 29-Nov-2024 Masami Hiramatsu (Google) <[email protected]>

kprobes: Adopt guard() and scoped_guard()

Use guard() or scoped_guard() for critical sections rather than
discrete lock/unlock pairs.

Link: https://lore.kernel.org/all/173289887835.73724.6082232173

kprobes: Adopt guard() and scoped_guard()

Use guard() or scoped_guard() for critical sections rather than
discrete lock/unlock pairs.

Link: https://lore.kernel.org/all/173289887835.73724.608223217359025939.stgit@devnote2/

Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# 1703abb4 08-Dec-2024 Thomas Weißschuh <[email protected]>

kprobes: Reduce preempt disable scope in check_kprobe_access_safe()

Commit a189d0350f387 ("kprobes: disable preempt for module_text_address() and kernel_text_address()")
introduced a preempt_disable

kprobes: Reduce preempt disable scope in check_kprobe_access_safe()

Commit a189d0350f387 ("kprobes: disable preempt for module_text_address() and kernel_text_address()")
introduced a preempt_disable() region to protect against concurrent
module unloading. However this region also includes the call to
jump_label_text_reserved() which takes a long time;
up to 400us, iterating over approx 6000 jump tables.

The scope protected by preempt_disable() is largen than necessary.
core_kernel_text() does not need to be protected as it does not interact
with module code at all.
Only the scope from __module_text_address() to try_module_get() needs to
be protected.
By limiting the critical section to __module_text_address() and
try_module_get() the function responsible for the latency spike remains
preemptible.

This works fine even when !CONFIG_MODULES as in that case
try_module_get() will always return true and that block can be optimized
away.

Limit the critical section to __module_text_address() and
try_module_get(). Use guard(preempt)() for easier error handling.

While at it also remove a spurious *probed_mod = NULL in an error
path. On errors the output parameter is never inspected by the caller.
Some error paths were clearing the parameters, some didn't.
Align them for clarity.

Link: https://lore.kernel.org/all/[email protected]/

Signed-off-by: Thomas Weißschuh <[email protected]>
Reviewed-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


Revision tags: v6.12, v6.12-rc7, v6.12-rc6
# 3fbff988 30-Oct-2024 Nathan Chancellor <[email protected]>

kprobes: Use struct_size() in __get_insn_slot()

__get_insn_slot() allocates 'struct kprobe_insn_page' using a custom
structure size calculation macro, KPROBE_INSN_PAGE_SIZE. Replace
KPROBE_INSN_PAGE

kprobes: Use struct_size() in __get_insn_slot()

__get_insn_slot() allocates 'struct kprobe_insn_page' using a custom
structure size calculation macro, KPROBE_INSN_PAGE_SIZE. Replace
KPROBE_INSN_PAGE_SIZE with the struct_size() macro, which is the
preferred way to calculate the size of flexible structures in the kernel
because it handles overflow and makes it easier to change and audit how
flexible structures are allocated across the entire tree.

Link: https://lore.kernel.org/all/20241030-kprobes-fix-counted-by-annotation-v1-2-8f266001fad0@kernel.org/
(Masami modofied this to be applicable without the 1st patch in the series.)

Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


Revision tags: v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4
# da93dd93 13-Aug-2024 Jinjie Ruan <[email protected]>

kprobes: Cleanup collect_one_slot() and __disable_kprobe()

If kip->nused is not zero, collect_one_slot() return false, otherwise do
a lot of linked list operations, reverse the processing order to m

kprobes: Cleanup collect_one_slot() and __disable_kprobe()

If kip->nused is not zero, collect_one_slot() return false, otherwise do
a lot of linked list operations, reverse the processing order to make the
code if nesting more concise. __disable_kprobe() is the same as well.

Link: https://lore.kernel.org/all/[email protected]/

Signed-off-by: Jinjie Ruan <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# ce7f27dc 13-Aug-2024 Jinjie Ruan <[email protected]>

kprobes: Cleanup the config comment

The CONFIG_KPROBES_ON_FTRACE #if/#else/#endif section is small and doesn't
nest additional #ifdefs so the comment is useless and should be removed,
but the __ARCH

kprobes: Cleanup the config comment

The CONFIG_KPROBES_ON_FTRACE #if/#else/#endif section is small and doesn't
nest additional #ifdefs so the comment is useless and should be removed,
but the __ARCH_WANT_KPROBES_INSN_SLOT and CONFIG_OPTPROBES() nest is long,
it is better to add comment for reading.

Link: https://lore.kernel.org/all/[email protected]/

Signed-off-by: Jinjie Ruan <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


Revision tags: v6.11-rc3, v6.11-rc2
# 8c8acb8f 02-Aug-2024 Masami Hiramatsu (Google) <[email protected]>

kprobes: Fix to check symbol prefixes correctly

Since str_has_prefix() takes the prefix as the 2nd argument and the string
as the first, is_cfi_preamble_symbol() always fails to check the prefix.
Fi

kprobes: Fix to check symbol prefixes correctly

Since str_has_prefix() takes the prefix as the 2nd argument and the string
as the first, is_cfi_preamble_symbol() always fails to check the prefix.
Fix the function parameter order so that it correctly check the prefix.

Link: https://lore.kernel.org/all/172260679559.362040.7360872132937227206.stgit@devnote2/

Fixes: de02f2ac5d8c ("kprobes: Prohibit probing on CFI preamble symbol")
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


Revision tags: v6.11-rc1
# 78eb4ea2 24-Jul-2024 Joel Granados <[email protected]>

sysctl: treewide: constify the ctl_table argument of proc_handlers

const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ct

sysctl: treewide: constify the ctl_table argument of proc_handlers

const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ctl_table
structs into .rodata data which will ensure that proc_handler function
pointers cannot be modified.

This patch has been generated by the following coccinelle script:

```
virtual patch

@r1@
identifier ctl, write, buffer, lenp, ppos;
identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)";
@@

int func(
- struct ctl_table *ctl
+ const struct ctl_table *ctl
,int write, void *buffer, size_t *lenp, loff_t *ppos);

@r2@
identifier func, ctl, write, buffer, lenp, ppos;
@@

int func(
- struct ctl_table *ctl
+ const struct ctl_table *ctl
,int write, void *buffer, size_t *lenp, loff_t *ppos)
{ ... }

@r3@
identifier func;
@@

int func(
- struct ctl_table *
+ const struct ctl_table *
,int , void *, size_t *, loff_t *);

@r4@
identifier func, ctl;
@@

int func(
- struct ctl_table *ctl
+ const struct ctl_table *ctl
,int , void *, size_t *, loff_t *);

@r5@
identifier func, write, buffer, lenp, ppos;
@@

int func(
- struct ctl_table *
+ const struct ctl_table *
,int write, void *buffer, size_t *lenp, loff_t *ppos);

```

* Code formatting was adjusted in xfs_sysctl.c to comply with code
conventions. The xfs_stats_clear_proc_handler,
xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where
adjusted.

* The ctl_table argument in proc_watchdog_common was const qualified.
This is called from a proc_handler itself and is calling back into
another proc_handler, making it necessary to change it as part of the
proc_handler migration.

Co-developed-by: Thomas Weißschuh <[email protected]>
Signed-off-by: Thomas Weißschuh <[email protected]>
Co-developed-by: Joel Granados <[email protected]>
Signed-off-by: Joel Granados <[email protected]>

show more ...


Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1
# 4b377b48 18-May-2024 Linus Torvalds <[email protected]>

kprobe/ftrace: fix build error due to bad function definition

Commit 1a7d0890dd4a ("kprobe/ftrace: bail out if ftrace was killed")
introduced a bad K&R function definition, which we haven't accepted

kprobe/ftrace: fix build error due to bad function definition

Commit 1a7d0890dd4a ("kprobe/ftrace: bail out if ftrace was killed")
introduced a bad K&R function definition, which we haven't accepted in a
long long time.

Gcc seems to let it slide, but clang notices with the appropriate error:

kernel/kprobes.c:1140:24: error: a function declaration without a prototype is deprecated in all >
1140 | void kprobe_ftrace_kill()
| ^
| void

but this commit was apparently never in linux-next before it was sent
upstream, so it didn't get the appropriate build test coverage.

Fixes: 1a7d0890dd4a kprobe/ftrace: bail out if ftrace was killed
Cc: Stephen Brennan <[email protected]>
Cc: Masami Hiramatsu (Google) <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v6.9, v6.9-rc7
# 1a7d0890 01-May-2024 Stephen Brennan <[email protected]>

kprobe/ftrace: bail out if ftrace was killed

If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet th

kprobe/ftrace: bail out if ftrace was killed

If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet the kprobes will still be active, and when triggered, they
will use the freed memory, likely resulting in a page fault and panic.

This behavior can be reproduced quite easily, by creating a kprobe and
then triggering a ftrace_kill(). For simplicity, we can simulate an
ftrace error with a kernel module like [1]:

[1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer

sudo perf probe --add commit_creds
sudo perf trace -e probe:commit_creds
# In another terminal
make
sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug
# Back to perf terminal
# ctrl-c
sudo perf probe --del commit_creds

After a short period, a page fault and panic would occur as the kprobe
continues to execute and uses the freed ftrace_ops. While ftrace_kill()
is supposed to be used only in extreme circumstances, it is invoked in
FTRACE_WARN_ON() and so there are many places where an unexpected bug
could be triggered, yet the system may continue operating, possibly
without the administrator noticing. If ftrace_kill() does not panic the
system, then we should do everything we can to continue operating,
rather than leave a ticking time bomb.

Link: https://lore.kernel.org/all/[email protected]/

Signed-off-by: Stephen Brennan <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Guo Ren <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# 7582b7be 05-May-2024 Mike Rapoport (IBM) <[email protected]>

kprobes: remove dependency on CONFIG_MODULES

kprobes depended on CONFIG_MODULES because it has to allocate memory for
code.

Since code allocations are now implemented with execmem, kprobes can be
e

kprobes: remove dependency on CONFIG_MODULES

kprobes depended on CONFIG_MODULES because it has to allocate memory for
code.

Since code allocations are now implemented with execmem, kprobes can be
enabled in non-modular kernels.

Add #ifdef CONFIG_MODULE guards for the code dealing with kprobes inside
modules, make CONFIG_KPROBES select CONFIG_EXECMEM and drop the
dependency of CONFIG_KPROBES on CONFIG_MODULES.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
[mcgrof: rebase in light of NEED_TASKS_RCU ]
Signed-off-by: Luis Chamberlain <[email protected]>

show more ...


# 12af2b83 05-May-2024 Mike Rapoport (IBM) <[email protected]>

mm: introduce execmem_alloc() and execmem_free()

module_alloc() is used everywhere as a mean to allocate memory for code.

Beside being semantically wrong, this unnecessarily ties all subsystems
tha

mm: introduce execmem_alloc() and execmem_free()

module_alloc() is used everywhere as a mean to allocate memory for code.

Beside being semantically wrong, this unnecessarily ties all subsystems
that need to allocate code, such as ftrace, kprobes and BPF to modules and
puts the burden of code allocation to the modules code.

Several architectures override module_alloc() because of various
constraints where the executable memory can be located and this causes
additional obstacles for improvements of code allocation.

Start splitting code allocation from modules by introducing execmem_alloc()
and execmem_free() APIs.

Initially, execmem_alloc() is a wrapper for module_alloc() and
execmem_free() is a replacement of module_memfree() to allow updating all
call sites to use the new APIs.

Since architectures define different restrictions on placement,
permissions, alignment and other parameters for memory that can be used by
different subsystems that allocate executable memory, execmem_alloc() takes
a type argument, that will be used to identify the calling subsystem and to
allow architectures define parameters for ranges suitable for that
subsystem.

No functional changes.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Song Liu <[email protected]>
Acked-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>

show more ...


Revision tags: v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1
# f884cd38 27-Jun-2023 Joel Granados <[email protected]>

kprobes: Remove the now superfluous sentinel elements from ctl_table array

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sent

kprobes: Remove the now superfluous sentinel elements from ctl_table array

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%[email protected]/)

Remove sentinel element from kprobe_sysclts

Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Joel Granados <[email protected]>

show more ...


# 325f3fb5 10-Apr-2024 Zheng Yejian <[email protected]>

kprobes: Fix possible use-after-free issue on kprobe registration

When unloading a module, its state is changing MODULE_STATE_LIVE ->
MODULE_STATE_GOING -> MODULE_STATE_UNFORMED. Each change will t

kprobes: Fix possible use-after-free issue on kprobe registration

When unloading a module, its state is changing MODULE_STATE_LIVE ->
MODULE_STATE_GOING -> MODULE_STATE_UNFORMED. Each change will take
a time. `is_module_text_address()` and `__module_text_address()`
works with MODULE_STATE_LIVE and MODULE_STATE_GOING.
If we use `is_module_text_address()` and `__module_text_address()`
separately, there is a chance that the first one is succeeded but the
next one is failed because module->state becomes MODULE_STATE_UNFORMED
between those operations.

In `check_kprobe_address_safe()`, if the second `__module_text_address()`
is failed, that is ignored because it expected a kernel_text address.
But it may have failed simply because module->state has been changed
to MODULE_STATE_UNFORMED. In this case, arm_kprobe() will try to modify
non-exist module text address (use-after-free).

To fix this problem, we should not use separated `is_module_text_address()`
and `__module_text_address()`, but use only `__module_text_address()`
once and do `try_module_get(module)` which is only available with
MODULE_STATE_LIVE.

Link: https://lore.kernel.org/all/[email protected]/

Fixes: 28f6c37a2910 ("kprobes: Forbid probing on trampoline and BPF code areas")
Cc: [email protected]
Signed-off-by: Zheng Yejian <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# 9efd24ec 19-Sep-2023 Li zeming <[email protected]>

kprobes: Remove unnecessary initial values of variables

ri and sym is assigned first, so it does not need to initialize the
assignment.

Link: https://lore.kernel.org/all/20230919012823.7815-1-zemin

kprobes: Remove unnecessary initial values of variables

ri and sym is assigned first, so it does not need to initialize the
assignment.

Link: https://lore.kernel.org/all/[email protected]/

Signed-off-by: Li zeming <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# d839a656 01-Dec-2023 JP Kobryn <[email protected]>

kprobes: consistent rcu api usage for kretprobe holder

It seems that the pointer-to-kretprobe "rp" within the kretprobe_holder is
RCU-managed, based on the (non-rethook) implementation of get_kretpr

kprobes: consistent rcu api usage for kretprobe holder

It seems that the pointer-to-kretprobe "rp" within the kretprobe_holder is
RCU-managed, based on the (non-rethook) implementation of get_kretprobe().
The thought behind this patch is to make use of the RCU API where possible
when accessing this pointer so that the needed barriers are always in place
and to self-document the code.

The __rcu annotation to "rp" allows for sparse RCU checking. Plain writes
done to the "rp" pointer are changed to make use of the RCU macro for
assignment. For the single read, the implementation of get_kretprobe()
is simplified by making use of an RCU macro which accomplishes the same,
but note that the log warning text will be more generic.

I did find that there is a difference in assembly generated between the
usage of the RCU macros vs without. For example, on arm64, when using
rcu_assign_pointer(), the corresponding store instruction is a
store-release (STLR) which has an implicit barrier. When normal assignment
is done, a regular store (STR) is found. In the macro case, this seems to
be a result of rcu_assign_pointer() using smp_store_release() when the
value to write is not NULL.

Link: https://lore.kernel.org/all/[email protected]/

Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
Cc: [email protected]
Signed-off-by: JP Kobryn <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# 4bbd9345 17-Oct-2023 wuqiang.matt <[email protected]>

kprobes: kretprobe scalability improvement

kretprobe is using freelist to manage return-instances, but freelist,
as LIFO queue based on singly linked list, scales badly and reduces
the overall throu

kprobes: kretprobe scalability improvement

kretprobe is using freelist to manage return-instances, but freelist,
as LIFO queue based on singly linked list, scales badly and reduces
the overall throughput of kretprobed routines, especially for high
contention scenarios.

Here's a typical throughput test of sys_prctl (counts in 10 seconds,
measured with perf stat -a -I 10000 -e syscalls:sys_enter_prctl):

OS: Debian 10 X86_64, Linux 6.5rc7 with freelist
HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s

1T 2T 4T 8T 16T 24T
24150045 29317964 15446741 12494489 18287272 17708768
32T 48T 64T 72T 96T 128T
16200682 13737658 11645677 11269858 10470118 9931051

This patch introduces objpool to replace freelist. objpool is a
high performance queue, which can bring near-linear scalability
to kretprobed routines. Tests of kretprobe throughput show the
biggest ratio as 159x of original freelist. Here's the result:

1T 2T 4T 8T 16T
native: 41186213 82336866 164250978 328662645 658810299
freelist: 24150045 29317964 15446741 12494489 18287272
objpool: 23926730 48010314 96125218 191782984 385091769
32T 48T 64T 96T 128T
native: 1330338351 1969957941 2512291791 2615754135 2671040914
freelist: 16200682 13737658 11645677 10470118 9931051
objpool: 764481096 1147149781 1456220214 1502109662 1579015050

Testings on 96-core ARM64 output similarly, but with the biggest
ratio up to 448x:

OS: Debian 10 AARCH64, Linux 6.5rc7
HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s

1T 2T 4T 8T 16T
native: . 30066096 63569843 126194076 257447289 505800181
freelist: 16152090 11064397 11124068 7215768 5663013
objpool: 13997541 28032100 55726624 110099926 221498787
24T 32T 48T 64T 96T
native: 763305277 1015925192 1521075123 2033009392 3021013752
freelist: 5015810 4602893 3766792 3382478 2945292
objpool: 328192025 439439564 668534502 887401381 1319972072

Link: https://lore.kernel.org/all/[email protected]/

Signed-off-by: wuqiang.matt <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# 8865aea0 25-Jul-2023 Ruan Jinjie <[email protected]>

kernel: kprobes: Use struct_size()

Use struct_size() instead of hand-writing it, when allocating a structure
with a flex array.

This is less verbose.

Link: https://lore.kernel.org/all/202307251954

kernel: kprobes: Use struct_size()

Use struct_size() instead of hand-writing it, when allocating a structure
with a flex array.

This is less verbose.

Link: https://lore.kernel.org/all/[email protected]/

Signed-off-by: Ruan Jinjie <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


# de02f2ac 11-Jul-2023 Masami Hiramatsu (Google) <[email protected]>

kprobes: Prohibit probing on CFI preamble symbol

Do not allow to probe on "__cfi_" or "__pfx_" started symbol, because those
are used for CFI and not executed. Probing it will break the CFI.

Link:

kprobes: Prohibit probing on CFI preamble symbol

Do not allow to probe on "__cfi_" or "__pfx_" started symbol, because those
are used for CFI and not executed. Probing it will break the CFI.

Link: https://lore.kernel.org/all/168904024679.116016.18089228029322008512.stgit@devnote2/

Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>

show more ...


# ed9492df 11-Jul-2023 Li zeming <[email protected]>

kernel: kprobes: Remove unnecessary ‘0’ values

it is assigned first, so it does not need to initialize the assignment.

Link: https://lore.kernel.org/all/[email protected]/

kernel: kprobes: Remove unnecessary ‘0’ values

it is assigned first, so it does not need to initialize the assignment.

Link: https://lore.kernel.org/all/[email protected]/

Signed-off-by: Li zeming <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>

show more ...


12345678910>>...13