History log of /linux-6.15/kernel/rcu/tree.c (Results 1 – 25 of 997)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5
# 5a562b8b 27-Feb-2025 Uladzislau Rezki (Sony) <[email protected]>

rcu: Use _full() API to debug synchronize_rcu()

Switch for using of get_state_synchronize_rcu_full() and
poll_state_synchronize_rcu_full() pair to debug a normal
synchronize_rcu() call.

Just using

rcu: Use _full() API to debug synchronize_rcu()

Switch for using of get_state_synchronize_rcu_full() and
poll_state_synchronize_rcu_full() pair to debug a normal
synchronize_rcu() call.

Just using "not" full APIs to identify if a grace period is
passed or not might lead to a false-positive kernel splat.

It can happen, because get_state_synchronize_rcu() compresses
both normal and expedited states into one single unsigned long
value, so a poll_state_synchronize_rcu() can miss GP-completion
when synchronize_rcu()/synchronize_rcu_expedited() concurrently
run.

To address this, switch to poll_state_synchronize_rcu_full() and
get_state_synchronize_rcu_full() APIs, which use separate variables
for expedited and normal states.

Reported-by: cheung wall <[email protected]>
Closes: https://lore.kernel.org/lkml/Z5ikQeVmVdsWQrdD@pc636/T/
Fixes: 988f569ae041 ("rcu: Reduce synchronize_rcu() latency")
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Boqun Feng <[email protected]>

show more ...


Revision tags: v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3
# 85aad7cc 12-Dec-2024 Paul E. McKenney <[email protected]>

rcu: Fix get_state_synchronize_rcu_full() GP-start detection

The get_state_synchronize_rcu_full() and poll_state_synchronize_rcu_full()
functions use the root rcu_node structure's ->gp_seq field to

rcu: Fix get_state_synchronize_rcu_full() GP-start detection

The get_state_synchronize_rcu_full() and poll_state_synchronize_rcu_full()
functions use the root rcu_node structure's ->gp_seq field to detect
the beginnings and ends of grace periods, respectively. This choice is
necessary for the poll_state_synchronize_rcu_full() function because
(give or take counter wrap), the following sequence is guaranteed not
to trigger:

get_state_synchronize_rcu_full(&rgos);
synchronize_rcu();
WARN_ON_ONCE(!poll_state_synchronize_rcu_full(&rgos));

The RCU callbacks that awaken synchronize_rcu() instances are
guaranteed not to be invoked before the root rcu_node structure's
->gp_seq field is updated to indicate the end of the grace period.
However, these callbacks might start being invoked immediately
thereafter, in particular, before rcu_state.gp_seq has been updated.
Therefore, poll_state_synchronize_rcu_full() must refer to the
root rcu_node structure's ->gp_seq field. Because this field is
updated under this structure's ->lock, any code following a call to
poll_state_synchronize_rcu_full() will be fully ordered after the
full grace-period computation, as is required by RCU's memory-ordering
semantics.

By symmetry, the get_state_synchronize_rcu_full() function should also
use this same root rcu_node structure's ->gp_seq field. But it turns out
that symmetry is profoundly (though extremely infrequently) destructive
in this case. To see this, consider the following sequence of events:

1. CPU 0 starts a new grace period, and updates rcu_state.gp_seq
accordingly.

2. As its first step of grace-period initialization, CPU 0 examines
the current CPU hotplug state and decides that it need not wait
for CPU 1, which is currently offline.

3. CPU 1 comes online, and updates its state. But this does not
affect the current grace period, but rather the one after that.
After all, CPU 1 was offline when the current grace period
started, so all pre-existing RCU readers on CPU 1 must have
completed or been preempted before it last went offline.
The current grace period therefore has nothing it needs to wait
for on CPU 1.

4. CPU 1 switches to an rcutorture kthread which is running
rcutorture's rcu_torture_reader() function, which starts a new
RCU reader.

5. CPU 2 is running rcutorture's rcu_torture_writer() function
and collects a new polled grace-period "cookie" using
get_state_synchronize_rcu_full(). Because the newly started
grace period has not completed initialization, the root rcu_node
structure's ->gp_seq field has not yet been updated to indicate
that this new grace period has already started.

This cookie is therefore set up for the end of the current grace
period (rather than the end of the following grace period).

6. CPU 0 finishes grace-period initialization.

7. If CPU 1’s rcutorture reader is preempted, it will be added to
the ->blkd_tasks list, but because CPU 1’s ->qsmask bit is not
set in CPU 1's leaf rcu_node structure, the ->gp_tasks pointer
will not be updated.  Thus, this grace period will not wait on
it.  Which is only fair, given that the CPU did not come online
until after the grace period officially started.

8. CPUs 0 and 2 then detect the new grace period and then report
a quiescent state to the RCU core.

9. Because CPU 1 was offline at the start of the current grace
period, CPUs 0 and 2 are the only CPUs that this grace period
needs to wait on. So the grace period ends and post-grace-period
cleanup starts. In particular, the root rcu_node structure's
->gp_seq field is updated to indicate that this grace period
has now ended.

10. CPU 2 continues running rcu_torture_writer() and sees that,
from the viewpoint of the root rcu_node structure consulted by
the poll_state_synchronize_rcu_full() function, the grace period
has ended.  It therefore updates state accordingly.

11. CPU 1 is still running the same RCU reader, which notices this
update and thus complains about the too-short grace period.

The fix is for the get_state_synchronize_rcu_full() function to use
rcu_state.gp_seq instead of the root rcu_node structure's ->gp_seq field.
With this change in place, if step 5's cookie indicates that the grace
period has not yet started, then any prior code executed by CPU 2 must
have happened before CPU 1 came online. This will in turn prevent CPU
1's code in steps 3 and 11 from spanning CPU 2's grace-period wait,
thus preventing CPU 1 from being subjected to a too-short grace period.

This commit therefore makes this change. Note that there is no change to
the poll_state_synchronize_rcu_full() function, which as noted above,
must continue to use the root rcu_node structure's ->gp_seq field.
This is of course an asymmetry between these two functions, but is an
asymmetry that is absolutely required for correct operation. It is a
common human tendency to greatly value symmetry, and sometimes symmetry
is a wonderful thing. Other times, symmetry results in poor performance.
But in this case, symmetry is just plain wrong.

Nevertheless, the asymmetry does require an additional adjustment.
It is possible for get_state_synchronize_rcu_full() to see a given
grace period as having started, but for an immediately following
poll_state_synchronize_rcu_full() to see it as having not yet started.
Given the current rcu_seq_done_exact() implementation, this will
result in a false-positive indication that the grace period is done
from poll_state_synchronize_rcu_full(). This is dealt with by making
rcu_seq_done_exact() reach back three grace periods rather than just
two of them.

However, simply changing get_state_synchronize_rcu_full() function to
use rcu_state.gp_seq instead of the root rcu_node structure's ->gp_seq
field results in a theoretical bug in kernels booted with
rcutree.rcu_normal_wake_from_gp=1 due to the following sequence of
events:

o The rcu_gp_init() function invokes rcu_seq_start() to officially
start a new grace period.

o A new RCU reader begins, referencing X from some RCU-protected
list. The new grace period is not obligated to wait for this
reader.

o An updater removes X, then calls synchronize_rcu(), which queues
a wait element.

o The grace period ends, awakening the updater, which frees X
while the reader is still referencing it.

The reason that this is theoretical is that although the grace period
has officially started, none of the CPUs are officially aware of this,
and thus will have to assume that the RCU reader pre-dated the start of
the grace period. Detailed explanation can be found at [2] and [3].

Except for kernels built with CONFIG_PROVE_RCU=y, which use the polled
grace-period APIs, which can and do complain bitterly when this sequence
of events occurs. Not only that, there might be some future RCU
grace-period mechanism that pulls this sequence of events from theory
into practice. This commit therefore also pulls the call to
rcu_sr_normal_gp_init() to precede that to rcu_seq_start().

Although this fixes commit 91a967fd6934 ("rcu: Add full-sized polling
for get_completed*() and poll_state*()"), it is not clear that it is
worth backporting this commit. First, it took me many weeks to convince
rcutorture to reproduce this more frequently than once per year.
Second, this cannot be reproduced at all without frequent CPU-hotplug
operations, as in waiting all of 50 milliseconds from the end of the
previous operation until starting the next one. Third, the TREE03.boot
settings cause multi-millisecond delays during RCU grace-period
initialization, which greatly increase the probability of the above
sequence of events. (Don't do this in production workloads!) Fourth,
the TREE03 rcutorture scenario was modified to use four-CPU guest OSes,
to have a single-rcu_node combining tree, no testing of RCU priority
boosting, and no random preemption, and these modifications were
necessary to reproduce this issue in a reasonable timeframe. Fifth,
extremely heavy use of get_state_synchronize_rcu_full() and/or
poll_state_synchronize_rcu_full() is required to reproduce this, and as
of v6.12, only kfree_rcu() uses it, and even then not particularly
heavily.

[boqun: Apply the fix [1], and add the comment before the moved
rcu_sr_normal_gp_init(). Additional links are added for explanation.]

Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Tested-by: Uladzislau Rezki (Sony) <[email protected]>
Link: https://lore.kernel.org/rcu/d90bd6d9-d15c-4b9b-8a69-95336e74e8f4@paulmck-laptop/ [1]
Link: https://lore.kernel.org/rcu/20250303001507.GA3994772@joelnvbox/ [2]
Link: https://lore.kernel.org/rcu/Z8bcUsZ9IpRi1QoP@pc636/ [3]
Reviewed-by: Joel Fernandes <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>

show more ...


# 7acc2d90 18-Dec-2024 Paul E. McKenney <[email protected]>

rcutorture: Make cur_ops->format_gp_seqs take buffer length

The Tree and Tiny implementations of rcutorture_format_gp_seqs() use
hard-coded constants for the length of the buffer that they format in

rcutorture: Make cur_ops->format_gp_seqs take buffer length

The Tree and Tiny implementations of rcutorture_format_gp_seqs() use
hard-coded constants for the length of the buffer that they format into.
This is of course an accident waiting to happen, so this commit therefore
makes them take a length argument. The rcutorture calling code uses
ARRAY_SIZE() to safely compute this new argument.

Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>

show more ...


Revision tags: v6.13-rc2
# 2db7ab8c 03-Dec-2024 Paul E. McKenney <[email protected]>

rcutorture: Expand failure/close-call grace-period output

With only eight bits per grace-period sequence number, wrap can happen
in 64 grace periods. This commit therefore increases this to sixteen

rcutorture: Expand failure/close-call grace-period output

With only eight bits per grace-period sequence number, wrap can happen
in 64 grace periods. This commit therefore increases this to sixteen
bits for normal grace-period sequence numbers and the combined short-form
polling sequence numbers, thus deferring wrap for at least 16,384 grace
periods. Because expedited grace periods go faster, expand these to 24
bits, deferring wrap for at least 4,194,304 expedited grace periods.
These longer wrap times makes it easier to correlate these numbers to
trace-event output.

Note that the low-order two bits are reserved for intra-grace-period
state, hence the above wrap numbers being a factor of four smaller than
you might expect.

Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>

show more ...


Revision tags: v6.13-rc1, v6.12
# 84ae9101 14-Nov-2024 Paul E. McKenney <[email protected]>

rcutorture: Include grace-period sequence numbers in failure/close-call

This commit includes the grace-period sequence numbers at the beginning
and end of each segment in the "Failure/close-call rcu

rcutorture: Include grace-period sequence numbers in failure/close-call

This commit includes the grace-period sequence numbers at the beginning
and end of each segment in the "Failure/close-call rcutorture reader
segments" list. These are in hexadecimal, and only the bottom byte.
Currently, only RCU is supported, with its three sequence numbers (normal,
expedited, and polled).

Note that if all the grace-period sequence numbers remain the same across
a given reader segment, only one copy of the number will be printed.
Of course, if there is a change, both sets of values will be printed.

Because the overhead of collecting this information can suppress
heisenbugs, this information is collected and printed only in kernels
built with CONFIG_RCU_TORTURE_TEST_LOG_GP=y.

[ paulmck: Apply Nathan Chancellor feedback for IS_ENABLED(). ]
[ paulmck: Apply feedback from kernel test robot. ]

Signed-off-by: Paul E. McKenney <[email protected]>
Tested-by: kernel test robot <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>

show more ...


# 7f4b19ef 03-Feb-2025 Vlastimil Babka <[email protected]>

rcu: remove trace_rcu_kvfree_callback

Tree RCU does not handle kvfree_rcu() by queueing individual objects by
call_rcu() anymore, thus the tracepoint and associated
__is_kvfree_rcu_offset() check is

rcu: remove trace_rcu_kvfree_callback

Tree RCU does not handle kvfree_rcu() by queueing individual objects by
call_rcu() anymore, thus the tracepoint and associated
__is_kvfree_rcu_offset() check is dead code now. Remove it.

Reviewed-by: Joel Fernandes (Google) <[email protected]>
Reviewed-by: Hyeonggon Yoo <[email protected]>
Tested-by: Paul E. McKenney <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>

show more ...


Revision tags: v6.12-rc7
# 764f6a81 10-Nov-2024 Zilin Guan <[email protected]>

rcu: Remove READ_ONCE() for rdp->gpwrap access in __note_gp_changes()

There is one access to the per-CPU rdp->gpwrap field in the
__note_gp_changes() function that does not use READ_ONCE(), but all

rcu: Remove READ_ONCE() for rdp->gpwrap access in __note_gp_changes()

There is one access to the per-CPU rdp->gpwrap field in the
__note_gp_changes() function that does not use READ_ONCE(), but all other
accesses do use READ_ONCE(). When using the 8*TREE03 and CONFIG_NR_CPUS=8
configuration, KCSAN found no data races at that point. This is because
all calls to __note_gp_changes() hold rnp->lock, which excludes writes
to the rdp->gpwrap fields for all CPUs associated with that same leaf
rcu_node structure.

This commit therefore removes READ_ONCE() from rdp->gpwrap accesses
within the __note_gp_changes() function.

Signed-off-by: Zilin Guan <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>

show more ...


# 053ca725 09-Jan-2025 Paul E. McKenney <[email protected]>

rcu: Add CONFIG_RCU_LAZY delays to call_rcu() kernel-doc header

This commit adds a description of the energy-efficiency delays that
call_rcu() can impose, along with a pointer to call_rcu_hurry() fo

rcu: Add CONFIG_RCU_LAZY delays to call_rcu() kernel-doc header

This commit adds a description of the energy-efficiency delays that
call_rcu() can impose, along with a pointer to call_rcu_hurry() for
latency-sensitive kernel code.

Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>

show more ...


# 21ef2498 09-Jan-2025 Paul E. McKenney <[email protected]>

rcu: Document self-propagating callbacks

This commit documents the fact that a given RCU callback function can
repost itself.

Reported-by: Jens Axboe <[email protected]>
Signed-off-by: Paul E. McKenn

rcu: Document self-propagating callbacks

This commit documents the fact that a given RCU callback function can
repost itself.

Reported-by: Jens Axboe <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>

show more ...


# d40797d6 22-Nov-2024 Peter Zijlstra <[email protected]>

kasan: make kasan_record_aux_stack_noalloc() the default behaviour

kasan_record_aux_stack_noalloc() was introduced to record a stack trace
without allocating memory in the process. It has been adde

kasan: make kasan_record_aux_stack_noalloc() the default behaviour

kasan_record_aux_stack_noalloc() was introduced to record a stack trace
without allocating memory in the process. It has been added to callers
which were invoked while a raw_spinlock_t was held. More and more callers
were identified and changed over time. Is it a good thing to have this
while functions try their best to do a locklessly setup? The only
downside of having kasan_record_aux_stack() not allocate any memory is
that we end up without a stacktrace if stackdepot runs out of memory and
at the same stacktrace was not recorded before To quote Marco Elver from
https://lore.kernel.org/all/CANpmjNPmQYJ7pv1N3cuU8cP18u7PP_uoZD8YxwZd4jtbof9nVQ@mail.gmail.com/

| I'd be in favor, it simplifies things. And stack depot should be
| able to replenish its pool sufficiently in the "non-aux" cases
| i.e. regular allocations. Worst case we fail to record some
| aux stacks, but I think that's only really bad if there's a bug
| around one of these allocations. In general the probabilities
| of this being a regression are extremely small [...]

Make the kasan_record_aux_stack_noalloc() behaviour default as
kasan_record_aux_stack().

[[email protected]: dressed the diff as patch]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 7cb3007ce2da ("kasan: generic: introduce kasan_record_aux_stack_noalloc()")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Reported-by: [email protected]
Closes: https://lore.kernel.org/all/[email protected]
Reviewed-by: Andrey Konovalov <[email protected]>
Reviewed-by: Marco Elver <[email protected]>
Reviewed-by: Waiman Long <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Ben Segall <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Hyeonggon Yoo <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Zqiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# bbe658d6 12-Dec-2024 Uladzislau Rezki (Sony) <[email protected]>

mm/slab: Move kvfree_rcu() into SLAB

Move kvfree_rcu() functionality to the slab_common.c file.

The reason to have kvfree_rcu() functionality as part of SLAB is that
there is a clear trend and need

mm/slab: Move kvfree_rcu() into SLAB

Move kvfree_rcu() functionality to the slab_common.c file.

The reason to have kvfree_rcu() functionality as part of SLAB is that
there is a clear trend and need of closer integration. One of the recent
example is creating a barrier function for SLAB caches.

Another reason is to prevent of having several implementations of RCU
machinery for reclaiming objects after a GP. As future steps, it can be
more integrated(easier) with SLAB internals.

Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Acked-by: Hyeonggon Yoo <[email protected]>
Tested-by: Hyeonggon Yoo <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>

show more ...


# c18bcd85 12-Dec-2024 Uladzislau Rezki (Sony) <[email protected]>

rcu/kvfree: Adjust a shrinker name

Rename "rcu-kfree" to "slab-kvfree-rcu" since it goes to the
slab_common.c file soon.

Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Acked-by: Hyeonggo

rcu/kvfree: Adjust a shrinker name

Rename "rcu-kfree" to "slab-kvfree-rcu" since it goes to the
slab_common.c file soon.

Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Acked-by: Hyeonggon Yoo <[email protected]>
Tested-by: Hyeonggon Yoo <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>

show more ...


# ba5cac52 12-Dec-2024 Uladzislau Rezki (Sony) <[email protected]>

rcu/kvfree: Adjust names passed into trace functions

Currently trace functions are supplied with "rcu_state.name"
member which is located in the structure. The problem is that
the "rcu_state" struct

rcu/kvfree: Adjust names passed into trace functions

Currently trace functions are supplied with "rcu_state.name"
member which is located in the structure. The problem is that
the "rcu_state" structure variable is local and can not be
accessed from another place.

To address this, this preparation patch passes "slab" string
as a first argument.

Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Acked-by: Hyeonggon Yoo <[email protected]>
Tested-by: Hyeonggon Yoo <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>

show more ...


# d824ed70 12-Dec-2024 Uladzislau Rezki (Sony) <[email protected]>

rcu/kvfree: Move some functions under CONFIG_TINY_RCU

Currently when a tiny RCU is enabled, the tree.c file is not
compiled, thus duplicating function names do not conflict with
each other.

Because

rcu/kvfree: Move some functions under CONFIG_TINY_RCU

Currently when a tiny RCU is enabled, the tree.c file is not
compiled, thus duplicating function names do not conflict with
each other.

Because of moving of kvfree_rcu() functionality to the SLAB,
we have to reorder some functions and place them together under
CONFIG_TINY_RCU macro definition. Therefore, those functions name
will not conflict when a kernel is compiled for CONFIG_TINY_RCU
flavor.

Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Acked-by: Hyeonggon Yoo <[email protected]>
Tested-by: Hyeonggon Yoo <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>

show more ...


# 0f52b4db 12-Dec-2024 Uladzislau Rezki (Sony) <[email protected]>

rcu/kvfree: Initialize kvfree_rcu() separately

Introduce a separate initialization of kvfree_rcu() functionality.
For such purpose a kfree_rcu_batch_init() is renamed to a kvfree_rcu_init()
and it i

rcu/kvfree: Initialize kvfree_rcu() separately

Introduce a separate initialization of kvfree_rcu() functionality.
For such purpose a kfree_rcu_batch_init() is renamed to a kvfree_rcu_init()
and it is invoked from the main.c right after rcu_init() is done.

Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Acked-by: Hyeonggon Yoo <[email protected]>
Tested-by: Hyeonggon Yoo <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>

show more ...


Revision tags: v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1
# 8044c589 26-Sep-2024 Frederic Weisbecker <[email protected]>

rcu: Use kthread preferred affinity for RCU exp kworkers

Now that kthreads have an infrastructure to handle preferred affinity
against CPU hotplug and housekeeping cpumask, convert RCU exp workers t

rcu: Use kthread preferred affinity for RCU exp kworkers

Now that kthreads have an infrastructure to handle preferred affinity
against CPU hotplug and housekeeping cpumask, convert RCU exp workers to
use it instead of handling all the constraints by itself.

Acked-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>

show more ...


# b04e317b 26-Sep-2024 Frederic Weisbecker <[email protected]>

treewide: Introduce kthread_run_worker[_on_cpu]()

kthread_create() creates a kthread without running it yet. kthread_run()
creates a kthread and runs it.

On the other hand, kthread_create_worker()

treewide: Introduce kthread_run_worker[_on_cpu]()

kthread_create() creates a kthread without running it yet. kthread_run()
creates a kthread and runs it.

On the other hand, kthread_create_worker() creates a kthread worker and
runs it.

This difference in behaviours is confusing. Also there is no way to
create a kthread worker and affine it using kthread_bind_mask() or
kthread_affine_preferred() before starting it.

Consolidate the behaviours and introduce kthread_run_worker[_on_cpu]()
that behaves just like kthread_run(). kthread_create_worker[_on_cpu]()
will now only create a kthread worker without starting it.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>

show more ...


# db7ee3cb 26-Sep-2024 Frederic Weisbecker <[email protected]>

rcu: Use kthread preferred affinity for RCU boost

Now that kthreads have an infrastructure to handle preferred affinity
against CPU hotplug and housekeeping cpumask, convert RCU boost to use
it inst

rcu: Use kthread preferred affinity for RCU boost

Now that kthreads have an infrastructure to handle preferred affinity
against CPU hotplug and housekeeping cpumask, convert RCU boost to use
it instead of handling all the constraints by itself.

Acked-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>

show more ...


# 049dfe96 02-Oct-2024 Frederic Weisbecker <[email protected]>

rcu: Report callbacks enqueued on offline CPU blind spot

Callbacks enqueued after rcutree_report_cpu_dead() fall into RCU barrier
blind spot. Report any potential misuse.

Reported-by: Paul E. McKen

rcu: Report callbacks enqueued on offline CPU blind spot

Callbacks enqueued after rcutree_report_cpu_dead() fall into RCU barrier
blind spot. Report any potential misuse.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>

show more ...


# a23da88c 22-Oct-2024 Uladzislau Rezki (Sony) <[email protected]>

rcu/kvfree: Fix data-race in __mod_timer / kvfree_call_rcu

KCSAN reports a data race when access the krcp->monitor_work.timer.expires
variable in the schedule_delayed_monitor_work() function:

<snip

rcu/kvfree: Fix data-race in __mod_timer / kvfree_call_rcu

KCSAN reports a data race when access the krcp->monitor_work.timer.expires
variable in the schedule_delayed_monitor_work() function:

<snip>
BUG: KCSAN: data-race in __mod_timer / kvfree_call_rcu

read to 0xffff888237d1cce8 of 8 bytes by task 10149 on cpu 1:
schedule_delayed_monitor_work kernel/rcu/tree.c:3520 [inline]
kvfree_call_rcu+0x3b8/0x510 kernel/rcu/tree.c:3839
trie_update_elem+0x47c/0x620 kernel/bpf/lpm_trie.c:441
bpf_map_update_value+0x324/0x350 kernel/bpf/syscall.c:203
generic_map_update_batch+0x401/0x520 kernel/bpf/syscall.c:1849
bpf_map_do_batch+0x28c/0x3f0 kernel/bpf/syscall.c:5143
__sys_bpf+0x2e5/0x7a0
__do_sys_bpf kernel/bpf/syscall.c:5741 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5739 [inline]
__x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5739
x64_sys_call+0x2625/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:322
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

write to 0xffff888237d1cce8 of 8 bytes by task 56 on cpu 0:
__mod_timer+0x578/0x7f0 kernel/time/timer.c:1173
add_timer_global+0x51/0x70 kernel/time/timer.c:1330
__queue_delayed_work+0x127/0x1a0 kernel/workqueue.c:2523
queue_delayed_work_on+0xdf/0x190 kernel/workqueue.c:2552
queue_delayed_work include/linux/workqueue.h:677 [inline]
schedule_delayed_monitor_work kernel/rcu/tree.c:3525 [inline]
kfree_rcu_monitor+0x5e8/0x660 kernel/rcu/tree.c:3643
process_one_work kernel/workqueue.c:3229 [inline]
process_scheduled_works+0x483/0x9a0 kernel/workqueue.c:3310
worker_thread+0x51d/0x6f0 kernel/workqueue.c:3391
kthread+0x1d1/0x210 kernel/kthread.c:389
ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 56 Comm: kworker/u8:4 Not tainted 6.12.0-rc2-syzkaller-00050-g5b7c893ed5ed #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: events_unbound kfree_rcu_monitor
<snip>

kfree_rcu_monitor() rearms the work if a "krcp" has to be still
offloaded and this is done without holding krcp->lock, whereas
the kvfree_call_rcu() holds it.

Fix it by acquiring the "krcp->lock" for kfree_rcu_monitor() so
both functions do not race anymore.

Reported-by: [email protected]
Link: https://lore.kernel.org/lkml/ZxZ68KmHDQYU0yfD@pc636/T/
Fixes: 8fc5494ad5fa ("rcu/kvfree: Move need_offload_krc() out of krcp->lock")
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Reviewed-by: Neeraj Upadhyay <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>

show more ...


# a3076380 09-Oct-2024 Paul E. McKenney <[email protected]>

rcu: Permit start_poll_synchronize_rcu*() with interrupts disabled

The header comment for both start_poll_synchronize_rcu() and
start_poll_synchronize_rcu_full() state that interrupts must be enable

rcu: Permit start_poll_synchronize_rcu*() with interrupts disabled

The header comment for both start_poll_synchronize_rcu() and
start_poll_synchronize_rcu_full() state that interrupts must be enabled
when calling these two functions, and there is a lockdep assertion in
start_poll_synchronize_rcu_common() enforcing this restriction. However,
there is no need for this restrictions, as can be seen in call_rcu(),
which does wakeups when interrupts are disabled.

This commit therefore removes the lockdep assertion and the comments.

Reported-by: Kent Overstreet <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Neeraj Upadhyay <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>

show more ...


Revision tags: v6.11, v6.11-rc7
# 5d2501f4 02-Sep-2024 Jinjie Ruan <[email protected]>

rcu: Use the BITS_PER_LONG macro

sizeof(unsigned long) * 8 is the number of bits in an unsigned long
variable, replace it with BITS_PER_LONG macro to make it simpler.

Signed-off-by: Jinjie Ruan <ru

rcu: Use the BITS_PER_LONG macro

sizeof(unsigned long) * 8 is the number of bits in an unsigned long
variable, replace it with BITS_PER_LONG macro to make it simpler.

Signed-off-by: Jinjie Ruan <[email protected]>
Reviewed-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>

show more ...


# 3c5d61ae 30-Sep-2024 Uladzislau Rezki (Sony) <[email protected]>

rcu/kvfree: Refactor kvfree_rcu_queue_batch()

Improve readability of kvfree_rcu_queue_batch() function
in away that, after a first batch queuing, the loop is break
and success value is returned to a

rcu/kvfree: Refactor kvfree_rcu_queue_batch()

Improve readability of kvfree_rcu_queue_batch() function
in away that, after a first batch queuing, the loop is break
and success value is returned to a caller.

There is no reason to loop and check batches further as all
outstanding objects have already been picked and attached to
a certain batch to complete an offloading.

Fixes: 2b55d6a42d14 ("rcu/kvfree: Add kvfree_rcu_barrier() API")
Suggested-by: Linus Torvalds <[email protected]>
Closes: https://lore.kernel.org/lkml/ZvWUt2oyXRsvJRNc@pc636/T/
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>

show more ...


Revision tags: v6.11-rc6, v6.11-rc5
# 2b55d6a4 20-Aug-2024 Uladzislau Rezki (Sony) <[email protected]>

rcu/kvfree: Add kvfree_rcu_barrier() API

Add a kvfree_rcu_barrier() function. It waits until all
in-flight pointers are freed over RCU machinery. It does
not wait any GP completion and it is within

rcu/kvfree: Add kvfree_rcu_barrier() API

Add a kvfree_rcu_barrier() function. It waits until all
in-flight pointers are freed over RCU machinery. It does
not wait any GP completion and it is within its right to
return immediately if there are no outstanding pointers.

This function is useful when there is a need to guarantee
that a memory is fully freed before destroying memory caches.
For example, during unloading a kernel module.

Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>

show more ...


Revision tags: v6.11-rc4
# e68ac2b4 15-Aug-2024 Caleb Sander Mateos <[email protected]>

softirq: Remove unused 'action' parameter from action callback

When soft interrupt actions are called, they are passed a pointer to the
struct softirq action which contains the action's function poi

softirq: Remove unused 'action' parameter from action callback

When soft interrupt actions are called, they are passed a pointer to the
struct softirq action which contains the action's function pointer.

This pointer isn't useful, as the action callback already knows what
function it is. And since each callback handles a specific soft interrupt,
the callback also knows which soft interrupt number is running.

No soft interrupt action callback actually uses this parameter, so remove
it from the function pointer signature. This clarifies that soft interrupt
actions are global routines and makes it slightly cheaper to call them.

Signed-off-by: Caleb Sander Mateos <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


12345678910>>...40