|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7 |
|
| #
76ce2b3d |
| 29-Apr-2024 |
Valentin Schneider <[email protected]> |
rcu: Rename struct rcu_data .exp_dynticks_snap into .exp_watching_snap
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, and the snapshot helpers are now prefix b
rcu: Rename struct rcu_data .exp_dynticks_snap into .exp_watching_snap
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, and the snapshot helpers are now prefix by "rcu_watching". Reflect that change into the storage variables for these snapshots.
Signed-off-by: Valentin Schneider <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Signed-off-by: Neeraj Upadhyay <[email protected]>
show more ...
|
| #
2dba2230 |
| 29-Apr-2024 |
Valentin Schneider <[email protected]> |
rcu: Rename struct rcu_data .dynticks_snap into .watching_snap
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, and the snapshot helpers are now prefix by "rcu_w
rcu: Rename struct rcu_data .dynticks_snap into .watching_snap
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, and the snapshot helpers are now prefix by "rcu_watching". Reflect that change into the storage variables for these snapshots.
Signed-off-by: Valentin Schneider <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Signed-off-by: Neeraj Upadhyay <[email protected]>
show more ...
|
| #
7121dd91 |
| 30-May-2024 |
Frederic Weisbecker <[email protected]> |
rcu/nocb: Introduce nocb mutex
The barrier_mutex is used currently to protect (de-)offloading operations and prevent from nocb_lock locking imbalance in rcu_barrier() and shrinker, and also from mis
rcu/nocb: Introduce nocb mutex
The barrier_mutex is used currently to protect (de-)offloading operations and prevent from nocb_lock locking imbalance in rcu_barrier() and shrinker, and also from misordered RCU barrier invocation.
Now since RCU (de-)offloading is going to happen on offline CPUs, an RCU barrier will have to be executed while transitionning from offloaded to de-offloaded state. And this can't happen while holding the barrier_mutex.
Introduce a NOCB mutex to protect (de-)offloading transitions. The barrier_mutex is still held for now when necessary to avoid barrier callbacks reordering and nocb_lock imbalance.
Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Neeraj Upadhyay <[email protected]>
show more ...
|
| #
ff81428e |
| 30-May-2024 |
Frederic Weisbecker <[email protected]> |
rcu/nocb: Move nocb field at the end of state struct
nocb_is_setup is a rarely used field, mostly on boot and CPU hotplug. It shouldn't occupy the middle of the rcu state hot fields cacheline.
Move
rcu/nocb: Move nocb field at the end of state struct
nocb_is_setup is a rarely used field, mostly on boot and CPU hotplug. It shouldn't occupy the middle of the rcu state hot fields cacheline.
Move it to the end and build it conditionally while at it. More cold NOCB fields are to come.
Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Neeraj Upadhyay <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1 |
|
| #
6f948568 |
| 19-Mar-2024 |
Joel Fernandes (Google) <[email protected]> |
rcu/tree: Reduce wake up for synchronize_rcu() common case
In the synchronize_rcu() common case, we will have less than SR_MAX_USERS_WAKE_FROM_GP number of users per GP. Waking up the kworker is poi
rcu/tree: Reduce wake up for synchronize_rcu() common case
In the synchronize_rcu() common case, we will have less than SR_MAX_USERS_WAKE_FROM_GP number of users per GP. Waking up the kworker is pointless just to free the last injected wait head since at that point, all the users have already been awakened.
Introduce a new counter to track this and prevent the wakeup in the common case.
[ paulmck: Remove atomic_dec_return_release in cannot-happen state. ]
Signed-off-by: Joel Fernandes (Google) <[email protected]> Reviewed-by: Uladzislau Rezki (Sony) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
| #
e4f78057 |
| 25-Apr-2024 |
Frederic Weisbecker <[email protected]> |
rcu/nocb: Remove buggy bypass lock contention mitigation
The bypass lock contention mitigation assumes there can be at most 2 contenders on the bypass lock, following this scheme:
1) One kthread ta
rcu/nocb: Remove buggy bypass lock contention mitigation
The bypass lock contention mitigation assumes there can be at most 2 contenders on the bypass lock, following this scheme:
1) One kthread takes the bypass lock 2) Another one spins on it and increment the contended counter 3) A third one (a bypass enqueuer) sees the contended counter on and busy loops waiting on it to decrement.
However this assumption is wrong. There can be only one CPU to find the lock contended because call_rcu() (the bypass enqueuer) is the only bypass lock acquire site that may not already hold the NOCB lock beforehand, all the other sites must first contend on the NOCB lock. Therefore step 2) is impossible.
The other problem is that the mitigation assumes that contenders all belong to the same rdp CPU, which is also impossible for a raw spinlock. In theory the warning could trigger if the enqueuer holds the bypass lock and another CPU flushes the bypass queue concurrently but this is prevented from all flush users:
1) NOCB kthreads only flush if they successfully _tried_ to lock the bypass lock. So no contention management here.
2) Flush on callbacks migration happen remotely when the CPU is offline. No concurrency against bypass enqueue.
3) Flush on deoffloading happen either locally with IRQs disabled or remotely when the CPU is not yet online. No concurrency against bypass enqueue.
4) Flush on barrier entrain happen either locally with IRQs disabled or remotely when the CPU is offline. No concurrency against bypass enqueue.
For those reasons, the bypass lock contention mitigation isn't needed and is even wrong. Remove it but keep the warning reporting a contended bypass lock on a remote CPU, to keep unexpected contention awareness.
Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
|
Revision tags: v6.8 |
|
| #
462df2f5 |
| 08-Mar-2024 |
Uladzislau Rezki (Sony) <[email protected]> |
rcu: Support direct wake-up of synchronize_rcu() users
This patch introduces a small enhancement which allows to do a direct wake-up of synchronize_rcu() callers. It occurs after a completion of gra
rcu: Support direct wake-up of synchronize_rcu() users
This patch introduces a small enhancement which allows to do a direct wake-up of synchronize_rcu() callers. It occurs after a completion of grace period, thus by the gp-kthread.
Number of clients is limited by the hard-coded maximum allowed threshold. The remaining part, if still exists is deferred to a main worker.
Link: https://lore.kernel.org/lkml/Zd0ZtNu+Rt0qXkfS@lothringen/
Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
show more ...
|
| #
ae2b217a |
| 08-Mar-2024 |
Paul E. McKenney <[email protected]> |
rcu: Make hotplug operations track GP state, not flags
Currently, there are rcu_data structure fields named ->rcu_onl_gp_seq and ->rcu_ofl_gp_seq that track the rcu_state.gp_flags field at the time
rcu: Make hotplug operations track GP state, not flags
Currently, there are rcu_data structure fields named ->rcu_onl_gp_seq and ->rcu_ofl_gp_seq that track the rcu_state.gp_flags field at the time of the corresponding CPU's last online or offline operation, respectively. However, this information is not particularly useful. It would be better to instead track the grace period state kept in rcu_state.gp_state. This would also be consistent with the initialization in rcu_boot_init_percpu_data(), which is to RCU_GP_CLEANED (an rcu_state.gp_state value), and also with the diagnostics in rcu_implicit_dynticks_qs(), whose format is consistent with an integer, not a bitmask.
This commit therefore makes this change and changes the names to ->rcu_onl_gp_flags and ->rcu_ofl_gp_flags, respectively.
Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
show more ...
|
| #
dfd458a9 |
| 08-Mar-2024 |
Uladzislau Rezki (Sony) <[email protected]> |
rcu: Add data structures for synchronize_rcu()
The synchronize_rcu() call is going to be reworked, thus this patch adds dedicated fields into the rcu_state structure.
Reviewed-by: Paul E. McKenney
rcu: Add data structures for synchronize_rcu()
The synchronize_rcu() call is going to be reworked, thus this patch adds dedicated fields into the rcu_state structure.
Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1 |
|
| #
23da2ad6 |
| 12-Jan-2024 |
Frederic Weisbecker <[email protected]> |
rcu/exp: Remove rcu_par_gp_wq
TREE04 running on short iterations can produce writer stalls of the following kind:
??? Writer stall state RTWS_EXP_SYNC(4) g3968 f0x0 ->state 0x2 cpu 0 task:rcu_tor
rcu/exp: Remove rcu_par_gp_wq
TREE04 running on short iterations can produce writer stalls of the following kind:
??? Writer stall state RTWS_EXP_SYNC(4) g3968 f0x0 ->state 0x2 cpu 0 task:rcu_torture_wri state:D stack:14568 pid:83 ppid:2 flags:0x00004000 Call Trace: <TASK> __schedule+0x2de/0x850 ? trace_event_raw_event_rcu_exp_funnel_lock+0x6d/0xb0 schedule+0x4f/0x90 synchronize_rcu_expedited+0x430/0x670 ? __pfx_autoremove_wake_function+0x10/0x10 ? __pfx_synchronize_rcu_expedited+0x10/0x10 do_rtws_sync.constprop.0+0xde/0x230 rcu_torture_writer+0x4b4/0xcd0 ? __pfx_rcu_torture_writer+0x10/0x10 kthread+0xc7/0xf0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK>
Waiting for an expedited grace period and polling for an expedited grace period both are operations that internally rely on the same workqueue performing necessary asynchronous work.
However, a dependency chain is involved between those two operations, as depicted below:
====== CPU 0 ======= ====== CPU 1 =======
synchronize_rcu_expedited() exp_funnel_lock() mutex_lock(&rcu_state.exp_mutex); start_poll_synchronize_rcu_expedited queue_work(rcu_gp_wq, &rnp->exp_poll_wq); synchronize_rcu_expedited_queue_work() queue_work(rcu_gp_wq, &rew->rew_work); wait_event() // A, wait for &rew->rew_work completion mutex_unlock() // B //======> switch to kworker
sync_rcu_do_polled_gp() { synchronize_rcu_expedited() exp_funnel_lock() mutex_lock(&rcu_state.exp_mutex); // C, wait B .... } // D
Since workqueues are usually implemented on top of several kworkers handling the queue concurrently, the above situation wouldn't deadlock most of the time because A then doesn't depend on D. But in case of memory stress, a single kworker may end up handling alone all the works in a serialized way. In that case the above layout becomes a problem because A then waits for D, closing a circular dependency:
A -> D -> C -> B -> A
This however only happens when CONFIG_RCU_EXP_KTHREAD=n. Indeed synchronize_rcu_expedited() is otherwise implemented on top of a kthread worker while polling still relies on rcu_gp_wq workqueue, breaking the above circular dependency chain.
Fix this with making expedited grace period to always rely on kthread worker. The workqueue based implementation is essentially a duplicate anyway now that the per-node initialization is performed by per-node kthread workers.
Meanwhile the CONFIG_RCU_EXP_KTHREAD switch is still kept around to manage the scheduler policy of these kthread workers.
Reported-by: Anna-Maria Behnsen <[email protected]> Reported-by: Thomas Gleixner <[email protected]> Suggested-by: Joel Fernandes <[email protected]> Suggested-by: Paul E. McKenney <[email protected]> Suggested-by: Neeraj upadhyay <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Boqun Feng <[email protected]>
show more ...
|
| #
8e5e6215 |
| 12-Jan-2024 |
Frederic Weisbecker <[email protected]> |
rcu/exp: Make parallel exp gp kworker per rcu node
When CONFIG_RCU_EXP_KTHREAD=n, the expedited grace period per node initialization is performed in parallel via workqueues (one work per node).
How
rcu/exp: Make parallel exp gp kworker per rcu node
When CONFIG_RCU_EXP_KTHREAD=n, the expedited grace period per node initialization is performed in parallel via workqueues (one work per node).
However in CONFIG_RCU_EXP_KTHREAD=y, this per node initialization is performed by a single kworker serializing each node initialization (one work for all nodes).
The second part is certainly less scalable and efficient beyond a single leaf node.
To improve this, expand this single kworker into per-node kworkers. This new layout is eventually intended to remove the workqueues based implementation since it will essentially now become duplicate code.
Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Boqun Feng <[email protected]>
show more ...
|
| #
7836b270 |
| 12-Jan-2024 |
Frederic Weisbecker <[email protected]> |
rcu: s/boost_kthread_mutex/kthread_mutex
This mutex is currently protecting per node boost kthreads creation and affinity setting across CPU hotplug operations.
Since the expedited kworkers will so
rcu: s/boost_kthread_mutex/kthread_mutex
This mutex is currently protecting per node boost kthreads creation and affinity setting across CPU hotplug operations.
Since the expedited kworkers will soon be split per node as well, they will be subject to the same concurrency constraints against hotplug.
Therefore their creation and affinity tuning operations will be grouped with those of boost kthreads and then rely on the same mutex.
To prepare for that, generalize its name.
Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Boqun Feng <[email protected]>
show more ...
|
| #
afd4e696 |
| 09-Jan-2024 |
Frederic Weisbecker <[email protected]> |
rcu/nocb: Re-arrange call_rcu() NOCB specific code
Currently the call_rcu() function interleaves NOCB and !NOCB enqueue code in a complicated way such that:
* The bypass enqueue code may or may not
rcu/nocb: Re-arrange call_rcu() NOCB specific code
Currently the call_rcu() function interleaves NOCB and !NOCB enqueue code in a complicated way such that:
* The bypass enqueue code may or may not have enqueued and may or may not have locked the ->nocb_lock. Everything that follows is in a Schrödinger locking state for the unwary reviewer's eyes.
* The was_alldone is always set but only used in NOCB related code.
* The NOCB wake up is distantly related to the locking hopefully performed by the bypass enqueue code that did not enqueue on the bypass list.
Unconfuse the whole and gather NOCB and !NOCB specific enqueue code to their own functions.
Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Boqun Feng <[email protected]>
show more ...
|
|
Revision tags: v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1 |
|
| #
b96e7a5f |
| 05-Sep-2023 |
Joel Fernandes (Google) <[email protected]> |
rcu/tree: Defer setting of jiffies during stall reset
There are instances where rcu_cpu_stall_reset() is called when jiffies did not get a chance to update for a long time. Before jiffies is updated
rcu/tree: Defer setting of jiffies during stall reset
There are instances where rcu_cpu_stall_reset() is called when jiffies did not get a chance to update for a long time. Before jiffies is updated, the CPU stall detector can go off triggering false-positives where a just-started grace period appears to be ages old. In the past, we disabled stall detection in rcu_cpu_stall_reset() however this got changed [1]. This is resulting in false-positives in KGDB usecase [2].
Fix this by deferring the update of jiffies to the third run of the FQS loop. This is more robust, as, even if rcu_cpu_stall_reset() is called just before jiffies is read, we would end up pushing out the jiffies read by 3 more FQS loops. Meanwhile the CPU stall detection will be delayed and we will not get any false positives.
[1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/
Tested with rcutorture.cpu_stall option as well to verify stall behavior with/without patch.
Tested-by: Huacai Chen <[email protected]> Reported-by: Binbin Zhou <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Suggested-by: Paul McKenney <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Fixes: a80be428fbc1 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()") Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]>
show more ...
|
|
Revision tags: v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6 |
|
| #
be42f00b |
| 19-Nov-2022 |
Zhen Lei <[email protected]> |
rcu: Add RCU stall diagnosis information
Because RCU CPU stall warnings are driven from the scheduling-clock interrupt handler, a workload consisting of a very large number of short-duration hardwar
rcu: Add RCU stall diagnosis information
Because RCU CPU stall warnings are driven from the scheduling-clock interrupt handler, a workload consisting of a very large number of short-duration hardware interrupts can result in misleading stall-warning messages. On systems supporting only a single level of interrupts, that is, where interrupts handlers cannot be interrupted, this can produce misleading diagnostics. The stack traces will show the innocent-bystander interrupted task, not the interrupts that are at the very least exacerbating the stall.
This situation can be improved by displaying the number of interrupts and the CPU time that they have consumed. Diagnosing other types of stalls can be eased by also providing the count of softirqs and the CPU time that they consumed as well as the number of context switches and the task-level CPU time consumed.
Consider the following output given this change:
rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 0-....: (1250 ticks this GP) <omitted> rcu: hardirqs softirqs csw/system rcu: number: 624 45 0 rcu: cputime: 69 1 2425 ==> 2500(ms)
This output shows that the number of hard and soft interrupts is small, there are no context switches, and the system takes up a lot of time. This indicates that the current task is looping with preemption disabled.
The impact on system performance is negligible because snapshot is recorded only once for all continuous RCU stalls.
This added debugging information is suppressed by default and can be enabled by building the kernel with CONFIG_RCU_CPU_STALL_CPUTIME=y or by booting with rcupdate.rcu_cpu_stall_cputime=1.
Signed-off-by: Zhen Lei <[email protected]> Reviewed-by: Mukesh Ojha <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1 |
|
| #
3cb278e7 |
| 16-Oct-2022 |
Joel Fernandes (Google) <[email protected]> |
rcu: Make call_rcu() lazy to save power
Implement timer-based RCU callback batching (also known as lazy callbacks). With this we save about 5-10% of power consumed due to RCU requests that happen wh
rcu: Make call_rcu() lazy to save power
Implement timer-based RCU callback batching (also known as lazy callbacks). With this we save about 5-10% of power consumed due to RCU requests that happen when system is lightly loaded or idle.
By default, all async callbacks (queued via call_rcu) are marked lazy. An alternate API call_rcu_hurry() is provided for the few users, for example synchronize_rcu(), that need the old behavior.
The batch is flushed whenever a certain amount of time has passed, or the batch on a particular CPU grows too big. Also memory pressure will flush it in a future patch.
To handle several corner cases automagically (such as rcu_barrier() and hotplug), we re-use bypass lists which were originally introduced to address lock contention, to handle lazy CBs as well. The bypass list length has the lazy CB length included in it. A separate lazy CB length counter is also introduced to keep track of the number of lazy CBs.
[ paulmck: Fix formatting of inline call_rcu_lazy() definition. ] [ paulmck: Apply Zqiang feedback. ] [ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]
Suggested-by: Paul McKenney <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
| #
b8f7aca3 |
| 16-Oct-2022 |
Frederic Weisbecker <[email protected]> |
rcu: Fix missing nocb gp wake on rcu_barrier()
In preparation for RCU lazy changes, wake up the RCU nocb gp thread if needed after an entrain. This change prevents the RCU barrier callback from wai
rcu: Fix missing nocb gp wake on rcu_barrier()
In preparation for RCU lazy changes, wake up the RCU nocb gp thread if needed after an entrain. This change prevents the RCU barrier callback from waiting in the queue for several seconds before the lazy callbacks in front of it are serviced.
Reported-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
|
Revision tags: v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3 |
|
| #
d96c52fe |
| 15-Apr-2022 |
Paul E. McKenney <[email protected]> |
rcu: Add polled expedited grace-period primitives
This commit adds expedited grace-period functionality to RCU's polled grace-period API, adding start_poll_synchronize_rcu_expedited() and cond_synch
rcu: Add polled expedited grace-period primitives
This commit adds expedited grace-period functionality to RCU's polled grace-period API, adding start_poll_synchronize_rcu_expedited() and cond_synchronize_rcu_expedited(), which are similar to the existing start_poll_synchronize_rcu() and cond_synchronize_rcu() functions, respectively.
Note that although start_poll_synchronize_rcu_expedited() can be invoked very early, the resulting expedited grace periods are not guaranteed to start until after workqueues are fully initialized. On the other hand, both synchronize_rcu() and synchronize_rcu_expedited() can also be invoked very early, and the resulting grace periods will be taken into account as they occur.
[ paulmck: Apply feedback from Neeraj Upadhyay. ]
Link: https://lore.kernel.org/all/[email protected]/ Link: https://docs.google.com/document/d/1RNKWW9jQyfjxw2E8dsXVTdvZYh0HnYeSHDKog9jhdN8/edit?usp=sharing Cc: Brian Foster <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Al Viro <[email protected]> Cc: Ian Kent <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
| #
dd041405 |
| 14-Apr-2022 |
Paul E. McKenney <[email protected]> |
rcu: Make polled grace-period API account for expedited grace periods
Currently, this code could splat:
oldstate = get_state_synchronize_rcu(); synchronize_rcu_expedited(); WARN_ON_ONCE(!poll_st
rcu: Make polled grace-period API account for expedited grace periods
Currently, this code could splat:
oldstate = get_state_synchronize_rcu(); synchronize_rcu_expedited(); WARN_ON_ONCE(!poll_state_synchronize_rcu(oldstate));
This situation is counter-intuitive and user-unfriendly. After all, there really was a perfectly valid full grace period right after the call to get_state_synchronize_rcu(), so why shouldn't poll_state_synchronize_rcu() know about it?
This commit therefore makes the polled grace-period API aware of expedited grace periods in addition to the normal grace periods that it is already aware of. With this change, the above code is guaranteed not to splat.
Please note that the above code can still splat due to counter wrap on the one hand and situations involving partially overlapping normal/expedited grace periods on the other. On 64-bit systems, the second is of course much more likely than the first. It is possible to modify this approach to prevent overlapping grace periods from causing splats, but only at the expense of greatly increasing the probability of counter wrap, as in within milliseconds on 32-bit systems and within minutes on 64-bit systems.
This commit is in preparation for polled expedited grace periods.
Link: https://lore.kernel.org/all/[email protected]/ Link: https://docs.google.com/document/d/1RNKWW9jQyfjxw2E8dsXVTdvZYh0HnYeSHDKog9jhdN8/edit?usp=sharing Cc: Brian Foster <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Al Viro <[email protected]> Cc: Ian Kent <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
| #
bf95b2bc |
| 14-Apr-2022 |
Paul E. McKenney <[email protected]> |
rcu: Switch polled grace-period APIs to ->gp_seq_polled
This commit switches the existing polled grace-period APIs to use a new ->gp_seq_polled counter in the rcu_state structure. An additional ->g
rcu: Switch polled grace-period APIs to ->gp_seq_polled
This commit switches the existing polled grace-period APIs to use a new ->gp_seq_polled counter in the rcu_state structure. An additional ->gp_seq_polled_snap counter in that same structure allows the normal grace period kthread to interact properly with the !SMP !PREEMPT fastpath through synchronize_rcu(). The first of the two to note the end of a given grace period will make knowledge of this transition available to the polled API.
This commit is in preparation for polled expedited grace periods.
[ paulmck: Fix use of rcu_state.gp_seq_polled to start normal grace period. ]
Link: https://lore.kernel.org/all/[email protected]/ Link: https://docs.google.com/document/d/1RNKWW9jQyfjxw2E8dsXVTdvZYh0HnYeSHDKog9jhdN8/edit?usp=sharing Cc: Brian Foster <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Al Viro <[email protected]> Cc: Ian Kent <[email protected]> Co-developed-by: Boqun Feng <[email protected]> Signed-off-by: Boqun Feng <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
| #
51038506 |
| 29-Apr-2022 |
Zqiang <[email protected]> |
rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread()
Callbacks are invoked in RCU kthreads when calbacks are offloaded (rcu_nocbs boot parameter) or when RCU's softirq handler has been offlo
rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread()
Callbacks are invoked in RCU kthreads when calbacks are offloaded (rcu_nocbs boot parameter) or when RCU's softirq handler has been offloaded to rcuc kthreads (use_softirq==0). The current code allows for the rcu_nocbs case but not the use_softirq case. This commit adds support for the use_softirq case.
Reported-by: kernel test robot <[email protected]> Signed-off-by: Zqiang <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Neeraj Upadhyay <[email protected]>
show more ...
|
| #
1598f4a4 |
| 19-Apr-2022 |
Frederic Weisbecker <[email protected]> |
rcu/nocb: Add/del rdp to iterate from rcuog itself
NOCB rdp's are part of a group whose list is iterated by the corresponding rdp leader.
This list is RCU traversed because an rdp can be either add
rcu/nocb: Add/del rdp to iterate from rcuog itself
NOCB rdp's are part of a group whose list is iterated by the corresponding rdp leader.
This list is RCU traversed because an rdp can be either added or deleted concurrently. Upon addition, a new iteration to the list after a synchronization point (a pair of LOCK/UNLOCK ->nocb_gp_lock) is forced to make sure:
1) we didn't miss a new element added in the middle of an iteration 2) we didn't ignore a whole subset of the list due to an element being quickly deleted and then re-added. 3) we prevent from probably other surprises...
Although this layout is expected to be safe, it doesn't help anybody to sleep well.
Simplify instead the nocb state toggling with moving the list modification from the nocb (de-)offloading workqueue to the rcuog kthreads instead.
Whenever the rdp leader is expected to (re-)set the SEGCBLIST_KTHREAD_GP flag of a target rdp, the latter is queued so that the leader handles the flag flip along with adding or deleting the target rdp to the list to iterate. This way the list modification and iteration happen from the same kthread and those operations can't race altogether.
As a bonus, the flags for each rdp don't need to be checked locklessly before each iteration, which is one less opportunity to produce nightmares.
Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Uladzislau Rezki <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Zqiang <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Neeraj Upadhyay <[email protected]>
show more ...
|
| #
17211455 |
| 08-Jun-2022 |
Frederic Weisbecker <[email protected]> |
rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking
Move the core RCU eqs/dynticks functions to context tracking so that we can later merge all that code within context tr
rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking
Move the core RCU eqs/dynticks functions to context tracking so that we can later merge all that code within context tracking.
Acked-by: Paul E. McKenney <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Uladzislau Rezki <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Nicolas Saenz Julienne <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Xiongfeng Wang <[email protected]> Cc: Yu Liao <[email protected]> Cc: Phil Auld <[email protected]> Cc: Paul Gortmaker<[email protected]> Cc: Alex Belits <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Nicolas Saenz Julienne <[email protected]> Tested-by: Nicolas Saenz Julienne <[email protected]>
show more ...
|
| #
95e04f48 |
| 08-Jun-2022 |
Frederic Weisbecker <[email protected]> |
rcu/context_tracking: Move dynticks_nmi_nesting to context tracking
The RCU eqs tracking is going to be performed by the context tracking subsystem. The related nesting counters thus need to be move
rcu/context_tracking: Move dynticks_nmi_nesting to context tracking
The RCU eqs tracking is going to be performed by the context tracking subsystem. The related nesting counters thus need to be moved to the context tracking structure.
Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Uladzislau Rezki <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Nicolas Saenz Julienne <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Xiongfeng Wang <[email protected]> Cc: Yu Liao <[email protected]> Cc: Phil Auld <[email protected]> Cc: Paul Gortmaker<[email protected]> Cc: Alex Belits <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Nicolas Saenz Julienne <[email protected]> Tested-by: Nicolas Saenz Julienne <[email protected]>
show more ...
|
| #
904e600e |
| 08-Jun-2022 |
Frederic Weisbecker <[email protected]> |
rcu/context_tracking: Move dynticks_nesting to context tracking
The RCU eqs tracking is going to be performed by the context tracking subsystem. The related nesting counters thus need to be moved to
rcu/context_tracking: Move dynticks_nesting to context tracking
The RCU eqs tracking is going to be performed by the context tracking subsystem. The related nesting counters thus need to be moved to the context tracking structure.
Acked-by: Paul E. McKenney <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Uladzislau Rezki <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Nicolas Saenz Julienne <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Xiongfeng Wang <[email protected]> Cc: Yu Liao <[email protected]> Cc: Phil Auld <[email protected]> Cc: Paul Gortmaker<[email protected]> Cc: Alex Belits <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Nicolas Saenz Julienne <[email protected]> Tested-by: Nicolas Saenz Julienne <[email protected]>
show more ...
|