|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6 |
|
| #
2af8780d |
| 06-Mar-2025 |
Paul E. McKenney <[email protected]> |
stop-machine: Add comment for rcu_momentary_eqs()
Add a comment to explain the purpose of the rcu_momentary_eqs() call from multi_cpu_stop(), which is to suppress false-positive RCU CPU stall warnin
stop-machine: Add comment for rcu_momentary_eqs()
Add a comment to explain the purpose of the rcu_momentary_eqs() call from multi_cpu_stop(), which is to suppress false-positive RCU CPU stall warnings.
Reported-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/87wmeuanti.ffs@tglx/ Signed-off-by: Paul E. McKenney <[email protected]> Cc: Valentin Schneider <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Frederic Weisbecker <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3 |
|
| #
c861cac9 |
| 11-Dec-2024 |
Mukesh Ojha <[email protected]> |
stop_machine: Fix rcu_momentary_eqs() call in multi_cpu_stop()
The multi_cpu_stop() contains a loop that can initially be executed with interrupts enabled (in the MULTI_STOP_NONE and MULTI_STOP_PREP
stop_machine: Fix rcu_momentary_eqs() call in multi_cpu_stop()
The multi_cpu_stop() contains a loop that can initially be executed with interrupts enabled (in the MULTI_STOP_NONE and MULTI_STOP_PREPARE states). Interrupts are guaranteed to be once the MULTI_STOP_DISABLE_IRQ state is reached. Unfortunately, the rcu_momentary_eqs() function that is currently invoked on each pass through this loop requires that interrupts be disabled.
This commit therefore moves this call to rcu_momentary_eqs() to the body of the "else if (curstate > MULTI_STOP_PREPARE)" portion of the loop, thus guaranteeing that interrupts will be disabled on each call, as required.
Kudos to 朱恺乾 (Kaiqian) for noting that this had not made it to mainline.
[ paulmck: Update from rcu_momentary_dyntick_idle() to rcu_momentary_eqs(). ]
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Mukesh Ojha <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7 |
|
| #
32a9f26e |
| 29-Apr-2024 |
Valentin Schneider <[email protected]> |
rcu: Rename rcu_momentary_dyntick_idle() into rcu_momentary_eqs()
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, replace "dyntick_idle" into "eqs" to drop the
rcu: Rename rcu_momentary_dyntick_idle() into rcu_momentary_eqs()
The context_tracking.state RCU_DYNTICKS subvariable has been renamed to RCU_WATCHING, replace "dyntick_idle" into "eqs" to drop the dyntick reference.
Signed-off-by: Valentin Schneider <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Signed-off-by: Neeraj Upadhyay <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6 |
|
| #
2760f5a4 |
| 06-May-2022 |
Peter Zijlstra <[email protected]> |
stop_machine: Add stop_core_cpuslocked() for per-core operations
Hardware core level testing features require near simultaneous execution of WRMSR instructions on all threads of a core to initiate a
stop_machine: Add stop_core_cpuslocked() for per-core operations
Hardware core level testing features require near simultaneous execution of WRMSR instructions on all threads of a core to initiate a test.
Provide a customized cut down version of stop_machine_cpuslocked() that just operates on the threads of a single core.
Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Tony Luck <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
show more ...
|
|
Revision tags: v5.18-rc5, v5.18-rc4, v5.18-rc3 |
|
| #
d664e399 |
| 13-Apr-2022 |
Thomas Gleixner <[email protected]> |
sched: Fix missing prototype warnings
A W=1 build emits more than a dozen missing prototype warnings related to scheduler and scheduler specific includes.
Reported-by: kernel test robot <lkp@intel.
sched: Fix missing prototype warnings
A W=1 build emits more than a dozen missing prototype warnings related to scheduler and scheduler specific includes.
Reported-by: kernel test robot <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10 |
|
| #
2a2f80ff |
| 10-Dec-2020 |
Valentin Schneider <[email protected]> |
stop_machine: Add caller debug info to queue_stop_cpus_work
Most callsites were covered by commit
a8b62fd08505 ("stop_machine: Add function and caller debug info")
but this skipped queue_stop_cp
stop_machine: Add caller debug info to queue_stop_cpus_work
Most callsites were covered by commit
a8b62fd08505 ("stop_machine: Add function and caller debug info")
but this skipped queue_stop_cpus_work(). Add caller debug info to it.
Signed-off-by: Valentin Schneider <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7 |
|
| #
a8b62fd0 |
| 21-Sep-2020 |
Peter Zijlstra <[email protected]> |
stop_machine: Add function and caller debug info
Crashes in stop-machine are hard to connect to the calling code, add a little something to help with that.
Signed-off-by: Peter Zijlstra (Intel) <pe
stop_machine: Add function and caller debug info
Crashes in stop-machine are hard to connect to the calling code, add a little something to help with that.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Reviewed-by: Daniel Bristot de Oliveira <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
| #
4230e2de |
| 21-Oct-2020 |
Zong Li <[email protected]> |
stop_machine, rcu: Mark functions as notrace
Some architectures assume that the stopped CPUs don't make function calls to traceable functions when they are in the stopped state. See also commit cb9d
stop_machine, rcu: Mark functions as notrace
Some architectures assume that the stopped CPUs don't make function calls to traceable functions when they are in the stopped state. See also commit cb9d7fd51d9f ("watchdog: Mark watchdog touch functions as notrace").
Violating this assumption causes kernel crashes when switching tracer on RISC-V.
Mark rcu_momentary_dyntick_idle() and stop_machine_yield() notrace to prevent this.
Fixes: 4ecf0a43e729 ("processor: get rid of cpu_relax_yield") Fixes: 366237e7b083 ("stop_machine: Provide RCU quiescent state in multi_cpu_stop()") Signed-off-by: Zong Li <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Atish Patra <[email protected]> Tested-by: Colin Ian King <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4 |
|
| #
35f4cd96 |
| 28-Dec-2019 |
Yangtao Li <[email protected]> |
stop_machine: Make stop_cpus() static
The function stop_cpus() is only used internally by the stop_machine for stop multiple cpus.
Make it static.
Signed-off-by: Yangtao Li <[email protected]>
stop_machine: Make stop_cpus() static
The function stop_cpus() is only used internally by the stop_machine for stop multiple cpus.
Make it static.
Signed-off-by: Yangtao Li <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.5-rc3, v5.5-rc2 |
|
| #
a5e37de9 |
| 14-Dec-2019 |
Yangtao Li <[email protected]> |
stop_machine: remove try_stop_cpus helper
try_stop_cpus is not used after this:
commit c190c3b16c0f ("rcu: Switch synchronize_sched_expedited() to stop_one_cpu()")
So remove it.
Signed-off-by: Ya
stop_machine: remove try_stop_cpus helper
try_stop_cpus is not used after this:
commit c190c3b16c0f ("rcu: Switch synchronize_sched_expedited() to stop_one_cpu()")
So remove it.
Signed-off-by: Yangtao Li <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3 |
|
| #
b1fc5833 |
| 07-Oct-2019 |
Mark Rutland <[email protected]> |
stop_machine: Avoid potential race behaviour
Both multi_cpu_stop() and set_state() access multi_stop_data::state racily using plain accesses. These are subject to compiler transformations which coul
stop_machine: Avoid potential race behaviour
Both multi_cpu_stop() and set_state() access multi_stop_data::state racily using plain accesses. These are subject to compiler transformations which could break the intended behaviour of the code, and this situation is detected by KCSAN on both arm64 and x86 (splats below).
Improve matters by using READ_ONCE() and WRITE_ONCE() to ensure that the compiler cannot elide, replay, or tear loads and stores.
In multi_cpu_stop() the two loads of multi_stop_data::state are expected to be a consistent value, so snapshot the value into a temporary variable to ensure this.
The state transitions are serialized by atomic manipulation of multi_stop_data::num_threads, and other fields in multi_stop_data are not modified while subject to concurrent reads.
KCSAN splat on arm64:
| BUG: KCSAN: data-race in multi_cpu_stop+0xa8/0x198 and set_state+0x80/0xb0 | | write to 0xffff00001003bd00 of 4 bytes by task 24 on cpu 3: | set_state+0x80/0xb0 | multi_cpu_stop+0x16c/0x198 | cpu_stopper_thread+0x170/0x298 | smpboot_thread_fn+0x40c/0x560 | kthread+0x1a8/0x1b0 | ret_from_fork+0x10/0x18 | | read to 0xffff00001003bd00 of 4 bytes by task 14 on cpu 1: | multi_cpu_stop+0xa8/0x198 | cpu_stopper_thread+0x170/0x298 | smpboot_thread_fn+0x40c/0x560 | kthread+0x1a8/0x1b0 | ret_from_fork+0x10/0x18 | | Reported by Kernel Concurrency Sanitizer on: | CPU: 1 PID: 14 Comm: migration/1 Not tainted 5.3.0-00007-g67ab35a199f4-dirty #3 | Hardware name: linux,dummy-virt (DT)
KCSAN splat on x86:
| write to 0xffffb0bac0013e18 of 4 bytes by task 19 on cpu 2: | set_state kernel/stop_machine.c:170 [inline] | ack_state kernel/stop_machine.c:177 [inline] | multi_cpu_stop+0x1a4/0x220 kernel/stop_machine.c:227 | cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516 | smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165 | kthread+0x1b5/0x200 kernel/kthread.c:255 | ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352 | | read to 0xffffb0bac0013e18 of 4 bytes by task 44 on cpu 7: | multi_cpu_stop+0xb4/0x220 kernel/stop_machine.c:213 | cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516 | smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165 | kthread+0x1b5/0x200 kernel/kthread.c:255 | ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352 | | Reported by Kernel Concurrency Sanitizer on: | CPU: 7 PID: 44 Comm: migration/7 Not tainted 5.3.0+ #1 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Signed-off-by: Mark Rutland <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Marco Elver <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1 |
|
| #
366237e7 |
| 10-Jul-2019 |
Paul E. McKenney <[email protected]> |
stop_machine: Provide RCU quiescent state in multi_cpu_stop()
When multi_cpu_stop() loops waiting for other tasks, it can trigger an RCU CPU stall warning. This can be misleading because what is in
stop_machine: Provide RCU quiescent state in multi_cpu_stop()
When multi_cpu_stop() loops waiting for other tasks, it can trigger an RCU CPU stall warning. This can be misleading because what is instead needed is information on whatever task is blocking multi_cpu_stop(). This commit therefore inserts an RCU quiescent state into the multi_cpu_stop() function's waitloop.
Signed-off-by: Paul E. McKenney <[email protected]>
show more ...
|
|
Revision tags: v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3 |
|
| #
99d84bf8 |
| 29-May-2019 |
Peter Zijlstra <[email protected]> |
stop_machine: Fix stop_cpus_in_progress ordering
Make sure the entire for loop has stop_cpus_in_progress set.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Aaron Lu <aaron.lwe@gm
stop_machine: Fix stop_cpus_in_progress ordering
Make sure the entire for loop has stop_cpus_in_progress set.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Aaron Lu <[email protected]> Cc: Valentin Schneider <[email protected]> Cc: [email protected] Cc: Phil Auld <[email protected]> Cc: Julien Desfossez <[email protected]> Cc: Nishanth Aravamudan <[email protected]> Link: https://lkml.kernel.org/r/0fd8fd4b99b9b9aa88d8b2dff897f7fd0d88f72c.1559129225.git.vpillai@digitalocean.com
show more ...
|
| #
4ecf0a43 |
| 08-Jun-2019 |
Heiko Carstens <[email protected]> |
processor: get rid of cpu_relax_yield
stop_machine is the only user left of cpu_relax_yield. Given that it now has special semantics which are tied to stop_machine introduce a weak stop_machine_yiel
processor: get rid of cpu_relax_yield
stop_machine is the only user left of cpu_relax_yield. Given that it now has special semantics which are tied to stop_machine introduce a weak stop_machine_yield function which architectures can override, and get rid of the generic cpu_relax_yield implementation.
Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
show more ...
|
|
Revision tags: v5.2-rc2, v5.2-rc1 |
|
| #
38f2c691 |
| 17-May-2019 |
Martin Schwidefsky <[email protected]> |
s390: improve wait logic of stop_machine
The stop_machine loop to advance the state machine and to wait for all affected CPUs to check-in calls cpu_relax_yield in a tight loop until the last missing
s390: improve wait logic of stop_machine
The stop_machine loop to advance the state machine and to wait for all affected CPUs to check-in calls cpu_relax_yield in a tight loop until the last missing CPUs acknowledged the state transition.
On a virtual system where not all logical CPUs are backed by real CPUs all the time it can take a while for all CPUs to check-in. With the current definition of cpu_relax_yield a diagnose 0x44 is done which tells the hypervisor to schedule *some* other CPU. That can be any CPU and not necessarily one of the CPUs that need to run in order to advance the state machine. This can lead to a pretty bad diagnose 0x44 storm until the last missing CPU finally checked-in.
Replace the undirected cpu_relax_yield based on diagnose 0x44 with a directed yield. Each CPU in the wait loop will pick up the next CPU in the cpumask of stop_machine. The diagnose 0x9c is used to tell the hypervisor to run this next CPU instead of the current one. If there is only a limited number of real CPUs backing the virtual CPUs we end up with the real CPUs passed around in a round-robin fashion.
[[email protected]]: Use cpumask_next_wrap as suggested by Peter Zijlstra.
Signed-off-by: Martin Schwidefsky <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
show more ...
|
| #
6ff3f917 |
| 20-May-2019 |
Thomas Gleixner <[email protected]> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 38
Based on 1 normalized pattern(s):
this file is released under the gplv2 and any later version
extracted by the scancode license
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 38
Based on 1 normalized pattern(s):
this file is released under the gplv2 and any later version
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 1 file(s).
Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Allison Randal <[email protected]> Reviewed-by: Kate Stewart <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3 |
|
| #
d75f773c |
| 25-Mar-2019 |
Sakari Ailus <[email protected]> |
treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively
%pF and %pf are functionally equivalent to %pS and %ps conversion specifiers. The former are deprecated, therefore switch
treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively
%pF and %pf are functionally equivalent to %pS and %ps conversion specifiers. The former are deprecated, therefore switch the current users to use the preferred variant.
The changes have been produced by the following command:
git grep -l '%p[fF]' | grep -v '^\(tools\|Documentation\)/' | \ while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done
And verifying the result.
Link: http://lkml.kernel.org/r/[email protected] Cc: Andy Shevchenko <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Sakari Ailus <[email protected]> Acked-by: David Sterba <[email protected]> (for btrfs) Acked-by: Mike Rapoport <[email protected]> (for mm/memblock.c) Acked-by: Bjorn Helgaas <[email protected]> (for drivers/pci) Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Petr Mladek <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8 |
|
| #
cfd35514 |
| 03-Aug-2018 |
Prasad Sodagudi <[email protected]> |
stop_machine: Atomically queue and wake stopper threads
When cpu_stop_queue_work() releases the lock for the stopper thread that was queued into its wake queue, preemption is enabled, which leads to
stop_machine: Atomically queue and wake stopper threads
When cpu_stop_queue_work() releases the lock for the stopper thread that was queued into its wake queue, preemption is enabled, which leads to the following deadlock:
CPU0 CPU1 sched_setaffinity(0, ...) __set_cpus_allowed_ptr() stop_one_cpu(0, ...) stop_two_cpus(0, 1, ...) cpu_stop_queue_work(0, ...) cpu_stop_queue_two_works(0, ..., 1, ...)
-grabs lock for migration/0- -spins with preemption disabled, waiting for migration/0's lock to be released-
-adds work items for migration/0 and queues migration/0 to its wake_q-
-releases lock for migration/0 and preemption is enabled-
-current thread is preempted, and __set_cpus_allowed_ptr has changed the thread's cpu allowed mask to CPU1 only-
-acquires migration/0 and migration/1's locks-
-adds work for migration/0 but does not add migration/0 to wake_q, since it is already in a wake_q-
-adds work for migration/1 and adds migration/1 to its wake_q-
-releases migration/0 and migration/1's locks, wakes migration/1, and enables preemption-
-since migration/1 is requested to run, migration/1 begins to run and waits on migration/0, but migration/0 will never be able to run, since the thread that can wake it is affine to CPU1-
Disable preemption in cpu_stop_queue_work() before queueing works for stopper threads, and queueing the stopper thread in the wake queue, to ensure that the operation of queueing the works and waking the stopper threads is atomic.
Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Signed-off-by: Prasad Sodagudi <[email protected]> Signed-off-by: Isaac J. Manjarres <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
Co-Developed-by: Isaac J. Manjarres <[email protected]>
show more ...
|
| #
b80a2bfc |
| 30-Jul-2018 |
Peter Zijlstra <[email protected]> |
stop_machine: Reflow cpu_stop_queue_two_works()
The code flow in cpu_stop_queue_two_works() is a little arcane; fix this by lifting the preempt_disable() to the top to create more natural nesting wr
stop_machine: Reflow cpu_stop_queue_two_works()
The code flow in cpu_stop_queue_two_works() is a little arcane; fix this by lifting the preempt_disable() to the top to create more natural nesting wrt the spinlocks and make the wake_up_q() and preempt_enable() unconditional at the end.
Furthermore, enable preemption in the -EDEADLK case, such that we spin-wait with preemption enabled.
Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v4.18-rc7, v4.18-rc6 |
|
| #
2610e889 |
| 17-Jul-2018 |
Isaac J. Manjarres <[email protected]> |
stop_machine: Disable preemption after queueing stopper threads
This commit:
9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads")
does not fully address the race con
stop_machine: Disable preemption after queueing stopper threads
This commit:
9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads")
does not fully address the race condition that can occur as follows:
On one CPU, call it CPU 3, thread 1 invokes cpu_stop_queue_two_works(2, 3,...), and the execution is such that thread 1 queues the works for migration/2 and migration/3, and is preempted after releasing the locks for migration/2 and migration/3, but before waking the threads.
Then, On CPU 2, a kworker, call it thread 2, is running, and it invokes cpu_stop_queue_two_works(1, 2,...), such that thread 2 queues the works for migration/1 and migration/2. Meanwhile, on CPU 3, thread 1 resumes execution, and wakes migration/2 and migration/3. This means that when CPU 2 releases the locks for migration/1 and migration/2, but before it wakes those threads, it can be preempted by migration/2.
If thread 2 is preempted by migration/2, then migration/2 will execute the first work item successfully, since migration/3 was woken up by CPU 3, but when it goes to execute the second work item, it disables preemption, calls multi_cpu_stop(), and thus, CPU 2 will wait forever for migration/1, which should have been woken up by thread 2. However migration/1 cannot be woken up by thread 2, since it is a kworker, so it is affine to CPU 2, but CPU 2 is running migration/2 with preemption disabled, so thread 2 will never run.
Disable preemption after queueing works for stopper threads to ensure that the operation of queueing the works and waking the stopper threads is atomic.
Co-Developed-by: Prasad Sodagudi <[email protected]> Co-Developed-by: Pavankumar Kondeti <[email protected]> Signed-off-by: Isaac J. Manjarres <[email protected]> Signed-off-by: Prasad Sodagudi <[email protected]> Signed-off-by: Pavankumar Kondeti <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v4.18-rc5, v4.18-rc4 |
|
| #
9fb8d5dc |
| 03-Jul-2018 |
Isaac J. Manjarres <[email protected]> |
stop_machine: Disable preemption when waking two stopper threads
When cpu_stop_queue_two_works() begins to wake the stopper threads, it does so without preemption disabled, which leads to the follow
stop_machine: Disable preemption when waking two stopper threads
When cpu_stop_queue_two_works() begins to wake the stopper threads, it does so without preemption disabled, which leads to the following race condition:
The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last.
When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU.
The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop(), and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall.
Disable preemption when waking the stopper threads in cpu_stop_queue_two_works().
Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Prasad Sodagudi <[email protected]> Signed-off-by: Prasad Sodagudi <[email protected]> Co-Developed-by: Pavankumar Kondeti <[email protected]> Signed-off-by: Pavankumar Kondeti <[email protected]> Signed-off-by: Isaac J. Manjarres <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3 |
|
| #
de5b55c1 |
| 23-Apr-2018 |
Thomas Gleixner <[email protected]> |
stop_machine: Use raw spinlocks
Use raw-locks in stop_machine() to allow locking in irq-off and preempt-disabled regions on -RT. This also documents the possible locking context in general.
[bigeas
stop_machine: Use raw spinlocks
Use raw-locks in stop_machine() to allow locking in irq-off and preempt-disabled regions on -RT. This also documents the possible locking context in general.
[bigeasy: update patch description.] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v4.17-rc2 |
|
| #
0b26351b |
| 20-Apr-2018 |
Peter Zijlstra <[email protected]> |
stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
Matt reported the following deadlock:
CPU0 CPU1
schedule(.prev=migrate/0) <fault> pick_next_task() ... idle_bal
stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
Matt reported the following deadlock:
CPU0 CPU1
schedule(.prev=migrate/0) <fault> pick_next_task() ... idle_balance() migrate_swap() active_balance() stop_two_cpus() spin_lock(stopper0->lock) spin_lock(stopper1->lock) ttwu(migrate/0) smp_cond_load_acquire() -- waits for schedule() stop_one_cpu(1) spin_lock(stopper1->lock) -- waits for stopper lock
Fix this deadlock by taking the wakeups out from under stopper->lock. This allows the active_balance() to queue the stop work and finish the context switch, which in turn allows the wakeup from migrate_swap() to observe the context and complete the wakeup.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reported-by: Matt Fleming <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Matt Fleming <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v4.17-rc1, v4.16, v4.16-rc7, v4.16-rc6, v4.16-rc5, v4.16-rc4, v4.16-rc3, v4.16-rc2, v4.16-rc1, v4.15, v4.15-rc9, v4.15-rc8, v4.15-rc7, v4.15-rc6, v4.15-rc5, v4.15-rc4, v4.15-rc3, v4.15-rc2, v4.15-rc1, v4.14, v4.14-rc8, v4.14-rc7, v4.14-rc6, v4.14-rc5, v4.14-rc4, v4.14-rc3, v4.14-rc2, v4.14-rc1, v4.13, v4.13-rc7, v4.13-rc6, v4.13-rc5, v4.13-rc4, v4.13-rc3, v4.13-rc2, v4.13-rc1, v4.12, v4.12-rc7, v4.12-rc6, v4.12-rc5, v4.12-rc4, v4.12-rc3 |
|
| #
fe5595c0 |
| 24-May-2017 |
Sebastian Andrzej Siewior <[email protected]> |
stop_machine: Provide stop_machine_cpuslocked()
Some call sites of stop_machine() are within a get_online_cpus() protected region.
stop_machine() calls get_online_cpus() as well, which is possible
stop_machine: Provide stop_machine_cpuslocked()
Some call sites of stop_machine() are within a get_online_cpus() protected region.
stop_machine() calls get_online_cpus() as well, which is possible in the current implementation but prevents converting the hotplug locking to a percpu rwsem.
Provide stop_machine_cpuslocked() to avoid nested calls to get_online_cpus().
Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Paul E. McKenney <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v4.12-rc2, v4.12-rc1, v4.11, v4.11-rc8, v4.11-rc7, v4.11-rc6, v4.11-rc5, v4.11-rc4, v4.11-rc3, v4.11-rc2, v4.11-rc1, v4.10, v4.10-rc8, v4.10-rc7, v4.10-rc6, v4.10-rc5, v4.10-rc4, v4.10-rc3, v4.10-rc2, v4.10-rc1, v4.9, v4.9-rc8, v4.9-rc7, v4.9-rc6, v4.9-rc5, v4.9-rc4, v4.9-rc3 |
|
| #
bf0d31c0 |
| 25-Oct-2016 |
Christian Borntraeger <[email protected]> |
locking/core, stop_machine: Yield the CPU during stop machine()
Some time ago the following commit:
57f2ffe14fd125c2 ("s390: remove diag 44 calls from cpu_relax()")
... stopped cpu_relax() on s3
locking/core, stop_machine: Yield the CPU during stop machine()
Some time ago the following commit:
57f2ffe14fd125c2 ("s390: remove diag 44 calls from cpu_relax()")
... stopped cpu_relax() on s390 yielding to the hypervisor.
As it turns out this made stop_machine() run really slow on virtualized overcommited systems. For example the kprobes test during bootup took several seconds instead of just running unnoticed with large guests.
Therefore, yielding was reintroduced with commit:
4d92f50249eb ("s390: reintroduce diag 44 calls for cpu_relax()")
... but in fact the stop machine code seems to be the only place where this yielding was really necessary. This place is probably the most important one as it makes all but one guest CPUs wait for one guest CPU.
As we now have cpu_relax_yield(), we can use this in multi_cpu_stop(). For now lets only add it here. We can add it later in other places when necessary.
Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Noam Camus <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russell King <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|