|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7 |
|
| #
ac08f68f |
| 16-Mar-2025 |
Kumar Kartikeya Dwivedi <[email protected]> |
locking: Move common qspinlock helpers to a private header
Move qspinlock helper functions that encode, decode tail word, set and clear the pending and locked bits, and other miscellaneous definitio
locking: Move common qspinlock helpers to a private header
Move qspinlock helper functions that encode, decode tail word, set and clear the pending and locked bits, and other miscellaneous definitions and macros to a private header. To this end, create a qspinlock.h header file in kernel/locking. Subsequent commits will introduce a modified qspinlock slow path function, thus moving shared code to a private header will help minimize unnecessary code duplication.
Reviewed-by: Barret Rhoden <[email protected]> Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10 |
|
| #
9fe6a8c5 |
| 10-Jul-2024 |
Juergen Gross <[email protected]> |
x86/xen: remove deprecated xen_nopvspin boot parameter
The xen_nopvspin boot parameter is deprecated since 2019. nopvspin can be used instead.
Remove the xen_nopvspin boot parameter and replace the
x86/xen: remove deprecated xen_nopvspin boot parameter
The xen_nopvspin boot parameter is deprecated since 2019. nopvspin can be used instead.
Remove the xen_nopvspin boot parameter and replace the xen_pvspin variable use cases with nopvspin.
This requires to move the nopvspin variable out of the .initdata section, as it needs to be accessed for cpuhotplug, too.
Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1 |
|
| #
79a34e3d |
| 21-Mar-2024 |
Uros Bizjak <[email protected]> |
locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail()
Use atomic_try_cmpxchg_relaxed(*ptr, &old, new) instead of atomic_cmpxchg_relaxed (*ptr, old, new) == old in xchg_tail().
x86 CMPX
locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail()
Use atomic_try_cmpxchg_relaxed(*ptr, &old, new) instead of atomic_cmpxchg_relaxed (*ptr, old, new) == old in xchg_tail().
x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after CMPXCHG.
No functional change intended.
Since this code requires NR_CPUS >= 16k, I have tested it by unconditionally setting _Q_PENDING_BITS to 1 in <asm-generic/qspinlock_types.h>.
Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Waiman Long <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3 |
|
| #
4282494a |
| 05-Jan-2023 |
Guo Ren <[email protected]> |
locking/qspinlock: Micro-optimize pending state waiting for unlock
When we're pending, we only care about lock value. The xchg_tail wouldn't affect the pending state. That means the hardware thread
locking/qspinlock: Micro-optimize pending state waiting for unlock
When we're pending, we only care about lock value. The xchg_tail wouldn't affect the pending state. That means the hardware thread could stay in a sleep state and leaves the rest execution units' resources of pipeline to other hardware threads. This situation is the SMT scenarios in the same core. Not an entering low-power state situation. Of course, the granularity between cores is "cacheline", but the granularity between SMT hw threads of the same core could be "byte" which internal LSU handles. For example, when a hw-thread yields the resources of the core to other hw-threads, this patch could help the hw-thread stay in the sleep state and prevent it from being woken up by other hw-threads xchg_tail.
Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Waiman Long <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Peter Zijlstra <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1 |
|
| #
501f7f69 |
| 10-Aug-2022 |
Namhyung Kim <[email protected]> |
locking: Add __lockfunc to slow path functions
So that we can skip the functions in the perf lock contention and other places like /proc/PID/wchan.
Signed-off-by: Namhyung Kim <[email protected]>
locking: Add __lockfunc to slow path functions
So that we can skip the functions in the perf lock contention and other places like /proc/PID/wchan.
Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1 |
|
| #
ee042be1 |
| 22-Mar-2022 |
Namhyung Kim <[email protected]> |
locking: Apply contention tracepoints in the slow path
Adding the lock contention tracepoints in various lock function slow paths. Note that each arch can define spinlock differently, I only added
locking: Apply contention tracepoints in the slow path
Adding the lock contention tracepoints in various lock function slow paths. Note that each arch can define spinlock differently, I only added it only to the generic qspinlock for now.
Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Hyeonggon Yoo <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5 |
|
| #
05eee619 |
| 23-Oct-2019 |
Zhenzhong Duan <[email protected]> |
x86/kvm: Add "nopvspin" parameter to disable PV spinlocks
There are cases where a guest tries to switch spinlocks to bare metal behavior (e.g. by setting "xen_nopvspin" on XEN platform and "hv_nopvs
x86/kvm: Add "nopvspin" parameter to disable PV spinlocks
There are cases where a guest tries to switch spinlocks to bare metal behavior (e.g. by setting "xen_nopvspin" on XEN platform and "hv_nopvspin" on HYPER_V).
That feature is missed on KVM, add a new parameter "nopvspin" to disable PV spinlocks for KVM guest.
The new 'nopvspin' parameter will also replace Xen and Hyper-V specific parameters in future patches.
Define variable nopvsin as global because it will be used in future patches as above.
Signed-off-by: Zhenzhong Duan <[email protected]> Reviewed-by: Vitaly Kuznetsov <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krcmar <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Jim Mattson <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
show more ...
|
| #
57097124 |
| 07-Jan-2020 |
Waiman Long <[email protected]> |
locking/qspinlock: Fix inaccessible URL of MCS lock paper
It turns out that the URL of the MCS lock paper listed in the source code is no longer accessible. I did got question about where the paper
locking/qspinlock: Fix inaccessible URL of MCS lock paper
It turns out that the URL of the MCS lock paper listed in the source code is no longer accessible. I did got question about where the paper was. This patch updates the URL to BZ 206115 which contains a copy of the paper from
https://www.cs.rochester.edu/u/scott/papers/1991_TOCS_synch.pdf
Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3 |
|
| #
c942fddf |
| 27-May-2019 |
Thomas Gleixner <[email protected]> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157
Based on 3 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of th
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157
Based on 3 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version [author] [graeme] [gregory] [gg]@[slimlogic] [co] [uk] [author] [kishon] [vijay] [abraham] [i] [kishon]@[ti] [com] [based] [on] [twl6030]_[usb] [c] [author] [hema] [hk] [hemahk]@[ti] [com] this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 1105 file(s).
Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Allison Randal <[email protected]> Reviewed-by: Richard Fontana <[email protected]> Reviewed-by: Kate Stewart <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4 |
|
| #
ad53fa10 |
| 04-Apr-2019 |
Waiman Long <[email protected]> |
locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs
The percpu event counts used by qspinlock code can be useful for other locking code as well. So a new set of lockevent_* countin
locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs
The percpu event counts used by qspinlock code can be useful for other locking code as well. So a new set of lockevent_* counting APIs is introduced with the lock event names extracted out into the new lock_events_list.h header file for easier addition in the future.
The existing qstat_inc() calls are replaced by either lockevent_inc() or lockevent_cond_inc() calls.
The qstat_hop() call is renamed to lockevent_pv_hop(). The "reset_counters" debugfs file is also renamed to ".reset_counts".
Signed-off-by: Waiman Long <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0 |
|
| #
733000c7 |
| 25-Feb-2019 |
Waiman Long <[email protected]> |
locking/qspinlock: Remove unnecessary BUG_ON() call
With the > 4 nesting levels case handled by the commit:
d682b596d993 ("locking/qspinlock: Handle > 4 slowpath nesting levels")
the BUG_ON() ca
locking/qspinlock: Remove unnecessary BUG_ON() call
With the > 4 nesting levels case handled by the commit:
d682b596d993 ("locking/qspinlock: Handle > 4 slowpath nesting levels")
the BUG_ON() call in encode_tail() will never actually be triggered.
Remove it.
Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5 |
|
| #
412f34a8 |
| 29-Jan-2019 |
Waiman Long <[email protected]> |
locking/qspinlock_stat: Track the no MCS node available case
Track the number of slowpath locking operations that are being done without any MCS node available as well renaming lock_index[123] to ma
locking/qspinlock_stat: Track the no MCS node available case
Track the number of slowpath locking operations that are being done without any MCS node available as well renaming lock_index[123] to make them more descriptive.
Using these stat counters is one way to find out if a code path is being exercised.
Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: James Morse <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: SRINIVAS <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Zhenzhong Duan <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
d682b596 |
| 29-Jan-2019 |
Waiman Long <[email protected]> |
locking/qspinlock: Handle > 4 slowpath nesting levels
Four queue nodes per CPU are allocated to enable up to 4 nesting levels using the per-CPU nodes. Nested NMIs are possible in some architectures.
locking/qspinlock: Handle > 4 slowpath nesting levels
Four queue nodes per CPU are allocated to enable up to 4 nesting levels using the per-CPU nodes. Nested NMIs are possible in some architectures. Still it is very unlikely that we will ever hit more than 4 nested levels with contention in the slowpath.
When that rare condition happens, however, it is likely that the system will hang or crash shortly after that. It is not good and we need to handle this exception case.
This is done by spinning directly on the lock using repeated trylock. This alternative code path should only be used when there is nested NMIs. Assuming that the locks used by those NMI handlers will not be heavily contended, a simple TAS locking should work out.
Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: James Morse <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: SRINIVAS <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Zhenzhong Duan <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19 |
|
| #
0fa809ca |
| 16-Oct-2018 |
Waiman Long <[email protected]> |
locking/pvqspinlock: Extend node size when pvqspinlock is configured
The qspinlock code supports up to 4 levels of slowpath nesting using four per-CPU mcs_spinlock structures. For 64-bit architectur
locking/pvqspinlock: Extend node size when pvqspinlock is configured
The qspinlock code supports up to 4 levels of slowpath nesting using four per-CPU mcs_spinlock structures. For 64-bit architectures, they fit nicely in one 64-byte cacheline.
For para-virtualized (PV) qspinlocks it needs to store more information in the per-CPU node structure than there is space for. It uses a trick to use a second cacheline to hold the extra information that it needs. So PV qspinlock needs to access two extra cachelines for its information whereas the native qspinlock code only needs one extra cacheline.
Freshly added counter profiling of the qspinlock code, however, revealed that it was very rare to use more than two levels of slowpath nesting. So it doesn't make sense to penalize PV qspinlock code in order to have four mcs_spinlock structures in the same cacheline to optimize for a case in the native qspinlock code that rarely happens.
Extend the per-CPU node structure to have two more long words when PV qspinlock locks are configured to hold the extra data that it needs.
As a result, the PV qspinlock code will enjoy the same benefit of using just one extra cacheline like the native counterpart, for most cases.
[ mingo: Minor changelog edits. ]
Signed-off-by: Waiman Long <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
1222109a |
| 16-Oct-2018 |
Waiman Long <[email protected]> |
locking/qspinlock_stat: Count instances of nested lock slowpaths
Queued spinlock supports up to 4 levels of lock slowpath nesting - user context, soft IRQ, hard IRQ and NMI. However, we are not sure
locking/qspinlock_stat: Count instances of nested lock slowpaths
Queued spinlock supports up to 4 levels of lock slowpath nesting - user context, soft IRQ, hard IRQ and NMI. However, we are not sure how often the nesting happens.
So add 3 more per-CPU stat counters to track the number of instances where nesting index goes to 1, 2 and 3 respectively.
On a dual-socket 64-core 128-thread Zen server, the following were the new stat counter values under different circumstances:
State slowpath index1 index2 index3 ----- -------- ------ ------ ------- After bootup 1,012,150 82 0 0 After parallel build + perf-top 125,195,009 82 0 0
So the chance of having more than 2 levels of nesting is extremely low.
[ mingo: Minor changelog edits. ]
Signed-off-by: Waiman Long <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v4.19-rc8, v4.19-rc7, v4.19-rc6 |
|
| #
7aa54be2 |
| 26-Sep-2018 |
Peter Zijlstra <[email protected]> |
locking/qspinlock, x86: Provide liveness guarantee
On x86 we cannot do fetch_or() with a single instruction and thus end up using a cmpxchg loop, this reduces determinism. Replace the fetch_or() wit
locking/qspinlock, x86: Provide liveness guarantee
On x86 we cannot do fetch_or() with a single instruction and thus end up using a cmpxchg loop, this reduces determinism. Replace the fetch_or() with a composite operation: tas-pending + load.
Using two instructions of course opens a window we previously did not have. Consider the scenario:
CPU0 CPU1 CPU2
1) lock trylock -> (0,0,1)
2) lock trylock /* fail */
3) unlock -> (0,0,0)
4) lock trylock -> (0,0,1)
5) tas-pending -> (0,1,1) load-val <- (0,1,0) from 3
6) clear-pending-set-locked -> (0,0,1)
FAIL: _2_ owners
where 5) is our new composite operation. When we consider each part of the qspinlock state as a separate variable (as we can when _Q_PENDING_BITS == 8) then the above is entirely possible, because tas-pending will only RmW the pending byte, so the later load is able to observe prior tail and lock state (but not earlier than its own trylock, which operates on the whole word, due to coherence).
To avoid this we need 2 things:
- the load must come after the tas-pending (obviously, otherwise it can trivially observe prior state).
- the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for example, such that we cannot observe other state prior to setting pending.
On x86 we can realize this by using "LOCK BTS m32, r32" for tas-pending followed by a regular load.
Note that observing later state is not a problem:
- if we fail to observe a later unlock, we'll simply spin-wait for that store to become visible.
- if we observe a later xchg_tail(), there is no difference from that xchg_tail() having taken place before the tas-pending.
Suggested-by: Will Deacon <[email protected]> Reported-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Will Deacon <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath") Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
756b1df4 |
| 26-Sep-2018 |
Peter Zijlstra <[email protected]> |
locking/qspinlock: Rework some comments
While working my way through the code again; I felt the comments could use help.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Will
locking/qspinlock: Rework some comments
While working my way through the code again; I felt the comments could use help.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
53bf57fa |
| 26-Sep-2018 |
Peter Zijlstra <[email protected]> |
locking/qspinlock: Re-order code
Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL) such that we loose the indent. This also result in a more natural code flow IMO.
Signed-off-
locking/qspinlock: Re-order code
Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL) such that we loose the indent. This also result in a more natural code flow IMO.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3 |
|
| #
81d3dc9a |
| 26-Apr-2018 |
Waiman Long <[email protected]> |
locking/qspinlock: Add stat tracking for pending vs. slowpath
Currently, the qspinlock_stat code tracks only statistical counts in the PV qspinlock code. However, it may also be useful to track the
locking/qspinlock: Add stat tracking for pending vs. slowpath
Currently, the qspinlock_stat code tracks only statistical counts in the PV qspinlock code. However, it may also be useful to track the number of locking operations done via the pending code vs. the MCS lock queue slowpath for the non-PV case.
The qspinlock stat code is modified to do that. The stat counter pv_lock_slowpath is renamed to lock_slowpath so that it can be used by both the PV and non-PV cases.
Signed-off-by: Waiman Long <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
ae75d908 |
| 26-Apr-2018 |
Will Deacon <[email protected]> |
locking/qspinlock: Use try_cmpxchg() instead of cmpxchg() when locking
When reaching the head of an uncontended queue on the qspinlock slow-path, using a try_cmpxchg() instead of a cmpxchg() operati
locking/qspinlock: Use try_cmpxchg() instead of cmpxchg() when locking
When reaching the head of an uncontended queue on the qspinlock slow-path, using a try_cmpxchg() instead of a cmpxchg() operation to transition the lock work to _Q_LOCKED_VAL generates slightly better code for x86 and pretty much identical code for arm64.
Reported-by: Peter Zijlstra <[email protected]> Signed-off-by: Will Deacon <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
9d4646d1 |
| 26-Apr-2018 |
Will Deacon <[email protected]> |
locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb()
The qspinlock slowpath must ensure that the MCS node is fully initialised before it can be reached by another other CPU. This
locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb()
The qspinlock slowpath must ensure that the MCS node is fully initialised before it can be reached by another other CPU. This is currently enforced by using a RELEASE operation when updating the tail and also when linking the node into the waitqueue, since the control dependency off xchg_tail is insufficient to enforce sufficient ordering, see:
95bcade33a8a ("locking/qspinlock: Ensure node is initialised before updating prev->next")
Back-to-back RELEASE operations may be expensive on some architectures, particularly those that implement them using fences under the hood. We can replace the two RELEASE operations with a single smp_wmb() fence and use RELAXED operations for the subsequent publishing of the node.
Signed-off-by: Will Deacon <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
c131a198 |
| 26-Apr-2018 |
Will Deacon <[email protected]> |
locking/qspinlock: Use smp_cond_load_relaxed() to wait for next node
When a locker reaches the head of the queue and takes the lock, a concurrent locker may enqueue and force the lock holder to spin
locking/qspinlock: Use smp_cond_load_relaxed() to wait for next node
When a locker reaches the head of the queue and takes the lock, a concurrent locker may enqueue and force the lock holder to spin whilst its node->next field is initialised. Rather than open-code a READ_ONCE/cpu_relax() loop, this can be implemented using smp_cond_load_relaxed() instead.
Signed-off-by: Will Deacon <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
f9c811fa |
| 26-Apr-2018 |
Will Deacon <[email protected]> |
locking/qspinlock: Use atomic_cond_read_acquire()
Rather than dig into the counter field of the atomic_t inside the qspinlock structure so that we can call smp_cond_load_acquire(), use atomic_cond_r
locking/qspinlock: Use atomic_cond_read_acquire()
Rather than dig into the counter field of the atomic_t inside the qspinlock structure so that we can call smp_cond_load_acquire(), use atomic_cond_read_acquire() instead, which operates on the atomic_t directly.
Signed-off-by: Will Deacon <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
c61da58d |
| 26-Apr-2018 |
Will Deacon <[email protected]> |
locking/qspinlock: Kill cmpxchg() loop when claiming lock from head of queue
When a queued locker reaches the head of the queue, it claims the lock by setting _Q_LOCKED_VAL in the lockword. If there
locking/qspinlock: Kill cmpxchg() loop when claiming lock from head of queue
When a queued locker reaches the head of the queue, it claims the lock by setting _Q_LOCKED_VAL in the lockword. If there isn't contention, it must also clear the tail as part of this operation so that subsequent lockers can avoid taking the slowpath altogether.
Currently this is expressed as a cmpxchg() loop that practically only runs up to two iterations. This is confusing to the reader and unhelpful to the compiler. Rewrite the cmpxchg() loop without the loop, so that a failed cmpxchg() implies that there is contention and we just need to write to _Q_LOCKED_VAL without considering the rest of the lockword.
Signed-off-by: Will Deacon <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
59fb586b |
| 26-Apr-2018 |
Will Deacon <[email protected]> |
locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath
The qspinlock locking slowpath utilises a "pending" bit as a simple form of an embedded test-and-set lock that can avoid the
locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath
The qspinlock locking slowpath utilises a "pending" bit as a simple form of an embedded test-and-set lock that can avoid the overhead of explicit queuing in cases where the lock is held but uncontended. This bit is managed using a cmpxchg() loop which tries to transition the uncontended lock word from (0,0,0) -> (0,0,1) or (0,0,1) -> (0,1,1).
Unfortunately, the cmpxchg() loop is unbounded and lockers can be starved indefinitely if the lock word is seen to oscillate between unlocked (0,0,0) and locked (0,0,1). This could happen if concurrent lockers are able to take the lock in the cmpxchg() loop without queuing and pass it around amongst themselves.
This patch fixes the problem by unconditionally setting _Q_PENDING_VAL using atomic_fetch_or, and then inspecting the old value to see whether we need to spin on the current lock owner, or whether we now effectively hold the lock. The tricky scenario is when concurrent lockers end up queuing on the lock and the lock becomes available, causing us to see a lockword of (n,0,0). With pending now set, simply queuing could lead to deadlock as the head of the queue may not have observed the pending flag being cleared. Conversely, if the head of the queue did observe pending being cleared, then it could transition the lock from (n,0,0) -> (0,0,1) meaning that any attempt to "undo" our setting of the pending bit could race with a concurrent locker trying to set it.
We handle this race by preserving the pending bit when taking the lock after reaching the head of the queue and leaving the tail entry intact if we saw pending set, because we know that the tail is going to be updated shortly.
Signed-off-by: Will Deacon <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|