History log of /linux-6.15/kernel/locking/qspinlock.c (Results 1 – 25 of 55)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7
# ac08f68f 16-Mar-2025 Kumar Kartikeya Dwivedi <[email protected]>

locking: Move common qspinlock helpers to a private header

Move qspinlock helper functions that encode, decode tail word, set and
clear the pending and locked bits, and other miscellaneous definitio

locking: Move common qspinlock helpers to a private header

Move qspinlock helper functions that encode, decode tail word, set and
clear the pending and locked bits, and other miscellaneous definitions
and macros to a private header. To this end, create a qspinlock.h header
file in kernel/locking. Subsequent commits will introduce a modified
qspinlock slow path function, thus moving shared code to a private
header will help minimize unnecessary code duplication.

Reviewed-by: Barret Rhoden <[email protected]>
Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

show more ...


Revision tags: v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10
# 9fe6a8c5 10-Jul-2024 Juergen Gross <[email protected]>

x86/xen: remove deprecated xen_nopvspin boot parameter

The xen_nopvspin boot parameter is deprecated since 2019. nopvspin
can be used instead.

Remove the xen_nopvspin boot parameter and replace the

x86/xen: remove deprecated xen_nopvspin boot parameter

The xen_nopvspin boot parameter is deprecated since 2019. nopvspin
can be used instead.

Remove the xen_nopvspin boot parameter and replace the xen_pvspin
variable use cases with nopvspin.

This requires to move the nopvspin variable out of the .initdata
section, as it needs to be accessed for cpuhotplug, too.

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

show more ...


Revision tags: v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1
# 79a34e3d 21-Mar-2024 Uros Bizjak <[email protected]>

locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail()

Use atomic_try_cmpxchg_relaxed(*ptr, &old, new) instead of
atomic_cmpxchg_relaxed (*ptr, old, new) == old in xchg_tail().

x86 CMPX

locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail()

Use atomic_try_cmpxchg_relaxed(*ptr, &old, new) instead of
atomic_cmpxchg_relaxed (*ptr, old, new) == old in xchg_tail().

x86 CMPXCHG instruction returns success in ZF flag,
so this change saves a compare after CMPXCHG.

No functional change intended.

Since this code requires NR_CPUS >= 16k, I have tested it
by unconditionally setting _Q_PENDING_BITS to 1 in
<asm-generic/qspinlock_types.h>.

Signed-off-by: Uros Bizjak <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Waiman Long <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3
# 4282494a 05-Jan-2023 Guo Ren <[email protected]>

locking/qspinlock: Micro-optimize pending state waiting for unlock

When we're pending, we only care about lock value. The xchg_tail
wouldn't affect the pending state. That means the hardware thread

locking/qspinlock: Micro-optimize pending state waiting for unlock

When we're pending, we only care about lock value. The xchg_tail
wouldn't affect the pending state. That means the hardware thread
could stay in a sleep state and leaves the rest execution units'
resources of pipeline to other hardware threads. This situation is
the SMT scenarios in the same core. Not an entering low-power state
situation. Of course, the granularity between cores is "cacheline",
but the granularity between SMT hw threads of the same core could
be "byte" which internal LSU handles. For example, when a hw-thread
yields the resources of the core to other hw-threads, this patch
could help the hw-thread stay in the sleep state and prevent it
from being woken up by other hw-threads xchg_tail.

Signed-off-by: Guo Ren <[email protected]>
Signed-off-by: Guo Ren <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Waiman Long <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: Peter Zijlstra <[email protected]>

show more ...


Revision tags: v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1
# 501f7f69 10-Aug-2022 Namhyung Kim <[email protected]>

locking: Add __lockfunc to slow path functions

So that we can skip the functions in the perf lock contention and other
places like /proc/PID/wchan.

Signed-off-by: Namhyung Kim <[email protected]>

locking: Add __lockfunc to slow path functions

So that we can skip the functions in the perf lock contention and other
places like /proc/PID/wchan.

Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Waiman Long <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1
# ee042be1 22-Mar-2022 Namhyung Kim <[email protected]>

locking: Apply contention tracepoints in the slow path

Adding the lock contention tracepoints in various lock function slow
paths. Note that each arch can define spinlock differently, I only
added

locking: Apply contention tracepoints in the slow path

Adding the lock contention tracepoints in various lock function slow
paths. Note that each arch can define spinlock differently, I only
added it only to the generic qspinlock for now.

Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Hyeonggon Yoo <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5
# 05eee619 23-Oct-2019 Zhenzhong Duan <[email protected]>

x86/kvm: Add "nopvspin" parameter to disable PV spinlocks

There are cases where a guest tries to switch spinlocks to bare metal
behavior (e.g. by setting "xen_nopvspin" on XEN platform and
"hv_nopvs

x86/kvm: Add "nopvspin" parameter to disable PV spinlocks

There are cases where a guest tries to switch spinlocks to bare metal
behavior (e.g. by setting "xen_nopvspin" on XEN platform and
"hv_nopvspin" on HYPER_V).

That feature is missed on KVM, add a new parameter "nopvspin" to disable
PV spinlocks for KVM guest.

The new 'nopvspin' parameter will also replace Xen and Hyper-V specific
parameters in future patches.

Define variable nopvsin as global because it will be used in future
patches as above.

Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Vitaly Kuznetsov <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krcmar <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Jim Mattson <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

show more ...


# 57097124 07-Jan-2020 Waiman Long <[email protected]>

locking/qspinlock: Fix inaccessible URL of MCS lock paper

It turns out that the URL of the MCS lock paper listed in the source
code is no longer accessible. I did got question about where the paper

locking/qspinlock: Fix inaccessible URL of MCS lock paper

It turns out that the URL of the MCS lock paper listed in the source
code is no longer accessible. I did got question about where the paper
was. This patch updates the URL to BZ 206115 which contains a copy of
the paper from

https://www.cs.rochester.edu/u/scott/papers/1991_TOCS_synch.pdf

Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3
# c942fddf 27-May-2019 Thomas Gleixner <[email protected]>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157

Based on 3 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of th

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157

Based on 3 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version [author] [kishon] [vijay] [abraham]
[i] [kishon]@[ti] [com] this program is distributed in the hope that
it will be useful but without any warranty without even the implied
warranty of merchantability or fitness for a particular purpose see
the gnu general public license for more details

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version [author] [graeme] [gregory]
[gg]@[slimlogic] [co] [uk] [author] [kishon] [vijay] [abraham] [i]
[kishon]@[ti] [com] [based] [on] [twl6030]_[usb] [c] [author] [hema]
[hk] [hemahk]@[ti] [com] this program is distributed in the hope
that it will be useful but without any warranty without even the
implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 1105 file(s).

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Reviewed-by: Richard Fontana <[email protected]>
Reviewed-by: Kate Stewart <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

show more ...


Revision tags: v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4
# ad53fa10 04-Apr-2019 Waiman Long <[email protected]>

locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs

The percpu event counts used by qspinlock code can be useful for
other locking code as well. So a new set of lockevent_* countin

locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs

The percpu event counts used by qspinlock code can be useful for
other locking code as well. So a new set of lockevent_* counting APIs
is introduced with the lock event names extracted out into the new
lock_events_list.h header file for easier addition in the future.

The existing qstat_inc() calls are replaced by either lockevent_inc() or
lockevent_cond_inc() calls.

The qstat_hop() call is renamed to lockevent_pv_hop(). The "reset_counters"
debugfs file is also renamed to ".reset_counts".

Signed-off-by: Waiman Long <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Acked-by: Davidlohr Bueso <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Will Deacon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0
# 733000c7 25-Feb-2019 Waiman Long <[email protected]>

locking/qspinlock: Remove unnecessary BUG_ON() call

With the > 4 nesting levels case handled by the commit:

d682b596d993 ("locking/qspinlock: Handle > 4 slowpath nesting levels")

the BUG_ON() ca

locking/qspinlock: Remove unnecessary BUG_ON() call

With the > 4 nesting levels case handled by the commit:

d682b596d993 ("locking/qspinlock: Handle > 4 slowpath nesting levels")

the BUG_ON() call in encode_tail() will never actually be triggered.

Remove it.

Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5
# 412f34a8 29-Jan-2019 Waiman Long <[email protected]>

locking/qspinlock_stat: Track the no MCS node available case

Track the number of slowpath locking operations that are being done
without any MCS node available as well renaming lock_index[123] to ma

locking/qspinlock_stat: Track the no MCS node available case

Track the number of slowpath locking operations that are being done
without any MCS node available as well renaming lock_index[123] to make
them more descriptive.

Using these stat counters is one way to find out if a code path is
being exercised.

Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: James Morse <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: SRINIVAS <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Zhenzhong Duan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# d682b596 29-Jan-2019 Waiman Long <[email protected]>

locking/qspinlock: Handle > 4 slowpath nesting levels

Four queue nodes per CPU are allocated to enable up to 4 nesting levels
using the per-CPU nodes. Nested NMIs are possible in some architectures.

locking/qspinlock: Handle > 4 slowpath nesting levels

Four queue nodes per CPU are allocated to enable up to 4 nesting levels
using the per-CPU nodes. Nested NMIs are possible in some architectures.
Still it is very unlikely that we will ever hit more than 4 nested
levels with contention in the slowpath.

When that rare condition happens, however, it is likely that the system
will hang or crash shortly after that. It is not good and we need to
handle this exception case.

This is done by spinning directly on the lock using repeated trylock.
This alternative code path should only be used when there is nested
NMIs. Assuming that the locks used by those NMI handlers will not be
heavily contended, a simple TAS locking should work out.

Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: James Morse <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: SRINIVAS <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Zhenzhong Duan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19
# 0fa809ca 16-Oct-2018 Waiman Long <[email protected]>

locking/pvqspinlock: Extend node size when pvqspinlock is configured

The qspinlock code supports up to 4 levels of slowpath nesting using
four per-CPU mcs_spinlock structures. For 64-bit architectur

locking/pvqspinlock: Extend node size when pvqspinlock is configured

The qspinlock code supports up to 4 levels of slowpath nesting using
four per-CPU mcs_spinlock structures. For 64-bit architectures, they
fit nicely in one 64-byte cacheline.

For para-virtualized (PV) qspinlocks it needs to store more information
in the per-CPU node structure than there is space for. It uses a trick
to use a second cacheline to hold the extra information that it needs.
So PV qspinlock needs to access two extra cachelines for its information
whereas the native qspinlock code only needs one extra cacheline.

Freshly added counter profiling of the qspinlock code, however, revealed
that it was very rare to use more than two levels of slowpath nesting.
So it doesn't make sense to penalize PV qspinlock code in order to have
four mcs_spinlock structures in the same cacheline to optimize for a case
in the native qspinlock code that rarely happens.

Extend the per-CPU node structure to have two more long words when PV
qspinlock locks are configured to hold the extra data that it needs.

As a result, the PV qspinlock code will enjoy the same benefit of using
just one extra cacheline like the native counterpart, for most cases.

[ mingo: Minor changelog edits. ]

Signed-off-by: Waiman Long <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# 1222109a 16-Oct-2018 Waiman Long <[email protected]>

locking/qspinlock_stat: Count instances of nested lock slowpaths

Queued spinlock supports up to 4 levels of lock slowpath nesting -
user context, soft IRQ, hard IRQ and NMI. However, we are not sure

locking/qspinlock_stat: Count instances of nested lock slowpaths

Queued spinlock supports up to 4 levels of lock slowpath nesting -
user context, soft IRQ, hard IRQ and NMI. However, we are not sure how
often the nesting happens.

So add 3 more per-CPU stat counters to track the number of instances where
nesting index goes to 1, 2 and 3 respectively.

On a dual-socket 64-core 128-thread Zen server, the following were the
new stat counter values under different circumstances:

State slowpath index1 index2 index3
----- -------- ------ ------ -------
After bootup 1,012,150 82 0 0
After parallel build + perf-top 125,195,009 82 0 0

So the chance of having more than 2 levels of nesting is extremely low.

[ mingo: Minor changelog edits. ]

Signed-off-by: Waiman Long <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v4.19-rc8, v4.19-rc7, v4.19-rc6
# 7aa54be2 26-Sep-2018 Peter Zijlstra <[email protected]>

locking/qspinlock, x86: Provide liveness guarantee

On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
wit

locking/qspinlock, x86: Provide liveness guarantee

On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
with a composite operation: tas-pending + load.

Using two instructions of course opens a window we previously did not
have. Consider the scenario:

CPU0 CPU1 CPU2

1) lock
trylock -> (0,0,1)

2) lock
trylock /* fail */

3) unlock -> (0,0,0)

4) lock
trylock -> (0,0,1)

5) tas-pending -> (0,1,1)
load-val <- (0,1,0) from 3

6) clear-pending-set-locked -> (0,0,1)

FAIL: _2_ owners

where 5) is our new composite operation. When we consider each part of
the qspinlock state as a separate variable (as we can when
_Q_PENDING_BITS == 8) then the above is entirely possible, because
tas-pending will only RmW the pending byte, so the later load is able
to observe prior tail and lock state (but not earlier than its own
trylock, which operates on the whole word, due to coherence).

To avoid this we need 2 things:

- the load must come after the tas-pending (obviously, otherwise it
can trivially observe prior state).

- the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
example, such that we cannot observe other state prior to setting
pending.

On x86 we can realize this by using "LOCK BTS m32, r32" for
tas-pending followed by a regular load.

Note that observing later state is not a problem:

- if we fail to observe a later unlock, we'll simply spin-wait for
that store to become visible.

- if we observe a later xchg_tail(), there is no difference from that
xchg_tail() having taken place before the tas-pending.

Suggested-by: Will Deacon <[email protected]>
Reported-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Will Deacon <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# 756b1df4 26-Sep-2018 Peter Zijlstra <[email protected]>

locking/qspinlock: Rework some comments

While working my way through the code again; I felt the comments could
use help.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Will

locking/qspinlock: Rework some comments

While working my way through the code again; I felt the comments could
use help.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# 53bf57fa 26-Sep-2018 Peter Zijlstra <[email protected]>

locking/qspinlock: Re-order code

Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL)
such that we loose the indent. This also result in a more natural code
flow IMO.

Signed-off-

locking/qspinlock: Re-order code

Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL)
such that we loose the indent. This also result in a more natural code
flow IMO.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3
# 81d3dc9a 26-Apr-2018 Waiman Long <[email protected]>

locking/qspinlock: Add stat tracking for pending vs. slowpath

Currently, the qspinlock_stat code tracks only statistical counts in the
PV qspinlock code. However, it may also be useful to track the

locking/qspinlock: Add stat tracking for pending vs. slowpath

Currently, the qspinlock_stat code tracks only statistical counts in the
PV qspinlock code. However, it may also be useful to track the number
of locking operations done via the pending code vs. the MCS lock queue
slowpath for the non-PV case.

The qspinlock stat code is modified to do that. The stat counter
pv_lock_slowpath is renamed to lock_slowpath so that it can be used by
both the PV and non-PV cases.

Signed-off-by: Waiman Long <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Waiman Long <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# ae75d908 26-Apr-2018 Will Deacon <[email protected]>

locking/qspinlock: Use try_cmpxchg() instead of cmpxchg() when locking

When reaching the head of an uncontended queue on the qspinlock slow-path,
using a try_cmpxchg() instead of a cmpxchg() operati

locking/qspinlock: Use try_cmpxchg() instead of cmpxchg() when locking

When reaching the head of an uncontended queue on the qspinlock slow-path,
using a try_cmpxchg() instead of a cmpxchg() operation to transition the
lock work to _Q_LOCKED_VAL generates slightly better code for x86 and
pretty much identical code for arm64.

Reported-by: Peter Zijlstra <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Waiman Long <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# 9d4646d1 26-Apr-2018 Will Deacon <[email protected]>

locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb()

The qspinlock slowpath must ensure that the MCS node is fully initialised
before it can be reached by another other CPU. This

locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb()

The qspinlock slowpath must ensure that the MCS node is fully initialised
before it can be reached by another other CPU. This is currently enforced
by using a RELEASE operation when updating the tail and also when linking
the node into the waitqueue, since the control dependency off xchg_tail
is insufficient to enforce sufficient ordering, see:

95bcade33a8a ("locking/qspinlock: Ensure node is initialised before updating prev->next")

Back-to-back RELEASE operations may be expensive on some architectures,
particularly those that implement them using fences under the hood. We
can replace the two RELEASE operations with a single smp_wmb() fence and
use RELAXED operations for the subsequent publishing of the node.

Signed-off-by: Will Deacon <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Waiman Long <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# c131a198 26-Apr-2018 Will Deacon <[email protected]>

locking/qspinlock: Use smp_cond_load_relaxed() to wait for next node

When a locker reaches the head of the queue and takes the lock, a
concurrent locker may enqueue and force the lock holder to spin

locking/qspinlock: Use smp_cond_load_relaxed() to wait for next node

When a locker reaches the head of the queue and takes the lock, a
concurrent locker may enqueue and force the lock holder to spin
whilst its node->next field is initialised. Rather than open-code
a READ_ONCE/cpu_relax() loop, this can be implemented using
smp_cond_load_relaxed() instead.

Signed-off-by: Will Deacon <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Waiman Long <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# f9c811fa 26-Apr-2018 Will Deacon <[email protected]>

locking/qspinlock: Use atomic_cond_read_acquire()

Rather than dig into the counter field of the atomic_t inside the
qspinlock structure so that we can call smp_cond_load_acquire(), use
atomic_cond_r

locking/qspinlock: Use atomic_cond_read_acquire()

Rather than dig into the counter field of the atomic_t inside the
qspinlock structure so that we can call smp_cond_load_acquire(), use
atomic_cond_read_acquire() instead, which operates on the atomic_t
directly.

Signed-off-by: Will Deacon <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Waiman Long <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# c61da58d 26-Apr-2018 Will Deacon <[email protected]>

locking/qspinlock: Kill cmpxchg() loop when claiming lock from head of queue

When a queued locker reaches the head of the queue, it claims the lock
by setting _Q_LOCKED_VAL in the lockword. If there

locking/qspinlock: Kill cmpxchg() loop when claiming lock from head of queue

When a queued locker reaches the head of the queue, it claims the lock
by setting _Q_LOCKED_VAL in the lockword. If there isn't contention, it
must also clear the tail as part of this operation so that subsequent
lockers can avoid taking the slowpath altogether.

Currently this is expressed as a cmpxchg() loop that practically only
runs up to two iterations. This is confusing to the reader and unhelpful
to the compiler. Rewrite the cmpxchg() loop without the loop, so that a
failed cmpxchg() implies that there is contention and we just need to
write to _Q_LOCKED_VAL without considering the rest of the lockword.

Signed-off-by: Will Deacon <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Waiman Long <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# 59fb586b 26-Apr-2018 Will Deacon <[email protected]>

locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath

The qspinlock locking slowpath utilises a "pending" bit as a simple form
of an embedded test-and-set lock that can avoid the

locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath

The qspinlock locking slowpath utilises a "pending" bit as a simple form
of an embedded test-and-set lock that can avoid the overhead of explicit
queuing in cases where the lock is held but uncontended. This bit is
managed using a cmpxchg() loop which tries to transition the uncontended
lock word from (0,0,0) -> (0,0,1) or (0,0,1) -> (0,1,1).

Unfortunately, the cmpxchg() loop is unbounded and lockers can be starved
indefinitely if the lock word is seen to oscillate between unlocked
(0,0,0) and locked (0,0,1). This could happen if concurrent lockers are
able to take the lock in the cmpxchg() loop without queuing and pass it
around amongst themselves.

This patch fixes the problem by unconditionally setting _Q_PENDING_VAL
using atomic_fetch_or, and then inspecting the old value to see whether
we need to spin on the current lock owner, or whether we now effectively
hold the lock. The tricky scenario is when concurrent lockers end up
queuing on the lock and the lock becomes available, causing us to see
a lockword of (n,0,0). With pending now set, simply queuing could lead
to deadlock as the head of the queue may not have observed the pending
flag being cleared. Conversely, if the head of the queue did observe
pending being cleared, then it could transition the lock from (n,0,0) ->
(0,0,1) meaning that any attempt to "undo" our setting of the pending
bit could race with a concurrent locker trying to set it.

We handle this race by preserving the pending bit when taking the lock
after reaching the head of the queue and leaving the tail entry intact
if we saw pending set, because we know that the tail is going to be
updated shortly.

Signed-off-by: Will Deacon <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Waiman Long <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


123