|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1 |
|
| #
fa1f5116 |
| 19-Mar-2024 |
Sebastian Andrzej Siewior <[email protected]> |
locking: Make rwsem_assert_held_write_nolockdep() build with PREEMPT_RT=y
The commit cited below broke the build for PREEMPT_RT because rwsem_assert_held_write_nolockdep() passes a struct rw_semapho
locking: Make rwsem_assert_held_write_nolockdep() build with PREEMPT_RT=y
The commit cited below broke the build for PREEMPT_RT because rwsem_assert_held_write_nolockdep() passes a struct rw_semaphore but rw_base_assert_held_write() expects struct rwbase_rt. Fixing the type alone leads to the problem that WARN_ON() is not found because bug.h is missing.
In order to resolve this:
- Keep the assert (WARN_ON()) in rwsem.h (not rwbase_rt.h)
- Make rwsem_assert_held_write_nolockdep() do the implementation specific (rw_base) writer check.
- Replace the "inline" with __always_inline which was used before.
Fixes: f70405afc99b1 ("locking: Add rwsem_assert_held() and rwsem_assert_held_write()") Reported-by: Clark Williams <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Waiman Long <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.8, v6.8-rc7, v6.8-rc6 |
|
| #
f70405af |
| 19-Feb-2024 |
Matthew Wilcox (Oracle) <[email protected]> |
locking: Add rwsem_assert_held() and rwsem_assert_held_write()
Modelled after lockdep_assert_held() and lockdep_assert_held_write(), but are always active, even when lockdep is disabled. Of course,
locking: Add rwsem_assert_held() and rwsem_assert_held_write()
Modelled after lockdep_assert_held() and lockdep_assert_held_write(), but are always active, even when lockdep is disabled. Of course, they don't test that _this_ thread is the owner, but it's sufficient to catch many bugs and doesn't incur the same performance penalty as lockdep.
Acked-by: "Peter Zijlstra (Intel)" <[email protected]> Acked-by: Waiman Long <[email protected]> Acked-by: "Darrick J. Wong" <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: "Matthew Wilcox (Oracle)" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2 |
|
| #
e4ab322f |
| 17-Sep-2023 |
Peter Zijlstra <[email protected]> |
cleanup: Add conditional guard support
Adds:
- DEFINE_GUARD_COND() / DEFINE_LOCK_GUARD_1_COND() to extend existing guards with conditional lock primitives, eg. mutex_trylock(), mutex_lock_in
cleanup: Add conditional guard support
Adds:
- DEFINE_GUARD_COND() / DEFINE_LOCK_GUARD_1_COND() to extend existing guards with conditional lock primitives, eg. mutex_trylock(), mutex_lock_interruptible().
nb. both primitives allow NULL 'locks', which cause the lock to fail (obviously).
- extends scoped_guard() to not take the body when the the conditional guard 'fails'. eg.
scoped_guard (mutex_intr, &task->signal_cred_guard_mutex) { ... }
will only execute the body when the mutex is held.
- provides scoped_cond_guard(name, fail, args...); which extends scoped_guard() to do fail when the lock-acquire fails.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/20231102110706.460851167%40infradead.org
show more ...
|
|
Revision tags: v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4 |
|
| #
54da6a09 |
| 26-May-2023 |
Peter Zijlstra <[email protected]> |
locking: Introduce __cleanup() based infrastructure
Use __attribute__((__cleanup__(func))) to build:
- simple auto-release pointers using __free()
- 'classes' with constructor and destructor sem
locking: Introduce __cleanup() based infrastructure
Use __attribute__((__cleanup__(func))) to build:
- simple auto-release pointers using __free()
- 'classes' with constructor and destructor semantics for scope-based resource management.
- lock guards based on the above classes.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/20230612093537.614161713%40infradead.org
show more ...
|
|
Revision tags: v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1 |
|
| #
d72d84ae |
| 14-Jan-2022 |
Guchun Chen <[email protected]> |
locking/rwsem: drop redundant semicolon of down_write_nest_lock
Otherwise, braces are needed when using it.
Signed-off-by: Guchun Chen <[email protected]> Acked-by: Christian König <christian.koe
locking/rwsem: drop redundant semicolon of down_write_nest_lock
Otherwise, braces are needed when using it.
Signed-off-by: Guchun Chen <[email protected]> Acked-by: Christian König <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Christian König <[email protected]>
show more ...
|
|
Revision tags: v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1 |
|
| #
f5d80614 |
| 09-Nov-2021 |
Andy Shevchenko <[email protected]> |
kernel.h: drop unneeded <linux/kernel.h> inclusion from other headers
Patch series "kernel.h further split", v5.
kernel.h is a set of something which is not related to each other and often used in
kernel.h: drop unneeded <linux/kernel.h> inclusion from other headers
Patch series "kernel.h further split", v5.
kernel.h is a set of something which is not related to each other and often used in non-crossed compilation units, especially when drivers need only one or two macro definitions from it.
This patch (of 7):
There is no evidence we need kernel.h inclusion in certain headers. Drop unneeded <linux/kernel.h> inclusion from other headers.
[[email protected]: bottom_half.h needs kernel] Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Will Deacon <[email protected]> Cc: Waiman Long <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Laurent Pinchart <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Thorsten Leemhuis <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1 |
|
| #
15eb7c88 |
| 31-Aug-2021 |
Mike Galbraith <[email protected]> |
locking/rwsem: Add missing __init_rwsem() for PREEMPT_RT
730633f0b7f95 became the first direct caller of __init_rwsem() vs the usual init_rwsem(), exposing PREEMPT_RT's lack thereof. Add it.
[ tgl
locking/rwsem: Add missing __init_rwsem() for PREEMPT_RT
730633f0b7f95 became the first direct caller of __init_rwsem() vs the usual init_rwsem(), exposing PREEMPT_RT's lack thereof. Add it.
[ tglx: Move it out of line ]
Signed-off-by: Mike Galbraith <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.14, v5.14-rc7, v5.14-rc6 |
|
| #
42254105 |
| 15-Aug-2021 |
Thomas Gleixner <[email protected]> |
locking/rwsem: Add rtmutex based R/W semaphore implementation
The RT specific R/W semaphore implementation used to restrict the number of readers to one, because a writer cannot block on multiple re
locking/rwsem: Add rtmutex based R/W semaphore implementation
The RT specific R/W semaphore implementation used to restrict the number of readers to one, because a writer cannot block on multiple readers and inherit its priority or budget.
The single reader restricting was painful in various ways:
- Performance bottleneck for multi-threaded applications in the page fault path (mmap sem)
- Progress blocker for drivers which are carefully crafted to avoid the potential reader/writer deadlock in mainline.
The analysis of the writer code paths shows that properly written RT tasks should not take them. Syscalls like mmap(), file access which take mmap sem write locked have unbound latencies, which are completely unrelated to mmap sem. Other R/W sem users like graphics drivers are not suitable for RT tasks either.
So there is little risk to hurt RT tasks when the RT rwsem implementation is done in the following way:
- Allow concurrent readers
- Make writers block until the last reader left the critical section. This blocking is not subject to priority/budget inheritance.
- Readers blocked on a writer inherit their priority/budget in the normal way.
There is a drawback with this scheme: R/W semaphores become writer unfair though the applications which have triggered writer starvation (mostly on mmap_sem) in the past are not really the typical workloads running on a RT system. So while it's unlikely to hit writer starvation, it's possible. If there are unexpected workloads on RT systems triggering it, the problem has to be revisited.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5 |
|
| #
e2db7592 |
| 22-Mar-2021 |
Ingo Molnar <[email protected]> |
locking: Fix typos in comments
Fix ~16 single-word typos in locking code comments.
Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Paul E. McKenney <paulmck@k
locking: Fix typos in comments
Fix ~16 single-word typos in locking code comments.
Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7 |
|
| #
31784cff |
| 03-Dec-2020 |
Eric W. Biederman <[email protected]> |
rwsem: Implement down_read_interruptible
In preparation for converting exec_update_mutex to a rwsem so that multiple readers can execute in parallel and not deadlock, add down_read_interruptible. T
rwsem: Implement down_read_interruptible
In preparation for converting exec_update_mutex to a rwsem so that multiple readers can execute in parallel and not deadlock, add down_read_interruptible. This is needed for perf_event_open to be converted (with no semantic changes) from working on a mutex to wroking on a rwsem.
Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
| #
0f9368b5 |
| 03-Dec-2020 |
Eric W. Biederman <[email protected]> |
rwsem: Implement down_read_killable_nested
In preparation for converting exec_update_mutex to a rwsem so that multiple readers can execute in parallel and not deadlock, add down_read_killable_nested
rwsem: Implement down_read_killable_nested
In preparation for converting exec_update_mutex to a rwsem so that multiple readers can execute in parallel and not deadlock, add down_read_killable_nested. This is needed so that kcmp_lock can be converted from working on a mutexes to working on rw_semaphores.
Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5 |
|
| #
a9232dc5 |
| 11-Jul-2020 |
Alexey Dobriyan <[email protected]> |
rwsem: fix commas in initialisation
Leading comma prevents arbitrary reordering of initialisation clauses. The whole point of C99 initialisation is to allow any such reordering.
Signed-off-by: Alex
rwsem: fix commas in initialisation
Leading comma prevents arbitrary reordering of initialisation clauses. The whole point of C99 initialisation is to allow any such reordering.
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7 |
|
| #
de8f5e4f |
| 21-Mar-2020 |
Peter Zijlstra <[email protected]> |
lockdep: Introduce wait-type checks
Extend lockdep to validate lock wait-type context.
The current wait-types are:
LD_WAIT_FREE, /* wait free, rcu etc.. */ LD_WAIT_SPIN, /* spin loops, raw_spi
lockdep: Introduce wait-type checks
Extend lockdep to validate lock wait-type context.
The current wait-types are:
LD_WAIT_FREE, /* wait free, rcu etc.. */ LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */ LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */ LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */
Where lockdep validates that the current lock (the one being acquired) fits in the current wait-context (as generated by the held stack).
This ensures that there is no attempt to acquire mutexes while holding spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In other words, its a more fancy might_sleep().
Obviously RCU made the entire ordeal more complex than a simple single value test because RCU can be acquired in (pretty much) any context and while it presents a context to nested locks it is not the same as it got acquired in.
Therefore its necessary to split the wait_type into two values, one representing the acquire (outer) and one representing the nested context (inner). For most 'normal' locks these two are the same.
[ To make static initialization easier we have the rule that: .outer == INV means .outer == .inner; because INV == 0. ]
It further means that its required to find the minimal .inner of the held stack to compare against the outer of the new lock; because while 'normal' RCU presents a CONFIG type to nested locks, if it is taken while already holding a SPIN type it obviously doesn't relax the rules.
Below is an example output generated by the trivial test code:
raw_spin_lock(&foo); spin_lock(&bar); spin_unlock(&bar); raw_spin_unlock(&foo);
[ BUG: Invalid wait context ] ----------------------------- swapper/0/1 is trying to lock: ffffc90000013f20 (&bar){....}-{3:3}, at: kernel_init+0xdb/0x187 other info that might help us debug this: 1 lock held by swapper/0/1: #0: ffffc90000013ee0 (&foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187
The way to read it is to look at the new -{n,m} part in the lock description; -{3:3} for the attempted lock, and try and match that up to the held locks, which in this case is the one: -{2,2}.
This tells that the acquiring lock requires a more relaxed environment than presented by the lock stack.
Currently only the normal locks and RCU are converted, the rest of the lockdep users defaults to .inner = INV which is ignored. More conversions can be done when desired.
The check for spinlock_t nesting is not enabled by default. It's a separate config option for now as there are known problems which are currently addressed. The config option allows to identify these problems and to verify that the solutions found are indeed solving them.
The config switch will be removed and the checks will permanently enabled once the vast majority of issues has been addressed.
[ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP] [ tglx: Add the config option ]
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1 |
|
| #
bcba67cd |
| 04-Feb-2020 |
Peter Zijlstra <[email protected]> |
locking/rwsem: Remove RWSEM_OWNER_UNKNOWN
Remove the now unused RWSEM_OWNER_UNKNOWN hack. This hack breaks PREEMPT_RT and getting rid of it was the entire motivation for re-writing the percpu rwsem.
locking/rwsem: Remove RWSEM_OWNER_UNKNOWN
Remove the now unused RWSEM_OWNER_UNKNOWN hack. This hack breaks PREEMPT_RT and getting rid of it was the entire motivation for re-writing the percpu rwsem.
The biggest problem is that it is fundamentally incompatible with any form of Priority Inheritance, any exclusively held lock must have a distinct owner.
Requested-by: Christoph Hellwig <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Waiman Long <[email protected]> Tested-by: Juri Lelli <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3 |
|
| #
fce45cd4 |
| 29-Jul-2019 |
Davidlohr Bueso <[email protected]> |
locking/rwsem: Check for operations on an uninitialized rwsem
Currently rwsems is the only locking primitive that lacks this debug feature. Add it under CONFIG_DEBUG_RWSEMS and do the magic checking
locking/rwsem: Check for operations on an uninitialized rwsem
Currently rwsems is the only locking primitive that lacks this debug feature. Add it under CONFIG_DEBUG_RWSEMS and do the magic checking in the locking fastpath (trylock) operation such that we cover all cases. The unlocking part is pretty straightforward.
Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Cc: [email protected] Cc: Davidlohr Bueso <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5 |
|
| #
387b1468 |
| 10-Apr-2019 |
Mauro Carvalho Chehab <[email protected]> |
docs: locking: convert docs to ReST and rename to *.rst
Convert the locking documents to ReST and add them to the kernel development book where it belongs.
Most of the stuff here is just to make Sp
docs: locking: convert docs to ReST and rename to *.rst
Convert the locking documents to ReST and add them to the kernel development book where it belongs.
Most of the stuff here is just to make Sphinx to properly parse the text file, as they're already in good shape, not requiring massive changes in order to be parsed.
The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups.
At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <[email protected]> Acked-by: Federico Vaga <[email protected]>
show more ...
|
| #
94a9717b |
| 20-May-2019 |
Waiman Long <[email protected]> |
locking/rwsem: Make rwsem->owner an atomic_long_t
The rwsem->owner contains not just the task structure pointer, it also holds some flags for storing the current state of the rwsem. Some of the flag
locking/rwsem: Make rwsem->owner an atomic_long_t
The rwsem->owner contains not just the task structure pointer, it also holds some flags for storing the current state of the rwsem. Some of the flags may have to be atomically updated. To reflect the new reality, the owner is now changed to an atomic_long_t type.
New helper functions are added to properly separate out the task structure pointer and the embedded flags.
Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Cc: Will Deacon <[email protected]> Cc: huang ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
02f1082b |
| 20-May-2019 |
Waiman Long <[email protected]> |
locking/rwsem: Clarify usage of owner's nonspinaable bit
Bit 1 of sem->owner (RWSEM_ANONYMOUSLY_OWNED) is used to designate an anonymous owner - readers or an anonymous writer. The setting of this a
locking/rwsem: Clarify usage of owner's nonspinaable bit
Bit 1 of sem->owner (RWSEM_ANONYMOUSLY_OWNED) is used to designate an anonymous owner - readers or an anonymous writer. The setting of this anonymous bit is used as an indicator that optimistic spinning cannot be done on this rwsem.
With the upcoming reader optimistic spinning patches, a reader-owned rwsem can be spinned on for a limit period of time. We still need this bit to indicate a rwsem is nonspinnable, but not setting this bit loses its meaning that the owner is known. So rename the bit to RWSEM_NONSPINNABLE to clarify its meaning.
This patch also fixes a DEBUG_RWSEMS_WARN_ON() bug in __up_write().
Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Cc: Will Deacon <[email protected]> Cc: huang ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
c71fd893 |
| 20-May-2019 |
Waiman Long <[email protected]> |
locking/rwsem: Make owner available even if !CONFIG_RWSEM_SPIN_ON_OWNER
The owner field in the rw_semaphore structure is used primarily for optimistic spinning. However, identifying the rwsem owner
locking/rwsem: Make owner available even if !CONFIG_RWSEM_SPIN_ON_OWNER
The owner field in the rw_semaphore structure is used primarily for optimistic spinning. However, identifying the rwsem owner can also be helpful in debugging as well as tracing locking related issues when analyzing crash dump. The owner field may also store state information that can be important to the operation of the rwsem.
So the owner field is now made a permanent member of the rw_semaphore structure irrespective of CONFIG_RWSEM_SPIN_ON_OWNER.
Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Cc: Will Deacon <[email protected]> Cc: huang ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc4 |
|
| #
364f784f |
| 04-Apr-2019 |
Waiman Long <[email protected]> |
locking/rwsem: Optimize rwsem structure for uncontended lock acquisition
For an uncontended rwsem, count and owner are the only fields a task needs to touch when acquiring the rwsem. So they are put
locking/rwsem: Optimize rwsem structure for uncontended lock acquisition
For an uncontended rwsem, count and owner are the only fields a task needs to touch when acquiring the rwsem. So they are put next to each other to increase the chance that they will share the same cacheline.
On a ThunderX2 99xx (arm64) system with 32K L1 cache and 256K L2 cache, a rwsem locking microbenchmark with one locking thread was run to write-lock and write-unlock an array of rwsems separated 2 cachelines apart in a 1M byte memory block. The locking rates (kops/s) of the microbenchmark when the rwsems are at various "long" (8-byte) offsets from beginning of the cacheline before and after the patch were as follows:
Cacheline Offset Pre-patch Post-patch ---------------- --------- ---------- 0 17,449 16,588 1 17,450 16,465 2 17,450 16,460 3 17,453 16,462 4 14,867 16,471 5 14,867 16,470 6 14,853 16,464 7 14,867 13,172
Before the patch, the count and owner are 4 "long"s apart. After the patch, they are only 1 "long" apart.
The rwsem data have to be loaded from the L3 cache for each access. It can be seen that the locking rates are more consistent after the patch than before. Note that for this particular system, the performance drop happens whenever the count and owner are at an odd multiples of "long"s apart. No performance drop was observed when only a single rwsem was used (hot cache). So the drop is likely just an idiosyncrasy of the cache architecture of this chip than an inherent problem with the patch.
Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Waiman Long <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
12a30a7f |
| 04-Apr-2019 |
Waiman Long <[email protected]> |
locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.h
We don't need to expose rwsem internal functions which are not supposed to be called directly from other kernel code.
Signed
locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.h
We don't need to expose rwsem internal functions which are not supposed to be called directly from other kernel code.
Signed-off-by: Waiman Long <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc3, v5.1-rc2 |
|
| #
390a0c62 |
| 22-Mar-2019 |
Waiman Long <[email protected]> |
locking/rwsem: Remove rwsem-spinlock.c & use rwsem-xadd.c for all archs
Currently, we have two different implementation of rwsem:
1) CONFIG_RWSEM_GENERIC_SPINLOCK (rwsem-spinlock.c) 2) CONFIG_RWS
locking/rwsem: Remove rwsem-spinlock.c & use rwsem-xadd.c for all archs
Currently, we have two different implementation of rwsem:
1) CONFIG_RWSEM_GENERIC_SPINLOCK (rwsem-spinlock.c) 2) CONFIG_RWSEM_XCHGADD_ALGORITHM (rwsem-xadd.c)
As we are going to use a single generic implementation for rwsem-xadd.c and no architecture-specific code will be needed, there is no point in keeping two different implementations of rwsem. In most cases, the performance of rwsem-spinlock.c will be worse. It also doesn't get all the performance tuning and optimizations that had been implemented in rwsem-xadd.c over the years.
For simplication, we are going to remove rwsem-spinlock.c and make all architectures use a single implementation of rwsem - rwsem-xadd.c.
All references to RWSEM_GENERIC_SPINLOCK and RWSEM_XCHGADD_ALGORITHM in the code are removed.
Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
| #
46ad0840 |
| 22-Mar-2019 |
Waiman Long <[email protected]> |
locking/rwsem: Remove arch specific rwsem files
As the generic rwsem-xadd code is using the appropriate acquire and release versions of the atomic operations, the arch specific rwsem.h files will no
locking/rwsem: Remove arch specific rwsem files
As the generic rwsem-xadd code is using the appropriate acquire and release versions of the atomic operations, the arch specific rwsem.h files will not be that much faster than the generic code as long as the atomic functions are properly implemented. So we can remove those arch specific rwsem.h and stop building asm/rwsem.h to reduce maintenance effort.
Currently, only x86, alpha and ia64 have implemented architecture specific fast paths. I don't have access to alpha and ia64 systems for testing, but they are legacy systems that are not likely to be updated to the latest kernel anyway.
By using a rwsem microbenchmark, the total locking rates on a 4-socket 56-core 112-thread x86-64 system before and after the patch were as follows (mixed means equal # of read and write locks):
Before Patch After Patch # of Threads wlock rlock mixed wlock rlock mixed ------------ ----- ----- ----- ----- ----- ----- 1 29,201 30,143 29,458 28,615 30,172 29,201 2 6,807 13,299 1,171 7,725 15,025 1,804 4 6,504 12,755 1,520 7,127 14,286 1,345 8 6,762 13,412 764 6,826 13,652 726 16 6,693 15,408 662 6,599 15,938 626 32 6,145 15,286 496 5,549 15,487 511 64 5,812 15,495 60 5,858 15,572 60
There were some run-to-run variations for the multi-thread tests. For x86-64, using the generic C code fast path seems to be a little bit faster than the assembly version with low lock contention. Looking at the assembly version of the fast paths, there are assembly to/from C code wrappers that save and restore all the callee-clobbered registers (7 registers on x86-64). The assembly generated from the generic C code doesn't need to do that. That may explain the slight performance gain here.
The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h with no code change as no other code other than those under kernel/locking needs to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3 |
|
| #
925b9cd1 |
| 06-Sep-2018 |
Waiman Long <[email protected]> |
locking/rwsem: Make owner store task pointer of last owning reader
Currently, when a reader acquires a lock, it only sets the RWSEM_READER_OWNED bit in the owner field. The other bits are simply not
locking/rwsem: Make owner store task pointer of last owning reader
Currently, when a reader acquires a lock, it only sets the RWSEM_READER_OWNED bit in the owner field. The other bits are simply not used. When debugging hanging cases involving rwsems and readers, the owner value does not provide much useful information at all.
This patch modifies the current behavior to always store the task_struct pointer of the last rwsem-acquiring reader in a reader-owned rwsem. This may be useful in debugging rwsem hanging cases especially if only one reader is involved. However, the task in the owner field may not the real owner or one of the real owners at all when the owner value is examined, for example, in a crash dump. So it is just an additional hint about the past history.
If CONFIG_DEBUG_RWSEMS=y is enabled, the owner field will be checked at unlock time too to make sure the task pointer value is valid. That does have a slight performance cost and so is only enabled as part of that debug option.
From the performance point of view, it is expected that the changes shouldn't have any noticeable performance impact. A rwsem microbenchmark (with 48 worker threads and 1:1 reader/writer ratio) was ran on a 2-socket 24-core 48-thread Haswell system. The locking rates on a 4.19-rc1 based kernel were as follows:
1) Unpatched kernel: 543.3 kops/s 2) Patched kernel: 549.2 kops/s 3) Patched kernel (CONFIG_DEBUG_RWSEMS on): 546.6 kops/s
There was actually a slight increase in performance (1.1%) in this particular case. Maybe it was caused by the elimination of a branch or just a testing noise. Turning on the CONFIG_DEBUG_RWSEMS option also had less than the expected impact on performance.
The least significant 2 bits of the owner value are now used to designate the rwsem is readers owned and the owners are anonymous.
Signed-off-by: Waiman Long <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6 |
|
| #
5a817641 |
| 15-May-2018 |
Waiman Long <[email protected]> |
locking/percpu-rwsem: Annotate rwsem ownership transfer by setting RWSEM_OWNER_UNKNOWN
The filesystem freezing code needs to transfer ownership of a rwsem embedded in a percpu-rwsem from the task th
locking/percpu-rwsem: Annotate rwsem ownership transfer by setting RWSEM_OWNER_UNKNOWN
The filesystem freezing code needs to transfer ownership of a rwsem embedded in a percpu-rwsem from the task that does the freezing to another one that does the thawing by calling percpu_rwsem_release() after freezing and percpu_rwsem_acquire() before thawing.
However, the new rwsem debug code runs afoul with this scheme by warning that the task that releases the rwsem isn't the one that acquires it, as reported by Amir Goldstein:
DEBUG_LOCKS_WARN_ON(sem->owner != get_current()) WARNING: CPU: 1 PID: 1401 at /home/amir/build/src/linux/kernel/locking/rwsem.c:133 up_write+0x59/0x79
Call Trace: percpu_up_write+0x1f/0x28 thaw_super_locked+0xdf/0x120 do_vfs_ioctl+0x270/0x5f1 ksys_ioctl+0x52/0x71 __x64_sys_ioctl+0x16/0x19 do_syscall_64+0x5d/0x167 entry_SYSCALL_64_after_hwframe+0x49/0xbe
To work properly with the rwsem debug code, we need to annotate that the rwsem ownership is unknown during the tranfer period until a brave soul comes forward to acquire the ownership. During that period, optimistic spinning will be disabled.
Reported-by: Amir Goldstein <[email protected]> Tested-by: Amir Goldstein <[email protected]> Signed-off-by: Waiman Long <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Jan Kara <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Theodore Y. Ts'o <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|