|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13 |
|
| #
9d4f8e54 |
| 14-Jan-2025 |
Herbert Xu <[email protected]> |
rhashtable: Fix rhashtable_try_insert test
The test on whether rhashtable_insert_one did an insertion relies on the value returned by rhashtable_lookup_one. Unfortunately that value is overwritten
rhashtable: Fix rhashtable_try_insert test
The test on whether rhashtable_insert_one did an insertion relies on the value returned by rhashtable_lookup_one. Unfortunately that value is overwritten after rhashtable_insert_one returns. Fix this by moving the test before data gets overwritten.
Simplify the test as only data == NULL matters.
Finally move atomic_inc back within the lock as otherwise it may be reordered with the atomic_dec on the removal side, potentially leading to an underflow.
Reported-by: Michael Kelley <[email protected]> Fixes: e1d3422c95f0 ("rhashtable: Fix potential deadlock by moving schedule_work outside lock") Signed-off-by: Herbert Xu <[email protected]> Tested-by: Michael Kelley <[email protected]> Reviewed-by: Breno Leitao <[email protected]> Tested-by: Mikhail Zaslonko <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1 |
|
| #
f3a6101b |
| 23-Nov-2024 |
Pratyush Mittal <[email protected]> |
lib/rhashtable: fix the typo for preemptible
Fix the spelling of the mis-spelled word
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Pratyush Mittal
lib/rhashtable: fix the typo for preemptible
Fix the spelling of the mis-spelled word
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Pratyush Mittal <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
e1d3422c |
| 28-Nov-2024 |
Breno Leitao <[email protected]> |
rhashtable: Fix potential deadlock by moving schedule_work outside lock
Move the hash table growth check and work scheduling outside the rht lock to prevent a possible circular locking dependency.
rhashtable: Fix potential deadlock by moving schedule_work outside lock
Move the hash table growth check and work scheduling outside the rht lock to prevent a possible circular locking dependency.
The original implementation could trigger a lockdep warning due to a potential deadlock scenario involving nested locks between rhashtable bucket, rq lock, and dsq lock. By relocating the growth check and work scheduling after releasing the rth lock, we break this potential deadlock chain.
This change expands the flexibility of rhashtable by removing restrictive locking that previously limited its use in scheduler and workqueue contexts.
Import to say that this calls rht_grow_above_75(), which reads from struct rhashtable without holding the lock, if this is a problem, we can move the check to the lock, and schedule the workqueue after the lock.
Fixes: f0e1a0643a59 ("sched_ext: Implement BPF extensible scheduler class") Suggested-by: Tejun Heo <[email protected]> Signed-off-by: Breno Leitao <[email protected]>
Modified so that atomic_inc is also moved outside of the bucket lock along with the growth above 75% check.
Signed-off-by: Herbert Xu <[email protected]>
show more ...
|
|
Revision tags: v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3 |
|
| #
a15bec6a |
| 06-Aug-2024 |
Davidlohr Bueso <[email protected]> |
lib/rhashtable: cleanup fallback check in bucket_table_alloc()
Upon allocation failure, the current check with the nofail bits is unnecessary, and further stands in the way of discouraging direct us
lib/rhashtable: cleanup fallback check in bucket_table_alloc()
Upon allocation failure, the current check with the nofail bits is unnecessary, and further stands in the way of discouraging direct use of __GFP_NOFAIL. Remove this and replace with the proper way of determining if doing a non-blocking allocation for the nested table case.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Davidlohr Bueso <[email protected]> Suggested-by: Michal Hocko <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Herbert Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1 |
|
| #
9e54dd8b |
| 21-Mar-2024 |
Kent Overstreet <[email protected]> |
rhashtable: plumb through alloc tag
This gives better memory allocation profiling results; rhashtable allocations will be accounted to the code that initialized the rhashtable.
[[email protected]:
rhashtable: plumb through alloc tag
This gives better memory allocation profiling results; rhashtable allocations will be accounted to the code that initialized the rhashtable.
[[email protected]: undo _noprof additions in the documentation] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Suren Baghdasaryan <[email protected]> Tested-by: Kees Cook <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Alex Gaynor <[email protected]> Cc: Alice Ryhl <[email protected]> Cc: Andreas Hindborg <[email protected]> Cc: Benno Lossin <[email protected]> Cc: "Björn Roy Baron" <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Gary Guo <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Wedson Almeida Filho <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1 |
|
| #
e47877c7 |
| 06-Dec-2022 |
Tejun Heo <[email protected]> |
rhashtable: Allow rhashtable to be used from irq-safe contexts
rhashtable currently only does bh-safe synchronization making it impossible to use from irq-safe contexts. Switch it to use irq-safe sy
rhashtable: Allow rhashtable to be used from irq-safe contexts
rhashtable currently only does bh-safe synchronization making it impossible to use from irq-safe contexts. Switch it to use irq-safe synchronization to remove the restriction.
v2: Update the lock functions to return the ulong flags value and unlock functions to take the value directly instead of passing around the pointer. Suggested by Linus.
Signed-off-by: Tejun Heo <[email protected]> Reviewed-by: David Vernet <[email protected]> Acked-by: Josh Don <[email protected]> Acked-by: Hao Luo <[email protected]> Acked-by: Barret Rhoden <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1 |
|
| #
9dbbc3b9 |
| 08-Jul-2021 |
Zhen Lei <[email protected]> |
lib: fix spelling mistakes
Fix some spelling mistakes in comments: permanentely ==> permanently wont ==> won't remaning ==> remaining succed ==> succeed shouldnt ==> shouldn't alpha-numeric ==> alph
lib: fix spelling mistakes
Fix some spelling mistakes in comments: permanentely ==> permanently wont ==> won't remaning ==> remaining succed ==> succeed shouldnt ==> shouldn't alpha-numeric ==> alphanumeric storeing ==> storing funtion ==> function documenation ==> documentation Determin ==> Determine intepreted ==> interpreted ammount ==> amount obious ==> obvious interupts ==> interrupts occured ==> occurred asssociated ==> associated taking into acount ==> taking into account squence ==> sequence stil ==> still contiguos ==> contiguous matchs ==> matches
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zhen Lei <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7 |
|
| #
ce9b362b |
| 24-Jul-2020 |
Herbert Xu <[email protected]> |
rhashtable: Restore RCU marking on rhash_lock_head
This patch restores the RCU marking on bucket_table->buckets as it really does need RCU protection. Its removal had led to a fatal bug.
Signed-of
rhashtable: Restore RCU marking on rhash_lock_head
This patch restores the RCU marking on bucket_table->buckets as it really does need RCU protection. Its removal had led to a fatal bug.
Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1 |
|
| #
4a3084aa |
| 03-Jun-2020 |
Herbert Xu <[email protected]> |
rhashtable: Drop raw RCU deref in nested_table_free
This patch replaces some unnecessary uses of rcu_dereference_raw in the rhashtable code with rcu_dereference_protected.
The top-level nested tabl
rhashtable: Drop raw RCU deref in nested_table_free
This patch replaces some unnecessary uses of rcu_dereference_raw in the rhashtable code with rcu_dereference_protected.
The top-level nested table entry is only marked as RCU because it shares the same type as the tree entries underneath it. So it doesn't need any RCU protection.
We also don't need RCU protection when we're freeing a nested RCU table because by this stage we've long passed a memory barrier when anyone could change the nested table.
Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4 |
|
| #
d2912cb1 |
| 04-Jun-2019 |
Thomas Gleixner <[email protected]> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of th
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Enrico Weigelt <[email protected]> Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v5.2-rc3, v5.2-rc2, v5.2-rc1 |
|
| #
e9458a4e |
| 16-May-2019 |
Herbert Xu <[email protected]> |
rhashtable: Fix cmpxchg RCU warnings
As cmpxchg is a non-RCU mechanism it will cause sparse warnings when we use it for RCU. This patch adds explicit casts to silence those warnings. This should p
rhashtable: Fix cmpxchg RCU warnings
As cmpxchg is a non-RCU mechanism it will cause sparse warnings when we use it for RCU. This patch adds explicit casts to silence those warnings. This should probably be moved to RCU itself in future.
Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
ba6306e3 |
| 16-May-2019 |
Herbert Xu <[email protected]> |
rhashtable: Remove RCU marking from rhash_lock_head
The opaque type rhash_lock_head should not be marked with __rcu because it can never be dereferenced. We should apply the RCU marking when we tur
rhashtable: Remove RCU marking from rhash_lock_head
The opaque type rhash_lock_head should not be marked with __rcu because it can never be dereferenced. We should apply the RCU marking when we turn it into a pointer which can be dereferenced.
This patch does exactly that. This fixes a number of sparse warnings as well as getting rid of some unnecessary RCU checking.
Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5 |
|
| #
ca0b709d |
| 12-Apr-2019 |
NeilBrown <[email protected]> |
rhashtable: use BIT(0) for locking.
As reported by Guenter Roeck, the new bit-locking using BIT(1) doesn't work on the m68k architecture. m68k only requires 2-byte alignment for words and longwords
rhashtable: use BIT(0) for locking.
As reported by Guenter Roeck, the new bit-locking using BIT(1) doesn't work on the m68k architecture. m68k only requires 2-byte alignment for words and longwords, so there is only one unused bit in pointers to structs - We current use two, one for the NULLS marker at the end of the linked list, and one for the bit-lock in the head of the list.
The two uses don't need to conflict as we never need the head of the list to be a NULLS marker - the marker is only needed to check if an object has moved to a different table, and the bucket head cannot move. The NULLS marker is only needed in a ->next pointer.
As we already have different types for the bucket head pointer (struct rhash_lock_head) and the ->next pointers (struct rhash_head), it is fairly easy to treat the lsb differently in each.
So: Initialize buckets heads to NULL, and use the lsb for locking. When loading the pointer from the bucket head, if it is NULL (ignoring the lock big), report as being the expected NULLS marker. When storing a value into a bucket head, if it is a NULLS marker, store NULL instead.
And convert all places that used bit 1 for locking, to use bit 0.
Fixes: 8f0db018006a ("rhashtable: use bit_spin_locks to protect hash bucket.") Reported-by: Guenter Roeck <[email protected]> Tested-by: Guenter Roeck <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
f4712b46 |
| 12-Apr-2019 |
NeilBrown <[email protected]> |
rhashtable: replace rht_ptr_locked() with rht_assign_locked()
The only times rht_ptr_locked() is used, it is to store a new value in a bucket-head. This is the only time it makes sense to use it to
rhashtable: replace rht_ptr_locked() with rht_assign_locked()
The only times rht_ptr_locked() is used, it is to store a new value in a bucket-head. This is the only time it makes sense to use it too. So replace it by a function which does the whole task: Sets the lock bit and assigns to a bucket head.
Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
adc6a3ab |
| 12-Apr-2019 |
NeilBrown <[email protected]> |
rhashtable: move dereference inside rht_ptr()
Rather than dereferencing a pointer to a bucket and then passing the result to rht_ptr(), we now pass in the pointer and do the dereference in rht_ptr()
rhashtable: move dereference inside rht_ptr()
Rather than dereferencing a pointer to a bucket and then passing the result to rht_ptr(), we now pass in the pointer and do the dereference in rht_ptr().
This requires that we pass in the tbl and hash as well to support RCU checks, and means that the various rht_for_each functions can expect a pointer that can be dereferenced without further care.
There are two places where we dereference a bucket pointer where there is no testable protection - in each case we know that we much have exclusive access without having taken a lock. The previous code used rht_dereference() to pretend that holding the mutex provided protects, but holding the mutex never provides protection for accessing buckets.
So instead introduce rht_ptr_exclusive() that can be used when there is known to be exclusive access without holding any locks.
Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
e4edbe3c |
| 12-Apr-2019 |
NeilBrown <[email protected]> |
rhashtable: fix some __rcu annotation errors
With these annotations, the rhashtable now gets no warnings when compiled with "C=1" for sparse checking.
Signed-off-by: NeilBrown <[email protected]> Sign
rhashtable: fix some __rcu annotation errors
With these annotations, the rhashtable now gets no warnings when compiled with "C=1" for sparse checking.
Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
c252aa3e |
| 11-Apr-2019 |
Gustavo A. R. Silva <[email protected]> |
rhashtable: use struct_size() in kvzalloc()
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory
rhashtable: use struct_size() in kvzalloc()
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example:
struct foo { int stuff; struct boo entry[]; };
size = sizeof(struct foo) + count * sizeof(struct boo); instance = kvzalloc(size, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper:
instance = kvzalloc(struct_size(instance, entry, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc4 |
|
| #
149212f0 |
| 01-Apr-2019 |
NeilBrown <[email protected]> |
rhashtable: add lockdep tracking to bucket bit-spin-locks.
Native bit_spin_locks are not tracked by lockdep.
The bit_spin_locks used for rhashtable buckets are local to the rhashtable implementatio
rhashtable: add lockdep tracking to bucket bit-spin-locks.
Native bit_spin_locks are not tracked by lockdep.
The bit_spin_locks used for rhashtable buckets are local to the rhashtable implementation, so there is little opportunity for the sort of misuse that lockdep might detect. However locks are held while a hash function or compare function is called, and if one of these took a lock, a misbehaviour is possible.
As it is quite easy to add lockdep support this unlikely possibility seems to be enough justification.
So create a lockdep class for bucket bit_spin_lock and attach through a lockdep_map in each bucket_table.
Without the 'nested' annotation in rhashtable_rehash_one(), lockdep correctly reports a possible problem as this lock is taken while another bucket lock (in another table) is held. This confirms that the added support works. With the correct nested annotation in place, lockdep reports no problems.
Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
8f0db018 |
| 01-Apr-2019 |
NeilBrown <[email protected]> |
rhashtable: use bit_spin_locks to protect hash bucket.
This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the bucket pointer to lock the hash chain for that bucket.
The benefits of
rhashtable: use bit_spin_locks to protect hash bucket.
This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the bucket pointer to lock the hash chain for that bucket.
The benefits of a bit spin_lock are: - no need to allocate a separate array of locks. - no need to have a configuration option to guide the choice of the size of this array - locking cost is often a single test-and-set in a cache line that will have to be loaded anyway. When inserting at, or removing from, the head of the chain, the unlock is free - writing the new address in the bucket head implicitly clears the lock bit. For __rhashtable_insert_fast() we ensure this always happens when adding a new key. - even when lockings costs 2 updates (lock and unlock), they are in a cacheline that needs to be read anyway.
The cost of using a bit spin_lock is a little bit of code complexity, which I think is quite manageable.
Bit spin_locks are sometimes inappropriate because they are not fair - if multiple CPUs repeatedly contend of the same lock, one CPU can easily be starved. This is not a credible situation with rhashtable. Multiple CPUs may want to repeatedly add or remove objects, but they will typically do so at different buckets, so they will attempt to acquire different locks.
As we have more bit-locks than we previously had spinlocks (by at least a factor of two) we can expect slightly less contention to go with the slightly better cache behavior and reduced memory consumption.
To enhance type checking, a new struct is introduced to represent the pointer plus lock-bit that is stored in the bucket-table. This is "struct rhash_lock_head" and is empty. A pointer to this needs to be cast to either an unsigned lock, or a "struct rhash_head *" to be useful. Variables of this type are most often called "bkt".
Previously "pprev" would sometimes point to a bucket, and sometimes a ->next pointer in an rhash_head. As these are now different types, pprev is NULL when it would have pointed to the bucket. In that case, 'blk' is used, together with correct locking protocol.
Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
ff302db9 |
| 01-Apr-2019 |
NeilBrown <[email protected]> |
rhashtable: allow rht_bucket_var to return NULL.
Rather than returning a pointer to a static nulls, rht_bucket_var() now returns NULL if the bucket doesn't exist. This will make the next patch, whic
rhashtable: allow rht_bucket_var to return NULL.
Rather than returning a pointer to a static nulls, rht_bucket_var() now returns NULL if the bucket doesn't exist. This will make the next patch, which stores a bitlock in the bucket pointer, somewhat cleaner.
This change involves introducing __rht_bucket_nested() which is like rht_bucket_nested(), but doesn't provide the static nulls, and changing rht_bucket_nested() to call this and possible provide a static nulls - as is still needed for the non-var case.
Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
7a41c294 |
| 01-Apr-2019 |
NeilBrown <[email protected]> |
rhashtable: use cmpxchg() in nested_table_alloc()
nested_table_alloc() relies on the fact that there is at most one spinlock allocated for every slot in the top level nested table, so it is not poss
rhashtable: use cmpxchg() in nested_table_alloc()
nested_table_alloc() relies on the fact that there is at most one spinlock allocated for every slot in the top level nested table, so it is not possible for two threads to try to allocate the same table at the same time.
This assumption is a little fragile (it is not explicit) and is unnecessary as cmpxchg() can be used instead.
A future patch will replace the spinlocks by per-bucket bitlocks, and then we won't be able to protect the slot pointer with a spinlock.
So replace rcu_assign_pointer() with cmpxchg() - which has equivalent barrier properties. If it the cmp fails, free the table that was just allocated.
Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc3, v5.1-rc2 |
|
| #
f7ad68bf |
| 21-Mar-2019 |
NeilBrown <[email protected]> |
rhashtable: rename rht_for_each*continue as *from.
The pattern set by list.h is that for_each..continue() iterators start at the next entry after the given one, while for_each..from() iterators star
rhashtable: rename rht_for_each*continue as *from.
The pattern set by list.h is that for_each..continue() iterators start at the next entry after the given one, while for_each..from() iterators start at the given entry.
The rht_for_each*continue() iterators are documented as though the start at the 'next' entry, but actually start at the given entry, and they are used expecting that behaviour. So fix the documentation and change the names to *from for consistency with list.h
Acked-by: Herbert Xu <[email protected]> Acked-by: Miguel Ojeda <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
4feb7c7a |
| 21-Mar-2019 |
NeilBrown <[email protected]> |
rhashtable: don't hold lock on first table throughout insertion.
rhashtable_try_insert() currently holds a lock on the bucket in the first table, while also locking buckets in subsequent tables. Thi
rhashtable: don't hold lock on first table throughout insertion.
rhashtable_try_insert() currently holds a lock on the bucket in the first table, while also locking buckets in subsequent tables. This is unnecessary and looks like a hold-over from some earlier version of the implementation.
As insert and remove always lock a bucket in each table in turn, and as insert only inserts in the final table, there cannot be any races that are not covered by simply locking a bucket in each table in turn.
When an insert call reaches that last table it can be sure that there is no matchinf entry in any other table as it has searched them all, and insertion never happens anywhere but in the last table. The fact that code tests for the existence of future_tbl while holding a lock on the relevant bucket ensures that two threads inserting the same key will make compatible decisions about which is the "last" table.
This simplifies the code and allows the ->rehash field to be discarded.
We still need a way to ensure that a dead bucket_table is never re-linked by rhashtable_walk_stop(). This can be achieved by calling call_rcu() inside the locked region, and checking with rcu_head_after_call_rcu() in rhashtable_walk_stop() to see if the bucket table is empty and dead.
Acked-by: Herbert Xu <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
408f13ef |
| 21-Mar-2019 |
Herbert Xu <[email protected]> |
rhashtable: Still do rehash when we get EEXIST
As it stands if a shrink is delayed because of an outstanding rehash, we will go into a rescheduling loop without ever doing the rehash.
This patch fi
rhashtable: Still do rehash when we get EEXIST
As it stands if a shrink is delayed because of an outstanding rehash, we will go into a rescheduling loop without ever doing the rehash.
This patch fixes this by still carrying out the rehash and then rescheduling so that we can shrink after the completion of the rehash should it still be necessary.
The return value of EEXIST captures this case and other cases (e.g., another thread expanded/rehashed the table at the same time) where we should still proceed with the rehash.
Fixes: da20420f83ea ("rhashtable: Add nested tables") Reported-by: Josh Elsasser <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Tested-by: Josh Elsasser <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7 |
|
| #
6c4128f6 |
| 14-Feb-2019 |
Herbert Xu <[email protected]> |
rhashtable: Remove obsolete rhashtable_walk_init function
The rhashtable_walk_init function has been obsolete for more than two years. This patch finally converts its last users over to rhashtable_
rhashtable: Remove obsolete rhashtable_walk_init function
The rhashtable_walk_init function has been obsolete for more than two years. This patch finally converts its last users over to rhashtable_walk_enter and removes it.
Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
show more ...
|