History log of /linux-6.15/lib/rhashtable.c (Results 1 – 25 of 168)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13
# 9d4f8e54 14-Jan-2025 Herbert Xu <[email protected]>

rhashtable: Fix rhashtable_try_insert test

The test on whether rhashtable_insert_one did an insertion relies
on the value returned by rhashtable_lookup_one. Unfortunately that
value is overwritten

rhashtable: Fix rhashtable_try_insert test

The test on whether rhashtable_insert_one did an insertion relies
on the value returned by rhashtable_lookup_one. Unfortunately that
value is overwritten after rhashtable_insert_one returns. Fix this
by moving the test before data gets overwritten.

Simplify the test as only data == NULL matters.

Finally move atomic_inc back within the lock as otherwise it may
be reordered with the atomic_dec on the removal side, potentially
leading to an underflow.

Reported-by: Michael Kelley <[email protected]>
Fixes: e1d3422c95f0 ("rhashtable: Fix potential deadlock by moving schedule_work outside lock")
Signed-off-by: Herbert Xu <[email protected]>
Tested-by: Michael Kelley <[email protected]>
Reviewed-by: Breno Leitao <[email protected]>
Tested-by: Mikhail Zaslonko <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1
# f3a6101b 23-Nov-2024 Pratyush Mittal <[email protected]>

lib/rhashtable: fix the typo for preemptible

Fix the spelling of the mis-spelled word

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Pratyush Mittal

lib/rhashtable: fix the typo for preemptible

Fix the spelling of the mis-spelled word

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Pratyush Mittal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# e1d3422c 28-Nov-2024 Breno Leitao <[email protected]>

rhashtable: Fix potential deadlock by moving schedule_work outside lock

Move the hash table growth check and work scheduling outside the
rht lock to prevent a possible circular locking dependency.

rhashtable: Fix potential deadlock by moving schedule_work outside lock

Move the hash table growth check and work scheduling outside the
rht lock to prevent a possible circular locking dependency.

The original implementation could trigger a lockdep warning due to
a potential deadlock scenario involving nested locks between
rhashtable bucket, rq lock, and dsq lock. By relocating the
growth check and work scheduling after releasing the rth lock, we break
this potential deadlock chain.

This change expands the flexibility of rhashtable by removing
restrictive locking that previously limited its use in scheduler
and workqueue contexts.

Import to say that this calls rht_grow_above_75(), which reads from
struct rhashtable without holding the lock, if this is a problem, we can
move the check to the lock, and schedule the workqueue after the lock.

Fixes: f0e1a0643a59 ("sched_ext: Implement BPF extensible scheduler class")
Suggested-by: Tejun Heo <[email protected]>
Signed-off-by: Breno Leitao <[email protected]>

Modified so that atomic_inc is also moved outside of the bucket
lock along with the growth above 75% check.

Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3
# a15bec6a 06-Aug-2024 Davidlohr Bueso <[email protected]>

lib/rhashtable: cleanup fallback check in bucket_table_alloc()

Upon allocation failure, the current check with the nofail bits is
unnecessary, and further stands in the way of discouraging direct us

lib/rhashtable: cleanup fallback check in bucket_table_alloc()

Upon allocation failure, the current check with the nofail bits is
unnecessary, and further stands in the way of discouraging direct use of
__GFP_NOFAIL. Remove this and replace with the proper way of determining
if doing a non-blocking allocation for the nested table case.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Davidlohr Bueso <[email protected]>
Suggested-by: Michal Hocko <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Herbert Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1
# 9e54dd8b 21-Mar-2024 Kent Overstreet <[email protected]>

rhashtable: plumb through alloc tag

This gives better memory allocation profiling results; rhashtable
allocations will be accounted to the code that initialized the rhashtable.

[[email protected]:

rhashtable: plumb through alloc tag

This gives better memory allocation profiling results; rhashtable
allocations will be accounted to the code that initialized the rhashtable.

[[email protected]: undo _noprof additions in the documentation]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kent Overstreet <[email protected]>
Signed-off-by: Suren Baghdasaryan <[email protected]>
Tested-by: Kees Cook <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Alex Gaynor <[email protected]>
Cc: Alice Ryhl <[email protected]>
Cc: Andreas Hindborg <[email protected]>
Cc: Benno Lossin <[email protected]>
Cc: "Björn Roy Baron" <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Gary Guo <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Pasha Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wedson Almeida Filho <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1
# e47877c7 06-Dec-2022 Tejun Heo <[email protected]>

rhashtable: Allow rhashtable to be used from irq-safe contexts

rhashtable currently only does bh-safe synchronization making it impossible
to use from irq-safe contexts. Switch it to use irq-safe sy

rhashtable: Allow rhashtable to be used from irq-safe contexts

rhashtable currently only does bh-safe synchronization making it impossible
to use from irq-safe contexts. Switch it to use irq-safe synchronization to
remove the restriction.

v2: Update the lock functions to return the ulong flags value and unlock
functions to take the value directly instead of passing around the
pointer. Suggested by Linus.

Signed-off-by: Tejun Heo <[email protected]>
Reviewed-by: David Vernet <[email protected]>
Acked-by: Josh Don <[email protected]>
Acked-by: Hao Luo <[email protected]>
Acked-by: Barret Rhoden <[email protected]>
Cc: Linus Torvalds <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1
# 9dbbc3b9 08-Jul-2021 Zhen Lei <[email protected]>

lib: fix spelling mistakes

Fix some spelling mistakes in comments:
permanentely ==> permanently
wont ==> won't
remaning ==> remaining
succed ==> succeed
shouldnt ==> shouldn't
alpha-numeric ==> alph

lib: fix spelling mistakes

Fix some spelling mistakes in comments:
permanentely ==> permanently
wont ==> won't
remaning ==> remaining
succed ==> succeed
shouldnt ==> shouldn't
alpha-numeric ==> alphanumeric
storeing ==> storing
funtion ==> function
documenation ==> documentation
Determin ==> Determine
intepreted ==> interpreted
ammount ==> amount
obious ==> obvious
interupts ==> interrupts
occured ==> occurred
asssociated ==> associated
taking into acount ==> taking into account
squence ==> sequence
stil ==> still
contiguos ==> contiguous
matchs ==> matches

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Zhen Lei <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7
# ce9b362b 24-Jul-2020 Herbert Xu <[email protected]>

rhashtable: Restore RCU marking on rhash_lock_head

This patch restores the RCU marking on bucket_table->buckets as
it really does need RCU protection. Its removal had led to a fatal
bug.

Signed-of

rhashtable: Restore RCU marking on rhash_lock_head

This patch restores the RCU marking on bucket_table->buckets as
it really does need RCU protection. Its removal had led to a fatal
bug.

Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1
# 4a3084aa 03-Jun-2020 Herbert Xu <[email protected]>

rhashtable: Drop raw RCU deref in nested_table_free

This patch replaces some unnecessary uses of rcu_dereference_raw
in the rhashtable code with rcu_dereference_protected.

The top-level nested tabl

rhashtable: Drop raw RCU deref in nested_table_free

This patch replaces some unnecessary uses of rcu_dereference_raw
in the rhashtable code with rcu_dereference_protected.

The top-level nested table entry is only marked as RCU because it
shares the same type as the tree entries underneath it. So it
doesn't need any RCU protection.

We also don't need RCU protection when we're freeing a nested RCU
table because by this stage we've long passed a memory barrier
when anyone could change the nested table.

Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4
# d2912cb1 04-Jun-2019 Thomas Gleixner <[email protected]>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of th

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Enrico Weigelt <[email protected]>
Reviewed-by: Kate Stewart <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

show more ...


Revision tags: v5.2-rc3, v5.2-rc2, v5.2-rc1
# e9458a4e 16-May-2019 Herbert Xu <[email protected]>

rhashtable: Fix cmpxchg RCU warnings

As cmpxchg is a non-RCU mechanism it will cause sparse warnings
when we use it for RCU. This patch adds explicit casts to silence
those warnings. This should p

rhashtable: Fix cmpxchg RCU warnings

As cmpxchg is a non-RCU mechanism it will cause sparse warnings
when we use it for RCU. This patch adds explicit casts to silence
those warnings. This should probably be moved to RCU itself in
future.

Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# ba6306e3 16-May-2019 Herbert Xu <[email protected]>

rhashtable: Remove RCU marking from rhash_lock_head

The opaque type rhash_lock_head should not be marked with __rcu
because it can never be dereferenced. We should apply the RCU
marking when we tur

rhashtable: Remove RCU marking from rhash_lock_head

The opaque type rhash_lock_head should not be marked with __rcu
because it can never be dereferenced. We should apply the RCU
marking when we turn it into a pointer which can be dereferenced.

This patch does exactly that. This fixes a number of sparse
warnings as well as getting rid of some unnecessary RCU checking.

Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5
# ca0b709d 12-Apr-2019 NeilBrown <[email protected]>

rhashtable: use BIT(0) for locking.

As reported by Guenter Roeck, the new bit-locking using
BIT(1) doesn't work on the m68k architecture. m68k only requires
2-byte alignment for words and longwords

rhashtable: use BIT(0) for locking.

As reported by Guenter Roeck, the new bit-locking using
BIT(1) doesn't work on the m68k architecture. m68k only requires
2-byte alignment for words and longwords, so there is only one
unused bit in pointers to structs - We current use two, one for the
NULLS marker at the end of the linked list, and one for the bit-lock
in the head of the list.

The two uses don't need to conflict as we never need the head of the
list to be a NULLS marker - the marker is only needed to check if an
object has moved to a different table, and the bucket head cannot
move. The NULLS marker is only needed in a ->next pointer.

As we already have different types for the bucket head pointer (struct
rhash_lock_head) and the ->next pointers (struct rhash_head), it is
fairly easy to treat the lsb differently in each.

So: Initialize buckets heads to NULL, and use the lsb for locking.
When loading the pointer from the bucket head, if it is NULL (ignoring
the lock big), report as being the expected NULLS marker.
When storing a value into a bucket head, if it is a NULLS marker,
store NULL instead.

And convert all places that used bit 1 for locking, to use bit 0.

Fixes: 8f0db018006a ("rhashtable: use bit_spin_locks to protect hash bucket.")
Reported-by: Guenter Roeck <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# f4712b46 12-Apr-2019 NeilBrown <[email protected]>

rhashtable: replace rht_ptr_locked() with rht_assign_locked()

The only times rht_ptr_locked() is used, it is to store a new
value in a bucket-head. This is the only time it makes sense
to use it to

rhashtable: replace rht_ptr_locked() with rht_assign_locked()

The only times rht_ptr_locked() is used, it is to store a new
value in a bucket-head. This is the only time it makes sense
to use it too. So replace it by a function which does the
whole task: Sets the lock bit and assigns to a bucket head.

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# adc6a3ab 12-Apr-2019 NeilBrown <[email protected]>

rhashtable: move dereference inside rht_ptr()

Rather than dereferencing a pointer to a bucket and then passing the
result to rht_ptr(), we now pass in the pointer and do the dereference
in rht_ptr()

rhashtable: move dereference inside rht_ptr()

Rather than dereferencing a pointer to a bucket and then passing the
result to rht_ptr(), we now pass in the pointer and do the dereference
in rht_ptr().

This requires that we pass in the tbl and hash as well to support RCU
checks, and means that the various rht_for_each functions can expect a
pointer that can be dereferenced without further care.

There are two places where we dereference a bucket pointer
where there is no testable protection - in each case we know
that we much have exclusive access without having taken a lock.
The previous code used rht_dereference() to pretend that holding
the mutex provided protects, but holding the mutex never provides
protection for accessing buckets.

So instead introduce rht_ptr_exclusive() that can be used when
there is known to be exclusive access without holding any locks.

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# e4edbe3c 12-Apr-2019 NeilBrown <[email protected]>

rhashtable: fix some __rcu annotation errors

With these annotations, the rhashtable now gets no
warnings when compiled with "C=1" for sparse checking.

Signed-off-by: NeilBrown <[email protected]>
Sign

rhashtable: fix some __rcu annotation errors

With these annotations, the rhashtable now gets no
warnings when compiled with "C=1" for sparse checking.

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# c252aa3e 11-Apr-2019 Gustavo A. R. Silva <[email protected]>

rhashtable: use struct_size() in kvzalloc()

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along with
memory

rhashtable: use struct_size() in kvzalloc()

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along with
memory for some number of elements for that array. For example:

struct foo {
int stuff;
struct boo entry[];
};

size = sizeof(struct foo) + count * sizeof(struct boo);
instance = kvzalloc(size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kvzalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.1-rc4
# 149212f0 01-Apr-2019 NeilBrown <[email protected]>

rhashtable: add lockdep tracking to bucket bit-spin-locks.

Native bit_spin_locks are not tracked by lockdep.

The bit_spin_locks used for rhashtable buckets are local
to the rhashtable implementatio

rhashtable: add lockdep tracking to bucket bit-spin-locks.

Native bit_spin_locks are not tracked by lockdep.

The bit_spin_locks used for rhashtable buckets are local
to the rhashtable implementation, so there is little opportunity
for the sort of misuse that lockdep might detect.
However locks are held while a hash function or compare
function is called, and if one of these took a lock,
a misbehaviour is possible.

As it is quite easy to add lockdep support this unlikely
possibility seems to be enough justification.

So create a lockdep class for bucket bit_spin_lock and attach
through a lockdep_map in each bucket_table.

Without the 'nested' annotation in rhashtable_rehash_one(), lockdep
correctly reports a possible problem as this lock is taken
while another bucket lock (in another table) is held. This
confirms that the added support works.
With the correct nested annotation in place, lockdep reports
no problems.

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# 8f0db018 01-Apr-2019 NeilBrown <[email protected]>

rhashtable: use bit_spin_locks to protect hash bucket.

This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the
bucket pointer to lock the hash chain for that bucket.

The benefits of

rhashtable: use bit_spin_locks to protect hash bucket.

This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the
bucket pointer to lock the hash chain for that bucket.

The benefits of a bit spin_lock are:
- no need to allocate a separate array of locks.
- no need to have a configuration option to guide the
choice of the size of this array
- locking cost is often a single test-and-set in a cache line
that will have to be loaded anyway. When inserting at, or removing
from, the head of the chain, the unlock is free - writing the new
address in the bucket head implicitly clears the lock bit.
For __rhashtable_insert_fast() we ensure this always happens
when adding a new key.
- even when lockings costs 2 updates (lock and unlock), they are
in a cacheline that needs to be read anyway.

The cost of using a bit spin_lock is a little bit of code complexity,
which I think is quite manageable.

Bit spin_locks are sometimes inappropriate because they are not fair -
if multiple CPUs repeatedly contend of the same lock, one CPU can
easily be starved. This is not a credible situation with rhashtable.
Multiple CPUs may want to repeatedly add or remove objects, but they
will typically do so at different buckets, so they will attempt to
acquire different locks.

As we have more bit-locks than we previously had spinlocks (by at
least a factor of two) we can expect slightly less contention to
go with the slightly better cache behavior and reduced memory
consumption.

To enhance type checking, a new struct is introduced to represent the
pointer plus lock-bit
that is stored in the bucket-table. This is "struct rhash_lock_head"
and is empty. A pointer to this needs to be cast to either an
unsigned lock, or a "struct rhash_head *" to be useful.
Variables of this type are most often called "bkt".

Previously "pprev" would sometimes point to a bucket, and sometimes a
->next pointer in an rhash_head. As these are now different types,
pprev is NULL when it would have pointed to the bucket. In that case,
'blk' is used, together with correct locking protocol.

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# ff302db9 01-Apr-2019 NeilBrown <[email protected]>

rhashtable: allow rht_bucket_var to return NULL.

Rather than returning a pointer to a static nulls, rht_bucket_var()
now returns NULL if the bucket doesn't exist.
This will make the next patch, whic

rhashtable: allow rht_bucket_var to return NULL.

Rather than returning a pointer to a static nulls, rht_bucket_var()
now returns NULL if the bucket doesn't exist.
This will make the next patch, which stores a bitlock in the
bucket pointer, somewhat cleaner.

This change involves introducing __rht_bucket_nested() which is
like rht_bucket_nested(), but doesn't provide the static nulls,
and changing rht_bucket_nested() to call this and possible
provide a static nulls - as is still needed for the non-var case.

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# 7a41c294 01-Apr-2019 NeilBrown <[email protected]>

rhashtable: use cmpxchg() in nested_table_alloc()

nested_table_alloc() relies on the fact that there is
at most one spinlock allocated for every slot in the top
level nested table, so it is not poss

rhashtable: use cmpxchg() in nested_table_alloc()

nested_table_alloc() relies on the fact that there is
at most one spinlock allocated for every slot in the top
level nested table, so it is not possible for two threads
to try to allocate the same table at the same time.

This assumption is a little fragile (it is not explicit) and is
unnecessary as cmpxchg() can be used instead.

A future patch will replace the spinlocks by per-bucket bitlocks,
and then we won't be able to protect the slot pointer with a spinlock.

So replace rcu_assign_pointer() with cmpxchg() - which has equivalent
barrier properties.
If it the cmp fails, free the table that was just allocated.

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.1-rc3, v5.1-rc2
# f7ad68bf 21-Mar-2019 NeilBrown <[email protected]>

rhashtable: rename rht_for_each*continue as *from.

The pattern set by list.h is that for_each..continue()
iterators start at the next entry after the given one,
while for_each..from() iterators star

rhashtable: rename rht_for_each*continue as *from.

The pattern set by list.h is that for_each..continue()
iterators start at the next entry after the given one,
while for_each..from() iterators start at the given
entry.

The rht_for_each*continue() iterators are documented as though the
start at the 'next' entry, but actually start at the given entry,
and they are used expecting that behaviour.
So fix the documentation and change the names to *from for consistency
with list.h

Acked-by: Herbert Xu <[email protected]>
Acked-by: Miguel Ojeda <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# 4feb7c7a 21-Mar-2019 NeilBrown <[email protected]>

rhashtable: don't hold lock on first table throughout insertion.

rhashtable_try_insert() currently holds a lock on the bucket in
the first table, while also locking buckets in subsequent tables.
Thi

rhashtable: don't hold lock on first table throughout insertion.

rhashtable_try_insert() currently holds a lock on the bucket in
the first table, while also locking buckets in subsequent tables.
This is unnecessary and looks like a hold-over from some earlier
version of the implementation.

As insert and remove always lock a bucket in each table in turn, and
as insert only inserts in the final table, there cannot be any races
that are not covered by simply locking a bucket in each table in turn.

When an insert call reaches that last table it can be sure that there
is no matchinf entry in any other table as it has searched them all, and
insertion never happens anywhere but in the last table. The fact that
code tests for the existence of future_tbl while holding a lock on
the relevant bucket ensures that two threads inserting the same key
will make compatible decisions about which is the "last" table.

This simplifies the code and allows the ->rehash field to be
discarded.

We still need a way to ensure that a dead bucket_table is never
re-linked by rhashtable_walk_stop(). This can be achieved by calling
call_rcu() inside the locked region, and checking with
rcu_head_after_call_rcu() in rhashtable_walk_stop() to see if the
bucket table is empty and dead.

Acked-by: Herbert Xu <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# 408f13ef 21-Mar-2019 Herbert Xu <[email protected]>

rhashtable: Still do rehash when we get EEXIST

As it stands if a shrink is delayed because of an outstanding
rehash, we will go into a rescheduling loop without ever doing
the rehash.

This patch fi

rhashtable: Still do rehash when we get EEXIST

As it stands if a shrink is delayed because of an outstanding
rehash, we will go into a rescheduling loop without ever doing
the rehash.

This patch fixes this by still carrying out the rehash and then
rescheduling so that we can shrink after the completion of the
rehash should it still be necessary.

The return value of EEXIST captures this case and other cases
(e.g., another thread expanded/rehashed the table at the same
time) where we should still proceed with the rehash.

Fixes: da20420f83ea ("rhashtable: Add nested tables")
Reported-by: Josh Elsasser <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Tested-by: Josh Elsasser <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7
# 6c4128f6 14-Feb-2019 Herbert Xu <[email protected]>

rhashtable: Remove obsolete rhashtable_walk_init function

The rhashtable_walk_init function has been obsolete for more than
two years. This patch finally converts its last users over to
rhashtable_

rhashtable: Remove obsolete rhashtable_walk_init function

The rhashtable_walk_init function has been obsolete for more than
two years. This patch finally converts its last users over to
rhashtable_walk_enter and removes it.

Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

show more ...


1234567