History log of /linux-6.15/include/linux/find.h (Results 1 – 23 of 23)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1
# fda1dd3c 19-Jul-2024 Yury Norov <[email protected]>

find: Switch from inline to __always_inline

'inline' keyword is only a recommendation for compiler. If it decides to
not inline find_bit nodemask functions, the whole small_const_nbits()
machinery d

find: Switch from inline to __always_inline

'inline' keyword is only a recommendation for compiler. If it decides to
not inline find_bit nodemask functions, the whole small_const_nbits()
machinery doesn't work.

This is how a standard GCC 11.3.0 does for my x86_64 build now. This patch
replaces 'inline' directive with unconditional '__always_inline' to make
sure that there's always a chance for compile-time optimization. It doesn't
change size of kernel image, according to bloat-o-meter.

[[ Brian: split out from:
Subject: [PATCH 1/3] bitmap: switch from inline to __always_inline
https://lore.kernel.org/all/[email protected]/
But rewritten, as there were too many conflicts. ]]

Co-developed-by: Brian Norris <[email protected]>
Signed-off-by: Brian Norris <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Yury Norov <[email protected]>

show more ...


Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7
# 0b2811ba 02-May-2024 Yury Norov <[email protected]>

bitmap: relax find_nth_bit() limitation on return value

The function claims to return the bitmap size, if Nth bit doesn't exist.
This rule is violated in inline case because the fns() that is used
t

bitmap: relax find_nth_bit() limitation on return value

The function claims to return the bitmap size, if Nth bit doesn't exist.
This rule is violated in inline case because the fns() that is used
there doesn't know anything about size of the bitmap.

So, relax this requirement to '>= size', and make the outline
implementation a bit cheaper.

All in-tree kernel users of find_nth_bit() are safe against that.

Reported-by: Rasmus Villemoes <[email protected]>
Closes: https://lore.kernel.org/all/Zi50cAgR8nZvgLa3@yury-ThinkPad/T/#m6da806a0525e74dcc91f35e5f20766ed4e853e8a
Signed-off-by: Yury Norov <[email protected]>

show more ...


Revision tags: v6.9-rc6, v6.9-rc5
# cdc66553 16-Apr-2024 Dawei Li <[email protected]>

cpumask: Introduce cpumask_first_and_and()

Introduce cpumask_first_and_and() to get intersection between 3 cpumasks,
free of any intermediate cpumask variable. Instead, cpumask_first_and_and()
works

cpumask: Introduce cpumask_first_and_and()

Introduce cpumask_first_and_and() to get intersection between 3 cpumasks,
free of any intermediate cpumask variable. Instead, cpumask_first_and_and()
works in-place with all inputs and produces desired output directly.

Signed-off-by: Dawei Li <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Yury Norov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6
# 27c82f14 28-Oct-2023 Yury Norov <[email protected]>

lib/find: optimize find_*_bit_wrap

When an offset is 0, there's no need to search a bitmap from the
beginning after the 1st search failed, because each bit has already
been tested.

Signed-off-by: Y

lib/find: optimize find_*_bit_wrap

When an offset is 0, there's no need to search a bitmap from the
beginning after the 1st search failed, because each bit has already
been tested.

Signed-off-by: Yury Norov <[email protected]>

show more ...


# 92697139 27-Nov-2023 Guanjun <[email protected]>

lib/find_bit: Fix the code comments about find_next_bit_wrap

The function find_next_bit_wrap only has one memory region
to search on. Adjust the comments.

Signed-off-by: Guanjun <[email protected]

lib/find_bit: Fix the code comments about find_next_bit_wrap

The function find_next_bit_wrap only has one memory region
to search on. Adjust the comments.

Signed-off-by: Guanjun <[email protected]>
Signed-off-by: Yury Norov <[email protected]>

show more ...


Revision tags: v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3
# 1470afef 16-Mar-2023 Dave Chinner <[email protected]>

cpumask: introduce for_each_cpu_or

Equivalent of for_each_cpu_and, except it ORs the two masks together
so it iterates all the CPUs present in either mask.

Signed-off-by: Dave Chinner <dchinner@red

cpumask: introduce for_each_cpu_or

Equivalent of for_each_cpu_and, except it ORs the two masks together
so it iterates all the CPUs present in either mask.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

show more ...


Revision tags: v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5
# 43245117 21-Jan-2023 Yury Norov <[email protected]>

lib/find: introduce find_nth_and_andnot_bit

In the following patches the function is used to implement in-place bitmaps
traversing without storing intermediate result in temporary bitmaps.

Signed-o

lib/find: introduce find_nth_and_andnot_bit

In the following patches the function is used to implement in-place bitmaps
traversing without storing intermediate result in temporary bitmaps.

Signed-off-by: Yury Norov <[email protected]>
Acked-by: Tariq Toukan <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Reviewed-by: Peter Lafreniere <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


Revision tags: v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1
# 5f75ff29 03-Oct-2022 Valentin Schneider <[email protected]>

cpumask: Introduce for_each_cpu_andnot()

for_each_cpu_and() is very convenient as it saves having to allocate a
temporary cpumask to store the result of cpumask_and(). The same issue
applies to cpum

cpumask: Introduce for_each_cpu_andnot()

for_each_cpu_and() is very convenient as it saves having to allocate a
temporary cpumask to store the result of cpumask_and(). The same issue
applies to cpumask_andnot() which doesn't actually need temporary storage
for iteration purposes.

Following what has been done for for_each_cpu_and(), introduce
for_each_cpu_andnot().

Signed-off-by: Valentin Schneider <[email protected]>

show more ...


# 90d48290 03-Oct-2022 Valentin Schneider <[email protected]>

lib/find_bit: Introduce find_next_andnot_bit()

In preparation of introducing for_each_cpu_andnot(), add a variant of
find_next_bit() that negate the bits in @addr2 when ANDing them with the
bits in

lib/find_bit: Introduce find_next_andnot_bit()

In preparation of introducing for_each_cpu_andnot(), add a variant of
find_next_bit() that negate the bits in @addr2 when ANDing them with the
bits in @addr1.

Signed-off-by: Valentin Schneider <[email protected]>

show more ...


Revision tags: v6.0, v6.0-rc7
# fdae96a3 19-Sep-2022 Yury Norov <[email protected]>

lib/find: optimize for_each() macros

Moving an iterator of the macros inside conditional part of for-loop
helps to generate a better code. It had been first implemented in commit
7baac8b91f9871ba ("

lib/find: optimize for_each() macros

Moving an iterator of the macros inside conditional part of for-loop
helps to generate a better code. It had been first implemented in commit
7baac8b91f9871ba ("cpumask: make for_each_cpu_mask a bit smaller").

Now that cpumask for-loops are the aliases to bitmap loops, it's worth
to optimize them the same way.

Bloat-o-meter says:
add/remove: 8/12 grow/shrink: 147/592 up/down: 4876/-24416 (-19540)

Signed-off-by: Yury Norov <[email protected]>

show more ...


# 4fe49b3b 19-Sep-2022 Yury Norov <[email protected]>

lib/bitmap: introduce for_each_set_bit_wrap() macro

Add for_each_set_bit_wrap() macro and use it in for_each_cpu_wrap(). The
new macro is based on __for_each_wrap() iterator, which is simpler and
sm

lib/bitmap: introduce for_each_set_bit_wrap() macro

Add for_each_set_bit_wrap() macro and use it in for_each_cpu_wrap(). The
new macro is based on __for_each_wrap() iterator, which is simpler and
smaller than cpumask_next_wrap().

Signed-off-by: Yury Norov <[email protected]>

show more ...


# 6cc18331 19-Sep-2022 Yury Norov <[email protected]>

lib/find_bit: add find_next{,_and}_bit_wrap

The helper is better optimized for the worst case: in case of empty
cpumask, current code traverses 2 * size:

next = cpumask_next_and(prev, src1p, src2

lib/find_bit: add find_next{,_and}_bit_wrap

The helper is better optimized for the worst case: in case of empty
cpumask, current code traverses 2 * size:

next = cpumask_next_and(prev, src1p, src2p);
if (next >= nr_cpu_ids)
next = cpumask_first_and(src1p, src2p);

At bitmap level we can stop earlier after checking 'size + offset' bits.

Signed-off-by: Yury Norov <[email protected]>

show more ...


# 33e67710 19-Sep-2022 Yury Norov <[email protected]>

cpumask: switch for_each_cpu{,_not} to use for_each_bit()

The difference between for_each_cpu() and for_each_set_bit()
is that the latter uses cpumask_next() instead of find_next_bit(),
and so calls

cpumask: switch for_each_cpu{,_not} to use for_each_bit()

The difference between for_each_cpu() and for_each_set_bit()
is that the latter uses cpumask_next() instead of find_next_bit(),
and so calls cpumask_check().

This check is useless because the iterator value is not provided by
user. It generates false-positives for the very last iteration
of for_each_cpu().

Signed-off-by: Yury Norov <[email protected]>

show more ...


Revision tags: v6.0-rc6
# 3cea8d47 18-Sep-2022 Yury Norov <[email protected]>

lib: add find_nth{,_and,_andnot}_bit()

Kernel lacks for a function that searches for Nth bit in a bitmap.
Usually people do it like this:
for_each_set_bit(bit, mask, size)
if (n-- == 0)
return

lib: add find_nth{,_and,_andnot}_bit()

Kernel lacks for a function that searches for Nth bit in a bitmap.
Usually people do it like this:
for_each_set_bit(bit, mask, size)
if (n-- == 0)
return bit;

We can do it more efficiently, if we:
1. find a word containing Nth bit, using hweight(); and
2. find the bit, using a helper fns(), that works similarly to
__ffs() and ffz().

fns() is implemented as a simple loop. For x86_64, there's PDEP instruction
to do that: ret = clz(pdep(1 << idx, num)). However, for large bitmaps the
most of improvement comes from using hweight(), so I kept fns() simple.

New find_nth_bit() is ~70 times faster on x86_64/kvm in find_bit benchmark:
find_nth_bit: 7154190 ns, 16411 iterations
for_each_bit: 505493126 ns, 16315 iterations

With all that, a family of 3 new functions is added, and used where
appropriate in the following patches.

Signed-off-by: Yury Norov <[email protected]>

show more ...


# e79864f3 15-Sep-2022 Yury Norov <[email protected]>

lib/find_bit: optimize find_next_bit() functions

Over the past couple years, the function _find_next_bit() was extended
with parameters that modify its behavior to implement and- zero- and le-
flavo

lib/find_bit: optimize find_next_bit() functions

Over the past couple years, the function _find_next_bit() was extended
with parameters that modify its behavior to implement and- zero- and le-
flavors. The parameters are passed at compile time, but current design
prevents a compiler from optimizing out the conditionals.

As find_next_bit() API grows, I expect that more parameters will be added.
Current design would require more conditional code in _find_next_bit(),
which would bloat the helper even more and make it barely readable.

This patch replaces _find_next_bit() with a macro FIND_NEXT_BIT, and adds
a set of wrappers, so that the compile-time optimizations become possible.

The common logic is moved to the new macro, and all flavors may be
generated by providing a FETCH macro parameter, like in this example:

#define FIND_NEXT_BIT(FETCH, MUNGE, size, start) ...

find_next_xornot_and_bit(addr1, addr2, addr3, size, start)
{
return FIND_NEXT_BIT(addr1[idx] ^ ~addr2[idx] & addr3[idx],
/* nop */, size, start);
}

The FETCH may be of any complexity, as soon as it only refers the bitmap(s)
and an iterator idx.

MUNGE is here to support _le code generation for BE builds. May be
empty.

I ran find_bit_benchmark 16 times on top of 6.0-rc2 and 16 times on top
of 6.0-rc2 + this series. The results for kvm/x86_64 are:

v6.0-rc2 Optimized Difference Z-score
Random dense bitmap ns ns ns %
find_next_bit: 787735 670546 117189 14.9 3.97
find_next_zero_bit: 777492 664208 113284 14.6 10.51
find_last_bit: 830925 687573 143352 17.3 2.35
find_first_bit: 3874366 3306635 567731 14.7 1.84
find_first_and_bit: 40677125 37739887 2937238 7.2 1.36
find_next_and_bit: 347865 304456 43409 12.5 1.35

Random sparse bitmap
find_next_bit: 19816 14021 5795 29.2 6.10
find_next_zero_bit: 1318901 1223794 95107 7.2 1.41
find_last_bit: 14573 13514 1059 7.3 6.92
find_first_bit: 1313321 1249024 64297 4.9 1.53
find_first_and_bit: 8921 8098 823 9.2 4.56
find_next_and_bit: 9796 7176 2620 26.7 5.39

Where the statistics is significant (z-score > 3), the improvement
is ~15%.

According to the bloat-o-meter, the Image size is 10-11K less:

x86_64/defconfig:
add/remove: 32/14 grow/shrink: 61/782 up/down: 6344/-16521 (-10177)

arm64/defconfig:
add/remove: 3/2 grow/shrink: 50/714 up/down: 608/-11556 (-10948)

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Yury Norov <[email protected]>

show more ...


# 14a99e13 15-Sep-2022 Yury Norov <[email protected]>

lib/find_bit: create find_first_zero_bit_le()

find_first_zero_bit_le() is an alias to find_next_zero_bit_le(),
despite that 'next' is known to be slower than 'first' version.

Now that we have commo

lib/find_bit: create find_first_zero_bit_le()

find_first_zero_bit_le() is an alias to find_next_zero_bit_le(),
despite that 'next' is known to be slower than 'first' version.

Now that we have common FIND_FIRST_BIT() macro helper, it's trivial
to implement find_first_zero_bit_le() as a real function.

Reviewed-by: Valentin Schneider <[email protected]>
Signed-off-by: Yury Norov <[email protected]>

show more ...


Revision tags: v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3
# 6d7131bd 11-Apr-2022 Anna-Maria Behnsen <[email protected]>

include/linux/find: Fix documentation

The order of the arguments in function documentation doesn't fit the
implementation. Change the documentation so that it corresponds to the
code. This prevent p

include/linux/find: Fix documentation

The order of the arguments in function documentation doesn't fit the
implementation. Change the documentation so that it corresponds to the
code. This prevent people to get confused when reading the documentation.

Signed-off-by: Anna-Maria Behnsen <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Yury Norov <[email protected]>

show more ...


Revision tags: v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6
# ec288a2c 14-Aug-2021 Yury Norov <[email protected]>

bitmap: unify find_bit operations

bitmap_for_each_{set,clear}_region() are similar to for_each_bit()
macros in include/linux/find.h, but interface and implementation
of them are different.

This pat

bitmap: unify find_bit operations

bitmap_for_each_{set,clear}_region() are similar to for_each_bit()
macros in include/linux/find.h, but interface and implementation
of them are different.

This patch adds for_each_bitrange() macros and drops unused
bitmap_*_region() API in sake of unification.

Signed-off-by: Yury Norov <[email protected]>
Tested-by: Wolfram Sang <[email protected]>
Acked-by: Dennis Zhou <[email protected]>
Acked-by: Ulf Hansson <[email protected]> # For MMC

show more ...


# 7516be99 14-Aug-2021 Yury Norov <[email protected]>

find: micro-optimize for_each_{set,clear}_bit()

The macros iterate thru all set/clear bits in a bitmap. They search a
first bit using find_first_bit(), and the rest bits using find_next_bit().

Sinc

find: micro-optimize for_each_{set,clear}_bit()

The macros iterate thru all set/clear bits in a bitmap. They search a
first bit using find_first_bit(), and the rest bits using find_next_bit().

Since find_next_bit() is called shortly after find_first_bit(), we can
save few lines of I-cache by not using find_first_bit().

Signed-off-by: Yury Norov <[email protected]>
Tested-by: Wolfram Sang <[email protected]>

show more ...


# bc9d6635 14-Aug-2021 Yury Norov <[email protected]>

include/linux: move for_each_bit() macros from bitops.h to find.h

for_each_bit() macros depend on find_bit() machinery, and so the
proper place for them is the find.h header.

Signed-off-by: Yury No

include/linux: move for_each_bit() macros from bitops.h to find.h

for_each_bit() macros depend on find_bit() machinery, and so the
proper place for them is the find.h header.

Signed-off-by: Yury Norov <[email protected]>
Tested-by: Wolfram Sang <[email protected]>

show more ...


# f68edc92 14-Aug-2021 Yury Norov <[email protected]>

lib: add find_first_and_bit()

Currently find_first_and_bit() is an alias to find_next_and_bit(). However,
it is widely used in cpumask, so it worth to optimize it. This patch adds
its own implementa

lib: add find_first_and_bit()

Currently find_first_and_bit() is an alias to find_next_and_bit(). However,
it is widely used in cpumask, so it worth to optimize it. This patch adds
its own implementation for find_first_and_bit().

On x86_64 find_bit_benchmark says:

Before (#define find_first_and_bit(...) find_next_and_bit(..., 0):
Start testing find_bit() with random-filled bitmap
[ 140.291468] find_first_and_bit: 46890919 ns, 32671 iterations
Start testing find_bit() with sparse bitmap
[ 140.295028] find_first_and_bit: 7103 ns, 1 iterations

After:
Start testing find_bit() with random-filled bitmap
[ 162.574907] find_first_and_bit: 25045813 ns, 32846 iterations
Start testing find_bit() with sparse bitmap
[ 162.578458] find_first_and_bit: 4900 ns, 1 iterations

(Thanks to Alexey Klimov for thorough testing.)

Signed-off-by: Yury Norov <[email protected]>
Tested-by: Wolfram Sang <[email protected]>
Tested-by: Alexey Klimov <[email protected]>

show more ...


# c126a53c 14-Aug-2021 Yury Norov <[email protected]>

arch: remove GENERIC_FIND_FIRST_BIT entirely

In 5.12 cycle we enabled GENERIC_FIND_FIRST_BIT config option for ARM64
and MIPS. It increased performance and shrunk .text size; and so far
I didn't rec

arch: remove GENERIC_FIND_FIRST_BIT entirely

In 5.12 cycle we enabled GENERIC_FIND_FIRST_BIT config option for ARM64
and MIPS. It increased performance and shrunk .text size; and so far
I didn't receive any negative feedback on the change.

https://lore.kernel.org/linux-arch/[email protected]/

Now I think it's a good time to switch all architectures to use
find_{first,last}_bit() unconditionally, and so remove corresponding
config option.

The patch does't introduce functioal changes for arc, arm, arm64, mips,
m68k, s390 and x86, for other architectures I expect improvement both in
performance and .text size.

Signed-off-by: Yury Norov <[email protected]>
Tested-by: Alexander Lobakin <[email protected]> (mips)
Reviewed-by: Alexander Lobakin <[email protected]> (mips)
Reviewed-by: Andy Shevchenko <[email protected]>
Acked-by: Will Deacon <[email protected]>
Tested-by: Wolfram Sang <[email protected]>

show more ...


# 47d8c156 14-Aug-2021 Yury Norov <[email protected]>

include: move find.h from asm_generic to linux

find_bit API and bitmap API are closely related, but inclusion paths
are different - include/asm-generic and include/linux, correspondingly.
In the pas

include: move find.h from asm_generic to linux

find_bit API and bitmap API are closely related, but inclusion paths
are different - include/asm-generic and include/linux, correspondingly.
In the past it made a lot of troubles due to circular dependencies
and/or undefined symbols. Fix this by moving find.h under include/linux.

Signed-off-by: Yury Norov <[email protected]>
Tested-by: Wolfram Sang <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>

show more ...