|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7 |
|
| #
c320592f |
| 07-Jan-2025 |
Wolfram Sang <[email protected]> |
bitops: add generic parity calculation for u8
There are multiple open coded implementations for getting the parity of a byte in the kernel, even using different approaches. Take the pretty efficient
bitops: add generic parity calculation for u8
There are multiple open coded implementations for getting the parity of a byte in the kernel, even using different approaches. Take the pretty efficient version from SPD5118 driver and make it generally available by putting it into the bitops header. As long as there is just one parity calculation helper, the creation of a distinct 'parity.h' header was discarded. Also, the usage of hweight8() for architectures having a popcnt instruction is postponed until a use case within hot paths is desired. The motivation for this patch is the frequent use of odd parity in the I3C specification and to simplify drivers there.
Changes compared to the original SPD5118 version are the addition of kernel documentation, switching the return type from bool to int, and renaming the argument of the function.
Signed-off-by: Wolfram Sang <[email protected]> Tested-by: Guenter Roeck <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Acked-by: Yury Norov <[email protected]> Reviewed-by: Kuan-Wei Chiu <[email protected]> Tested-by: Kuan-Wei Chiu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4 |
|
| #
e0eeb938 |
| 11-Jun-2024 |
Dan Carpenter <[email protected]> |
bitops: Add a comment explaining the double underscore macros
Linus Walleij pointed out that a new comer might be confused about the difference between set_bit() and __set_bit(). Add a comment expl
bitops: Add a comment explaining the double underscore macros
Linus Walleij pointed out that a new comer might be confused about the difference between set_bit() and __set_bit(). Add a comment explaining the difference.
Link: https://lore.kernel.org/all/CACRpkdZFPG_YLici-BmYfk9HZ36f4WavCN3JNotkk8cPgCODCg@mail.gmail.com/ Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Signed-off-by: Yury Norov <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9 |
|
| #
9f2c2d6b |
| 07-May-2024 |
Andy Shevchenko <[email protected]> |
bitops: Move aligned_byte_mask() to wordpart.h
The bitops.h is for bit related operations. The aligned_byte_mask() is about byte (or part of the machine word) operations, for which we have a separat
bitops: Move aligned_byte_mask() to wordpart.h
The bitops.h is for bit related operations. The aligned_byte_mask() is about byte (or part of the machine word) operations, for which we have a separate header, move the mentioned macro to wordpart.h to consolidate similar operations.
Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Yury Norov <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc7 |
|
| #
1c2aa561 |
| 02-May-2024 |
Kuan-Wei Chiu <[email protected]> |
bitops: Optimize fns() for improved performance
The current fns() repeatedly uses __ffs() to find the index of the least significant bit and then clears the corresponding bit using __clear_bit(). Th
bitops: Optimize fns() for improved performance
The current fns() repeatedly uses __ffs() to find the index of the least significant bit and then clears the corresponding bit using __clear_bit(). The method for clearing the least significant bit can be optimized by using word &= word - 1 instead.
Typically, the execution time of one __ffs() plus one __clear_bit() is longer than that of a bitwise AND operation and a subtraction. To improve performance, the loop for clearing the least significant bit has been replaced with word &= word - 1, followed by a single __ffs() operation to obtain the answer. This change reduces the number of __ffs() iterations from n to just one, enhancing overall performance.
This modification significantly accelerates the fns() function in the test_bitops benchmark, improving its speed by approximately 7.6 times. Additionally, it enhances the performance of find_nth_bit() in the find_bit benchmark by approximately 26%.
Before: test_bitops: fns: 58033164 ns find_nth_bit: 4254313 ns, 16525 iterations
After: test_bitops: fns: 7637268 ns find_nth_bit: 3362863 ns, 16501 iterations
CC: Andrew Morton <[email protected]> CC: Rasmus Villemoes <[email protected]> Signed-off-by: Kuan-Wei Chiu <[email protected]> Signed-off-by: Yury Norov <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc6, v6.9-rc5 |
|
| #
9c313ccd |
| 20-Apr-2024 |
Thorsten Blum <[email protected]> |
bitops: Change function return types from long to int
Change the return types of bitops functions (ffs, fls, and fns) from long to int. The expected return values are in the range [0, 64], for which
bitops: Change function return types from long to int
Change the return types of bitops functions (ffs, fls, and fns) from long to int. The expected return values are in the range [0, 64], for which int is sufficient.
Additionally, int aligns well with the return types of the corresponding __builtin_* functions, potentially reducing overall type conversions.
Many of the existing bitops functions already return an int and don't need to be changed. The bitops functions in arch/ should be considered separately.
Adjust some return variables to match the function return types.
With GCC 13 and defconfig, these changes reduced the size of a test kernel image by 5,432 bytes on arm64 and by 248 bytes on riscv; there were no changes in size on x86_64, powerpc, or m68k.
Signed-off-by: Thorsten Blum <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc4, v6.9-rc3, v6.9-rc2 |
|
| #
5259401e |
| 27-Mar-2024 |
Alexander Lobakin <[email protected]> |
bitops: let the compiler optimize {__,}assign_bit()
Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops on compile-time constants"), the compilers are able to expand inline bitmap
bitops: let the compiler optimize {__,}assign_bit()
Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops on compile-time constants"), the compilers are able to expand inline bitmap operations to compile-time initializers when possible. However, during the round of replacement if-__set-else-__clear with __assign_bit() as per Andy's advice, bloat-o-meter showed +1024 bytes difference in object code size for one module (even one function), where the pattern:
DECLARE_BITMAP(foo) = { }; // on the stack, zeroed
if (a) __set_bit(const_bit_num, foo); if (b) __set_bit(another_const_bit_num, foo); ...
is heavily used, although there should be no difference: the bitmap is zeroed, so the second half of __assign_bit() should be compiled-out as a no-op. I either missed the fact that __assign_bit() has bitmap pointer marked as `volatile` (as we usually do for bitops) or was hoping that the compilers would at least try to look past the `volatile` for __always_inline functions. Anyhow, due to that attribute, the compilers were always compiling the whole expression and no mentioned compile-time optimizations were working.
Convert __assign_bit() to a macro since it's a very simple if-else and all of the checks are performed inside __set_bit() and __clear_bit(), thus that wrapper has to be as transparent as possible. After that change, despite it showing only -20 bytes change for vmlinux (due to that it's still relatively unpopular), no drastic code size changes happen when replacing if-set-else-clear for onstack bitmaps with __assign_bit(), meaning the compiler now expands them to the actual operations will all the expected optimizations.
Atomic assign_bit() is less affected due to its nature, but let's convert it to a macro as well to keep the code consistent and not leave a place for possible suboptimal codegen. Moreover, with certain kernel configuration it actually gives some saves (x86):
do_ip_setsockopt 4154 4099 -55
Suggested-by: Yury Norov <[email protected]> # assign_bit(), too Cc: Andy Shevchenko <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Acked-by: Yury Norov <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
7d8296b2 |
| 27-Mar-2024 |
Alexander Lobakin <[email protected]> |
bitops: make BYTES_TO_BITS() treewide-available
Avoid open-coding that simple expression each time by moving BYTES_TO_BITS() from the probes code to <linux/bitops.h> to export it to the rest of the
bitops: make BYTES_TO_BITS() treewide-available
Avoid open-coding that simple expression each time by moving BYTES_TO_BITS() from the probes code to <linux/bitops.h> to export it to the rest of the kernel. Simplify the macro while at it. `BITS_PER_LONG / sizeof(long)` always equals to %BITS_PER_BYTE, regardless of the target architecture. Do the same for the tools ecosystem as well (incl. its version of bitops.h). The previous implementation had its implicit type of long, while the new one is int, so adjust the format literal accordingly in the perf code.
Suggested-by: Andy Shevchenko <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Acked-by: Yury Norov <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
72cc1980 |
| 27-Mar-2024 |
Alexander Lobakin <[email protected]> |
bitops: add missing prototype check
Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added a new bitop, test_bit_acquire(), with proper wrapping in order to try to optimize it at c
bitops: add missing prototype check
Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added a new bitop, test_bit_acquire(), with proper wrapping in order to try to optimize it at compile-time, but missed the list of bitops used for checking their prototypes a bit below. The functions added have consistent prototypes, so that no more changes are required and no functional changes take place.
Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier") Reviewed-by: Przemek Kitszel <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6 |
|
| #
3cea8d47 |
| 18-Sep-2022 |
Yury Norov <[email protected]> |
lib: add find_nth{,_and,_andnot}_bit()
Kernel lacks for a function that searches for Nth bit in a bitmap. Usually people do it like this: for_each_set_bit(bit, mask, size) if (n-- == 0) return
lib: add find_nth{,_and,_andnot}_bit()
Kernel lacks for a function that searches for Nth bit in a bitmap. Usually people do it like this: for_each_set_bit(bit, mask, size) if (n-- == 0) return bit;
We can do it more efficiently, if we: 1. find a word containing Nth bit, using hweight(); and 2. find the bit, using a helper fns(), that works similarly to __ffs() and ffz().
fns() is implemented as a simple loop. For x86_64, there's PDEP instruction to do that: ret = clz(pdep(1 << idx, num)). However, for large bitmaps the most of improvement comes from using hweight(), so I kept fns() simple.
New find_nth_bit() is ~70 times faster on x86_64/kvm in find_bit benchmark: find_nth_bit: 7154190 ns, 16411 iterations for_each_bit: 505493126 ns, 16315 iterations
With all that, a family of 3 new functions is added, and used where appropriate in the following patches.
Signed-off-by: Yury Norov <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc5, v6.0-rc4, v6.0-rc3 |
|
| #
e1d7c760 |
| 22-Aug-2022 |
Uros Bizjak <[email protected]> |
bitops: use try_cmpxchg in set_mask_bits and bit_clear_unless
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in set_mask_bits and bit_clear_unless. x86 CMPXCHG instruction returns succe
bitops: use try_cmpxchg in set_mask_bits and bit_clear_unless
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in set_mask_bits and bit_clear_unless. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg).
Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails, enabling further code simplifications.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
8238b457 |
| 26-Aug-2022 |
Mikulas Patocka <[email protected]> |
wait_on_bit: add an acquire memory barrier
There are several places in the kernel where wait_on_bit is not followed by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read).
On architec
wait_on_bit: add an acquire memory barrier
There are several places in the kernel where wait_on_bit is not followed by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read).
On architectures with weak memory ordering, it may happen that memory accesses that follow wait_on_bit are reordered before wait_on_bit and they may return invalid data.
Fix this class of bugs by introducing a new function "test_bit_acquire" that works like test_bit, but has acquire memory ordering semantics.
Signed-off-by: Mikulas Patocka <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4 |
|
| #
b03fc117 |
| 24-Jun-2022 |
Alexander Lobakin <[email protected]> |
bitops: let optimize out non-atomic bitops on compile-time constants
Currently, many architecture-specific non-atomic bitop implementations use inline asm or other hacks which are faster or more rob
bitops: let optimize out non-atomic bitops on compile-time constants
Currently, many architecture-specific non-atomic bitop implementations use inline asm or other hacks which are faster or more robust when working with "real" variables (i.e. fields from the structures etc.), but the compilers have no clue how to optimize them out when called on compile-time constants. That said, the following code:
DECLARE_BITMAP(foo, BITS_PER_LONG) = { }; // -> unsigned long foo[1]; unsigned long bar = BIT(BAR_BIT); unsigned long baz = 0;
__set_bit(FOO_BIT, foo); baz |= BIT(BAZ_BIT);
BUILD_BUG_ON(!__builtin_constant_p(test_bit(FOO_BIT, foo)); BUILD_BUG_ON(!__builtin_constant_p(bar & BAR_BIT)); BUILD_BUG_ON(!__builtin_constant_p(baz & BAZ_BIT));
triggers the first assertion on x86_64, which means that the compiler is unable to evaluate it to a compile-time initializer when the architecture-specific bitop is used even if it's obvious. In order to let the compiler optimize out such cases, expand the bitop() macro to use the "constant" C non-atomic bitop implementations when all of the arguments passed are compile-time constants, which means that the result will be a compile-time constant as well, so that it produces more efficient and simple code in 100% cases, comparing to the architecture-specific counterparts.
The savings are architecture, compiler and compiler flags dependent, for example, on x86_64 -O2:
GCC 12: add/remove: 78/29 grow/shrink: 332/525 up/down: 31325/-61560 (-30235) LLVM 13: add/remove: 79/76 grow/shrink: 184/537 up/down: 55076/-141892 (-86816) LLVM 14: add/remove: 10/3 grow/shrink: 93/138 up/down: 3705/-6992 (-3287)
and ARM64 (courtesy of Mark):
GCC 11: add/remove: 92/29 grow/shrink: 933/2766 up/down: 39340/-82580 (-43240) LLVM 14: add/remove: 21/11 grow/shrink: 620/651 up/down: 12060/-15824 (-3764)
Cc: Mark Rutland <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Marco Elver <[email protected]> Signed-off-by: Yury Norov <[email protected]>
show more ...
|
| #
e69eb9c4 |
| 24-Jun-2022 |
Alexander Lobakin <[email protected]> |
bitops: wrap non-atomic bitops with a transparent macro
In preparation for altering the non-atomic bitops with a macro, wrap them in a transparent definition. This requires prepending one more '_' t
bitops: wrap non-atomic bitops with a transparent macro
In preparation for altering the non-atomic bitops with a macro, wrap them in a transparent definition. This requires prepending one more '_' to their names in order to be able to do that seamlessly. It is a simple change, given that all the non-prefixed definitions are now in asm-generic. sparc32 already has several triple-underscored functions, so I had to rename them ('___' -> 'sp32_').
Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Marco Elver <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Yury Norov <[email protected]>
show more ...
|
| #
bb7379bf |
| 24-Jun-2022 |
Alexander Lobakin <[email protected]> |
bitops: define const_*() versions of the non-atomics
Define const_*() variants of the non-atomic bitops to be used when the input arguments are compile-time constants, so that the compiler will be a
bitops: define const_*() versions of the non-atomics
Define const_*() variants of the non-atomic bitops to be used when the input arguments are compile-time constants, so that the compiler will be always able to resolve those to compile-time constants as well. Those are mostly direct aliases for generic_*() with one exception for const_test_bit(): the original one is declared atomic-safe and thus doesn't discard the `volatile` qualifier, so in order to let optimize code, define it separately disregarding the qualifier. Add them to the compile-time type checks as well just in case.
Suggested-by: Marco Elver <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Marco Elver <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Yury Norov <[email protected]>
show more ...
|
| #
0e862838 |
| 24-Jun-2022 |
Alexander Lobakin <[email protected]> |
bitops: unify non-atomic bitops prototypes across architectures
Currently, there is a mess with the prototypes of the non-atomic bitops across the different architectures:
ret bool, int, unsigned l
bitops: unify non-atomic bitops prototypes across architectures
Currently, there is a mess with the prototypes of the non-atomic bitops across the different architectures:
ret bool, int, unsigned long nr int, long, unsigned int, unsigned long addr volatile unsigned long *, volatile void *
Thankfully, it doesn't provoke any bugs, but can sometimes make the compiler angry when it's not handy at all. Adjust all the prototypes to the following standard:
ret bool retval can be only 0 or 1 nr unsigned long native; signed makes no sense addr volatile unsigned long * bitmaps are arrays of ulongs
Next, some architectures don't define 'arch_' versions as they don't support instrumentation, others do. To make sure there is always the same set of callables present and to ease any potential future changes, make them all follow the rule: * architecture-specific files define only 'arch_' versions; * non-prefixed versions can be defined only in asm-generic files; and place the non-prefixed definitions into a new file in asm-generic to be included by non-instrumented architectures.
Finally, add some static assertions in order to prevent people from making a mess in this room again. I also used the %__always_inline attribute consistently, so that they always get resolved to the actual operations.
Suggested-by: Andy Shevchenko <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Acked-by: Mark Rutland <[email protected]> Reviewed-by: Yury Norov <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Yury Norov <[email protected]>
show more ...
|
|
Revision tags: v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6 |
|
| #
bc9d6635 |
| 14-Aug-2021 |
Yury Norov <[email protected]> |
include/linux: move for_each_bit() macros from bitops.h to find.h
for_each_bit() macros depend on find_bit() machinery, and so the proper place for them is the find.h header.
Signed-off-by: Yury No
include/linux: move for_each_bit() macros from bitops.h to find.h
for_each_bit() macros depend on find_bit() machinery, and so the proper place for them is the find.h header.
Signed-off-by: Yury Norov <[email protected]> Tested-by: Wolfram Sang <[email protected]>
show more ...
|
|
Revision tags: v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1 |
|
| #
cb0f8003 |
| 02-Jul-2021 |
Kumar Kartikeya Dwivedi <[email protected]> |
bitops: Add non-atomic bitops for pointers
cpumap needs to set, clear, and test the lowest bit in skb pointer in various places. To make these checks less noisy, add pointer friendly bitop macros th
bitops: Add non-atomic bitops for pointers
cpumap needs to set, clear, and test the lowest bit in skb pointer in various places. To make these checks less noisy, add pointer friendly bitop macros that also do some typechecking to sanitize the argument.
These wrap the non-atomic bitops __set_bit, __clear_bit, and test_bit but for pointer arguments. Pointer's address has to be passed in and it is treated as an unsigned long *, since width and representation of pointer and unsigned long match on targets Linux supports. They are prefixed with double underscore to indicate lack of atomicity.
Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
|
Revision tags: v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1 |
|
| #
2cc7b6a4 |
| 07-May-2021 |
Yury Norov <[email protected]> |
lib: add fast path for find_first_*_bit() and find_last_bit()
Similarly to bitmap functions, users would benefit if we'll handle a case of small-size bitmaps that fit into a single word.
While here
lib: add fast path for find_first_*_bit() and find_last_bit()
Similarly to bitmap functions, users would benefit if we'll handle a case of small-size bitmaps that fit into a single word.
While here, move the find_last_bit() declaration to bitops/find.h where other find_*_bit() functions sit.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yury Norov <[email protected]> Acked-by: Rasmus Villemoes <[email protected]> Acked-by: Andy Shevchenko <[email protected]> Cc: Alexey Klimov <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: David Sterba <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Jianpeng Ma <[email protected]> Cc: Joe Perches <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Rich Felker <[email protected]> Cc: Stefano Brivio <[email protected]> Cc: Wei Yang <[email protected]> Cc: Wolfram Sang <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse |
|
| #
4945cca2 |
| 26-Feb-2021 |
Geert Uytterhoeven <[email protected]> |
include/linux/bitops.h: spelling s/synomyn/synonym/
Fix a misspelling of "synonym".
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Geert Uytterhoeve
include/linux/bitops.h: spelling s/synomyn/synonym/
Fix a misspelling of "synonym".
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1 |
|
| #
aa6159ab |
| 16-Dec-2020 |
Andy Shevchenko <[email protected]> |
kernel.h: split out mathematical helpers
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out mathematical helpers.
kernel.h: split out mathematical helpers
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out mathematical helpers.
At the same time convert users in header and lib folder to use new header. Though for time being include new header back to kernel.h to avoid twisted indirected includes for existing users.
[[email protected]: fix powerpc build] Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Trond Myklebust <[email protected]> Cc: Jeff Layton <[email protected]> Cc: Rasmus Villemoes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1 |
|
| #
004fba1a |
| 16-Oct-2020 |
Wei Yang <[email protected]> |
bitops: use the same mechanism for get_count_order[_long]
These two functions share the same logic.
Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <akpm@li
bitops: use the same mechanism for get_count_order[_long]
These two functions share the same logic.
Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Andy Shevchenko <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
a9eb6370 |
| 16-Oct-2020 |
Wei Yang <[email protected]> |
bitops: simplify get_count_order_long()
These two cases could be unified into one.
Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]
bitops: simplify get_count_order_long()
These two cases could be unified into one.
Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Andy Shevchenko <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1 |
|
| #
bd93f003 |
| 04-Jun-2020 |
Arnd Bergmann <[email protected]> |
include/linux/bitops.h: avoid clang shift-count-overflow warnings
Clang normally does not warn about certain issues in inline functions when it only happens in an eliminated code path. However if so
include/linux/bitops.h: avoid clang shift-count-overflow warnings
Clang normally does not warn about certain issues in inline functions when it only happens in an eliminated code path. However if something else goes wrong, it does tend to complain about the definition of hweight_long() on 32-bit targets:
include/linux/bitops.h:75:41: error: shift count >= width of type [-Werror,-Wshift-count-overflow] return sizeof(w) == 4 ? hweight32(w) : hweight64(w); ^~~~~~~~~~~~ include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64' define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w)) ^~~~~~~~~~~~~~~~~~~~ include/asm-generic/bitops/const_hweight.h:21:76: note: expanded from macro '__const_hweight64' define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32)) ^ ~~ include/asm-generic/bitops/const_hweight.h:20:49: note: expanded from macro '__const_hweight32' define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) >> 16)) ^ include/asm-generic/bitops/const_hweight.h:19:72: note: expanded from macro '__const_hweight16' define __const_hweight16(w) (__const_hweight8(w) + __const_hweight8((w) >> 8 )) ^ include/asm-generic/bitops/const_hweight.h:12:9: note: expanded from macro '__const_hweight8' (!!((w) & (1ULL << 2))) + \
Adding an explicit cast to __u64 avoids that warning and makes it easier to read other output.
Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Christian Brauner <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Nick Desaulniers <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1 |
|
| #
f80ac98a |
| 07-Apr-2020 |
Josh Poimboeuf <[email protected]> |
bitops: always inline sign extension helpers
With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b
bitops: always inline sign extension helpers
With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled
This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr() from the user_access_begin/end critical region (i.e, with SMAP disabled).
While it's probably harmless in this case, in general we like to avoid extra function calls in SMAP-disabled regions because it can open up inadvertent security holes.
Fix the warning by changing the sign extension helpers to __always_inline. This convinces GCC to inline gen8_canonical_addr().
The sign extension functions are trivial anyway, so it makes sense to always inline them. With my test optimize-for-size-based config, this actually shrinks the text size of i915_gem_execbuffer.o by 45 bytes -- and no change for vmlinux.
Reported-by: Randy Dunlap <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Al Viro <[email protected]> Cc: Chris Wilson <[email protected]> Link: http://lkml.kernel.org/r/740179324b2b18b750b16295c48357f00b5fa9ed.1582982020.git.jpoimboe@redhat.com Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1 |
|
| #
0bddc1bd |
| 04-Feb-2020 |
Yury Norov <[email protected]> |
bitops: more BITS_TO_* macros
Introduce BITS_TO_U64, BITS_TO_U32 and BITS_TO_BYTES as they are handy in the following patches (BITS_TO_U32 specifically). Reimplement tools/ version of the macros ac
bitops: more BITS_TO_* macros
Introduce BITS_TO_U64, BITS_TO_U32 and BITS_TO_BYTES as they are handy in the following patches (BITS_TO_U32 specifically). Reimplement tools/ version of the macros according to the kernel implementation.
Also fix indentation for BITS_PER_TYPE definition.
Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Yury Norov <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Cc: Amritha Nambiar <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Kees Cook <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Miklos Szeredi <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Steffen Klassert <[email protected]> Cc: "Tobin C . Harding" <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Cc: Willem de Bruijn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|