History log of /linux-6.15/arch/arm/crypto/Makefile (Results 1 – 25 of 40)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2
# 1684e829 02-Dec-2024 Eric Biggers <[email protected]>

arm/crc-t10dif: expose CRC-T10DIF function through lib

Move the arm CRC-T10DIF assembly code into the lib directory and wire it
up to the library interface. This allows it to be used without going

arm/crc-t10dif: expose CRC-T10DIF function through lib

Move the arm CRC-T10DIF assembly code into the lib directory and wire it
up to the library interface. This allows it to be used without going
through the crypto API. It remains usable via the crypto API too via
the shash algorithms that use the library interface. Thus all the
arch-specific "shash" code becomes unnecessary and is removed.

Note: to see the diff from arch/arm/crypto/crct10dif-ce-glue.c to
arch/arm/lib/crc-t10dif-glue.c, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>

show more ...


# 1e1b6dbc 02-Dec-2024 Eric Biggers <[email protected]>

arm/crc32: expose CRC32 functions through lib

Move the arm CRC32 assembly code into the lib directory and wire it up
to the library interface. This allows it to be used without going
through the cr

arm/crc32: expose CRC32 functions through lib

Move the arm CRC32 assembly code into the lib directory and wire it up
to the library interface. This allows it to be used without going
through the crypto API. It remains usable via the crypto API too via
the shash algorithms that use the library interface. Thus all the
arch-specific "shash" code becomes unnecessary and is removed.

Note: to see the diff from arch/arm/crypto/crc32-ce-glue.c to
arch/arm/lib/crc32-glue.c, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>

show more ...


Revision tags: v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5
# 2f62847c 18-Jan-2023 Nathan Chancellor <[email protected]>

ARM: 9287/1: Reduce __thumb2__ definition to crypto files that require it

Commit 1d2e9b67b001 ("ARM: 9265/1: pass -march= only to compiler") added
a __thumb2__ define to ASFLAGS to avoid build error

ARM: 9287/1: Reduce __thumb2__ definition to crypto files that require it

Commit 1d2e9b67b001 ("ARM: 9265/1: pass -march= only to compiler") added
a __thumb2__ define to ASFLAGS to avoid build errors in the crypto code,
which relies on __thumb2__ for preprocessing. Commit 59e2cf8d21e0 ("ARM:
9275/1: Drop '-mthumb' from AFLAGS_ISA") followed up on this by removing
-mthumb from AFLAGS so that __thumb2__ would not be defined when the
default target was ARMv7 or newer.

Unfortunately, the second commit's fix assumes that the toolchain
defaults to -mno-thumb / -marm, which is not the case for Debian's
arm-linux-gnueabihf target, which defaults to -mthumb:

$ echo | arm-linux-gnueabihf-gcc -dM -E - | grep __thumb
#define __thumb2__ 1
#define __thumb__ 1

This target is used by several CI systems, which will still see
redefined macro warnings, despite '-mthumb' not being present in the
flags:

<command-line>: warning: "__thumb2__" redefined
<built-in>: note: this is the location of the previous definition

Remove the global AFLAGS __thumb2__ define and move it to the crypto
folder where it is required by the imported OpenSSL algorithms; the rest
of the kernel should use the internal CONFIG_THUMB2_KERNEL symbol to
know whether or not Thumb2 is being used or not. Be sure that __thumb2__
is undefined first so that there are no macro redefinition warnings.

Link: https://github.com/ClangBuiltLinux/linux/issues/1772

Reported-by: "kernelci.org bot" <[email protected]>
Suggested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
Fixes: 59e2cf8d21e0 ("ARM: 9275/1: Drop '-mthumb' from AFLAGS_ISA")
Fixes: 1d2e9b67b001 ("ARM: 9265/1: pass -march= only to compiler")
Signed-off-by: Russell King (Oracle) <[email protected]>

show more ...


Revision tags: v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1
# 2d16803c 28-May-2022 Jason A. Donenfeld <[email protected]>

crypto: blake2s - remove shash module

BLAKE2s has no currently known use as an shash. Just remove all of this
unnecessary plumbing. Removing this shash was something we talked about
back when we wer

crypto: blake2s - remove shash module

BLAKE2s has no currently known use as an shash. Just remove all of this
unnecessary plumbing. Removing this shash was something we talked about
back when we were making BLAKE2s a built-in, but I simply never got
around to doing it. So this completes that project.

Importantly, this fixs a bug in which the lib code depends on
crypto_simd_disabled_for_test, causing linker errors.

Also add more alignment tests to the selftests and compare SIMD and
non-SIMD compression functions, to make up for what we lose from
testmgr.c.

Reported-by: gaochao <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: [email protected]
Fixes: 6048fdcc5f26 ("lib/crypto: blake2s: include as built-in")
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7
# 6048fdcc 22-Dec-2021 Jason A. Donenfeld <[email protected]>

lib/crypto: blake2s: include as built-in

In preparation for using blake2s in the RNG, we change the way that it
is wired-in to the build system. Instead of using ifdefs to select the
right symbol, w

lib/crypto: blake2s: include as built-in

In preparation for using blake2s in the RNG, we change the way that it
is wired-in to the build system. Instead of using ifdefs to select the
right symbol, we use weak symbols. And because ARM doesn't need the
generic implementation, we make the generic one default only if an arch
library doesn't need it already, and then have arch libraries that do
need it opt-in. So that the arch libraries can remain tristate rather
than bool, we then split the shash part from the glue code.

Acked-by: Herbert Xu <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Jason A. Donenfeld <[email protected]>

show more ...


Revision tags: v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12
# 8116138c 25-Apr-2021 Masahiro Yamada <[email protected]>

crypto: arm - use a pattern rule for generating *.S files

Unify similar build rules.

Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>


# 7c0303ff 25-Apr-2021 Masahiro Yamada <[email protected]>

crypto: arm - generate *.S by Perl at build time instead of shipping them

Generate *.S by Perl like arch/{mips,x86}/crypto/Makefile.

Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off

crypto: arm - generate *.S by Perl at build time instead of shipping them

Generate *.S by Perl like arch/{mips,x86}/crypto/Makefile.

Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1
# 1862eb00 23-Dec-2020 Eric Biggers <[email protected]>

crypto: arm/blake2b - add NEON-accelerated BLAKE2b

Add a NEON-accelerated implementation of BLAKE2b.

On Cortex-A7 (which these days is the most common ARM processor that
doesn't have the ARMv8 Cryp

crypto: arm/blake2b - add NEON-accelerated BLAKE2b

Add a NEON-accelerated implementation of BLAKE2b.

On Cortex-A7 (which these days is the most common ARM processor that
doesn't have the ARMv8 Crypto Extensions), this is over twice as fast as
SHA-256, and slightly faster than SHA-1. It is also almost three times
as fast as the generic implementation of BLAKE2b:

Algorithm Cycles per byte (on 4096-byte messages)
=================== =======================================
blake2b-256-neon 14.0
sha1-neon 16.3
blake2s-256-arm 18.8
sha1-asm 20.8
blake2s-256-generic 26.0
sha256-neon 28.9
sha256-asm 32.0
blake2b-256-generic 38.9

This implementation isn't directly based on any other implementation,
but it borrows some ideas from previous NEON code I've written as well
as from chacha-neon-core.S. At least on Cortex-A7, it is faster than
the other NEON implementations of BLAKE2b I'm aware of (the
implementation in the BLAKE2 official repository using intrinsics, and
Andrew Moon's implementation which can be found in SUPERCOP). It does
only one block at a time, so it performs well on short messages too.

NEON-accelerated BLAKE2b is useful because there is interest in using
BLAKE2b-256 for dm-verity on low-end Android devices (specifically,
devices that lack the ARMv8 Crypto Extensions) to replace SHA-1. On
these devices, the performance cost of upgrading to SHA-256 may be
unacceptable, whereas BLAKE2b-256 would actually improve performance.

Although BLAKE2b is intended for 64-bit platforms (unlike BLAKE2s which
is intended for 32-bit platforms), on 32-bit ARM processors with NEON,
BLAKE2b is actually faster than BLAKE2s. This is because NEON supports
64-bit operations, and because BLAKE2s's block size is too small for
NEON to be helpful for it. The best I've been able to do with BLAKE2s
on Cortex-A7 is 18.8 cpb with an optimized scalar implementation.

(I didn't try BLAKE2sp and BLAKE3, which in theory would be faster, but
they're more complex as they require running multiple hashes at once.
Note that BLAKE2b already uses all the NEON bandwidth on the Cortex-A7,
so I expect that any speedup from BLAKE2sp or BLAKE3 would come only
from the smaller number of rounds, not from the extra parallelism.)

For now this BLAKE2b implementation is only wired up to the shash API,
since there is no library API for BLAKE2b yet. However, I've tried to
keep things consistent with BLAKE2s, e.g. by defining
blake2b_compress_arch() which is analogous to blake2s_compress_arch()
and could be exported for use by the library API later if needed.

Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
Tested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


# 5172d322 23-Dec-2020 Eric Biggers <[email protected]>

crypto: arm/blake2s - add ARM scalar optimized BLAKE2s

Add an ARM scalar optimized implementation of BLAKE2s.

NEON isn't very useful for BLAKE2s because the BLAKE2s block size is too
small for NEON

crypto: arm/blake2s - add ARM scalar optimized BLAKE2s

Add an ARM scalar optimized implementation of BLAKE2s.

NEON isn't very useful for BLAKE2s because the BLAKE2s block size is too
small for NEON to help. Each NEON instruction would depend on the
previous one, resulting in poor performance.

With scalar instructions, on the other hand, we can take advantage of
ARM's "free" rotations (like I did in chacha-scalar-core.S) to get an
implementation get runs much faster than the C implementation.

Performance results on Cortex-A7 in cycles per byte using the shash API:

4096-byte messages:
blake2s-256-arm: 18.8
blake2s-256-generic: 26.0

500-byte messages:
blake2s-256-arm: 20.3
blake2s-256-generic: 27.9

100-byte messages:
blake2s-256-arm: 29.7
blake2s-256-generic: 39.2

32-byte messages:
blake2s-256-arm: 50.6
blake2s-256-generic: 66.2

Except on very short messages, this is still slower than the NEON
implementation of BLAKE2b which I've written; that is 14.0, 16.4, 25.8,
and 76.1 cpb on 4096, 500, 100, and 32-byte messages, respectively.
However, optimized BLAKE2s is useful for cases where BLAKE2s is used
instead of BLAKE2b, such as WireGuard.

This new implementation is added in the form of a new module
blake2s-arm.ko, which is analogous to blake2s-x86_64.ko in that it
provides blake2s_compress_arch() for use by the library API as well as
optionally register the algorithms with the shash API.

Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
Tested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7
# d8f1308a 08-Nov-2019 Jason A. Donenfeld <[email protected]>

crypto: arm/curve25519 - wire up NEON implementation

This ports the SUPERCOP implementation for usage in kernel space. In
addition to the usual header, macro, and style changes required for
kernel s

crypto: arm/curve25519 - wire up NEON implementation

This ports the SUPERCOP implementation for usage in kernel space. In
addition to the usual header, macro, and style changes required for
kernel space, it makes a few small changes to the code:

- The stack alignment is relaxed to 16 bytes.
- Superfluous mov statements have been removed.
- ldr for constants has been replaced with movw.
- ldreq has been replaced with moveq.
- The str epilogue has been made more idiomatic.
- SIMD registers are not pushed and popped at the beginning and end.
- The prologue and epilogue have been made idiomatic.
- A hole has been removed from the stack, saving 32 bytes.
- We write-back the base register whenever possible for vld1.8.
- Some multiplications have been reordered for better A7 performance.

There are more opportunities for cleanup, since this code is from qhasm,
which doesn't always do the most opportune thing. But even prior to
extensive hand optimizations, this code delivers significant performance
improvements (given in get_cycles() per call):

----------- -------------
| generic C | this commit |
------------ ----------- -------------
| Cortex-A7 | 49136 | 22395 |
------------ ----------- -------------
| Cortex-A17 | 17326 | 4983 |
------------ ----------- -------------

Signed-off-by: Jason A. Donenfeld <[email protected]>
[ardb: - move to arch/arm/crypto
- wire into lib/crypto framework
- implement crypto API KPP hooks ]
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


# a6b803b3 08-Nov-2019 Ard Biesheuvel <[email protected]>

crypto: arm/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation

This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305 implementation
for NEON authored by Andy Polyakov, and contribu

crypto: arm/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation

This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305 implementation
for NEON authored by Andy Polyakov, and contributed by him to the OpenSSL
project. The file 'poly1305-armv4.pl' is taken straight from this upstream
GitHub repository [0] at commit ec55a08dc0244ce570c4fc7cade330c60798952f,
and already contains all the changes required to build it as part of a
Linux kernel module.

[0] https://github.com/dot-asm/cryptogams

Co-developed-by: Andy Polyakov <[email protected]>
Signed-off-by: Andy Polyakov <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


# b36d8c09 08-Nov-2019 Ard Biesheuvel <[email protected]>

crypto: arm/chacha - remove dependency on generic ChaCha driver

Instead of falling back to the generic ChaCha skcipher driver for
non-SIMD cases, use a fast scalar implementation for ARM authored
by

crypto: arm/chacha - remove dependency on generic ChaCha driver

Instead of falling back to the generic ChaCha skcipher driver for
non-SIMD cases, use a fast scalar implementation for ARM authored
by Eric Biggers. This removes the module dependency on chacha-generic
altogether, which also simplifies things when we expose the ChaCha
library interface from this module.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3
# b4d0c0aa 11-Oct-2019 Ard Biesheuvel <[email protected]>

crypto: arm - use Kconfig based compiler checks for crypto opcodes

Instead of allowing the Crypto Extensions algorithms to be selected when
using a toolchain that does not support them, and complain

crypto: arm - use Kconfig based compiler checks for crypto opcodes

Instead of allowing the Crypto Extensions algorithms to be selected when
using a toolchain that does not support them, and complain about it at
build time, use the information we have about the compiler to prevent
them from being selected in the first place. Users that are stuck with
a GCC version <4.8 are unlikely to care about these routines anyway, and
it cleans up the Makefile considerably.

While at it, add explicit 'armv8-a' CPU specifiers to the code that uses
the 'crypto-neon-fp-armv8' FPU specifier so we don't regress Clang, which
will complain about this in version 10 and later.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5
# 8e9b61b2 01-Dec-2018 Masahiro Yamada <[email protected]>

kbuild: move .SECONDARY special target to Kbuild.include

In commit 54a702f70589 ("kbuild: mark $(targets) as .SECONDARY and
remove .PRECIOUS markers"), I missed one important feature of the
.SECONDA

kbuild: move .SECONDARY special target to Kbuild.include

In commit 54a702f70589 ("kbuild: mark $(targets) as .SECONDARY and
remove .PRECIOUS markers"), I missed one important feature of the
.SECONDARY target:

.SECONDARY with no prerequisites causes all targets to be
treated as secondary.

... which agrees with the policy of Kbuild.

Let's move it to scripts/Kbuild.include, with no prerequisites.

Note:
If an intermediate file is generated by $(call if_changed,...), you
still need to add it to "targets" so its .*.cmd file is included.

The arm/arm64 crypto files are generated by $(call cmd,shipped),
so they do not need to be added to "targets", but need to be added
to "clean-files" so "make clean" can properly clean them away.

Signed-off-by: Masahiro Yamada <[email protected]>

show more ...


Revision tags: v4.20-rc4, v4.20-rc3
# 16aae359 17-Nov-2018 Eric Biggers <[email protected]>

crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305

Add an ARM NEON implementation of NHPoly1305, an ε-almost-∆-universal
hash function used in the Adiantum encryption mode. For now, only the

crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305

Add an ARM NEON implementation of NHPoly1305, an ε-almost-∆-universal
hash function used in the Adiantum encryption mode. For now, only the
NH portion is actually NEON-accelerated; the Poly1305 part is less
performance-critical so is just implemented in C.

Signed-off-by: Eric Biggers <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


# 3cc21519 17-Nov-2018 Eric Biggers <[email protected]>

crypto: arm/chacha20 - refactor to allow varying number of rounds

In preparation for adding XChaCha12 support, rename/refactor the NEON
implementation of ChaCha20 to support different numbers of rou

crypto: arm/chacha20 - refactor to allow varying number of rounds

In preparation for adding XChaCha12 support, rename/refactor the NEON
implementation of ChaCha20 to support different numbers of rounds.

Reviewed-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18
# 578bdaab 07-Aug-2018 Jason A. Donenfeld <[email protected]>

crypto: speck - remove Speck

These are unused, undesired, and have never actually been used by
anybody. The original authors of this code have changed their mind about
its inclusion. While originall

crypto: speck - remove Speck

These are unused, undesired, and have never actually been used by
anybody. The original authors of this code have changed their mind about
its inclusion. While originally proposed for disk encryption on low-end
devices, the idea was discarded [1] in favor of something else before
that could really get going. Therefore, this patch removes Speck.

[1] https://marc.info/?l=linux-crypto-vger&m=153359499015659

Signed-off-by: Jason A. Donenfeld <[email protected]>
Acked-by: Eric Biggers <[email protected]>
Cc: [email protected]
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3, v4.17-rc2, v4.17-rc1, v4.16, v4.16-rc7
# 54a702f7 23-Mar-2018 Masahiro Yamada <[email protected]>

kbuild: mark $(targets) as .SECONDARY and remove .PRECIOUS markers

GNU Make automatically deletes intermediate files that are updated
in a chain of pattern rules.

Example 1) %.dtb.o <- %.dtb.S <- %

kbuild: mark $(targets) as .SECONDARY and remove .PRECIOUS markers

GNU Make automatically deletes intermediate files that are updated
in a chain of pattern rules.

Example 1) %.dtb.o <- %.dtb.S <- %.dtb <- %.dts
Example 2) %.o <- %.c <- %.c_shipped

A couple of makefiles mark such targets as .PRECIOUS to prevent Make
from deleting them, but the correct way is to use .SECONDARY.

.SECONDARY
Prerequisites of this special target are treated as intermediate
files but are never automatically deleted.

.PRECIOUS
When make is interrupted during execution, it may delete the target
file it is updating if the file was modified since make started.
If you mark the file as precious, make will never delete the file
if interrupted.

Both can avoid deletion of intermediate files, but the difference is
the behavior when Make is interrupted; .SECONDARY deletes the target,
but .PRECIOUS does not.

The use of .PRECIOUS is relatively rare since we do not want to keep
partially constructed (possibly corrupted) targets.

Another difference is that .PRECIOUS works with pattern rules whereas
.SECONDARY does not.

.PRECIOUS: $(obj)/%.lex.c

works, but

.SECONDARY: $(obj)/%.lex.c

has no effect. However, for the reason above, I do not want to use
.PRECIOUS which could cause obscure build breakage.

The targets specified as .SECONDARY must be explicit. $(targets)
contains all targets that need to include .*.cmd files. So, the
intermediates you want to keep are mostly in there. Therefore, mark
$(targets) as .SECONDARY. It means primary targets are also marked
as .SECONDARY, but I do not see any drawback for this.

I replaced some .SECONDARY / .PRECIOUS markers with 'targets'. This
will make Kbuild search for non-existing .*.cmd files, but this is
not a noticeable performance issue.

Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Frank Rowand <[email protected]>
Acked-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v4.16-rc6
# 6aaf49b4 13-Mar-2018 Leonard Crestez <[email protected]>

crypto: arm,arm64 - Fix random regeneration of S_shipped

The decision to rebuild .S_shipped is made based on the relative
timestamps of .S_shipped and .pl files but git makes this essentially
random

crypto: arm,arm64 - Fix random regeneration of S_shipped

The decision to rebuild .S_shipped is made based on the relative
timestamps of .S_shipped and .pl files but git makes this essentially
random. This means that the perl script might run anyway (usually at
most once per checkout), defeating the whole purpose of _shipped.

Fix by skipping the rule unless explicit make variables are provided:
REGENERATE_ARM_CRYPTO or REGENERATE_ARM64_CRYPTO.

This can produce nasty occasional build failures downstream, for example
for toolchains with broken perl. The solution is minimally intrusive to
make it easier to push into stable.

Another report on a similar issue here: https://lkml.org/lkml/2018/3/8/1379

Signed-off-by: Leonard Crestez <[email protected]>
Cc: <[email protected]>
Reviewed-by: Masahiro Yamada <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v4.16-rc5, v4.16-rc4, v4.16-rc3, v4.16-rc2
# ede96221 14-Feb-2018 Eric Biggers <[email protected]>

crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS

Add an ARM NEON-accelerated implementation of Speck-XTS. It operates on
128-byte chunks at a time, i.e. 8 blocks for Speck128 or

crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS

Add an ARM NEON-accelerated implementation of Speck-XTS. It operates on
128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for
Speck64. Each 128-byte chunk goes through XTS preprocessing, then is
encrypted/decrypted (doing one cipher round for all the blocks, then the
next round, etc.), then goes through XTS postprocessing.

The performance depends on the processor but can be about 3 times faster
than the generic code. For example, on an ARMv7 processor we observe
the following performance with Speck128/256-XTS:

xts-speck128-neon: Encryption 107.9 MB/s, Decryption 108.1 MB/s
xts(speck128-generic): Encryption 32.1 MB/s, Decryption 36.6 MB/s

In comparison to AES-256-XTS without the Cryptography Extensions:

xts-aes-neonbs: Encryption 41.2 MB/s, Decryption 36.7 MB/s
xts(aes-asm): Encryption 31.7 MB/s, Decryption 30.8 MB/s
xts(aes-generic): Encryption 21.2 MB/s, Decryption 20.9 MB/s

Speck64/128-XTS is even faster:

xts-speck64-neon: Encryption 138.6 MB/s, Decryption 139.1 MB/s

Note that as with the generic code, only the Speck128 and Speck64
variants are supported. Also, for now only the XTS mode of operation is
supported, to target the disk and file encryption use cases. The NEON
code also only handles the portion of the data that is evenly divisible
into 128-byte chunks, with any remainder handled by a C fallback. Of
course, other modes of operation could be added later if needed, and/or
the NEON code could be updated to handle other buffer sizes.

The XTS specification is only defined for AES which has a 128-bit block
size, so for the GF(2^64) math needed for Speck64-XTS we use the
reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX
paper. Of course, when possible users should use Speck128-XTS, but even
that may be too slow on some processors; Speck64-XTS can be faster.

Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v4.16-rc1, v4.15, v4.15-rc9, v4.15-rc8, v4.15-rc7, v4.15-rc6, v4.15-rc5, v4.15-rc4, v4.15-rc3, v4.15-rc2, v4.15-rc1, v4.14, v4.14-rc8
# b2441318 01-Nov-2017 Greg Kroah-Hartman <[email protected]>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <[email protected]>
Reviewed-by: Philippe Ombredanne <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

show more ...


Revision tags: v4.14-rc7, v4.14-rc6, v4.14-rc5, v4.14-rc4, v4.14-rc3, v4.14-rc2, v4.14-rc1, v4.13, v4.13-rc7, v4.13-rc6, v4.13-rc5, v4.13-rc4, v4.13-rc3, v4.13-rc2, v4.13-rc1, v4.12, v4.12-rc7, v4.12-rc6, v4.12-rc5, v4.12-rc4, v4.12-rc3, v4.12-rc2, v4.12-rc1, v4.11, v4.11-rc8, v4.11-rc7, v4.11-rc6, v4.11-rc5, v4.11-rc4, v4.11-rc3, v4.11-rc2, v4.11-rc1
# efa7cebd 28-Feb-2017 Ard Biesheuvel <[email protected]>

crypto: arm/crc32 - add build time test for CRC instruction support

The accelerated CRC32 module for ARM may use either the scalar CRC32
instructions, the NEON 64x64 to 128 bit polynomial multiplica

crypto: arm/crc32 - add build time test for CRC instruction support

The accelerated CRC32 module for ARM may use either the scalar CRC32
instructions, the NEON 64x64 to 128 bit polynomial multiplication
(vmull.p64) instruction, or both, depending on what the current CPU
supports.

However, this also requires support in binutils, and as it turns out,
versions of binutils exist that support the vmull.p64 instruction but
not the crc32 instructions.

So refactor the Makefile logic so that this module only gets built if
binutils has support for both.

Signed-off-by: Ard Biesheuvel <[email protected]>
Acked-by: Jon Hunter <[email protected]>
Tested-by: Jon Hunter <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


Revision tags: v4.10, v4.10-rc8, v4.10-rc7, v4.10-rc6, v4.10-rc5, v4.10-rc4
# cc477bf6 11-Jan-2017 Ard Biesheuvel <[email protected]>

crypto: arm/aes - replace bit-sliced OpenSSL NEON code

This replaces the unwieldy generated implementation of bit-sliced AES
in CBC/CTR/XTS modes that originated in the OpenSSL project with a
new ve

crypto: arm/aes - replace bit-sliced OpenSSL NEON code

This replaces the unwieldy generated implementation of bit-sliced AES
in CBC/CTR/XTS modes that originated in the OpenSSL project with a
new version that is heavily based on the OpenSSL implementation, but
has a number of advantages over the old version:
- it does not rely on the scalar AES cipher that also originated in the
OpenSSL project and contains redundant lookup tables and key schedule
generation routines (which we already have in crypto/aes_generic.)
- it uses the same expanded key schedule for encryption and decryption,
reducing the size of the per-key data structure by 1696 bytes
- it adds an implementation of AES in ECB mode, which can be wrapped by
other generic chaining mode implementations
- it moves the handling of corner cases that are non critical to performance
to the glue layer written in C
- it was written directly in assembler rather than generated from a Perl
script

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


# 81edb426 11-Jan-2017 Ard Biesheuvel <[email protected]>

crypto: arm/aes - replace scalar AES cipher

This replaces the scalar AES cipher that originates in the OpenSSL project
with a new implementation that is ~15% (*) faster (on modern cores), and
reuses

crypto: arm/aes - replace scalar AES cipher

This replaces the scalar AES cipher that originates in the OpenSSL project
with a new implementation that is ~15% (*) faster (on modern cores), and
reuses the lookup tables and the key schedule generation routines from the
generic C implementation (which is usually compiled in anyway due to
networking and other subsystems depending on it).

Note that the bit sliced NEON code for AES still depends on the scalar cipher
that this patch replaces, so it is not removed entirely yet.

* On Cortex-A57, the performance increases from 17.0 to 14.9 cycles per byte
for 128-bit keys.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


# afaf712e 11-Jan-2017 Ard Biesheuvel <[email protected]>

crypto: arm/chacha20 - implement NEON version based on SSE3 code

This is a straight port to ARM/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher. It uses the new skcipher walksize
a

crypto: arm/chacha20 - implement NEON version based on SSE3 code

This is a straight port to ARM/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher. It uses the new skcipher walksize
attribute to process the input in strides of 4x the block size.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

show more ...


12