History log of /linux-6.15/arch/arm/lib/delay-loop.S (Results 1 – 8 of 8)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6
# 7339fb11 23-Apr-2024 Linus Walleij <[email protected]>

ARM: 9390/2: lib: Annotate loop delay instructions for CFI

When we annotate the loop delay code with SYM_TYPED_FUNC_START()
a function prototype signature will be emitted into the object
file above

ARM: 9390/2: lib: Annotate loop delay instructions for CFI

When we annotate the loop delay code with SYM_TYPED_FUNC_START()
a function prototype signature will be emitted into the object
file above each site called from C, and the delay loop code is
using "fallthroughs" from the different assembly callbacks. This
will not work as the execution flow will run into the prototype
signatures.

Rewrite the code to use explicit branches to the other code
segments and annotate the code using SYM_TYPED_FUNC_START().

Tested on the ARM Versatile which uses the calibrated loop delay.

Tested-by: Kees Cook <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Russell King (Oracle) <[email protected]>

show more ...


Revision tags: v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3
# a2faac39 24-Oct-2022 Nick Desaulniers <[email protected]>

ARM: 9263/1: use .arch directives instead of assembler command line flags

Similar to commit a6c30873ee4a ("ARM: 8989/1: use .fpu assembler
directives instead of assembler arguments").

GCC and GNU b

ARM: 9263/1: use .arch directives instead of assembler command line flags

Similar to commit a6c30873ee4a ("ARM: 8989/1: use .fpu assembler
directives instead of assembler arguments").

GCC and GNU binutils support setting the "sub arch" via -march=,
-Wa,-march, target function attribute, and .arch assembler directive.

Clang was missing support for -Wa,-march=, but this was implemented in
clang-13.

The behavior of both GCC and Clang is to
prefer -Wa,-march= over -march= for assembler and assembler-with-cpp
sources, but Clang will warn about the -march= being unused.

clang: warning: argument unused during compilation: '-march=armv6k'
[-Wunused-command-line-argument]

Since most assembler is non-conditionally assembled with one sub arch
(modulo arch/arm/delay-loop.S which conditionally is assembled as armv4
based on CONFIG_ARCH_RPC, and arch/arm/mach-at91/pm-suspend.S which is
conditionally assembled as armv7-a based on CONFIG_CPU_V7), prefer the
.arch assembler directive.

Add a few more instances found in compile testing as found by Arnd and
Nathan.

Link: https://github.com/llvm/llvm-project/commit/1d51c699b9e2ebc5bcfdbe85c74cc871426333d4
Link: https://bugs.llvm.org/show_bug.cgi?id=48894
Link: https://github.com/ClangBuiltLinux/linux/issues/1195
Link: https://github.com/ClangBuiltLinux/linux/issues/1315

Suggested-by: Arnd Bergmann <[email protected]>
Suggested-by: Nathan Chancellor <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Russell King (Oracle) <[email protected]>

show more ...


Revision tags: v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4
# d2912cb1 04-Jun-2019 Thomas Gleixner <[email protected]>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of th

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Enrico Weigelt <[email protected]>
Reviewed-by: Kate Stewart <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

show more ...


Revision tags: v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3, v4.17-rc2, v4.17-rc1, v4.16, v4.16-rc7, v4.16-rc6, v4.16-rc5, v4.16-rc4, v4.16-rc3, v4.16-rc2, v4.16-rc1, v4.15, v4.15-rc9, v4.15-rc8, v4.15-rc7, v4.15-rc6, v4.15-rc5, v4.15-rc4, v4.15-rc3, v4.15-rc2, v4.15-rc1, v4.14, v4.14-rc8, v4.14-rc7, v4.14-rc6, v4.14-rc5, v4.14-rc4, v4.14-rc3, v4.14-rc2, v4.14-rc1, v4.13, v4.13-rc7, v4.13-rc6, v4.13-rc5, v4.13-rc4, v4.13-rc3, v4.13-rc2, v4.13-rc1, v4.12, v4.12-rc7, v4.12-rc6, v4.12-rc5, v4.12-rc4, v4.12-rc3, v4.12-rc2, v4.12-rc1, v4.11, v4.11-rc8, v4.11-rc7, v4.11-rc6, v4.11-rc5, v4.11-rc4, v4.11-rc3, v4.11-rc2, v4.11-rc1, v4.10, v4.10-rc8, v4.10-rc7, v4.10-rc6, v4.10-rc5, v4.10-rc4, v4.10-rc3, v4.10-rc2, v4.10-rc1, v4.9, v4.9-rc8, v4.9-rc7, v4.9-rc6, v4.9-rc5, v4.9-rc4, v4.9-rc3, v4.9-rc2, v4.9-rc1
# 207b1150 07-Oct-2016 Nicolas Pitre <[email protected]>

ARM: 8619/1: udelay: document the various constants

Explain where the value for UDELAY_MULT and UDELAY_SHIFT come from.
Also fix/clarify some comments pertaining to their usage in the
assembly code.

ARM: 8619/1: udelay: document the various constants

Explain where the value for UDELAY_MULT and UDELAY_SHIFT come from.
Also fix/clarify some comments pertaining to their usage in the
assembly code.

Signed-off-by: Nicolas Pitre <[email protected]>
Signed-off-by: Russell King <[email protected]>

show more ...


Revision tags: v4.8, v4.8-rc8, v4.8-rc7, v4.8-rc6, v4.8-rc5, v4.8-rc4, v4.8-rc3, v4.8-rc2, v4.8-rc1, v4.7, v4.7-rc7, v4.7-rc6, v4.7-rc5, v4.7-rc4, v4.7-rc3, v4.7-rc2, v4.7-rc1, v4.6, v4.6-rc7, v4.6-rc6, v4.6-rc5, v4.6-rc4, v4.6-rc3, v4.6-rc2, v4.6-rc1, v4.5, v4.5-rc7, v4.5-rc6, v4.5-rc5, v4.5-rc4, v4.5-rc3, v4.5-rc2, v4.5-rc1, v4.4, v4.4-rc8, v4.4-rc7, v4.4-rc6, v4.4-rc5, v4.4-rc4, v4.4-rc3, v4.4-rc2, v4.4-rc1, v4.3, v4.3-rc7, v4.3-rc6, v4.3-rc5, v4.3-rc4, v4.3-rc3, v4.3-rc2, v4.3-rc1, v4.2, v4.2-rc8, v4.2-rc7, v4.2-rc6, v4.2-rc5, v4.2-rc4, v4.2-rc3, v4.2-rc2, v4.2-rc1, v4.1, v4.1-rc8, v4.1-rc7, v4.1-rc6, v4.1-rc5, v4.1-rc4, v4.1-rc3, v4.1-rc2, v4.1-rc1, v4.0, v4.0-rc7, v4.0-rc6, v4.0-rc5, v4.0-rc4, v4.0-rc3, v4.0-rc2
# 215e362d 25-Feb-2015 Nicolas Pitre <[email protected]>

ARM: 8306/1: loop_udelay: remove bogomips value limitation

Now that we don't support ARMv3 anymore, the loop based delay code can
convert microsecs into number of loops using a 64-bit multiplication

ARM: 8306/1: loop_udelay: remove bogomips value limitation

Now that we don't support ARMv3 anymore, the loop based delay code can
convert microsecs into number of loops using a 64-bit multiplication
and more precision.

This allows us to lift the hard limit of 3355 on the bogomips value as
loops_per_jiffy may now safely span the full 32-bit range.

Signed-off-by: Nicolas Pitre <[email protected]>
Signed-off-by: Russell King <[email protected]>

show more ...


Revision tags: v4.0-rc1, v3.19, v3.19-rc7, v3.19-rc6, v3.19-rc5, v3.19-rc4, v3.19-rc3, v3.19-rc2, v3.19-rc1, v3.18, v3.18-rc7, v3.18-rc6, v3.18-rc5, v3.18-rc4, v3.18-rc3, v3.18-rc2, v3.18-rc1, v3.17, v3.17-rc7, v3.17-rc6, v3.17-rc5, v3.17-rc4, v3.17-rc3, v3.17-rc2, v3.17-rc1, v3.16, v3.16-rc7, v3.16-rc6, v3.16-rc5, v3.16-rc4
# 6ebbf2ce 30-Jun-2014 Russell King <[email protected]>

ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+

ARMv6 and greater introduced a new instruction ("bx") which can be used
to return from function calls. Recent CPUs perform better when the
"b

ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+

ARMv6 and greater introduced a new instruction ("bx") which can be used
to return from function calls. Recent CPUs perform better when the
"bx lr" instruction is used rather than the "mov pc, lr" instruction,
and this sequence is strongly recommended to be used by the ARM
architecture manual (section A.4.1.1).

We provide a new macro "ret" with all its variants for the condition
code which will resolve to the appropriate instruction.

Rather than doing this piecemeal, and miss some instances, change all
the "mov pc" instances to use the new macro, with the exception of
the "movs" instruction and the kprobes code. This allows us to detect
the "mov pc, lr" case and fix it up - and also gives us the possibility
of deploying this for other registers depending on the CPU selection.

Reported-by: Will Deacon <[email protected]>
Tested-by: Stephen Warren <[email protected]> # Tegra Jetson TK1
Tested-by: Robert Jarzmik <[email protected]> # mioa701_bootresume.S
Tested-by: Andrew Lunn <[email protected]> # Kirkwood
Tested-by: Shawn Guo <[email protected]>
Tested-by: Tony Lindgren <[email protected]> # OMAPs
Tested-by: Gregory CLEMENT <[email protected]> # Armada XP, 375, 385
Acked-by: Sekhar Nori <[email protected]> # DaVinci
Acked-by: Christoffer Dall <[email protected]> # kvm/hyp
Acked-by: Haojian Zhuang <[email protected]> # PXA3xx
Acked-by: Stefano Stabellini <[email protected]> # Xen
Tested-by: Uwe Kleine-König <[email protected]> # ARMv7M
Tested-by: Simon Horman <[email protected]> # Shmobile
Signed-off-by: Russell King <[email protected]>

show more ...


Revision tags: v3.16-rc3, v3.16-rc2, v3.16-rc1, v3.15, v3.15-rc8, v3.15-rc7, v3.15-rc6, v3.15-rc5, v3.15-rc4, v3.15-rc3, v3.15-rc2, v3.15-rc1, v3.14, v3.14-rc8, v3.14-rc7, v3.14-rc6, v3.14-rc5, v3.14-rc4, v3.14-rc3, v3.14-rc2, v3.14-rc1, v3.13, v3.13-rc8, v3.13-rc7, v3.13-rc6, v3.13-rc5, v3.13-rc4, v3.13-rc3
# 11d4bb1b 30-Nov-2013 Fabio Estevam <[email protected]>

ARM: 7907/1: lib: delay-loop: Add align directive to fix BogoMIPS calculation

Currently mx53 (CortexA8) running at 1GHz reports:
Calibrating delay loop... 663.55 BogoMIPS (lpj=3317760)

Tom Evans ve

ARM: 7907/1: lib: delay-loop: Add align directive to fix BogoMIPS calculation

Currently mx53 (CortexA8) running at 1GHz reports:
Calibrating delay loop... 663.55 BogoMIPS (lpj=3317760)

Tom Evans verified that alignments of 0x0 and 0x8 run the two instructions of __loop_delay in one clock cycle (1 clock/loop), while alignments of 0x4 and 0xc take 3 clocks to run the loop twice. (1.5 clock/loop)

The original object code looks like this:

00000010 <__loop_const_udelay>:
10: e3e01000 mvn r1, #0
14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8>
18: e5922000 ldr r2, [r2]
1c: e0800921 add r0, r0, r1, lsr #18
20: e1a00720 lsr r0, r0, #14
24: e0822b21 add r2, r2, r1, lsr #22
28: e1a02522 lsr r2, r2, #10
2c: e0000092 mul r0, r2, r0
30: e0800d21 add r0, r0, r1, lsr #26
34: e1b00320 lsrs r0, r0, #6
38: 01a0f00e moveq pc, lr

0000003c <__loop_delay>:
3c: e2500001 subs r0, r0, #1
40: 8afffffe bhi 3c <__loop_delay>
44: e1a0f00e mov pc, lr

After adding the 'align 3' directive to __loop_delay (align to 8 bytes):

00000010 <__loop_const_udelay>:
10: e3e01000 mvn r1, #0
14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8>
18: e5922000 ldr r2, [r2]
1c: e0800921 add r0, r0, r1, lsr #18
20: e1a00720 lsr r0, r0, #14
24: e0822b21 add r2, r2, r1, lsr #22
28: e1a02522 lsr r2, r2, #10
2c: e0000092 mul r0, r2, r0
30: e0800d21 add r0, r0, r1, lsr #26
34: e1b00320 lsrs r0, r0, #6
38: 01a0f00e moveq pc, lr
3c: e320f000 nop {0}

00000040 <__loop_delay>:
40: e2500001 subs r0, r0, #1
44: 8afffffe bhi 40 <__loop_delay>
48: e1a0f00e mov pc, lr
4c: e320f000 nop {0}

, which now reports:
Calibrating delay loop... 996.14 BogoMIPS (lpj=4980736)

Some more test results:

On mx31 (ARM1136) running at 532 MHz, before the patch:
Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184)

On mx31 (ARM1136) running at 532 MHz after the patch:
Calibrating delay loop... 528.79 BogoMIPS (lpj=2643968)

Also tested on mx6 (CortexA9) and on mx27 (ARM926), which shows the same
BogoMIPS value before and after this patch.

Reported-by: Tom Evans <[email protected]>
Suggested-by: Tom Evans <[email protected]>
Signed-off-by: Fabio Estevam <[email protected]>
Signed-off-by: Russell King <[email protected]>

show more ...


Revision tags: v3.13-rc2, v3.13-rc1, v3.12, v3.12-rc7, v3.12-rc6, v3.12-rc5, v3.12-rc4, v3.12-rc3, v3.12-rc2, v3.12-rc1, v3.11, v3.11-rc7, v3.11-rc6, v3.11-rc5, v3.11-rc4, v3.11-rc3, v3.11-rc2, v3.11-rc1, v3.10, v3.10-rc7, v3.10-rc6, v3.10-rc5, v3.10-rc4, v3.10-rc3, v3.10-rc2, v3.10-rc1, v3.9, v3.9-rc8, v3.9-rc7, v3.9-rc6, v3.9-rc5, v3.9-rc4, v3.9-rc3, v3.9-rc2, v3.9-rc1, v3.8, v3.8-rc7, v3.8-rc6, v3.8-rc5, v3.8-rc4, v3.8-rc3, v3.8-rc2, v3.8-rc1, v3.7, v3.7-rc8, v3.7-rc7, v3.7-rc6, v3.7-rc5, v3.7-rc4, v3.7-rc3, v3.7-rc2, v3.7-rc1, v3.6, v3.6-rc7, v3.6-rc6, v3.6-rc5, v3.6-rc4, v3.6-rc3, v3.6-rc2, v3.6-rc1, v3.5, v3.5-rc7, v3.5-rc6
# d0a533b1 06-Jul-2012 Will Deacon <[email protected]>

ARM: 7452/1: delay: allow timer-based delay implementation to be selected

This patch allows a timer-based delay implementation to be selected by
switching the delay routines over to use get_cycles,

ARM: 7452/1: delay: allow timer-based delay implementation to be selected

This patch allows a timer-based delay implementation to be selected by
switching the delay routines over to use get_cycles, which is
implemented in terms of read_current_timer. This further allows us to
skip the loop calibration and have a consistent delay function in the
face of core frequency scaling.

To avoid the pain of dealing with memory-mapped counters, this
implementation uses the co-processor interface to the architected timers
when they are available. The previous loop-based implementation is
kept around for CPUs without the architected timers and we retain both
the maximum delay (2ms) and the corresponding conversion factors for
determining the number of loops required for a given interval. Since the
indirection of the timer routines will only work when called from C,
the sa1100 sleep routines are modified to branch to the loop-based delay
functions directly.

Tested-by: Shinya Kuribayashi <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Russell King <[email protected]>

show more ...