|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
70a5c525 |
| 06-May-2022 |
Lucas Prates <[email protected]> |
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
show more ...
|
| #
8f2ba363 |
| 15-Jun-2022 |
Krasimir Georgiev <[email protected]> |
Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records AND [NFC][Thumb] Update frame-chain codegen test to use thumbv6m"
This reverts commit 7625e01d661644a560884057755d48a
Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records AND [NFC][Thumb] Update frame-chain codegen test to use thumbv6m"
This reverts commit 7625e01d661644a560884057755d48a0da8b77b4 and dependent cbcce82ef6b512d97e92a319a75a03e997c844e1.
Commit 7625e01d661644a560884057755d48a0da8b77b4 causes some new codegen test failures under asan, e.g., CodeGen/ARM/execute-only.ll: https://lab.llvm.org/buildbot/#/builders/5/builds/24659/steps/15/logs/stdio.
show more ...
|
| #
7625e01d |
| 06-May-2022 |
Lucas Prates <[email protected]> |
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
show more ...
|
| #
33b9ad64 |
| 13-Jun-2022 |
Lucas Prates <[email protected]> |
Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records"
Reverting change due to test failure.
This reverts commit 6119053dab67129eb1700dbf36db3524dd3e421f.
|
| #
6119053d |
| 06-May-2022 |
Lucas Prates <[email protected]> |
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
show more ...
|
| #
485432f3 |
| 01-Jun-2022 |
Martin Storsjö <[email protected]> |
[ARM] Make a narrow tMOVi8 where possible in SEH prologues
We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions
[ARM] Make a narrow tMOVi8 where possible in SEH prologues
We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering.
But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering.
Differential Revision: https://reviews.llvm.org/D126949
show more ...
|
| #
bd52506d |
| 01-Jun-2022 |
Martin Storsjö <[email protected]> |
[ARM] Make narrow push/pop in SEH prologues/epilogues where applicable
We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the i
[ARM] Make narrow push/pop in SEH prologues/epilogues where applicable
We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering.
But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering.
Differential Revision: https://reviews.llvm.org/D126948
show more ...
|
| #
40c937cb |
| 02-Jun-2022 |
Martin Storsjö <[email protected]> |
[ARM] Fix restoring stack for varargs with SEH split frame pointer push
Previously, the "add sp, #12" ended up inserted after "bx lr".
Differential Revision: https://reviews.llvm.org/D126872
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
2ab19bfa |
| 26-Nov-2021 |
Martin Storsjö <[email protected]> |
[ARM] Adjust the frame pointer when it's needed for SEH unwinding
For functions that require restoring SP from FP (e.g. that need to align the stack, or that have variable sized allocations), the pr
[ARM] Adjust the frame pointer when it's needed for SEH unwinding
For functions that require restoring SP from FP (e.g. that need to align the stack, or that have variable sized allocations), the prologue and epilogue previously used to look like this:
push {r4-r5, r11, lr} add r11, sp, #8 ... sub r4, r11, #8 mov sp, r4 pop {r4-r5, r11, pc}
This is problematic, because this unwinding operation (restoring sp from r11 - offset) can't be expressed with the SEH unwind opcodes (probably because this unwind procedure doesn't map exactly to individual instructions; note the detour via r4 in the epilogue too).
To make unwinding work, the GPR push is split into two; the first one pushing all other registers, and the second one pushing r11+lr, so that r11 can be set pointing at this spot on the stack:
push {r4-r5} push {r11, lr} mov r11, sp ... mov sp, r11 pop {r11, lr} pop {r4-r5} bx lr
For the same setup, MSVC generates code that uses two registers; r11 still pointing at the {r11,lr} pair, but a separate register used for restoring the stack at the end:
push {r4-r5, r7, r11, lr} add r11, sp, #12 mov r7, sp ... mov sp, r7 pop {r4-r5, r7, r11, pc}
For cases with clobbered float/vector registers, they are pushed after the GPRs, before the {r11,lr} pair.
Differential Revision: https://reviews.llvm.org/D125649
show more ...
|
| #
d8e67c1c |
| 26-Nov-2021 |
Martin Storsjö <[email protected]> |
[ARM] Add SEH opcodes in frame lowering
Skip inserting regular CFI instructions if using WinCFI.
This is based a fair amount on the corresponding ARM64 implementation, but instead of trying to inse
[ARM] Add SEH opcodes in frame lowering
Skip inserting regular CFI instructions if using WinCFI.
This is based a fair amount on the corresponding ARM64 implementation, but instead of trying to insert the SEH opcodes one by one where we generate other prolog/epilog instructions, we try to walk over the whole prolog/epilog range and insert them. This is done because in many cases, the exact number of instructions inserted is abstracted away deeper.
For some cases, we manually insert specific SEH opcodes directly where instructions are generated, where the automatic mapping of instructions to SEH opcodes doesn't hold up (e.g. for __chkstk stack probes).
Skip Thumb2SizeReduction for SEH prologs/epilogs, and force tail calls to wide instructions (just like on MachO), to make sure that the unwind info actually matches the width of the final instructions, without heuristics about what later passes will do.
Mark SEH instructions as scheduling boundaries, to make sure that they aren't reordered away from the instruction they describe by PostRAScheduler.
Mark the SEH instructions with the NoMerge flag, to avoid doing tail merging of functions that have multiple epilogs that all end with the same sequence of "b <other>; .seh_nop_w, .seh_endepilogue".
Differential Revision: https://reviews.llvm.org/D125648
show more ...
|
| #
ad73ce31 |
| 26-May-2022 |
Zongwei Lan <[email protected]> |
[Target] use getSubtarget<> instead of static_cast<>(getSubtarget())
Differential Revision: https://reviews.llvm.org/D125391
|
| #
bd606afe |
| 03-May-2022 |
Zhiyao Ma <[email protected]> |
[ARM] Only update the successor edges for immediate predecessors of PrologueMBB
When adjusting the function prologue for segmented stacks, only update the successor edges of the immediate predecesso
[ARM] Only update the successor edges for immediate predecessors of PrologueMBB
When adjusting the function prologue for segmented stacks, only update the successor edges of the immediate predecessors of the original prologue.
Differential Revision: https://reviews.llvm.org/D122959
show more ...
|
| #
d7938b1a |
| 17-Apr-2022 |
Matt Arsenault <[email protected]> |
MachineModuleInfo: Move HasSplitStack handling to AsmPrinter
This is used to emit one field in doFinalization for the module. We can accumulate this when emitting all individual functions directly i
MachineModuleInfo: Move HasSplitStack handling to AsmPrinter
This is used to emit one field in doFinalization for the module. We can accumulate this when emitting all individual functions directly in the AsmPrinter, rather than accumulating additional state in MachineModuleInfo.
Move the special case behavior predicate into MachineFrameInfo to share it. This now promotes it to generic behavior. I'm assuming this is fine because no other target implements adjustForSegmentedStacks, or has tests using the split-stack attribute.
show more ...
|
| #
adc26b4e |
| 10-Mar-2022 |
Zhiyao Ma <[email protected]> |
[ARM] Fix 8-bit immediate overflow in the instruction of segmented stack prologue.
It fixes the overflow of 8-bit immediate field in the emitted instruction that allocates large stacklet.
For thumb
[ARM] Fix 8-bit immediate overflow in the instruction of segmented stack prologue.
It fixes the overflow of 8-bit immediate field in the emitted instruction that allocates large stacklet.
For thumb2 targets, load large immediate by a pair of movw and movt instruction. For thumb1 and ARM targets, load large immediate by reading from literal pool.
Differential Revision: https://reviews.llvm.org/D118545
show more ...
|
| #
f15014ff |
| 26-Jan-2022 |
Benjamin Kramer <[email protected]> |
Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef8206320769ad31422a803a0d6de6077fd231d2.
- It conflicts with the existing llvm::size in STLEx
Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef8206320769ad31422a803a0d6de6077fd231d2.
- It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat
show more ...
|
| #
ef820632 |
| 26-Jan-2022 |
serge-sans-paille <[email protected]> |
Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no buil
Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).
show more ...
|
| #
d6b07348 |
| 19-Jan-2022 |
Jim Lin <[email protected]> |
[NFC] Use Register instead of unsigned
|
| #
7c70f96a |
| 13-Jan-2022 |
Ties Stuij <[email protected]> |
[ARM] fix bug causing shrinkwrapping not always being off using PAC
If you want to check for all uses of PAC, the SpillsLR argument to shouldSignReturnAddress should be true instead of false, as tha
[ARM] fix bug causing shrinkwrapping not always being off using PAC
If you want to check for all uses of PAC, the SpillsLR argument to shouldSignReturnAddress should be true instead of false, as that value will be returned from the function if the other checks fall through.
Reviewed By: miyuki
Differential Revision: https://reviews.llvm.org/D116213
show more ...
|
| #
d0f55a0d |
| 09-Dec-2021 |
Mikael Holmen <[email protected]> |
[ARM] Fix gcc warning about mix of enumeral and non-enumeral types
gcc warned with ../lib/Target/ARM/ARMFrameLowering.cpp:797:31: warning: enumeral and non-enumeral type in conditional expression [-
[ARM] Fix gcc warning about mix of enumeral and non-enumeral types
gcc warned with ../lib/Target/ARM/ARMFrameLowering.cpp:797:31: warning: enumeral and non-enumeral type in conditional expression [-Wextra] 797 | Reg == ARM::R12 ? ARM::RA_AUTH_CODE : Reg, true); | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
show more ...
|
| #
63eb7ff4 |
| 07-Dec-2021 |
Ties Stuij <[email protected]> |
[ARM] Implement PAC return address signing mechanism for PACBTI-M
This patch implements PAC return address signing for armv8-m. This patch roughly accomplishes the following things:
- PAC and AUT i
[ARM] Implement PAC return address signing mechanism for PACBTI-M
This patch implements PAC return address signing for armv8-m. This patch roughly accomplishes the following things:
- PAC and AUT instructions are generated. - They're part of the stack frame setup, so that shrink-wrapping can move them inwards to cover only part of a function - The auth code generated by PAC is saved across subroutine calls so that AUT can find it again to check - PAC is emitted before stacking registers (so that the SP it signs is the one on function entry). - The new pseudo-register ra_auth_code is mentioned in the DWARF frame data - With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT validates the corresponding value of SP - Emit correct unwind information when PAC is replaced by PACBTI - Handle tail calls correctly
Some notes:
We make the assembler accept the `.save {ra_auth_code}` directive that is emitted by the compiler when it saves a register that contains a return address authentication code.
For EHABI we need to have the `FrameSetup` flag on the instruction and handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit `.save {ra_auth_code}`, instead of `.save {r12}`.
For PACBTI-M, the instruction which computes return address PAC should use SP value before adjustment for the argument registers save are (used for variadic functions and when a parameter is is split between stack and register), but at the same it should be after the instruction that saves FPCXT when compiling a CMSE entry function.
This patch moves the varargs SP adjustment after the FPCXT save (they are never enabled at the same time), so in a following patch handling of the `PAC` instruction can be placed between them.
Epilogue emission code adjusted in a similar manner.
PACBTI-M code generation should not emit any instructions for architectures v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases is handled separately by a future ticket.
note on tail calls:
If the called function has four arguments that occupy registers `r0`-`r3`, the only option for holding the function pointer itself is `r12`, but this register is used to keep the PAC during function/prologue epilogue and clobbers the function pointer.
When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to keep six values - the four function arguments, the function pointer and the PAC, which is obviously impossible.
One option would be to authenticate the return address before all callee-saved registers are restored, so we have a scratch register to temporarily keep the value of `r12`. The issue with this approach is that it violates a fundamental invariant that PAC is computed using CFA as a modifier. It would also mean using separate instructions to pop `lr` and the rest of the callee-saved registers, which would offset the advantages of doing a tail call.
Instead, this patch disables indirect tail calls when the called function take four or more arguments and the return address sign and authentication is enabled for the caller function, conservatively assuming the caller function would spill LR.
This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here:
https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension
The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual:
https://developer.arm.com/documentation/ddi0553/latest
The following people contributed to this patch:
- Momchil Velikov - Ties Stuij
Reviewed By: danielkiss
Differential Revision: https://reviews.llvm.org/D112429
show more ...
|
| #
b8f1ccb0 |
| 02-Dec-2021 |
David Green <[email protected]> |
[ARM] Introduce i8neg and i8pos addressing modes
Some instructions with i8 immediate ranges can only hold negative values (like t2LDRHi8), only hold positive values (like t2STRT) or hold +/- dependi
[ARM] Introduce i8neg and i8pos addressing modes
Some instructions with i8 immediate ranges can only hold negative values (like t2LDRHi8), only hold positive values (like t2STRT) or hold +/- depending on the U bit (like the pre/post inc instructions. e.g t2LDRH_POST). This patch splits the AddrModeT2_i8 into AddrModeT2_i8, AddrModeT2_i8pos and AddrModeT2_i8neg to make this clear.
This allows us to get the offset ranges of t2LDRHi8 correct in the load/store optimizer, fixing issues where we could end up creating instructions with positive offsets (which may then be encoded as ldrht).
Differential Revision: https://reviews.llvm.org/D114638
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
fc981ced |
| 21-Nov-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use range-based for loops (NFC)
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
19d2e42b |
| 21-Jul-2021 |
Tim Northover <[email protected]> |
ARM: don't return by popping PC if we have to adjust the stack afterwards.
In mandatory tail calling conventions we might have to deallocate stack space used by our arguments before return. This hap
ARM: don't return by popping PC if we have to adjust the stack afterwards.
In mandatory tail calling conventions we might have to deallocate stack space used by our arguments before return. This happens after popping CSRs, so the pop cannot be turned into the return itself in this case.
The else branch here was already a nop, so removing it as a tidy-up.
show more ...
|
| #
c82957e7 |
| 29-Jun-2021 |
Tim Northover <[email protected]> |
ARM: fix vacuously true assertion to actually check what it should. NFC.
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
| #
18dbe689 |
| 17-Jun-2021 |
Tomas Matheson <[email protected]> |
[ARM][NFC] Tidy up subtarget frame pointer routines
getFramePointerReg only depends on information in ARMSubtarget, so move it in there so it can be accessed from more places.
Make use of ARMSubtar
[ARM][NFC] Tidy up subtarget frame pointer routines
getFramePointerReg only depends on information in ARMSubtarget, so move it in there so it can be accessed from more places.
Make use of ARMSubtarget::getFramePointerReg to remove duplicated code.
The main use of useR7AsFramePointer is getFramePointerReg, so inline it.
Differential Revision: https://reviews.llvm.org/D104476
show more ...
|