|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
| #
e0e687a6 |
| 20-Jun-2022 |
Kazu Hirata <[email protected]> |
[llvm] Don't use Optional::hasValue (NFC)
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
3a24df99 |
| 13-May-2022 |
Archibald Elliott <[email protected]> |
[ARM] Pass for Cortex-A57 and Cortex-A72 Fused AES Erratum
This adds a late Machine Pass to work around a Cortex CPU Erratum affecting Cortex-A57 and Cortex-A72: - Cortex-A57 Erratum 1742098 - Corte
[ARM] Pass for Cortex-A57 and Cortex-A72 Fused AES Erratum
This adds a late Machine Pass to work around a Cortex CPU Erratum affecting Cortex-A57 and Cortex-A72: - Cortex-A57 Erratum 1742098 - Cortex-A72 Erratum 1655431
The pass inserts instructions to make the inputs to the fused AES instruction pairs no longer trigger the erratum. Here the pass errs on the side of caution, inserting the instructions wherever we cannot prove that the inputs came from a safe instruction.
The pass is used: - for Cortex-A57 and Cortex-A72, - for "generic" cores (which are used when using `-march=`), - when the user specifies `-mfix-cortex-a57-aes-1742098` or `mfix-cortex-a72-aes-1655431` in the command-line arguments to clang.
Reviewed By: dmgreen, simon_tatham
Differential Revision: https://reviews.llvm.org/D119720
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
dcb77643 |
| 29-Mar-2022 |
David Penry <[email protected]> |
Reapply [CodeGen][ARM] Enable Swing Module Scheduling for ARM
Fixed "private field is not used" warning when compiled with clang.
original commit: 28d09bbbc3d09c912b54a4d5edb32cab7de32a6f reverted
Reapply [CodeGen][ARM] Enable Swing Module Scheduling for ARM
Fixed "private field is not used" warning when compiled with clang.
original commit: 28d09bbbc3d09c912b54a4d5edb32cab7de32a6f reverted in: fa49021c68ef7a7adcdf7b8a44b9006506523191
------
This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch.
MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body.
The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues.
It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D122672
show more ...
|
| #
fa49021c |
| 28-Apr-2022 |
David Penry <[email protected]> |
Revert "[CodeGen][ARM] Enable Swing Module Scheduling for ARM"
This reverts commit 28d09bbbc3d09c912b54a4d5edb32cab7de32a6f while I investigate a buildbot failure.
|
| #
28d09bbb |
| 29-Mar-2022 |
David Penry <[email protected]> |
[CodeGen][ARM] Enable Swing Module Scheduling for ARM
This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a l
[CodeGen][ARM] Enable Swing Module Scheduling for ARM
This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch.
MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body.
The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues.
It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D122672
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
ed98c1b3 |
| 09-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
cb216076 |
| 15-Feb-2022 |
Mircea Trofin <[email protected]> |
[nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described, fixed file headers and include guards, updated all files where the old paths
[nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described, fixed file headers and include guards, updated all files where the old paths were detected (simple grep through the repo), and `clang-format`-ed it all.
Differential Revision: https://reviews.llvm.org/D119876
show more ...
|
| #
c4b1a63a |
| 25-Feb-2022 |
Jameson Nash <[email protected]> |
mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D1
mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D120518
show more ...
|
| #
f9270214 |
| 10-Feb-2022 |
Yuanfang Chen <[email protected]> |
Reland "[clang-cl] Support the /JMC flag"
This relands commit b380a31de084a540cfa38b72e609b25ea0569bb7.
Restrict the tests to Windows only since the flag symbol hash depends on system-dependent pat
Reland "[clang-cl] Support the /JMC flag"
This relands commit b380a31de084a540cfa38b72e609b25ea0569bb7.
Restrict the tests to Windows only since the flag symbol hash depends on system-dependent path normalization.
show more ...
|
| #
b380a31d |
| 10-Feb-2022 |
Yuanfang Chen <[email protected]> |
Revert "[clang-cl] Support the /JMC flag"
This reverts commit bd3a1de683f80d174ea9c97000db3ec3276bc022.
Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x6
Revert "[clang-cl] Support the /JMC flag"
This reverts commit bd3a1de683f80d174ea9c97000db3ec3276bc022.
Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8822587673277278177/overview
show more ...
|
| #
bd3a1de6 |
| 10-Feb-2022 |
Yuanfang Chen <[email protected]> |
[clang-cl] Support the /JMC flag
The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/
The `/JMC` flag enables these
[clang-cl] Support the /JMC flag
The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/
The `/JMC` flag enables these instrumentations: - Insert at the beginning of every function immediately after the prologue with a call to `void __fastcall __CheckForDebuggerJustMyCode(unsigned char *JMC_flag)`. The argument for `__CheckForDebuggerJustMyCode` is the address of a boolean global variable (the global variable is initialized to 1) with the name convention `__<hash>_<filename>`. All such global variables are placed in the `.msvcjmc` section. - The `<hash>` part of `__<hash>_<filename>` has a one-to-one mapping with a directory path. MSVC uses some unknown hashing function. Here I used DJB. - Add a dummy/empty COMDAT function `__JustMyCode_Default`. - Add `/alternatename:__CheckForDebuggerJustMyCode=__JustMyCode_Default` link option via ".drectve" section. This is to prevent failure in case `__CheckForDebuggerJustMyCode` is not provided during linking.
Implementation: All the instrumentations are implemented in an IR codegen pass. The pass is placed immediately before CodeGenPrepare pass. This is to not interfere with mid-end optimizations and make the instrumentation target-independent (I'm still working on an ELF port in a separate patch).
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D118428
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
75e164f6 |
| 20-Jan-2022 |
serge-sans-paille <[email protected]> |
[llvm] Cleanup header dependencies in ADT and Support
The cleanup was manual, but assisted by "include-what-you-use". It consists in
1. Removing unused forward declaration. No impact expected. 2. R
[llvm] Cleanup header dependencies in ADT and Support
The cleanup was manual, but assisted by "include-what-you-use". It consists in
1. Removing unused forward declaration. No impact expected. 2. Removing unused headers in .cpp files. No impact expected. 3. Removing unused headers in .h files. This removes implicit dependencies and is generally considered a good thing, but this may break downstream builds. I've updated llvm, clang, lld, lldb and mlir deps, and included a list of the modification in the second part of the commit. 4. Replacing header inclusion by forward declaration. This has the same impact as 3.
Notable changes:
- llvm/Support/TargetParser.h no longer includes llvm/Support/AArch64TargetParser.h nor llvm/Support/ARMTargetParser.h - llvm/Support/TypeSize.h no longer includes llvm/Support/WithColor.h - llvm/Support/YAMLTraits.h no longer includes llvm/Support/Regex.h - llvm/ADT/SmallVector.h no longer includes llvm/Support/MemAlloc.h nor llvm/Support/ErrorHandling.h
You may need to add some of these headers in your compilation units, if needs be.
As an hint to the impact of the cleanup, running
clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Support/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 8000919 lines after: 7917500 lines
Reduced dependencies also helps incremental rebuilds and is more ccache friendly, something not shown by the above metric :-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc2 |
|
| #
f5f28d5b |
| 01-Dec-2021 |
Ties Stuij <[email protected]> |
[ARM] Implement BTI placement pass for PACBTI-M
This patch implements a new MachineFunction in the ARM backend for placing BTI instructions. It is similar to the existing AArch64 aarch64-branch-targ
[ARM] Implement BTI placement pass for PACBTI-M
This patch implements a new MachineFunction in the ARM backend for placing BTI instructions. It is similar to the existing AArch64 aarch64-branch-targets pass.
BTI instructions are inserted into basic blocks that: - Have their address taken - Are the entry block of a function, if the function has external linkage or has its address taken - Are mentioned in jump tables - Are exception/cleanup landing pads
Each BTI instructions is placed in the beginning of a BB after the so-called meta instructions (e.g. exception handler labels).
Each outlining candidate and the outlined function need to be in agreement about whether BTI placement is enabled or not. If branch target enforcement is disabled for a function, the outliner should not covertly enable it by emitting a call to an outlined function, which begins with BTI.
The cost mode of the outliner is adjusted to account for the extra BTI instructions in the outlined function.
The ARM Constant Islands pass will maintain the count of the jump tables, which reference a block. A `BTI` instruction is removed from a block only if the reference count reaches zero.
PAC instructions in entry blocks are replaced with PACBTI instructions (tests for this case will be added in a later patch because the compiler currently does not generate PAC instructions).
The ARM Constant Island pass is adjusted to handle BTI instructions correctly.
Functions with static linkage that don't have their address taken can still be called indirectly by linker-generated veneers and thus their entry points need be marked with BTI or PACBTI.
The changes are tested using "LLVM IR -> assembly" tests, jump tables also have a MIR test. Unfortunately it is not possible add MIR tests for exception handling and computed gotos because of MIR parser limitations.
This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here:
https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension
The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual:
https://developer.arm.com/documentation/ddi0553/latest
The following people contributed to this patch:
- Mikhail Maltsev - Momchil Velikov - Ties Stuij
Reviewed By: ostannard
Differential Revision: https://reviews.llvm.org/D112426
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
09124402 |
| 04-Nov-2021 |
David Green <[email protected]> |
[ARM] Move VPTBlock pass after post-ra scheduling
Currently when tail predicating loops, vpt blocks need to be created with the vctp predicate in case we need to revert to non-tail predicated form.
[ARM] Move VPTBlock pass after post-ra scheduling
Currently when tail predicating loops, vpt blocks need to be created with the vctp predicate in case we need to revert to non-tail predicated form. This has the unfortunate side effect of severely hampering post-ra scheduling at times as the instructions are already stuck in vpt blocks, not allowed to be independently ordered.
This patch addresses that by just moving the creation of VPT blocks later in the pipeline, after post-ra scheduling has been performed. This allows more optimal scheduling post-ra before the vpt blocks are created, leading to more optimal tail predicated loops.
Differential Revision: https://reviews.llvm.org/D113094
show more ...
|
| #
89b57061 |
| 08-Oct-2021 |
Reid Kleckner <[email protected]> |
Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually us
Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5 |
|
| #
8815ce03 |
| 01-Apr-2021 |
Arthur Eubanks <[email protected]> |
Remove "Rewrite Symbols" from codegen pipeline
It breaks up the function pass manager in the codegen pipeline.
With empty parameters, it looks at the -mllvm flag -rewrite-map-file. This is likely n
Remove "Rewrite Symbols" from codegen pipeline
It breaks up the function pass manager in the codegen pipeline.
With empty parameters, it looks at the -mllvm flag -rewrite-map-file. This is likely not in use.
Add a check that we only have one function pass manager in the codegen pipeline.
Some tests relied on the fact that we had a module pass somewhere in the codegen pipeline.
addr-label.ll crashes on ARM due to this change. This is because a ARMConstantPoolConstant containing a BasicBlock to represent a blockaddress may hold an invalid pointer to a BasicBlock if the blockaddress is invalidated by its BasicBlock getting removed. In that case all referencing blockaddresses are RAUW a constant int. Making ARMConstantPoolConstant::CVal a WeakVH fixes the crash, but I'm not sure that's the right fix. As a workaround, create a barrier right before ISel so that IR optimizations can't happen while a ARMConstantPoolConstant has been created.
Reviewed By: rnk, MaskRay, compnerd
Differential Revision: https://reviews.llvm.org/D99707
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc4 |
|
| #
d6de1e1a |
| 24-Mar-2021 |
Serge Guelton <[email protected]> |
Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from
Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from
if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
to
if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
Introduce a getValueAsBool that normalize the check, with the following behavior:
no attributes or attribute set to "false" => return false attribute set to "true" => return true
Differential Revision: https://reviews.llvm.org/D99299
show more ...
|
| #
7b6f760f |
| 28-Mar-2021 |
David Green <[email protected]> |
[ARM] MVE vector lane interleaving
MVE does not have a single sext/zext or trunc instruction that takes the bottom half of a vector and extends to a full width, like NEON has with MOVL. Instead it i
[ARM] MVE vector lane interleaving
MVE does not have a single sext/zext or trunc instruction that takes the bottom half of a vector and extends to a full width, like NEON has with MOVL. Instead it is expected that this happens through top/bottom instructions. So the MVE equivalent VMOVLT/B instructions take either the even or odd elements of the input and extend them to the larger type, producing a vector with half the number of elements each of double the bitwidth. As there is no simple instruction for a normal extend, we often have to expand sext/zext/trunc into a series of lane moves (or stack loads/stores, which we do not do yet).
This pass takes vector code that starts at truncs, looks for interconnected blobs of operations that end with sext/zext and transforms them by adding shuffles so that the lanes are interleaved and the MVE VMOVL/VMOVN instructions can be used. This is done pre-ISel so that it can work across basic blocks.
This initial version of the pass just handles a limited set of instructions, not handling constants or splats or FP, which can all come as extensions to this base.
Differential Revision: https://reviews.llvm.org/D95804
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3 |
|
| #
8a316045 |
| 25-Feb-2021 |
Amara Emerson <[email protected]> |
[AArch64][GlobalISel] Enable use of the optsize predicate in the selector.
To do this while supporting the existing functionality in SelectionDAG of using PGO info, we add the ProfileSummaryInfo and
[AArch64][GlobalISel] Enable use of the optsize predicate in the selector.
To do this while supporting the existing functionality in SelectionDAG of using PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis dependencies to the instruction selector pass.
Then, use the predicate to generate constant pool loads for f32 materialization, if we're targeting optsize/minsize.
Differential Revision: https://reviews.llvm.org/D97732
show more ...
|
| #
e880f8b8 |
| 01-Mar-2021 |
David Green <[email protected]> |
[ARM] Rename pass to MVETPAndVPTOptimisationsPass
This pass has for a while performed Tail predication as well as VPT block optimizations. Rename the pass to make that clear.
|
|
Revision tags: llvmorg-12.0.0-rc2 |
|
| #
08086647 |
| 15-Feb-2021 |
Arlo Siemsen <[email protected]> |
Add ehcont section support
In the future Windows will enable Control-flow Enforcement Technology (CET aka shadow stacks). To protect the path where the context is updated during exception handling,
Add ehcont section support
In the future Windows will enable Control-flow Enforcement Technology (CET aka shadow stacks). To protect the path where the context is updated during exception handling, the binary is required to enumerate valid unwind entrypoints in a dedicated section which is validated when the context is being set during exception handling.
This change allows llvm to generate the section that contains the appropriate symbol references in the form expected by the msvc linker.
This feature is enabled through a new module flag, ehcontguard, which was modelled on the cfguard flag.
The change includes a test that when the module flag is enabled the section is correctly generated.
The set of exception continuation information includes returns from exceptional control flow (catchret in llvm).
In order to collect catchret we: 1) Includes an additional flag on machine basic blocks to indicate that the given block is the target of a catchret operation, 2) Introduces a new machine function pass to insert and collect symbols at the start of each block, and 3) Combines these targets with the other EHCont targets that were already being collected.
Change originally authored by Daniel Frampton <[email protected]>
For more details, see MSVC documentation for `/guard:ehcont` https://docs.microsoft.com/en-us/cpp/build/reference/guard-enable-eh-continuation-metadata
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D94835
show more ...
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
5e4480b6 |
| 15-Jan-2021 |
Sam Tebbs <[email protected]> |
[ARM] Don't run the block placement pass at O0
The block placement pass shouldn't run unless optimisations are enabled.
Differential Revision: https://reviews.llvm.org/D94691
|
|
Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
60fda8eb |
| 26-Nov-2020 |
Sam Tebbs <[email protected]> |
[ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch
Blocks can be laid out such that a t2WhileLoopStart branches backwards. This is forbidden by the architecture and so it
[ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch
Blocks can be laid out such that a t2WhileLoopStart branches backwards. This is forbidden by the architecture and so it fails to be converted into a low-overhead loop. This new pass checks for these cases and moves the target block, fixing any fall-through that would then be broken.
Differential Revision: https://reviews.llvm.org/D92385
show more ...
|
|
Revision tags: llvmorg-11.0.1-rc1 |
|
| #
a4c1f516 |
| 20-Nov-2020 |
Kristof Beyls <[email protected]> |
[ARM] Harden indirect calls against SLS
To make sure that no barrier gets placed on the architectural execution path, each indirect call calling the function in register rN, it gets transformed to a
[ARM] Harden indirect calls against SLS
To make sure that no barrier gets placed on the architectural execution path, each indirect call calling the function in register rN, it gets transformed to a direct call to __llvm_slsblr_thunk_mode_rN. mode is either arm or thumb, depending on the mode of where the indirect call happens.
The llvm_slsblr_thunk_mode_rN thunk contains:
bx rN <speculation barrier>
Therefore, the indirect call gets split into 2; one direct call and one indirect jump. This transformation results in not inserting a speculation barrier on the architectural execution path.
The mitigation is off by default and can be enabled by the harden-sls-blr subtarget feature.
As a linker is allowed to clobber r12 on function calls, the above code transformation is not correct in case a linker does so. Similarly, the transformation is not correct when register lr is used. Avoiding r12/lr being used is done in a follow-on patch to make reviewing this code easier.
Differential Revision: https://reviews.llvm.org/D92468
show more ...
|
| #
195f4427 |
| 28-Oct-2020 |
Kristof Beyls <[email protected]> |
[ARM] Implement harden-sls-retbr for ARM mode
Some processors may speculatively execute the instructions immediately following indirect control flow, such as returns, indirect jumps and indirect fun
[ARM] Implement harden-sls-retbr for ARM mode
Some processors may speculatively execute the instructions immediately following indirect control flow, such as returns, indirect jumps and indirect function calls.
To avoid a potential miss-speculatively executed gadget after these instructions leaking secrets through side channels, this pass places a speculation barrier immediately after every indirect control flow where control flow doesn't return to the next instruction, such as returns and indirect jumps, but not indirect function calls.
Hardening of indirect function calls will be done in a later, independent patch.
This patch is implementing the same functionality as the AArch64 counter part implemented in https://reviews.llvm.org/D81400. For AArch64, returns and indirect jumps only occur on RET and BR instructions and hence the function attribute to control the hardening is called "harden-sls-retbr" there. On AArch32, there is a much wider variety of instructions that can trigger an indirect unconditional control flow change. I've decided to stick with the name "harden-sls-retbr" as introduced for the corresponding AArch64 mitigation.
This patch implements this for ARM mode. A future patch will extend this to also support Thumb mode.
The inserted barriers are never on the correct, architectural execution path, and therefore performance overhead of this is expected to be low. To ensure these barriers are never on an architecturally executed path, when the harden-sls-retbr function attribute is present, indirect control flow is never conditionalized/predicated.
On targets that implement that Armv8.0-SB Speculation Barrier extension, a single SB instruction is emitted that acts as a speculation barrier. On other targets, a DSB SYS followed by a ISB is emitted to act as a speculation barrier.
These speculation barriers are implemented as pseudo instructions to avoid later passes to analyze them and potentially remove them.
The mitigation is off by default and can be enabled by the harden-sls-retbr subtarget feature.
Differential Revision: https://reviews.llvm.org/D92395
show more ...
|