|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
9e6d1f4b |
| 17-Jul-2022 |
Kazu Hirata <[email protected]> |
[CodeGen] Qualify auto variables in for loops (NFC)
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
b4ad28da |
| 11-Apr-2022 |
Momchil Velikov <[email protected]> |
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CG
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CGA aware) nature of the unwind tables.
Unlike the `CFIInstrInserer` pass, this one almost always emits only `.cfi_remember_state`/`.cfi_restore_state`, which results in smaller unwind tables and also transparently handles custom unwind info extensions like CFA offset adjustement and save locations of SVE registers.
This pass takes advantage of the constraints taht LLVM imposes on the placement of save/restore points (cf. `ShrinkWrap.cpp`):
* there is a single basic block, containing the function prologue
* possibly multiple epilogue blocks, where each epilogue block is complete and self-contained, i.e. CSR restore instructions (and the corresponding CFI instructions are not split across two or more blocks.
* prologue and epilogue blocks are outside of any loops
Thus, during execution, at the beginning and at the end of each basic block the function can be in one of two states:
- "has a call frame", if the function has executed the prologue, or has not executed any epilogue
- "does not have a call frame", if the function has not executed the prologue, or has executed an epilogue
These properties can be computed for each basic block by a single RPO traversal.
From the point of view of the unwind tables, the "has/does not have call frame" state at beginning of each block is determined by the state at the end of the previous block, in layout order.
Where these states differ, we insert compensating CFI instructions, which come in two flavours:
- CFI instructions, which reset the unwind table state to the initial one. This is done by a target specific hook and is expected to be trivial to implement, for example it could be: ``` .cfi_def_cfa <sp>, 0 .cfi_same_value <rN> .cfi_same_value <rN-1> ... ``` where `<rN>` are the callee-saved registers.
- CFI instructions, which reset the unwind table state to the one created by the function prologue. These are the sequence: ``` .cfi_restore_state .cfi_remember_state ``` In this case we also insert a `.cfi_remember_state` after the last CFI instruction in the function prologue.
Reviewed By: MaskRay, danielkiss, chill
Differential Revision: https://reviews.llvm.org/D114545
show more ...
|
| #
0320115c |
| 05-Apr-2022 |
Muhammad Omair Javaid <[email protected]> |
Revert "[CodeGen] Async unwind - add a pass to fix CFI information"
This reverts commit 980c3e6dd223a8e628367144b8180117950bb364.
This commit had failing tests with clang crashing across various AA
Revert "[CodeGen] Async unwind - add a pass to fix CFI information"
This reverts commit 980c3e6dd223a8e628367144b8180117950bb364.
This commit had failing tests with clang crashing across various AArch64/Linux buildots.
https://lab.llvm.org/buildbot/#/builders/179/builds/3346
Differential Revision: https://reviews.llvm.org/D114545
show more ...
|
| #
980c3e6d |
| 04-Apr-2022 |
Momchil Velikov <[email protected]> |
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CF
[CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CFG aware) nature of the unwind tables.
Unlike the `CFIInstrInserer` pass, this one almost always emits only `.cfi_remember_state`/`.cfi_restore_state`, which results in smaller unwind tables and also transparently handles custom unwind info extensions like CFA offset adjustement and save locations of SVE registers.
This pass takes advantage of the constraints that LLVM imposes on the placement of save/restore points (cf. `ShrinkWrap.cpp`):
* there is a single basic block, containing the function prologue
* possibly multiple epilogue blocks, where each epilogue block is complete and self-contained, i.e. CSR restore instructions (and the corresponding CFI instructions are not split across two or more blocks.
* prologue and epilogue blocks are outside of any loops
Thus, during execution, at the beginning and at the end of each basic block the function can be in one of two states:
- "has a call frame", if the function has executed the prologue, or has not executed any epilogue
- "does not have a call frame", if the function has not executed the prologue, or has executed an epilogue
These properties can be computed for each basic block by a single RPO traversal.
In order to accommodate backends which do not generate unwind info in epilogues we compute an additional property "strong no call frame on entry" which is set for the entry point of the function and for every block reachable from the entry along a path that does not execute the prologue. If this property holds, it takes precedence over the "has a call frame" property.
From the point of view of the unwind tables, the "has/does not have call frame" state at beginning of each block is determined by the state at the end of the previous block, in layout order.
Where these states differ, we insert compensating CFI instructions, which come in two flavours:
- CFI instructions, which reset the unwind table state to the initial one. This is done by a target specific hook and is expected to be trivial to implement, for example it could be: ``` .cfi_def_cfa <sp>, 0 .cfi_same_value <rN> .cfi_same_value <rN-1> ... ``` where `<rN>` are the callee-saved registers.
- CFI instructions, which reset the unwind table state to the one created by the function prologue. These are the sequence: ``` .cfi_restore_state .cfi_remember_state ``` In this case we also insert a `.cfi_remember_state` after the last CFI instruction in the function prologue.
Reviewed By: MaskRay, danielkiss, chill
Differential Revision: https://reviews.llvm.org/D114545
show more ...
|
| #
37b37838 |
| 16-Mar-2022 |
Shengchen Kan <[email protected]> |
[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments
|
| #
989f1c72 |
| 15-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
a278250b |
| 10-Mar-2022 |
Nico Weber <[email protected]> |
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/
Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
show more ...
|
| #
7f230fee |
| 07-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup codegen includes
after: 1061034926 before: 1063332844
Differential Revision: https://reviews.llvm.org/D121169
|
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
630c847b |
| 07-Dec-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use range-based for loops (NFC)
|
| #
98a021fc |
| 03-Dec-2021 |
Stephen Tozer <[email protected]> |
[DebugInfo] Attempt to preserve more information during tail duplication
Prior to this patch, tail duplication handled debug info poorly - specifically, debug instructions would be dropped instead o
[DebugInfo] Attempt to preserve more information during tail duplication
Prior to this patch, tail duplication handled debug info poorly - specifically, debug instructions would be dropped instead of being set undef, potentially extending the lifetimes of prior debug values that should be killed. The pass was also very aggressive with dropping debug info, dropping debug info even when the SSA value it referred to was still present. This patch attempts to handle debug info more carefully, checking to see whether each affected debug value can still be live, setting it undef if not.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D106875
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
17eb6b61 |
| 24-Nov-2021 |
Jun Ma <[email protected]> |
Revert "[Taildup] Don't tail-duplicate loop header with multiple successors as its latches"
This reverts commit 1f9fa549841a2ec55aa5a131bfaf83f0383c4713.
|
| #
7f00806a |
| 16-Nov-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use make_early_inc_range (NFC)
|
| #
1f9fa549 |
| 28-Sep-2021 |
Jun Ma <[email protected]> |
[Taildup] Don't tail-duplicate loop header with multiple successors as its latches
when Taildup hit loop with multiple latches like: // 1 -> 2 <-> 3 | // \ <-> 4
[Taildup] Don't tail-duplicate loop header with multiple successors as its latches
when Taildup hit loop with multiple latches like: // 1 -> 2 <-> 3 | // \ <-> 4 | // \ <-> 5 | // \---> rest | it may transform this loop into multiple loops by duplicate loop header. However, this change may has little benefit while makes cfg much complex. In some uncommon cases, it causes large compile time regression (offered by @alexfh in D106056).
This patch disable tail-duplicate of such cases.
TestPlan: check-llvm
Differential Revision: https://reviews.llvm.org/D110613
show more ...
|
| #
c78640ee |
| 22-Oct-2021 |
Neubauer, Sebastian <[email protected]> |
[TailDuplicator] Fix merging block with terminator
The TailDuplicator merged two blocks, even if the first one ended with a terminator, resulting in invalid MIR, where a terminator is in the middle
[TailDuplicator] Fix merging block with terminator
The TailDuplicator merged two blocks, even if the first one ended with a terminator, resulting in invalid MIR, where a terminator is in the middle of a block.
Abort merging if the first block ends with a terminator.
Differential Revision: https://reviews.llvm.org/D112226
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
e2febc2e |
| 17-Sep-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use drop_begin (NFC)
|
| #
cfc74024 |
| 16-Sep-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use drop_begin (NFC)
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
31e75512 |
| 26-Jul-2021 |
Stephen Tozer <[email protected]> |
[DebugInfo] Correctly update debug users of SSA values in tail duplication
During tail duplication, SSA values may be updated and have their uses replaced with a virtual register, and any debug inst
[DebugInfo] Correctly update debug users of SSA values in tail duplication
During tail duplication, SSA values may be updated and have their uses replaced with a virtual register, and any debug instructions that use that value are deleted. This patch fixes the implementation of the debug instruction deletion to work correctly for debug instructions that use the SSA value multiple times, by batching deletions so that we don't attempt to delete the same instruction twice.
Differential Revision: https://reviews.llvm.org/D106557
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
| #
bd524955 |
| 17-Jun-2021 |
Hongtao Yu <[email protected]> |
[CSSPGO] Undoing the concept of dangling pseudo probe
As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen.
I'm s
[CSSPGO] Undoing the concept of dangling pseudo probe
As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen.
I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D104477
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3 |
|
| #
89855158 |
| 25-Feb-2021 |
Hongtao Yu <[email protected]> |
[CSSPGO] Unblocking optimizations by dangling pseudo probes.
This change fixes a couple places where the pseudo probe intrinsic blocks optimizations because they are not naturally removable. To unbl
[CSSPGO] Unblocking optimizations by dangling pseudo probes.
This change fixes a couple places where the pseudo probe intrinsic blocks optimizations because they are not naturally removable. To unblock those optimizations, the blocking pseudo probes are moved out of the original blocks and tagged dangling, instead of allowing pseudo probes to be literally removed. The reason is that when the original block is removed, we won't be able to sample it. Instead of assigning it a zero weight, moving all its pseudo probes into another block and marking them dangling should allow the counts inference a chance to assign them a more reasonable weight. We have not seen counts quality degradation from our experiments.
The optimizations being unblocked are:
1. Removing conditional probes for if-converted branches. Conditional probes are tagged dangling when their homing branch arms are folded so that they will not be over-counted. 2. Unblocking jump threading from removing empty blocks. Pseudo probe prevents jump threading from removing logically empty blocks that only has one unconditional jump instructions. 3. Unblocking SimplifyCFG and MIR tail duplicate to thread empty blocks and blocks with redundant branch checks.
Since dangling probes are logically deleted, they should not consume any samples in LTO postLink. This can be achieved by setting their distribution factors to zero when dangled.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D97481
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc2 |
|
| #
a205fa5c |
| 20-Feb-2021 |
Kazu Hirata <[email protected]> |
[CodeGen] Use range-based for loops (NFC)
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
| #
7bc76fd0 |
| 31-Dec-2020 |
Kazu Hirata <[email protected]> |
[CodeGen] Construct SmallVector with iterator ranges (NFC)
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6 |
|
| #
d2c61d2b |
| 07-Oct-2020 |
Bill Wendling <[email protected]> |
[CodeGen][TailDuplicator] Don't duplicate blocks with INLINEASM_BR
Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introdu
[CodeGen][TailDuplicator] Don't duplicate blocks with INLINEASM_BR
Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node *after* the INLINEASM_BR, which is not okay.
See: https://github.com/ClangBuiltLinux/linux/issues/1125
Differential Revision: https://reviews.llvm.org/D88823
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
4b0aa572 |
| 16-May-2020 |
James Y Knight <[email protected]> |
Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while
Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while also supporting outputs causes some trouble, as the physreg->vreg COPY operations cannot be in the same block.
Modeling it as a non-terminator allows it to be handled the same way as invoke is handled already.
Most of the changes here were created by auditing all the existing users of MachineBasicBlock::isEHPad() and MachineBasicBlock::hasEHPadSuccessor(), and adding calls to isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate.
Reviewed By: nickdesaulniers, void
Differential Revision: https://reviews.llvm.org/D79794
show more ...
|
| #
edb4a5cb |
| 30-Jun-2020 |
Matt Arsenault <[email protected]> |
TailDuplicator: Use Register
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
| #
1978309d |
| 19-Feb-2020 |
James Y Knight <[email protected]> |
MachineBasicBlock::updateTerminator now requires an explicit layout successor.
Previously, it tried to infer the correct destination block from the successor list, but this is a rather tricky propsp
MachineBasicBlock::updateTerminator now requires an explicit layout successor.
Previously, it tried to infer the correct destination block from the successor list, but this is a rather tricky propspect, given the existence of successors that occur mid-block, such as invoke, and potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in particular would be problematic, because its successor blocks are not distinct from "normal" successors, as EHPads are.)
Instead, require the caller to pass in the expected fallthrough successor explicitly. In most callers, the correct block is immediately clear. But, in MachineBlockPlacement, we do need to record the original ordering, before starting to reorder blocks.
Unfortunately, the goal of decoupling the behavior of end-of-block jumps from the successor list has not been fully accomplished in this patch, as there is currently no other way to determine whether a block is intended to fall-through, or end as unreachable. Further work is needed there.
Differential Revision: https://reviews.llvm.org/D79605
show more ...
|