|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
8d0383eb |
| 24-Jun-2022 |
Matt Arsenault <[email protected]> |
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable.
Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy.
Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.
show more ...
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
e9f7263b |
| 16-Jun-2022 |
Kito Cheng <[email protected]> |
Reland "[SplitKit] Handle early clobber + tied to def correctly"
This reverts commit 7207373e1eb0dd419b4e13a5e2d0ca146ef9544e.
We found another RISC-V bug when landing D126048, and it has been fixe
Reland "[SplitKit] Handle early clobber + tied to def correctly"
This reverts commit 7207373e1eb0dd419b4e13a5e2d0ca146ef9544e.
We found another RISC-V bug when landing D126048, and it has been fixed by D127642 now.
Differential Revision: https://reviews.llvm.org/D126048
show more ...
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
7207373e |
| 08-Jun-2022 |
Kito Cheng <[email protected]> |
Revert "[SplitKit] Handle early clobber + tied to def correctly"
Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS.
This reverts commit e14d04909df4e52e531f6c2e045c3cf9638dd817.
|
| #
e14d0490 |
| 26-May-2022 |
Kito Cheng <[email protected]> |
[SplitKit] Handle early clobber + tied to def correctly
Spliter will try to extend a live range into `r` slot for a use operand, that's works on most situaion, however that not work correctly when t
[SplitKit] Handle early clobber + tied to def correctly
Spliter will try to extend a live range into `r` slot for a use operand, that's works on most situaion, however that not work correctly when the operand has tied to def, and the def operand is early clobber.
Give an example to demo what's wrong: 0 %0 = ... 16 early-clobber %0 = Op %0 (tied-def 0), ... 32 ... = Op %0
Before extend: %0 = [0r, 0d) [16e, 32d)
The point we want to extend is 0d to 16e not 16r in this case, but if we use 16r here we will extend nothing because that already contained in [16e, 32d).
This patch add check for detect such case and adjust the extend point.
Detailed explanation for testcase: https://reviews.llvm.org/D126047
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D126048
show more ...
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1 |
|
| #
a40dc4ea |
| 05-Feb-2022 |
Benjamin Kramer <[email protected]> |
Simplify mask creation with llvm::seq. NFCI.
|
| #
592f52de |
| 03-Feb-2022 |
Mircea Trofin <[email protected]> |
[nfc][regalloc] const LiveIntervals within the allocator
Once built, LiveIntervals are immutable. This patch captures that.
Differential Revision: https://reviews.llvm.org/D118918
|
|
Revision tags: llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
87c00878 |
| 28-Aug-2021 |
Matt Arsenault <[email protected]> |
SplitKit: Remove decade old live interval hack
This was trying to fixup broken live intervals coming out of the coalescer. The verifier is more complete now and no tests seem to fail without this.
|
|
Revision tags: llvmorg-13.0.0-rc2 |
|
| #
e1beebba |
| 10-Aug-2021 |
Ruiling Song <[email protected]> |
SplitKit: Don't further split subrange mask in buildCopy
We may use several COPY instructions to copy the needed sub-registers during split. But the way we split the lanes during the COPYs may be di
SplitKit: Don't further split subrange mask in buildCopy
We may use several COPY instructions to copy the needed sub-registers during split. But the way we split the lanes during the COPYs may be different from the subranges of the old register. This would fail when we extend the subranges of the new register because the LaneMasks do not match exactly between subranges of new register and old register. Since we are bundling the COPYs, I think there is no need to further refine the subranges of the new register based on the set of LaneMasks of the inserted COPYs.
I am not sure if there will be further breaking cases. But as the subranges of new register are created based on the LaneMasks of the subranges of old register, it will be highly possible we will always find an exact LaneMask match. We can think about how to make the extendPHIKillRanges() work for subrange mask mismatch case if we meet more such cases in the future.
The test case was from D105065 by @arsenm.
Differential Revision: https://reviews.llvm.org/D107829
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
9f631d14 |
| 04-May-2021 |
Serguei Katkov <[email protected]> |
[GreedyRA] Add support for invoke statepoint with tied-defs.
statepoint instruction uses tied-def registers to represent live gc value which is use and def at the same time on a call. At the same ti
[GreedyRA] Add support for invoke statepoint with tied-defs.
statepoint instruction uses tied-def registers to represent live gc value which is use and def at the same time on a call. At the same time invoke statepoint instruction is a last split point which can throw and jump to landing pad. As a result we have instructon which is last split point with tied-defs registers and we need to teach Greedy RA to work with it.
The option -use-registers-for-gc-values-in-landing-pad controls whether statepoint lowering will generate tied-defs for invoke statepoint and is off by default now.
To resolve all issues the following changes has been done. 1) Last Split point for invoke statepoint should be statepoint itself
If statepoint has a def it is a relocated gc pointer and it should be available in landing pad. So we cannot split interval after statepoint at end of basic block.
2) Do not split interval on tied-def
If end of interval for overlap utility is a use which has tied-def we should not split interval on this instruction due to in this case use and def may have different registers and it breaks tied-def property.
3) Take into account Last Split Point for enterIntvAtEnd
If the use after Last Split Point is a def so it should be tied-def and we can take the def of the tied-use as ParentVNI and thus tied-use and tied-def will be live in resulting interval.
4) Handle the case when def is after LIP in InlineSpiller
If def of LI is after last insertion point of basic block we cannot hoist in this BB.
The example of such instruction is invoke statepoint where def represents the relocated live gc pointer. Invoke is a last insertion point and its def is located after it. In this case there is no place to insert spill and we bail out.
5) Fix removeBackCopies to account empty copies
RegAssignMap cannot hold empty interval, so do not set stop to kill value if it produces empty interval.
This can happen if we remove back-copy and right before that we have another back-copy.
For example, for parent %0 we can get %1 = COPY %0 %2 = COPY %0 while we removing %2 we cannot set kill for %1 due to its empty.
6) Do not hoist copy to BB if its def is after LSP
If the parent def is a LastSplitPoint or later we cannot hoist copy to this basic block because inserted copy (or re-materialization) will be located before the def.
All parts have been reviewed separately as follows: https://reviews.llvm.org/D100747 https://reviews.llvm.org/D100748 https://reviews.llvm.org/D100750 https://reviews.llvm.org/D100927 https://reviews.llvm.org/D100945 https://reviews.llvm.org/D101028
Reviewers: reames, rnk, void, MatzeB, wmi, qcolombet Reviewed By: reames, qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D101150
show more ...
|
| #
b98807df |
| 13-Apr-2021 |
Hongtao Yu <[email protected]> |
[CSSPGO] Exclude pseudo probes from slot index
Pseudo probe are currently given a slot index like other regular instructions. This affects register pressure and lifetime weight computation because o
[CSSPGO] Exclude pseudo probes from slot index
Pseudo probe are currently given a slot index like other regular instructions. This affects register pressure and lifetime weight computation because of enlarged lifetime length with pseudo probe instructions. As a consequence, program could get different code generated w/ and w/o pseudo probes. I'm closing the gap by excluding pseudo probes from stack index and downstream register allocation related passes.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D100334
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
ffba9e59 |
| 22-Feb-2021 |
Kazu Hirata <[email protected]> |
[CodeGen] Use range-based for loops (NFC)
|
| #
0b417ba2 |
| 21-Feb-2021 |
Kazu Hirata <[email protected]> |
[CodeGen] Use range-based for loops (NFC)
|
| #
82492f24 |
| 17-Feb-2021 |
Mircea Trofin <[email protected]> |
[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit
VirtRegAuxInfo is an extensibility point, so the register allocator's decision on which implementation to use should be communicate
[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit
VirtRegAuxInfo is an extensibility point, so the register allocator's decision on which implementation to use should be communicated to the other users - namely, LiveRangeEdit.
Differential Revision: https://reviews.llvm.org/D96898
show more ...
|
| #
fd04f3a3 |
| 19-Feb-2021 |
Kazu Hirata <[email protected]> |
[CodeGen] Use range-based for loops (NFC)
|
| #
5318d9e5 |
| 18-Feb-2021 |
Philip Reames <[email protected]> |
[splitkit] Add a minor wrapper function for readability [NFC]
|
| #
1dfb06d0 |
| 18-Feb-2021 |
Philip Reames <[email protected]> |
[regalloc] Add a couple of dump routines for ease of debugging [NFC]
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
1b3d8dde |
| 02-Dec-2020 |
Matt Arsenault <[email protected]> |
CodeGen: Move function to get subregister indexes to cover a LaneMask
Return the best covering index, and additional needed to complete the mask. This logically belongs in TargetRegisterInfo, althou
CodeGen: Move function to get subregister indexes to cover a LaneMask
Return the best covering index, and additional needed to complete the mask. This logically belongs in TargetRegisterInfo, although I ended up not needing it for why I originally split this out.
show more ...
|
| #
c5c4dbd2 |
| 22-Jan-2021 |
Kazu Hirata <[email protected]> |
[CodeGen] Use llvm::append_range (NFC)
|
| #
29bd6519 |
| 30-Nov-2020 |
Matt Arsenault <[email protected]> |
SplitKit: Use Register
|
|
Revision tags: llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4 |
|
| #
cdac4492 |
| 28-Sep-2020 |
Jay Foad <[email protected]> |
[SplitKit] Cope with no live subranges in defFromParent
Following on from D87757 "[SplitKit] Only copy live lanes", it is possible to split a live range at a point when none of its subranges are liv
[SplitKit] Cope with no live subranges in defFromParent
Following on from D87757 "[SplitKit] Only copy live lanes", it is possible to split a live range at a point when none of its subranges are live. This patch handles that case by inserting an implicit def of the superreg.
Patch by Quentin Colombet!
Differential Revision: https://reviews.llvm.org/D88397
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc3 |
|
| #
b34ddfcc |
| 21-Sep-2020 |
Jay Foad <[email protected]> |
[SplitKit] In addDeadDef tolerate parent range that defines more lanes
Following on from D87757 "[SplitKit] Only copy live lanes", in SplitEditor::addDeadDef, when we're checking whether the parent
[SplitKit] In addDeadDef tolerate parent range that defines more lanes
Following on from D87757 "[SplitKit] Only copy live lanes", in SplitEditor::addDeadDef, when we're checking whether the parent live interval has a subrange defining the same lanes, tolerate the case where the parent subrange defines a superset of the lanes. This can happen when the child subrange comes from SplitEditor::buildCopy decomposing a partial copy into a sequence of subreg copies that cover the required lanes.
Differential Revision: https://reviews.llvm.org/D88020
show more ...
|
| #
6f6d389d |
| 16-Sep-2020 |
Jay Foad <[email protected]> |
[SplitKit] Only copy live lanes
When splitting a live interval with subranges, only insert copies for the lanes that are live at the point of the split. This avoids some unnecessary copies and fixes
[SplitKit] Only copy live lanes
When splitting a live interval with subranges, only insert copies for the lanes that are live at the point of the split. This avoids some unnecessary copies and fixes a problem where copying dead lanes was generating MIR that failed verification. The test case for this is test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir.
Without this fix, some earlier live range splitting would create %430:
%430 [256r,848r:0)[848r,2584r:1) 0@256r 1@848r L0000000000000003 [848r,2584r:0) 0@848r L0000000000000030 [256r,2584r:0) 0@256r weight:1.480938e-03 ... 256B undef %430.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec ... 848B %430.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec ... 2584B %431:vreg_128 = COPY %430:vreg_128
Then RAGreedy::tryLocalSplit would split %430 into %432 and %433 just before 848B giving:
%432 [256r,844r:0) 0@256r L0000000000000030 [256r,844r:0) 0@256r weight:3.066802e-03 %433 [844r,848r:0)[848r,2584r:1) 0@844r 1@848r L0000000000000030 [844r,2584r:0) 0@844r L0000000000000003 [844r,844d:0)[848r,2584r:1) 0@844r 1@848r weight:2.831776e-03 ... 256B undef %432.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec ... 844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128 { internal %433.sub2:vreg_128 = COPY %432.sub2:vreg_128 848B } %433.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec ... 2584B %431:vreg_128 = COPY %433:vreg_128
Note that the copy from %432 to %433 at 844B is a curious bundle-without-a-BUNDLE-instruction that SplitKit creates deliberately, and it includes a copy of .sub0 which is not live at this point, and that causes it to fail verification:
*** Bad machine code: No live subrange at use *** - function: zextload_global_v64i16_to_v64i64 - basic block: %bb.0 (0x7faed48) [0B;2848B) - instruction: 844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128 - operand 1: %432.sub0:vreg_128 - interval: %432 [256r,844r:0) 0@256r L0000000000000030 [256r,844r:0) 0@256r weight:3.066802e-03 - at: 844B
Using real bundles with a BUNDLE instruction might also fix this problem, but the current fix is less invasive and also avoids some unnecessary copies.
https://bugs.llvm.org/show_bug.cgi?id=47492
Differential Revision: https://reviews.llvm.org/D87757
show more ...
|
| #
6e85c3d5 |
| 15-Sep-2020 |
Mircea Trofin <[email protected]> |
[NFC][Regalloc] accessors for 'reg' and 'weight'
Also renamed the fields to follow style guidelines.
Accessors help with readability - weight mutation, in particular, is easier to follow this way.
[NFC][Regalloc] accessors for 'reg' and 'weight'
Also renamed the fields to follow style guidelines.
Accessors help with readability - weight mutation, in particular, is easier to follow this way.
Differential Revision: https://reviews.llvm.org/D87725
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc2 |
|
| #
ebfa4104 |
| 13-Aug-2020 |
Simon Pilgrim <[email protected]> |
SplitKit.cpp - removes includes already included by SplitKit.h. NFC.
Don't duplicate includes already provided by the module header.
|
|
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
4b0aa572 |
| 16-May-2020 |
James Y Knight <[email protected]> |
Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while
Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while also supporting outputs causes some trouble, as the physreg->vreg COPY operations cannot be in the same block.
Modeling it as a non-terminator allows it to be handled the same way as invoke is handled already.
Most of the changes here were created by auditing all the existing users of MachineBasicBlock::isEHPad() and MachineBasicBlock::hasEHPadSuccessor(), and adding calls to isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate.
Reviewed By: nickdesaulniers, void
Differential Revision: https://reviews.llvm.org/D79794
show more ...
|