|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
547e3cba |
| 17-Jul-2022 |
Carl Ritson <[email protected]> |
[AMDGPU] Improve liveness copying in si-optimize-exec-masking-pre-ra
Further improve liveness copying for CC register post optimization by mirroring live internal splits. The fixes a bug in register
[AMDGPU] Improve liveness copying in si-optimize-exec-masking-pre-ra
Further improve liveness copying for CC register post optimization by mirroring live internal splits. The fixes a bug in register allocation when CC register liveness is extended across a branches instead of split.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D129557
show more ...
|
| #
8bc5e7ac |
| 06-Jul-2022 |
Carl Ritson <[email protected]> |
[AMDGPU] Additional liveness tests for si-optimize-exec-masking-pre-ra
Merge tests and fixes from D128110 and D128315 on top of already committed D128800.
Original author: arsenm
Reviewed By: arse
[AMDGPU] Additional liveness tests for si-optimize-exec-masking-pre-ra
Merge tests and fixes from D128110 and D128315 on top of already committed D128800.
Original author: arsenm
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D128882
show more ...
|
| #
d0f66416 |
| 30-Jun-2022 |
Carl Ritson <[email protected]> |
[AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra
Follow up to D127894, new liveness update code needs to handle the case where S_ANDN2 input must be extended through loops when V_C
[AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra
Follow up to D127894, new liveness update code needs to handle the case where S_ANDN2 input must be extended through loops when V_CNDMASK_B32 has been hoisted.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D128800
show more ...
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
b03d902b |
| 10-Jun-2022 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix invalid liveness after si-optimize-exec-masking-pre-ra
This was leaving behind a use at the deleted instruction which the verifier would fail during allocation.
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
6527b2a4 |
| 18-Feb-2022 |
Sebastian Neubauer <[email protected]> |
[AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.
Differential Revision: https://reviews.llvm.org/D119235
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
dccf83ac |
| 18-Mar-2021 |
alex-t <[email protected]> |
[AMDGPU] SIOptimizeExecMaskingPreRA should check constant bus constraint when folds EXEC copy
Folding EXEC copy into it's single use may lead to constant bus constraint violation as it adds one more
[AMDGPU] SIOptimizeExecMaskingPreRA should check constant bus constraint when folds EXEC copy
Folding EXEC copy into it's single use may lead to constant bus constraint violation as it adds one more SGPR operand. This change makes it validate the user instruction with the new SGPR operand and only fold it if it is legal.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D98888
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
560d7e04 |
| 20-Jan-2021 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
|
Revision tags: llvmorg-11.1.0-rc1 |
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
5dc47541 |
| 03-Nov-2020 |
Mircea Trofin <[email protected]> |
[NFC] Use Register/MCRegister
Differential Revision: https://reviews.llvm.org/D90724
|
| #
be2afbd0 |
| 20-Oct-2020 |
Carl Ritson <[email protected]> |
[AMDGPU] Remove fix up operand from SI_ELSE
Remove immediate operand from SI_ELSE which indicates if EXEC has been modified. Instead always emit code that handles EXEC and remove unnecessary instru
[AMDGPU] Remove fix up operand from SI_ELSE
Remove immediate operand from SI_ELSE which indicates if EXEC has been modified. Instead always emit code that handles EXEC and remove unnecessary instructions during pre-RA optimisation.
This facilitates passes (i.e. SIWholeQuadMode) adding exec mask manipulation post control flow lowering, and pre control flow lower passes do not need to be aware of SI_ELSE handling.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D89644
show more ...
|
| #
6aabbead |
| 20-Oct-2020 |
Carl Ritson <[email protected]> |
[AMDGPU][NFC] Tidy SIOptimizeExecMaskingPreRA for extensibility
Remove duplicate code and move things around to make it easier to add additional optimisations to the pass.
Reviewed By: rampitec
Di
[AMDGPU][NFC] Tidy SIOptimizeExecMaskingPreRA for extensibility
Remove duplicate code and move things around to make it easier to add additional optimisations to the pass.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D89619
show more ...
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
| #
34978602 |
| 20-Aug-2020 |
Jay Foad <[email protected]> |
[AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister
... in favour of the isPhysical/isVirtual methods.
|
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1 |
|
| #
e9b236f4 |
| 25-Jul-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Check for other defs when folding conditions into s_andn2_b64
We can't fold the masked compare value through the select if the select condition is re-defed after the and instruction. Fixes a
AMDGPU: Check for other defs when folding conditions into s_andn2_b64
We can't fold the masked compare value through the select if the select condition is re-defed after the and instruction. Fixes a verifier error and trying to use the outgoing value defined in the block.
I'm not sure why this pass is bothering to handle physregs. It's making this more complex and forces extra liveness computation.
show more ...
|
|
Revision tags: llvmorg-12-init |
|
| #
74c14202 |
| 14-Jul-2020 |
Carl Ritson <[email protected]> |
[AMDGPU] Propagate dead flag during pre-RA exec mask optimizations
Preserve SCC dead flags in SIOptimizeExecMaskingPreRA. This helps with removing redundant s_andn2 instructions later.
Reviewed By:
[AMDGPU] Propagate dead flag during pre-RA exec mask optimizations
Preserve SCC dead flags in SIOptimizeExecMaskingPreRA. This helps with removing redundant s_andn2 instructions later.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D83637
show more ...
|
|
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
eab9a4f1 |
| 13-Apr-2020 |
Austin Kerbow <[email protected]> |
[AMDGPU] Don't assert on partial exec copy
After Machine CSE and coalescing we can end up with copies of exec to subregister SGPRs.
Differential Revision: https://reviews.llvm.org/D77992
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4 |
|
| #
a7352864 |
| 12-Mar-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Simplify exec copies
The patch removes late endcf handling and only leaves the related portion with redundant exec mask copy elimination.
Differential Revision: https://reviews.llvm.org/D7
[AMDGPU] Simplify exec copies
The patch removes late endcf handling and only leaves the related portion with redundant exec mask copy elimination.
Differential Revision: https://reviews.llvm.org/D76095
show more ...
|
| #
9801e546 |
| 10-Mar-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Disable nested endcf collapse
The assumption is that conditional regions are perfectly nested and a mask restored at the exit from the inner block will be completely covered by a mask resto
[AMDGPU] Disable nested endcf collapse
The assumption is that conditional regions are perfectly nested and a mask restored at the exit from the inner block will be completely covered by a mask restored in the outer.
It turns out with our current structurizer this is not always the case.
Disable the optimization for now, but I want to keep it around for a while to either try after further structurizer changes or to move it into control flow lowering where we have more info and reuse the test.
Differential Revision: https://reviews.llvm.org/D75958
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
| #
05da2fe5 |
| 13-Nov-2019 |
Reid Kleckner <[email protected]> |
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of reco
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
show more ...
|
| #
df6e6769 |
| 08-Oct-2019 |
Nicolai Haehnle <[email protected]> |
AMDGPU: Propagate undef flag during pre-RA exec mask optimizations
Summary: Issue: https://github.com/GPUOpen-Drivers/llpc/issues/204
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, jvesely, wd
AMDGPU: Propagate undef flag during pre-RA exec mask optimizations
Summary: Issue: https://github.com/GPUOpen-Drivers/llpc/issues/204
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68184
llvm-svn: 374041
show more ...
|
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3 |
|
| #
4b7fc85c |
| 20-Aug-2019 |
Matt Arsenault <[email protected]> |
Revert "AMDGPU: Fix iterator error when lowering SI_END_CF"
This reverts r367500 and r369203. This is causing various test failures.
llvm-svn: 369417
|
| #
0c476111 |
| 15-Aug-2019 |
Daniel Sanders <[email protected]> |
Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Re
Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
show more ...
|
|
Revision tags: llvmorg-9.0.0-rc2 |
|
| #
2bea69bf |
| 01-Aug-2019 |
Daniel Sanders <[email protected]> |
Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC
llvm-svn: 367633
|
| #
d48324ff |
| 01-Aug-2019 |
Matt Arsenault <[email protected]> |
Reapply "AMDGPU: Split block for si_end_cf"
This reverts commit r359363, reapplying r357634
llvm-svn: 367500
|
|
Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3 |
|
| #
52500216 |
| 16-Jun-2019 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] gfx10 conditional registers handling
This is cpp source part of wave32 support, excluding overriden getRegClass().
Differential Revision: https://reviews.llvm.org/D63351
llvm-svn: 363513
|
|
Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
| #
76c5b629 |
| 27-Apr-2019 |
Mark Searles <[email protected]> |
Revert "AMDGPU: Split block for si_end_cf"
This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4.
We discovered some internal test failures, so reverting for now.
Differential Revision: htt
Revert "AMDGPU: Split block for si_end_cf"
This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4.
We discovered some internal test failures, so reverting for now.
Differential Revision: https://reviews.llvm.org/D61213
llvm-svn: 359363
show more ...
|