|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
37b37838 |
| 16-Mar-2022 |
Shengchen Kan <[email protected]> |
[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
2bbe6506 |
| 27-Feb-2022 |
Carl Ritson <[email protected]> |
[AMDGPU] Remove redundant isVALU in SIPreEmitPeephole. NFC
Remove redundant isVALU call added in D120202.
|
| #
565af157 |
| 25-Feb-2022 |
Carl Ritson <[email protected]> |
[AMDGPU] Extend pre-emit peephole to redundantly masked VCC
Extend pre-emit peephole for S_CBRANCH_VCC[N]Z to eliminate redundant S_AND operations against EXEC for V_CMP results in VCC. These occur
[AMDGPU] Extend pre-emit peephole to redundantly masked VCC
Extend pre-emit peephole for S_CBRANCH_VCC[N]Z to eliminate redundant S_AND operations against EXEC for V_CMP results in VCC. These occur after after register allocation when VCC has been selected as the comparison destination.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D120202
show more ...
|
| #
6527b2a4 |
| 18-Feb-2022 |
Sebastian Neubauer <[email protected]> |
[AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.
Differential Revision: https://reviews.llvm.org/D119235
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
59a26448 |
| 22-Nov-2021 |
Kazu Hirata <[email protected]> |
[Target] Use range-based for loops (NFC)
|
| #
30b27ecf |
| 19-Nov-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Use new opcode for indexed vgpr reads
Introduce V_MOV_B32_indirect_read for indexed vgpr reads (and rename the old V_MOV_B32_indirect to V_MOV_B32_indirect_write) so they can be unambiguous
[AMDGPU] Use new opcode for indexed vgpr reads
Introduce V_MOV_B32_indirect_read for indexed vgpr reads (and rename the old V_MOV_B32_indirect to V_MOV_B32_indirect_write) so they can be unambiguously distinguished from regular V_MOV_B32_e32. Previously they were distinguished by looking for extra implicit operands but this is fragile because regular moves sometimes have extra implicit operands too: - either by accident, when instructions end up with duplicate implicit operands (see e.g. D100939) - or by design, when SIInstrInfo::copyPhysReg breaks a multi-dword copy into individual subreg mov instructions and adds implicit operands for the super-register.
The effect of this is that SIInstrInfo::isFoldableCopy can be simplified and identifies more foldable copies. The test diffs show that more immediate 0 values have been folded as inline operands.
SIInstrInfo::isReallyTriviallyReMaterializable could probably be simplified too but that is not part of this patch.
Differential Revision: https://reviews.llvm.org/D114230
show more ...
|
| #
d1f45ed5 |
| 11-Nov-2021 |
Neubauer, Sebastian <[email protected]> |
[AMDGPU][NFC] Fix typos
Differential Revision: https://reviews.llvm.org/D113672
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
7e43483d |
| 30-Apr-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Remove set_gpr_idx instructions in conditional blocks
SIPreEmitPeephole did not try to remove redundant s_set_gpr_idx_* instructions in blocks that end with a conditional branch instruction
[AMDGPU] Remove set_gpr_idx instructions in conditional blocks
SIPreEmitPeephole did not try to remove redundant s_set_gpr_idx_* instructions in blocks that end with a conditional branch instruction. This seems like a simple oversight.
Differential Revision: https://reviews.llvm.org/D101629
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
fe5f4c39 |
| 20-Mar-2021 |
Carl Ritson <[email protected]> |
[AMDGPU] Rename SIInsertSkips Pass
Pass no longer handles skips. Pass now removes unnecessary unconditional branches and lowers early termination branches. Hence rename to SILateBranchLowering.
Mo
[AMDGPU] Rename SIInsertSkips Pass
Pass no longer handles skips. Pass now removes unnecessary unconditional branches and lowers early termination branches. Hence rename to SILateBranchLowering.
Move code to handle returns to epilog from SIPreEmitPeephole into SILateBranchLowering. This means SIPreEmitPeephole only contains optional optimisations, and all required transforms are in SILateBranchLowering.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D98915
show more ...
|
| #
5df2af8b |
| 20-Mar-2021 |
Carl Ritson <[email protected]> |
[AMDGPU] Merge SIRemoveShortExecBranches into SIPreEmitPeephole
SIRemoveShortExecBranches is an optimisation so fits well in the context of SIPreEmitPeephole.
Test changes relate to early terminati
[AMDGPU] Merge SIRemoveShortExecBranches into SIPreEmitPeephole
SIRemoveShortExecBranches is an optimisation so fits well in the context of SIPreEmitPeephole.
Test changes relate to early termination from kills which have now been lowered prior to considering branches for removal. As these use s_cbranch the execz skips are now retained instead. Currently either behaviour is valid as kill with EXEC=0 is a nop; however, if early termination is used differently in future then the new behaviour is the correct one.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D98917
show more ...
|
| #
b76c0902 |
| 20-Mar-2021 |
Carl Ritson <[email protected]> |
[AMDGPU] Allow index optimisation in SIPreEmitPeephole for bundles
Add code so duplication index register changes can be removed from inside bundles.
Reviewed By: rampitec, foad
Differential Revis
[AMDGPU] Allow index optimisation in SIPreEmitPeephole for bundles
Add code so duplication index register changes can be removed from inside bundles.
Reviewed By: rampitec, foad
Differential Revision: https://reviews.llvm.org/D98940
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
560d7e04 |
| 20-Jan-2021 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
|
Revision tags: llvmorg-11.1.0-rc1 |
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2 |
|
| #
d538c583 |
| 13-Aug-2020 |
Carl Ritson <[email protected]> |
[AMDGPU] Fix missed SI_RETURN_TO_EPILOG in pre-emit peephole
SIPreEmitPeephole does not process all terminators, which means it can fail to handle SI_RETURN_TO_EPILOG if immediately preceeded by a b
[AMDGPU] Fix missed SI_RETURN_TO_EPILOG in pre-emit peephole
SIPreEmitPeephole does not process all terminators, which means it can fail to handle SI_RETURN_TO_EPILOG if immediately preceeded by a branch to the early exit block.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D85872
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc1 |
|
| #
3a186657 |
| 17-Jul-2020 |
Carl Ritson <[email protected]> |
[AMDGPU] Translate s_and/s_andn2 to s_mov in vcc optimisation
When SCC is dead, but VCC is required then replace s_and / s_andn2 with s_mov into VCC when mask value is 0 or -1.
Reviewed By: rampite
[AMDGPU] Translate s_and/s_andn2 to s_mov in vcc optimisation
When SCC is dead, but VCC is required then replace s_and / s_andn2 with s_mov into VCC when mask value is 0 or -1.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D83850
show more ...
|
|
Revision tags: llvmorg-12-init |
|
| #
67422612 |
| 15-Jul-2020 |
Carl Ritson <[email protected]> |
[AMDGPU] Apply pre-emit s_cbranch_vcc optimation to more patterns
Add handling of s_andn2 and mask of 0. This eliminates redundant instructions from uniform control flow.
Reviewed By: rampitec
Dif
[AMDGPU] Apply pre-emit s_cbranch_vcc optimation to more patterns
Add handling of s_andn2 and mask of 0. This eliminates redundant instructions from uniform control flow.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D83641
show more ...
|
|
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
226cda58 |
| 17-Jun-2020 |
Christudasan Devadasan <[email protected]> |
[AMDGPU] Moving SI_RETURN_TO_EPILOG handling out of SIInsertSkips.
For now, moving it to SIPreEmitPeephole. Should find a right place to have this code.
Reviewed By: nhaehnle
Differential revision
[AMDGPU] Moving SI_RETURN_TO_EPILOG handling out of SIInsertSkips.
For now, moving it to SIPreEmitPeephole. Should find a right place to have this code.
Reviewed By: nhaehnle
Differential revision: https://reviews.llvm.org/D77544
show more ...
|
| #
677929e3 |
| 19-May-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Process V_MOV_B32_indirect in SET_GPR_IDX optimization
Differential Revision: https://reviews.llvm.org/D80256
|
|
Revision tags: llvmorg-10.0.1-rc1 |
|
| #
7d16a22e |
| 13-May-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Peephole adjacent equivalent S_SET_GPR_IDX_ON
Differential Revision: https://reviews.llvm.org/D79907
|
| #
ce984129 |
| 25-Mar-2020 |
cdevadas <[email protected]> |
[AMDGPU] Add SIPreEmitPeephole pass.
This pass can handle all the optimization opportunities found just before code emission. Presently it includes the handling of vcc branch optimization that was h
[AMDGPU] Add SIPreEmitPeephole pass.
This pass can handle all the optimization opportunities found just before code emission. Presently it includes the handling of vcc branch optimization that was handled earlier in SIInsertSkips.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D76712
show more ...
|