|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
bc9b964f |
| 20-Jul-2022 |
Arthur Eubanks <[email protected]> |
[NFC] Suppress unused variable warning in non-assert builds
|
| #
dc850fbf |
| 19-Jul-2022 |
Joe Nash <[email protected]> |
[AMDGPU] NFC. Assert that mask is full with VOPC DPP
VOPC DPP should not be formed when the row_mask and bank_mask are not 0xf (full) because the resulting VOP DPP would have different semantics tha
[AMDGPU] NFC. Assert that mask is full with VOPC DPP
VOPC DPP should not be formed when the row_mask and bank_mask are not 0xf (full) because the resulting VOP DPP would have different semantics than the MOV DPP followed by VOP. Existing checks in GCNDPPCombine cover this case but for different reasons, so assert the property for future-proofing.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D130101
show more ...
|
| #
b28bb8cc |
| 15-Jul-2022 |
Joe Nash <[email protected]> |
[AMDGPU] Remove old operand from VOPC DPP
For most DPP instructions, the old operand stores the value that was in the current lane before the DPP operation, and is tied to the destination. For VOPC
[AMDGPU] Remove old operand from VOPC DPP
For most DPP instructions, the old operand stores the value that was in the current lane before the DPP operation, and is tied to the destination. For VOPC DPP, this is unnecessary and incorrect.
There appears to have been a latent bug related to D122737 with SIInstrInfo::isOperandLegal. If you checked if a register operand was legal when the InstructionDesc expected an immediate, it reported that is valid. Its fix is necessary for and tested in this patch.
Reviewed By: foad, rampitec
Differential Revision: https://reviews.llvm.org/D130040
show more ...
|
| #
0483c91e |
| 27-Jun-2022 |
Joe Nash <[email protected]> |
[AMDGPU] gfx11 CodeGen for new DPP instructions
Modifies the GCNDPPCombine pass to enable DPP formation for the new DPP instruction in gfx11, namely VOP3 encoded instructions with DPP and VOPC with
[AMDGPU] gfx11 CodeGen for new DPP instructions
Modifies the GCNDPPCombine pass to enable DPP formation for the new DPP instruction in gfx11, namely VOP3 encoded instructions with DPP and VOPC with DPP.
Depends on D128656
Reviewed By: #amdgpu, rampitec
Differential Revision: https://reviews.llvm.org/D128682
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2 |
|
| #
ba6c8d42 |
| 21-Apr-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Combine DPP mov even if old reg def is in different BB
Given a DPP mov like this:
%2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec ... %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, impl
[AMDGPU] Combine DPP mov even if old reg def is in different BB
Given a DPP mov like this:
%2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec ... %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec
this patch just removes a check that %2 (the "old reg") was defined in the same BB as the DPP mov instruction. GCNDPPCombine requires that the MIR is in SSA form so I don't understand why the BB matters.
This lets the optimization work in more real world cases when the definition of %2 gets hoisted out of a loop.
Differential Revision: https://reviews.llvm.org/D124182
show more ...
|
|
Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4 |
|
| #
31f215ab |
| 10-Mar-2022 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Support v_mov_b64 in dpp combine
Differential Revision: https://reviews.llvm.org/D121411
|
|
Revision tags: llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
4bef0304 |
| 03-Nov-2021 |
Kazu Hirata <[email protected]> |
[AArch64, AMDGPU] Use make_early_inc_range (NFC)
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
b22721f0 |
| 19-Apr-2021 |
Jay Foad <[email protected]> |
[AMDGPU] GCNDPPCombine: don't shrink V_ADD_CO_U32 if carry out is used
Don't shrink VOP3 instructions if there are any uses of a carry-out operand, because the shrunken form of the instruction would
[AMDGPU] GCNDPPCombine: don't shrink V_ADD_CO_U32 if carry out is used
Don't shrink VOP3 instructions if there are any uses of a carry-out operand, because the shrunken form of the instruction would write the carry-out to vcc instead of to a virtual register.
Differential Revision: https://reviews.llvm.org/D100760
show more ...
|
| #
a02aa913 |
| 19-Apr-2021 |
Jay Foad <[email protected]> |
[AMDGPU] GCNDPPCombine: simplify API of isShrinkable. NFC.
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
538bda0b |
| 22-Mar-2021 |
Joe Nash <[email protected]> |
[AMDGPU] Refactor DPPCombine
NFC. Extract IsShrinkable into a helper function, and make Subtarget a member variable.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D99099
C
[AMDGPU] Refactor DPPCombine
NFC. Extract IsShrinkable into a helper function, and make Subtarget a member variable.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D99099
Change-Id: If4bc97a88a9ae4eb1df47e717345d46a6ed515bf
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
a8d9d507 |
| 17-Feb-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
560d7e04 |
| 20-Jan-2021 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
|
Revision tags: llvmorg-11.1.0-rc1 |
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init |
|
| #
79f67cae |
| 14-Jul-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Rename add/sub with carry out instructions
The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to
AMDGPU: Rename add/sub with carry out instructions
The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to have the gfx9/gfx10 names. We were using the original SI/CI v_add_i32/v_sub_i32 names. Later targets reintroduced these names as carryless instructions with a saturating clamp bit, which we do not define. Do this rename so we can unambiguously add these missing instructions.
The carry-in versions should also be renamed, but at least those had a consistent _u32 name to begin with. The 16-bit instructions were also renamed, but aren't ambiguous.
This does regress assembler error message quality in some cases. In mismatched wave32/wave64 situations, this will switch from "unsupported instruction" to "invalid operand", with the error pointing at the wrong position. I couldn't quite follow how the assembler selects these, but the previous behavior seemed accidental to me. It looked like there was a partial attempt to handle this which was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it isn't used for anything).
show more ...
|
|
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
bb69ca82 |
| 25-Jun-2020 |
vpykhtin <[email protected]> |
[AMDGPU] Don't combine DPP if DPP register is used more than once per instruction
Reviewers: arsenm, rampitec, foad
Reviewed By: rampitec, foad
Subscribers: wuzish, kzhuravl, nemanjai, jvesely, wd
[AMDGPU] Don't combine DPP if DPP register is used more than once per instruction
Reviewers: arsenm, rampitec, foad
Reviewed By: rampitec, foad
Subscribers: wuzish, kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82551
show more ...
|
| #
d0b0b252 |
| 28-Jun-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Use IsSSA property check instead of asserting on isSSA
Also fix an SSA violation in a test the MIRParser/verifier fails to catch. It's illegal to define a subregister in SSA. For the purpose
AMDGPU: Use IsSSA property check instead of asserting on isSSA
Also fix an SSA violation in a test the MIRParser/verifier fails to catch. It's illegal to define a subregister in SSA. For the purpose of the test, it just needs to define the super-register to use the subregister in the use operand.
show more ...
|
| #
07cd19ef |
| 27-May-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix dropping MI flags when rewriting instructions
All 3 passes that change instruction encodings were dropping MI flags. This avoids scheduling regressions caused by setting mayRaiseFPExcept
AMDGPU: Fix dropping MI flags when rewriting instructions
All 3 passes that change instruction encodings were dropping MI flags. This avoids scheduling regressions caused by setting mayRaiseFPExceptions on FP instructions for non-strictfp functions.
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
| #
525f9c0b |
| 20-Nov-2019 |
Dmitry Preobrazhensky <[email protected]> |
[AMDGPU][DPP] Corrected DPP combiner
Added a check to make sure that the selected dpp opcode is supported by target.
Reviewers: vpykhtin, arsenm, rampitec
Differential Revision: https://reviews.ll
[AMDGPU][DPP] Corrected DPP combiner
Added a check to make sure that the selected dpp opcode is supported by target.
Reviewers: vpykhtin, arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D70402
show more ...
|
| #
c9c18e5a |
| 25-Oct-2019 |
vpykhtin <[email protected]> |
[AMDGPU] Disallow dpp combining for dpp instructions without Src2 operand (when Src2 is required)
Differential revision: https://reviews.llvm.org/D69430
|
| #
edcd5815 |
| 16-Oct-2019 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Do not combine dpp mov reading physregs
We cannot be sure physregs will stay unchanged.
Differential Revision: https://reviews.llvm.org/D69065
llvm-svn: 375033
|
| #
3d99310c |
| 16-Oct-2019 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Do not combine dpp with physreg def
We will remove dpp mov along with the physreg def otherwise.
Differential Revision: https://reviews.llvm.org/D69063
llvm-svn: 375030
|
| #
1184c27f |
| 15-Oct-2019 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Support mov dpp with 64 bit operands
We define mov/update dpp intrinsics as overloaded but do not support i64, which is a practically useful type. Fix the selection and lowering.
Different
[AMDGPU] Support mov dpp with 64 bit operands
We define mov/update dpp intrinsics as overloaded but do not support i64, which is a practically useful type. Fix the selection and lowering.
Differential Revision: https://reviews.llvm.org/D68673
llvm-svn: 374910
show more ...
|
| #
6e8599d9 |
| 15-Oct-2019 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Allow DPP combiner to work with REG_SEQUENCE
Differential Revision: https://reviews.llvm.org/D68828
llvm-svn: 374908
|
| #
19a1a739 |
| 10-Oct-2019 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Handle undef old operand in DPP combine
It was missing an undef flag.
Differential Revision: https://reviews.llvm.org/D68813
llvm-svn: 374455
|
| #
c6dec1d8 |
| 09-Oct-2019 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Fixed dpp combine of VOP1
If original instruction did not have source modifiers they were not added to the new DPP instruction as well, even if needed.
Differential Revision: https://revie
[AMDGPU] Fixed dpp combine of VOP1
If original instruction did not have source modifiers they were not added to the new DPP instruction as well, even if needed.
Differential Revision: https://reviews.llvm.org/D68729
llvm-svn: 374241
show more ...
|