GCNDPPCombine.cpp - OpenGrok history log for /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# bc9b964f	20-Jul-2022	Arthur Eubanks <[email protected]>	[NFC] Suppress unused variable warning in non-assert builds
# dc850fbf	19-Jul-2022	Joe Nash <[email protected]>	[AMDGPU] NFC. Assert that mask is full with VOPC DPP VOPC DPP should not be formed when the row_mask and bank_mask are not 0xf (full) because the resulting VOP DPP would have different semantics tha [AMDGPU] NFC. Assert that mask is full with VOPC DPP VOPC DPP should not be formed when the row_mask and bank_mask are not 0xf (full) because the resulting VOP DPP would have different semantics than the MOV DPP followed by VOP. Existing checks in GCNDPPCombine cover this case but for different reasons, so assert the property for future-proofing. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D130101 show more ...
# b28bb8cc	15-Jul-2022	Joe Nash <[email protected]>	[AMDGPU] Remove old operand from VOPC DPP For most DPP instructions, the old operand stores the value that was in the current lane before the DPP operation, and is tied to the destination. For VOPC [AMDGPU] Remove old operand from VOPC DPP For most DPP instructions, the old operand stores the value that was in the current lane before the DPP operation, and is tied to the destination. For VOPC DPP, this is unnecessary and incorrect. There appears to have been a latent bug related to D122737 with SIInstrInfo::isOperandLegal. If you checked if a register operand was legal when the InstructionDesc expected an immediate, it reported that is valid. Its fix is necessary for and tested in this patch. Reviewed By: foad, rampitec Differential Revision: https://reviews.llvm.org/D130040 show more ...
# 0483c91e	27-Jun-2022	Joe Nash <[email protected]>	[AMDGPU] gfx11 CodeGen for new DPP instructions Modifies the GCNDPPCombine pass to enable DPP formation for the new DPP instruction in gfx11, namely VOP3 encoded instructions with DPP and VOPC with [AMDGPU] gfx11 CodeGen for new DPP instructions Modifies the GCNDPPCombine pass to enable DPP formation for the new DPP instruction in gfx11, namely VOP3 encoded instructions with DPP and VOPC with DPP. Depends on D128656 Reviewed By: #amdgpu, rampitec Differential Revision: https://reviews.llvm.org/D128682 show more ...
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2
# ba6c8d42	21-Apr-2022	Jay Foad <[email protected]>	[AMDGPU] Combine DPP mov even if old reg def is in different BB Given a DPP mov like this: %2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec ... %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, impl [AMDGPU] Combine DPP mov even if old reg def is in different BB Given a DPP mov like this: %2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec ... %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec this patch just removes a check that %2 (the "old reg") was defined in the same BB as the DPP mov instruction. GCNDPPCombine requires that the MIR is in SSA form so I don't understand why the BB matters. This lets the optimization work in more real world cases when the definition of %2 gets hoisted out of a loop. Differential Revision: https://reviews.llvm.org/D124182 show more ...
Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4
# 31f215ab	10-Mar-2022	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Support v_mov_b64 in dpp combine Differential Revision: https://reviews.llvm.org/D121411
Revision tags: llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 4bef0304	03-Nov-2021	Kazu Hirata <[email protected]>	[AArch64, AMDGPU] Use make_early_inc_range (NFC)
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# b22721f0	19-Apr-2021	Jay Foad <[email protected]>	[AMDGPU] GCNDPPCombine: don't shrink V_ADD_CO_U32 if carry out is used Don't shrink VOP3 instructions if there are any uses of a carry-out operand, because the shrunken form of the instruction would [AMDGPU] GCNDPPCombine: don't shrink V_ADD_CO_U32 if carry out is used Don't shrink VOP3 instructions if there are any uses of a carry-out operand, because the shrunken form of the instruction would write the carry-out to vcc instead of to a virtual register. Differential Revision: https://reviews.llvm.org/D100760 show more ...
# a02aa913	19-Apr-2021	Jay Foad <[email protected]>	[AMDGPU] GCNDPPCombine: simplify API of isShrinkable. NFC.
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 538bda0b	22-Mar-2021	Joe Nash <[email protected]>	[AMDGPU] Refactor DPPCombine NFC. Extract IsShrinkable into a helper function, and make Subtarget a member variable. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D99099 C [AMDGPU] Refactor DPPCombine NFC. Extract IsShrinkable into a helper function, and make Subtarget a member variable. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D99099 Change-Id: If4bc97a88a9ae4eb1df47e717345d46a6ed515bf show more ...
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# a8d9d507	17-Feb-2021	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04	20-Jan-2021	dfukalov <[email protected]>	[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets ... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036
Revision tags: llvmorg-11.1.0-rc1
# 6a87e9b0	25-Dec-2020	dfukalov <[email protected]>	[NFC][AMDGPU] Reduce include files dependency. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init
# 79f67cae	14-Jul-2020	Matt Arsenault <[email protected]>	AMDGPU: Rename add/sub with carry out instructions The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to AMDGPU: Rename add/sub with carry out instructions The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to have the gfx9/gfx10 names. We were using the original SI/CI v_add_i32/v_sub_i32 names. Later targets reintroduced these names as carryless instructions with a saturating clamp bit, which we do not define. Do this rename so we can unambiguously add these missing instructions. The carry-in versions should also be renamed, but at least those had a consistent _u32 name to begin with. The 16-bit instructions were also renamed, but aren't ambiguous. This does regress assembler error message quality in some cases. In mismatched wave32/wave64 situations, this will switch from "unsupported instruction" to "invalid operand", with the error pointing at the wrong position. I couldn't quite follow how the assembler selects these, but the previous behavior seemed accidental to me. It looked like there was a partial attempt to handle this which was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it isn't used for anything). show more ...
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# bb69ca82	25-Jun-2020	vpykhtin <[email protected]>	[AMDGPU] Don't combine DPP if DPP register is used more than once per instruction Reviewers: arsenm, rampitec, foad Reviewed By: rampitec, foad Subscribers: wuzish, kzhuravl, nemanjai, jvesely, wd [AMDGPU] Don't combine DPP if DPP register is used more than once per instruction Reviewers: arsenm, rampitec, foad Reviewed By: rampitec, foad Subscribers: wuzish, kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82551 show more ...
# d0b0b252	28-Jun-2020	Matt Arsenault <[email protected]>	AMDGPU: Use IsSSA property check instead of asserting on isSSA Also fix an SSA violation in a test the MIRParser/verifier fails to catch. It's illegal to define a subregister in SSA. For the purpose AMDGPU: Use IsSSA property check instead of asserting on isSSA Also fix an SSA violation in a test the MIRParser/verifier fails to catch. It's illegal to define a subregister in SSA. For the purpose of the test, it just needs to define the super-register to use the subregister in the use operand. show more ...
# 07cd19ef	27-May-2020	Matt Arsenault <[email protected]>	AMDGPU: Fix dropping MI flags when rewriting instructions All 3 passes that change instruction encodings were dropping MI flags. This avoids scheduling regressions caused by setting mayRaiseFPExcept AMDGPU: Fix dropping MI flags when rewriting instructions All 3 passes that change instruction encodings were dropping MI flags. This avoids scheduling regressions caused by setting mayRaiseFPExceptions on FP instructions for non-strictfp functions. show more ...
Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 525f9c0b	20-Nov-2019	Dmitry Preobrazhensky <[email protected]>	[AMDGPU][DPP] Corrected DPP combiner Added a check to make sure that the selected dpp opcode is supported by target. Reviewers: vpykhtin, arsenm, rampitec Differential Revision: https://reviews.ll [AMDGPU][DPP] Corrected DPP combiner Added a check to make sure that the selected dpp opcode is supported by target. Reviewers: vpykhtin, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D70402 show more ...
# c9c18e5a	25-Oct-2019	vpykhtin <[email protected]>	[AMDGPU] Disallow dpp combining for dpp instructions without Src2 operand (when Src2 is required) Differential revision: https://reviews.llvm.org/D69430
# edcd5815	16-Oct-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Do not combine dpp mov reading physregs We cannot be sure physregs will stay unchanged. Differential Revision: https://reviews.llvm.org/D69065 llvm-svn: 375033
# 3d99310c	16-Oct-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Do not combine dpp with physreg def We will remove dpp mov along with the physreg def otherwise. Differential Revision: https://reviews.llvm.org/D69063 llvm-svn: 375030
# 1184c27f	15-Oct-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Support mov dpp with 64 bit operands We define mov/update dpp intrinsics as overloaded but do not support i64, which is a practically useful type. Fix the selection and lowering. Different [AMDGPU] Support mov dpp with 64 bit operands We define mov/update dpp intrinsics as overloaded but do not support i64, which is a practically useful type. Fix the selection and lowering. Differential Revision: https://reviews.llvm.org/D68673 llvm-svn: 374910 show more ...
# 6e8599d9	15-Oct-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Allow DPP combiner to work with REG_SEQUENCE Differential Revision: https://reviews.llvm.org/D68828 llvm-svn: 374908
# 19a1a739	10-Oct-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Handle undef old operand in DPP combine It was missing an undef flag. Differential Revision: https://reviews.llvm.org/D68813 llvm-svn: 374455
# c6dec1d8	09-Oct-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Fixed dpp combine of VOP1 If original instruction did not have source modifiers they were not added to the new DPP instruction as well, even if needed. Differential Revision: https://revie [AMDGPU] Fixed dpp combine of VOP1 If original instruction did not have source modifiers they were not added to the new DPP instruction as well, even if needed. Differential Revision: https://reviews.llvm.org/D68729 llvm-svn: 374241 show more ...
12