History log of /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp (Results 1 – 25 of 34)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# bc9b964f 20-Jul-2022 Arthur Eubanks <[email protected]>

[NFC] Suppress unused variable warning in non-assert builds


# dc850fbf 19-Jul-2022 Joe Nash <[email protected]>

[AMDGPU] NFC. Assert that mask is full with VOPC DPP

VOPC DPP should not be formed when the row_mask and bank_mask are not
0xf (full) because the resulting VOP DPP would have different semantics
tha

[AMDGPU] NFC. Assert that mask is full with VOPC DPP

VOPC DPP should not be formed when the row_mask and bank_mask are not
0xf (full) because the resulting VOP DPP would have different semantics
than the MOV DPP followed by VOP. Existing checks in GCNDPPCombine cover
this case but for different reasons, so assert the property for
future-proofing.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D130101

show more ...


# b28bb8cc 15-Jul-2022 Joe Nash <[email protected]>

[AMDGPU] Remove old operand from VOPC DPP

For most DPP instructions, the old operand stores the value that was in
the current lane before the DPP operation, and is tied to the
destination. For VOPC

[AMDGPU] Remove old operand from VOPC DPP

For most DPP instructions, the old operand stores the value that was in
the current lane before the DPP operation, and is tied to the
destination. For VOPC DPP, this is unnecessary and incorrect.

There appears to have been a latent bug related to D122737 with
SIInstrInfo::isOperandLegal. If you checked if a register operand was legal
when the InstructionDesc expected an immediate, it reported that is valid.
Its fix is necessary for and tested in this patch.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D130040

show more ...


# 0483c91e 27-Jun-2022 Joe Nash <[email protected]>

[AMDGPU] gfx11 CodeGen for new DPP instructions

Modifies the GCNDPPCombine pass to enable DPP formation for the new DPP
instruction in gfx11, namely VOP3 encoded instructions with DPP and VOPC
with

[AMDGPU] gfx11 CodeGen for new DPP instructions

Modifies the GCNDPPCombine pass to enable DPP formation for the new DPP
instruction in gfx11, namely VOP3 encoded instructions with DPP and VOPC
with DPP.

Depends on D128656

Reviewed By: #amdgpu, rampitec

Differential Revision: https://reviews.llvm.org/D128682

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2
# ba6c8d42 21-Apr-2022 Jay Foad <[email protected]>

[AMDGPU] Combine DPP mov even if old reg def is in different BB

Given a DPP mov like this:

%2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
...
%3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, impl

[AMDGPU] Combine DPP mov even if old reg def is in different BB

Given a DPP mov like this:

%2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
...
%3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec

this patch just removes a check that %2 (the "old reg") was defined in
the same BB as the DPP mov instruction. GCNDPPCombine requires that the
MIR is in SSA form so I don't understand why the BB matters.

This lets the optimization work in more real world cases when the
definition of %2 gets hoisted out of a loop.

Differential Revision: https://reviews.llvm.org/D124182

show more ...


Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4
# 31f215ab 10-Mar-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Support v_mov_b64 in dpp combine

Differential Revision: https://reviews.llvm.org/D121411


Revision tags: llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 4bef0304 03-Nov-2021 Kazu Hirata <[email protected]>

[AArch64, AMDGPU] Use make_early_inc_range (NFC)


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# b22721f0 19-Apr-2021 Jay Foad <[email protected]>

[AMDGPU] GCNDPPCombine: don't shrink V_ADD_CO_U32 if carry out is used

Don't shrink VOP3 instructions if there are any uses of a carry-out
operand, because the shrunken form of the instruction would

[AMDGPU] GCNDPPCombine: don't shrink V_ADD_CO_U32 if carry out is used

Don't shrink VOP3 instructions if there are any uses of a carry-out
operand, because the shrunken form of the instruction would write the
carry-out to vcc instead of to a virtual register.

Differential Revision: https://reviews.llvm.org/D100760

show more ...


# a02aa913 19-Apr-2021 Jay Foad <[email protected]>

[AMDGPU] GCNDPPCombine: simplify API of isShrinkable. NFC.


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 538bda0b 22-Mar-2021 Joe Nash <[email protected]>

[AMDGPU] Refactor DPPCombine

NFC. Extract IsShrinkable into a helper function, and
make Subtarget a member variable.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D99099

C

[AMDGPU] Refactor DPPCombine

NFC. Extract IsShrinkable into a helper function, and
make Subtarget a member variable.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D99099

Change-Id: If4bc97a88a9ae4eb1df47e717345d46a6ed515bf

show more ...


Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# a8d9d507 17-Feb-2021 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] gfx90a support

Differential Revision: https://reviews.llvm.org/D96906


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04 20-Jan-2021 dfukalov <[email protected]>

[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets

... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036


Revision tags: llvmorg-11.1.0-rc1
# 6a87e9b0 25-Dec-2020 dfukalov <[email protected]>

[NFC][AMDGPU] Reduce include files dependency.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init
# 79f67cae 14-Jul-2020 Matt Arsenault <[email protected]>

AMDGPU: Rename add/sub with carry out instructions

The hardware has created a real mess in the naming for add/sub, which
have been renamed basically every generation. Switch the carry out
pseudos to

AMDGPU: Rename add/sub with carry out instructions

The hardware has created a real mess in the naming for add/sub, which
have been renamed basically every generation. Switch the carry out
pseudos to have the gfx9/gfx10 names. We were using the original SI/CI
v_add_i32/v_sub_i32 names. Later targets reintroduced these names as
carryless instructions with a saturating clamp bit, which we do not
define. Do this rename so we can unambiguously add these missing
instructions.

The carry-in versions should also be renamed, but at least those had a
consistent _u32 name to begin with. The 16-bit instructions were also
renamed, but aren't ambiguous.

This does regress assembler error message quality in some cases. In
mismatched wave32/wave64 situations, this will switch from
"unsupported instruction" to "invalid operand", with the error
pointing at the wrong position. I couldn't quite follow how the
assembler selects these, but the previous behavior seemed accidental
to me. It looked like there was a partial attempt to handle this which
was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it
isn't used for anything).

show more ...


Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# bb69ca82 25-Jun-2020 vpykhtin <[email protected]>

[AMDGPU] Don't combine DPP if DPP register is used more than once per instruction

Reviewers: arsenm, rampitec, foad

Reviewed By: rampitec, foad

Subscribers: wuzish, kzhuravl, nemanjai, jvesely, wd

[AMDGPU] Don't combine DPP if DPP register is used more than once per instruction

Reviewers: arsenm, rampitec, foad

Reviewed By: rampitec, foad

Subscribers: wuzish, kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82551

show more ...


# d0b0b252 28-Jun-2020 Matt Arsenault <[email protected]>

AMDGPU: Use IsSSA property check instead of asserting on isSSA

Also fix an SSA violation in a test the MIRParser/verifier fails to
catch. It's illegal to define a subregister in SSA. For the purpose

AMDGPU: Use IsSSA property check instead of asserting on isSSA

Also fix an SSA violation in a test the MIRParser/verifier fails to
catch. It's illegal to define a subregister in SSA. For the purpose of
the test, it just needs to define the super-register to use the
subregister in the use operand.

show more ...


# 07cd19ef 27-May-2020 Matt Arsenault <[email protected]>

AMDGPU: Fix dropping MI flags when rewriting instructions

All 3 passes that change instruction encodings were dropping MI
flags. This avoids scheduling regressions caused by setting
mayRaiseFPExcept

AMDGPU: Fix dropping MI flags when rewriting instructions

All 3 passes that change instruction encodings were dropping MI
flags. This avoids scheduling regressions caused by setting
mayRaiseFPExceptions on FP instructions for non-strictfp functions.

show more ...


Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 525f9c0b 20-Nov-2019 Dmitry Preobrazhensky <[email protected]>

[AMDGPU][DPP] Corrected DPP combiner

Added a check to make sure that the selected dpp opcode is supported by target.

Reviewers: vpykhtin, arsenm, rampitec

Differential Revision: https://reviews.ll

[AMDGPU][DPP] Corrected DPP combiner

Added a check to make sure that the selected dpp opcode is supported by target.

Reviewers: vpykhtin, arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D70402

show more ...


# c9c18e5a 25-Oct-2019 vpykhtin <[email protected]>

[AMDGPU] Disallow dpp combining for dpp instructions without Src2 operand (when Src2 is required)

Differential revision: https://reviews.llvm.org/D69430


# edcd5815 16-Oct-2019 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Do not combine dpp mov reading physregs

We cannot be sure physregs will stay unchanged.

Differential Revision: https://reviews.llvm.org/D69065

llvm-svn: 375033


# 3d99310c 16-Oct-2019 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Do not combine dpp with physreg def

We will remove dpp mov along with the physreg def otherwise.

Differential Revision: https://reviews.llvm.org/D69063

llvm-svn: 375030


# 1184c27f 15-Oct-2019 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Support mov dpp with 64 bit operands

We define mov/update dpp intrinsics as overloaded but do not
support i64, which is a practically useful type. Fix the
selection and lowering.

Different

[AMDGPU] Support mov dpp with 64 bit operands

We define mov/update dpp intrinsics as overloaded but do not
support i64, which is a practically useful type. Fix the
selection and lowering.

Differential Revision: https://reviews.llvm.org/D68673

llvm-svn: 374910

show more ...


# 6e8599d9 15-Oct-2019 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Allow DPP combiner to work with REG_SEQUENCE

Differential Revision: https://reviews.llvm.org/D68828

llvm-svn: 374908


# 19a1a739 10-Oct-2019 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Handle undef old operand in DPP combine

It was missing an undef flag.

Differential Revision: https://reviews.llvm.org/D68813

llvm-svn: 374455


# c6dec1d8 09-Oct-2019 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Fixed dpp combine of VOP1

If original instruction did not have source modifiers they were
not added to the new DPP instruction as well, even if needed.

Differential Revision: https://revie

[AMDGPU] Fixed dpp combine of VOP1

If original instruction did not have source modifiers they were
not added to the new DPP instruction as well, even if needed.

Differential Revision: https://reviews.llvm.org/D68729

llvm-svn: 374241

show more ...


12