|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
| #
c155a944 |
| 14-Jun-2022 |
Jay Foad <[email protected]> |
[AMDGPU] GFX11 CodeGen support for MIMG instructions
This includes: - New llvm.amdgcn.image.msaa.load.* intrinsics - NSA changes, because MIMG-NSA is now limited to 3 dwords - Split CD forms of IMAG
[AMDGPU] GFX11 CodeGen support for MIMG instructions
This includes: - New llvm.amdgcn.image.msaa.load.* intrinsics - NSA changes, because MIMG-NSA is now limited to 3 dwords - Split CD forms of IMAGE_SAMPLE instructions out into separate test files since they are no longer supported in GFX11
Differential Revision: https://reviews.llvm.org/D127837
show more ...
|
| #
77851cc1 |
| 15-Jun-2022 |
David Stuttard <[email protected]> |
[AMDGPU] Change use null for dead sdst to be gfx1030+
Pre gfx1030 null for sdst is different. c97436f8b6e2 [AMDGPU] Use null for dead sdst operand - requires a change to make it not apply to pre gfx
[AMDGPU] Change use null for dead sdst to be gfx1030+
Pre gfx1030 null for sdst is different. c97436f8b6e2 [AMDGPU] Use null for dead sdst operand - requires a change to make it not apply to pre gfx1030
Differential Revision: https://reviews.llvm.org/D127869
show more ...
|
| #
c97436f8 |
| 10-Jun-2022 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Use null for dead sdst operand
Differential Revision: https://reviews.llvm.org/D127542
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
e2926501 |
| 16-May-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Aggressively fold immediates in SIShrinkInstructions
Fold immediates regardless of how many uses they have. This is expected to increase overall code size, but decrease register usage.
Dif
[AMDGPU] Aggressively fold immediates in SIShrinkInstructions
Fold immediates regardless of how many uses they have. This is expected to increase overall code size, but decrease register usage.
Differential Revision: https://reviews.llvm.org/D114644
show more ...
|
| #
dd12c343 |
| 17-May-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Shrink F16 MAD/FMA to MADAK/MADMK/FMAAK/FMAMK on GFX10
Differential Revision: https://reviews.llvm.org/D125803
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
27fa4158 |
| 18-Feb-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Shrink MAD/FMA to MADAK/MADMK/FMAAK/FMAMK on GFX10
On GFX10 VOP3 instructions can have a literal operand, so the conversion from VOP3 MAD/FMA to VOP2 MADAK/MADMK/FMAAK/FMAMK will not happen
[AMDGPU] Shrink MAD/FMA to MADAK/MADMK/FMAAK/FMAMK on GFX10
On GFX10 VOP3 instructions can have a literal operand, so the conversion from VOP3 MAD/FMA to VOP2 MADAK/MADMK/FMAAK/FMAMK will not happen in SIFoldOperands. The only benefit of the VOP2 form is code size, so do it in SIShrinkInstructions instead.
Differential Revision: https://reviews.llvm.org/D125567
show more ...
|
| #
c1af2d32 |
| 18-Feb-2022 |
Jay Foad <[email protected]> |
[AMDGPU] SIShrinkInstructions: change static functions to methods
This is a mechanical change to avoid passing MRI and TII around explicitly. NFC.
Differential Revision: https://reviews.llvm.org/D1
[AMDGPU] SIShrinkInstructions: change static functions to methods
This is a mechanical change to avoid passing MRI and TII around explicitly. NFC.
Differential Revision: https://reviews.llvm.org/D125566
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
718aec20 |
| 01-Feb-2022 |
Thomas Symalla <[email protected]> |
[AMDGPU] Improve v_cmpx usage on GFX10.3.
On GFX10.3 targets, the following instruction sequence
v_cmp_* SGPR, ... s_and_saveexec ..., SGPR
leads to a fairly long stall caused by a VALU write to a
[AMDGPU] Improve v_cmpx usage on GFX10.3.
On GFX10.3 targets, the following instruction sequence
v_cmp_* SGPR, ... s_and_saveexec ..., SGPR
leads to a fairly long stall caused by a VALU write to a SGPR and having the following SALU wait for the SGPR.
An equivalent sequence is to save the exec mask manually instead of letting s_and_saveexec do the work and use a v_cmpx instruction instead to do the comparison.
This patch modifies the SIOptimizeExecMasking pass as this is the last position where s_and_saveexec instructions are inserted. It does the transformation by trying to find the pattern, extracting the operands and generating the new instruction sequence.
It also changes some existing lit tests and introduces a few new tests to show the changed behavior on GFX10.3 targets.
Same as D119696 including a buildbot and MIR test fix.
Reviewed By: critson
Differential Revision: https://reviews.llvm.org/D122332
show more ...
|
| #
7de6107d |
| 21-Mar-2022 |
Thomas Symalla <[email protected]> |
Revert "[AMDGPU] Improve v_cmpx usage on GFX10.3."
This reverts commit 011c64191ef9ccc6538d52f4b57f98f37d4ea36e and e725e2afe02e18398525652c9bceda1eb055ea64.
Differential Revision: https://reviews.
Revert "[AMDGPU] Improve v_cmpx usage on GFX10.3."
This reverts commit 011c64191ef9ccc6538d52f4b57f98f37d4ea36e and e725e2afe02e18398525652c9bceda1eb055ea64.
Differential Revision: https://reviews.llvm.org/D122117
show more ...
|
| #
011c6419 |
| 01-Feb-2022 |
Thomas Symalla <[email protected]> |
[AMDGPU] Improve v_cmpx usage on GFX10.3.
On GFX10.3 targets, the following instruction sequence
v_cmp_* SGPR, ... s_and_saveexec ..., SGPR
leads to a fairly long stall caused by a VALU write to a
[AMDGPU] Improve v_cmpx usage on GFX10.3.
On GFX10.3 targets, the following instruction sequence
v_cmp_* SGPR, ... s_and_saveexec ..., SGPR
leads to a fairly long stall caused by a VALU write to a SGPR and having the following SALU wait for the SGPR.
An equivalent sequence is to save the exec mask manually instead of letting s_and_saveexec do the work and use a v_cmpx instruction instead to do the comparison.
This patch modifies the SIOptimizeExecMasking pass as this is the last position where s_and_saveexec instructions are inserted. It does the transformation by trying to find the pattern, extracting the operands and generating the new instruction sequence.
It also changes some existing lit tests and introduces a few new tests to show the changed behavior on GFX10.3 targets.
Reviewed By: sebastian-ne, critson
Differential Revision: https://reviews.llvm.org/D119696
show more ...
|
| #
37b37838 |
| 16-Mar-2022 |
Shengchen Kan <[email protected]> |
[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments
|
| #
6527b2a4 |
| 18-Feb-2022 |
Sebastian Neubauer <[email protected]> |
[AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.
Differential Revision: https://reviews.llvm.org/D119235
|
| #
476bb2d9 |
| 09-Feb-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Remove dead code from shrinkScalarLogicOp
It looks like this code has been dead since shrinkScalarLogicOp was introduced in svn r348601.
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
16de2c09 |
| 13-Dec-2021 |
Jay Foad <[email protected]> |
[AMDGPU] SIShrinkInstructions: sink code to where it's used. NFC.
|
| #
63681527 |
| 13-Dec-2021 |
Jay Foad <[email protected]> |
[AMDGPU] SIShrinkInstructions: remove redundant check
canShrink already calls hasVALU32BitEncoding, so there is no need to call it again here.
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
d1f45ed5 |
| 11-Nov-2021 |
Neubauer, Sebastian <[email protected]> |
[AMDGPU][NFC] Fix typos
Differential Revision: https://reviews.llvm.org/D113672
|
| #
74cd4dee |
| 22-Oct-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Preserve deadness of vcc when shrinking instructions
This doesn't have any effect on codegen now, but it might do in the future if we shrink instructions before post-RA scheduling, which is
[AMDGPU] Preserve deadness of vcc when shrinking instructions
This doesn't have any effect on codegen now, but it might do in the future if we shrink instructions before post-RA scheduling, which is sensitive to live vs dead defs.
Differential Revision: https://reviews.llvm.org/D112305
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
6efb3220 |
| 22-Jul-2021 |
Carl Ritson <[email protected]> |
[AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions
Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr. Previously these were rounded up to VReg_256 this saves VGPRs.
Reviewed
[AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions
Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr. Previously these were rounded up to VReg_256 this saves VGPRs.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D103800
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
f8816c74 |
| 08-Jun-2021 |
Carl Ritson <[email protected]> |
[AMDGPU] Add v5f32/VReg_160 support for MIMG instructions
Avoid having to round up to v8f32/VReg_256 when only 5 VGPRs are required for a MIMG address operand.
Maintain _V8 instruction variants of
[AMDGPU] Add v5f32/VReg_160 support for MIMG instructions
Avoid having to round up to v8f32/VReg_256 when only 5 VGPRs are required for a MIMG address operand.
Maintain _V8 instruction variants of pseudo instructions allowing assembly prior to GFX10 to work as-is. Currently the validator can tell for GFX10 what the correct size is, so will disallow oversize address registers.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D103672
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
16d707e6 |
| 29-Apr-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Fix v_swap_b32 formation on physical registers
As explained in the comments, matchSwap matches:
// mov t, x // mov x, y // mov y, t
and turns it into:
// mov t, x (t is potentially dead
[AMDGPU] Fix v_swap_b32 formation on physical registers
As explained in the comments, matchSwap matches:
// mov t, x // mov x, y // mov y, t
and turns it into:
// mov t, x (t is potentially dead and move eliminated) // v_swap_b32 x, y
On physical registers we don't have full use-def chains so the check for T being live-out was not working properly with subregs/superregs.
Differential Revision: https://reviews.llvm.org/D101546
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1 |
|
| #
fc8e7411 |
| 27-Jan-2021 |
Piotr Sobczak <[email protected]> |
[AMDGPU] Avoid an illegal operand in si-shrink-instructions
Before the patch it was possible to trigger a constant bus violation when folding immediates into a shrunk instruction.
The patch adds a
[AMDGPU] Avoid an illegal operand in si-shrink-instructions
Before the patch it was possible to trigger a constant bus violation when folding immediates into a shrunk instruction.
The patch adds a check to enforce the legality of the new operand.
Differential Revision: https://reviews.llvm.org/D95527
show more ...
|
|
Revision tags: llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
560d7e04 |
| 20-Jan-2021 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
|
Revision tags: llvmorg-11.1.0-rc1 |
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
ad8131bb |
| 26-Oct-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Fix VC warning about singed/unsigned comparison. NFC.
This is the warning reported in https://reviews.llvm.org/D89599
|
| #
611959f0 |
| 16-Oct-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Fixed v_swap_b32 match
1. Fixed liveness issue with implicit kills. 2. Fixed potential problem with an indirect mov.
Fixes: SWDEV-256848
Differential Revision: https://reviews.llvm.org/D8
[AMDGPU] Fixed v_swap_b32 match
1. Fixed liveness issue with implicit kills. 2. Fixed potential problem with an indirect mov.
Fixes: SWDEV-256848
Differential Revision: https://reviews.llvm.org/D89599
show more ...
|