|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
c17450a0 |
| 23-Jul-2022 |
Fangrui Song <[email protected]> |
[AMDGPU] Change DEBUG_TYPE from isel to amdgpu-isel
to match all other *ISelDAGToDAG.cpp
|
| #
432cbd78 |
| 18-Jul-2022 |
Ivan Kosarev <[email protected]> |
[AMDGPU][CodeGen] Support (register + immediate) SMRD offsets.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D129381
|
| #
9c66c02e |
| 18-Jul-2022 |
Ivan Kosarev <[email protected]> |
[AMDGPU][CodeGen] Match SMRDs with constant bases and register offsets.
Saves some add instructions on a couple Rage 2 shaders and is also a prerequisite for a coming-soon change matching (register
[AMDGPU][CodeGen] Match SMRDs with constant bases and register offsets.
Saves some add instructions on a couple Rage 2 shaders and is also a prerequisite for a coming-soon change matching (register + immediate) offsets.
Reviewed By: foad, arsenm
Differential Revision: https://reviews.llvm.org/D129095
show more ...
|
| #
4696a33d |
| 05-Jul-2022 |
Ivan Kosarev <[email protected]> |
[AMDGPU][NFC] Refine matching SMRD offsets.
Tell the matcher what we are looking for instead of matching everything and then discarding the result if doesn't fit.
Reviewed By: foad
Differential Re
[AMDGPU][NFC] Refine matching SMRD offsets.
Tell the matcher what we are looking for instead of matching everything and then discarding the result if doesn't fit.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D128171
show more ...
|
| #
4874838a |
| 28-Jun-2022 |
Piotr Sobczak <[email protected]> |
[AMDGPU] gfx11 WMMA instruction support
gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate) instructions.
Reviewed By: arsenm, #amdgpu
Differential Revision: https://reviews.llvm.org/D1287
[AMDGPU] gfx11 WMMA instruction support
gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate) instructions.
Reviewed By: arsenm, #amdgpu
Differential Revision: https://reviews.llvm.org/D128756
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5 |
|
| #
20d20156 |
| 09-Jun-2022 |
Joe Nash <[email protected]> |
[AMDGPU] gfx11 VINTERP intrinsics and ISel support
Depends on D127664
Reviewed By: rampitec, #amdgpu
Differential Revision: https://reviews.llvm.org/D127756
|
| #
2d43de13 |
| 15-Jun-2022 |
Joe Nash <[email protected]> |
[AMDGPU] gfx11 new dot instruction codegen support
Reviewed By: rampitec, #amdgpu
Differential Revision: https://reviews.llvm.org/D127904
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
7b9f620e |
| 06-Apr-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Work around GFX11 flat scratch SVS swizzling bug
Differential Revision: https://reviews.llvm.org/D127635
|
| #
d943c514 |
| 13-Jun-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Fix GFX11 codegen for V_MAD_U64_U32 and V_MAD_I64_I32
GFX11 uses different pseudos for these because of a new constraint on which operands' registers can overlap.
Differential Revision: ht
[AMDGPU] Fix GFX11 codegen for V_MAD_U64_U32 and V_MAD_I64_I32
GFX11 uses different pseudos for these because of a new constraint on which operands' registers can overlap.
Differential Revision: https://reviews.llvm.org/D127659
show more ...
|
| #
07881861 |
| 03-Jun-2022 |
Guillaume Chatelet <[email protected]> |
[Alignment][NFC] Remove usage of MemSDNode::getAlignment
I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with
[Alignment][NFC] Remove usage of MemSDNode::getAlignment
I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with TypeSize since it came up a few times.
Differential Revision: https://reviews.llvm.org/D126910
show more ...
|
| #
f59cb41b |
| 15-Mar-2022 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Select buffer_atomic_cmpswap* in tblgen
This change replaces the manual selection of buffer_atomic_cmpswap* instructions in SelectionDAG and GlobalISel with a tblgen based selection in BUFI
[AMDGPU] Select buffer_atomic_cmpswap* in tblgen
This change replaces the manual selection of buffer_atomic_cmpswap* instructions in SelectionDAG and GlobalISel with a tblgen based selection in BUFInstructions.td. This allows us to select the return and no-return variants in tblgen.
Differential Revision: https://reviews.llvm.org/D121770
show more ...
|
| #
c4500de2 |
| 14-Mar-2022 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] gfx940: disable OP_SEL on V_DOT instructions
Differential Revision: https://reviews.llvm.org/D121634
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
36fe3f13 |
| 08-Mar-2022 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] flat scratch SVS addressing mode for gfx940
Both VADDR and SADDR are used in SVS mode.
Differential Revision: https://reviews.llvm.org/D121254
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
6527b2a4 |
| 18-Feb-2022 |
Sebastian Neubauer <[email protected]> |
[AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.
Differential Revision: https://reviews.llvm.org/D119235
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
d7e03df7 |
| 12-Nov-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Implement widening multiplies with v_mad_i64_i32/v_mad_u64_u32
Select SelectionDAG ops smul_lohi/umul_lohi to v_mad_i64_i32/v_mad_u64_u32 respectively, with an addend of 0. v_mul_lo, v_mul_
[AMDGPU] Implement widening multiplies with v_mad_i64_i32/v_mad_u64_u32
Select SelectionDAG ops smul_lohi/umul_lohi to v_mad_i64_i32/v_mad_u64_u32 respectively, with an addend of 0. v_mul_lo, v_mul_hi and v_mad_i64/u64 are all quarter-rate instructions so it is better to use one instruction than two.
Further improvements are possible to make better use of the addend operand, but this is already a strict improvement over what we have now.
Differential Revision: https://reviews.llvm.org/D113986
show more ...
|
| #
078da26b |
| 08-Nov-2021 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Check for unneeded shift mask in shift PatFrags.
The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnnee
[AMDGPU] Check for unneeded shift mask in shift PatFrags.
The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnneededShiftMask() function to AMDGPU and uses it in the shift PatFrag predicates.
Differential Revision: https://reviews.llvm.org/D113448
show more ...
|
| #
0a3d755e |
| 31-Oct-2021 |
alex-t <[email protected]> |
[AMDGPU] Enable divergence-driven BFE selection
Detailed description: This change enables the bit field extract patterns selection to s_bfe_u32 or v_bfe_u32 dependent on the pattern root node diverg
[AMDGPU] Enable divergence-driven BFE selection
Detailed description: This change enables the bit field extract patterns selection to s_bfe_u32 or v_bfe_u32 dependent on the pattern root node divergence.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D110950
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
47d6274d |
| 07-Sep-2021 |
Daniil Fukalov <[email protected]> |
[NFC][AMDGPU] Reduce includes dependencies, part 2
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Minor forward declarations, red
[NFC][AMDGPU] Reduce includes dependencies, part 2
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Minor forward declarations, redundant includes and flags in GCNSubtarget cleanup.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D109351
show more ...
|
| #
598bebea |
| 20-Sep-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Prefer fmac over fma when selecting FMA_W_CHAIN
FMA_W_CHAIN is used when lowering fdiv f32. Prefer to select it to fmac if there are no source modifiers, just like we do for other mad/mac a
[AMDGPU] Prefer fmac over fma when selecting FMA_W_CHAIN
FMA_W_CHAIN is used when lowering fdiv f32. Prefer to select it to fmac if there are no source modifiers, just like we do for other mad/mac and fma/fmac cases.
Differential Revision: https://reviews.llvm.org/D110074
show more ...
|
| #
dc6e8dfd |
| 20-Sep-2021 |
Jacob Lambert <[email protected]> |
[AMDGPU][NFC] Correct typos in lib/Target/AMDGPU/AMDGPU*.cpp files. Test commit for new contributor.
|
| #
9af8f1b1 |
| 09-Sep-2021 |
Craig Topper <[email protected]> |
[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.
Soft deprecrate isNullValue/isAllOnesValue and update in tree callers. This matches the changes to the APInt interface from D109483.
R
[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.
Soft deprecrate isNullValue/isAllOnesValue and update in tree callers. This matches the changes to the APInt interface from D109483.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D109535
show more ...
|
| #
735f4671 |
| 09-Sep-2021 |
Chris Lattner <[email protected]> |
[APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAl
[APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things:
1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things.
2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct!
APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment.
Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical.
Differential Revision: https://reviews.llvm.org/D109483
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc2 |
|
| #
48958d02 |
| 23-Aug-2021 |
Daniil Fukalov <[email protected]> |
[NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `G
[NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()` and `R600TargetMachine::getSubtargetImpl()` had different return value type than base class. 4. Minor forward declarations cleanup.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D108596
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
9ad8a1f6 |
| 15-Jun-2021 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix high 16-bit optimization on gfx9
We can do this optimization in the majority of cases, but we currently don't have a way to do it. We do not track/model which instructions have which beh
AMDGPU: Fix high 16-bit optimization on gfx9
We can do this optimization in the majority of cases, but we currently don't have a way to do it. We do not track/model which instructions have which behavior, the control bit to change the high bit behavior, or making use of preserved bits at all. This is a bit fuzzy since we don't know precisely how the source instruction will be lowered, but that only really matters in one case (for fma_mixlo).
We do need to fixup some of these cases after selection, but the pattern helps eliminate many of these zexts.
show more ...
|
| #
a7786bad |
| 15-Jun-2021 |
Matt Arsenault <[email protected]> |
AMDGPU: Move zeroed FP high bits optimization to patterns
|