|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
17a81ecf |
| 08-May-2022 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Use the HasNoUse predicate for no-ret atomic op selection
This change replaces the C++ predicates with the HasNoUse builtin predicate that would enable the no-ret atomic op selection in Glo
[AMDGPU] Use the HasNoUse predicate for no-ret atomic op selection
This change replaces the C++ predicates with the HasNoUse builtin predicate that would enable the no-ret atomic op selection in GlobalISel.
Differential Revision: https://reviews.llvm.org/D125213
show more ...
|
| #
7504c7a8 |
| 20-Jun-2022 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Use AddedComplexity for ret and noret atomic ops selection
This patch removes the predicate for return atomic ops and uses AddedComplexity to distinguish its selection from its no return va
[AMDGPU] Use AddedComplexity for ret and noret atomic ops selection
This patch removes the predicate for return atomic ops and uses AddedComplexity to distinguish its selection from its no return variant. This will produce better matchers that doesn't unnecessarily check for the negated predicate if the initial predicate failed. Also, it simplifies the enabling of no return atomic ops selection in GlobalISel.
Differential Revision: https://reviews.llvm.org/D128241
show more ...
|
| #
07b7fada |
| 25-May-2022 |
Joe Nash <[email protected]> |
[AMDGPU] gfx11 VOPD instructions MC support
VOPD is a new encoding for dual-issue instructions for use in wave32. This patch includes MC layer support only.
A VOPD instruction is constituted of an
[AMDGPU] gfx11 VOPD instructions MC support
VOPD is a new encoding for dual-issue instructions for use in wave32. This patch includes MC layer support only.
A VOPD instruction is constituted of an X component (for which there are 13 possible opcodes) and a Y component (for which there are the 13 X opcodes plus 3 more). Most of the complexity in defining and parsing a VOPD operation arises from the possible different total numbers of operands and deferred parsing of certain operands depending on the constituent X and Y opcodes.
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D128218
show more ...
|
| #
e243ead6 |
| 18-May-2022 |
Joe Nash <[email protected]> |
Reland [AMDGPU] gfx11 vop3dpp instructions
There was an issue with encoding wide (>64 bit) instructions on BigEndian hosts, which is fixed in D127195. Therefore reland this.
gfx11 adds the ability
Reland [AMDGPU] gfx11 vop3dpp instructions
There was an issue with encoding wide (>64 bit) instructions on BigEndian hosts, which is fixed in D127195. Therefore reland this.
gfx11 adds the ability to use dpp modifiers on vop3 instructions. This patch adds machine code layer support for that. The MCCodeEmitter is changed to use APInt instead of uint64_t to support these wider instructions.
Patch 16/N for upstreaming of AMDGPU gfx11 architecture
Differential Revision: https://reviews.llvm.org/D126483
show more ...
|
| #
eaed07eb |
| 06-Jun-2022 |
Joe Nash <[email protected]> |
Revert "[AMDGPU] gfx11 vop3dpp instructions"
This reverts commit 99a83b1286748501e0ccf199a582dc3ec5451ef5.
|
| #
99a83b12 |
| 18-May-2022 |
Joe Nash <[email protected]> |
[AMDGPU] gfx11 vop3dpp instructions
gfx11 adds the ability to use dpp modifiers on vop3 instructions. This patch adds machine code layer support for that. The MCCodeEmitter is changed to use APInt i
[AMDGPU] gfx11 vop3dpp instructions
gfx11 adds the ability to use dpp modifiers on vop3 instructions. This patch adds machine code layer support for that. The MCCodeEmitter is changed to use APInt instead of uint64_t to support these wider instructions.
Patch 16/N for upstreaming of AMDGPU gfx11 architecture
Depends on D126475
Reviewed By: rampitec, #amdgpu
Differential Revision: https://reviews.llvm.org/D126483
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
0ecbb683 |
| 05-Apr-2022 |
Matt Arsenault <[email protected]> |
TableGen/GlobalISel: Make address space/align predicates consistent
The builtin predicate handling has a strange behavior where the code assumes that a PatFrag is a stack of PatFrags, and each level
TableGen/GlobalISel: Make address space/align predicates consistent
The builtin predicate handling has a strange behavior where the code assumes that a PatFrag is a stack of PatFrags, and each level adds at most one predicate. I don't think this particularly makes sense, especially without a diagnostic to ensure you aren't trying to set multiple at once.
This wasn't followed for address spaces and alignment, which could potentially fall through to report no builtin predicate was added. Just switch these to follow the existing convention for now.
show more ...
|
| #
45ca9433 |
| 08-Apr-2022 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Select no-return atomic intrinsics in tblgen
This is to avoid relying on the post-isel hook.
This change also enable the saddr pattern selection for atomic intrinsics in GlobalISel.
Diffe
[AMDGPU] Select no-return atomic intrinsics in tblgen
This is to avoid relying on the post-isel hook.
This change also enable the saddr pattern selection for atomic intrinsics in GlobalISel.
Differential Revision: https://reviews.llvm.org/D123583
show more ...
|
| #
b7df7152 |
| 19-Apr-2022 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU][GlobalISel] Force return atomic selection for now
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
f4e8cf25 |
| 27-Dec-2021 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Select no-return ds_* atomic ops in tblgen.
SelectionDAG relies on MachineInstr's HasPostISelHook for selecting the no-return atomic ops. GlobalISel, at the moment, doesn't handle HasPostIS
[AMDGPU] Select no-return ds_* atomic ops in tblgen.
SelectionDAG relies on MachineInstr's HasPostISelHook for selecting the no-return atomic ops. GlobalISel, at the moment, doesn't handle HasPostISelHook.
This change adds the selection for no-return ds_* atomic ops in tblgen so that it can work with both GlobalISel and SelectionDAG. I couldn't add the predicates for GlobalISel in this change since there's a restriction in GlobalISelEmitter that disallows selecting generic atomics ops that return with instructions that doesn't return.
We can't remove the HasPostISelHook code that selects the no return atomic ops in SelectionDAG yet since we still need to cover selections in FLATInstructions.td, BUFInstructions.td.
Differential Revision: https://reviews.llvm.org/D115881
show more ...
|
| #
d8b69040 |
| 20-Jan-2022 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Set MemoryVT for truncstores in tblgen.
GlobalISelEmitter was skipping these patterns when its predicates were checked. This patch should allow us to select d16_hi stores in GlobalISel.
Di
[AMDGPU] Set MemoryVT for truncstores in tblgen.
GlobalISelEmitter was skipping these patterns when its predicates were checked. This patch should allow us to select d16_hi stores in GlobalISel.
Differential Revision: https://reviews.llvm.org/D117762
show more ...
|
| #
9392b40d |
| 12-Jan-2022 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Fix selection of constant 32-bit addrspace loads
Unfortunately the selection patterns still rely on the address space from the memory operand instead of using the pointer type. Ad
AMDGPU/GlobalISel: Fix selection of constant 32-bit addrspace loads
Unfortunately the selection patterns still rely on the address space from the memory operand instead of using the pointer type. Add this address space to the list of cases supported by global-like loads.
Alternatively we would have to adjust the address space of the memory operand to deviate from the underlying IR value, which looks ugly and is more work in the legalizer.
This doesn't come up in the DAG path because it uses a different selection strategy where the cast is inserted during the addressing mode matching.
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
ca57b80c |
| 20-Sep-2021 |
Mateja Marjanovic <[email protected]> |
Code quality: Combine V_RSQ
Combine V_RCP and V_SQRT into V_RSQ on AMDGPU for GlobalISel.
Change-Id: I93c5dcb412483156a6e8b68c4085cbce83ac9703
|
| #
14c40511 |
| 30-Nov-2021 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU][NFC] Remove unused defvar in AMDGPUInstructions.td.
|
| #
078da26b |
| 08-Nov-2021 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Check for unneeded shift mask in shift PatFrags.
The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnnee
[AMDGPU] Check for unneeded shift mask in shift PatFrags.
The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnneededShiftMask() function to AMDGPU and uses it in the shift PatFrag predicates.
Differential Revision: https://reviews.llvm.org/D113448
show more ...
|
| #
61e3b9fe |
| 22-Sep-2021 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Add constrained shift pattern matches.
The motivation for this is due to clang's conformance to https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#operators-shift
[AMDGPU] Add constrained shift pattern matches.
The motivation for this is due to clang's conformance to https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#operators-shift which makes clang emit (<shift> a, (and b, <width> - 1)) for `a <shift> b` in OpenCL where a is an int of bit width <width>.
Differential revision: https://reviews.llvm.org/D110231
show more ...
|
| #
d8699210 |
| 15-Oct-2021 |
Piotr Sobczak <[email protected]> |
[AMDGPU] Add patterns for i8/i16 local atomic load/store
Add patterns for i8/i16 local atomic load/store.
Added tests for new patterns.
Copied atomic_[store/load]_local.ll to GlobalISel directory.
[AMDGPU] Add patterns for i8/i16 local atomic load/store
Add patterns for i8/i16 local atomic load/store.
Added tests for new patterns.
Copied atomic_[store/load]_local.ll to GlobalISel directory.
Differential Revision: https://reviews.llvm.org/D111869
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
ce098ccc |
| 06-Jul-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Simplify tablegen files. NFC.
There is no need to cast records to strings before comparing them.
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
46adccc5 |
| 12-May-2021 |
Julien Pagès <[email protected]> |
[AMDGPU] Improve Codegen for build_vector
Improve the code generation of build_vector. Use the v_pack_b32_f16 instruction instead of v_and_b32 + v_lshl_or_b32
Differential Revision: https://reviews
[AMDGPU] Improve Codegen for build_vector
Improve the code generation of build_vector. Use the v_pack_b32_f16 instruction instead of v_and_b32 + v_lshl_or_b32
Differential Revision: https://reviews.llvm.org/D98081
Patch by Julien Pagès!
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
d92b4956 |
| 26-Mar-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Inline FSHRPattern into its only use. NFC.
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
91d2e5c8 |
| 05-Nov-2020 |
Paul C. Anagnostopoulos <[email protected]> |
[TableGen] Add the !filter bang operator.
Add a test. Update the Programmer's Reference.
Use it in some TableGen files.
Differential Revision: https://reviews.llvm.org/D91008
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4 |
|
| #
0d5989bb |
| 25-Sep-2020 |
Jay Foad <[email protected]> |
[AMDGPU] Split R600 and GCN bfe patterns
This is in preparation for making the GCN patterns divergence-aware. NFC.
Differential Revision: https://reviews.llvm.org/D88579
|
| #
286d3fc7 |
| 24-Sep-2020 |
Jay Foad <[email protected]> |
[AMDGPU] Split R600 and GCN bfi patterns
This is in preparation for making the GCN patterns divergence-aware. NFC.
Differential Revision: https://reviews.llvm.org/D88244
|
|
Revision tags: llvmorg-11.0.0-rc3 |
|
| #
277de43d |
| 10-Sep-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Unify intrinsic ret/nortn interface
We have a single noret intrinsic an a lot of special handling around it. Declare it just as any other but do not define rtn instructions itself instead.
[AMDGPU] Unify intrinsic ret/nortn interface
We have a single noret intrinsic an a lot of special handling around it. Declare it just as any other but do not define rtn instructions itself instead.
Differential Revision: https://reviews.llvm.org/D87719
show more ...
|
| #
d17ea67b |
| 21-Aug-2020 |
Mirko Brkusanin <[email protected]> |
[AMDGPU][GlobalISel] Fix 96 and 128 local loads and stores
Fix local ds_read/write_b96/b128 so they can be selected if the alignment allows. Otherwise, either pick appropriate ds_read2/write2 instru
[AMDGPU][GlobalISel] Fix 96 and 128 local loads and stores
Fix local ds_read/write_b96/b128 so they can be selected if the alignment allows. Otherwise, either pick appropriate ds_read2/write2 instructions or break them down.
Differential Revision: https://reviews.llvm.org/D81638
show more ...
|