History log of /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td (Results 1 – 25 of 126)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4
# 17a81ecf 08-May-2022 Abinav Puthan Purayil <[email protected]>

[AMDGPU] Use the HasNoUse predicate for no-ret atomic op selection

This change replaces the C++ predicates with the HasNoUse builtin
predicate that would enable the no-ret atomic op selection in
Glo

[AMDGPU] Use the HasNoUse predicate for no-ret atomic op selection

This change replaces the C++ predicates with the HasNoUse builtin
predicate that would enable the no-ret atomic op selection in
GlobalISel.

Differential Revision: https://reviews.llvm.org/D125213

show more ...


# 7504c7a8 20-Jun-2022 Abinav Puthan Purayil <[email protected]>

[AMDGPU] Use AddedComplexity for ret and noret atomic ops selection

This patch removes the predicate for return atomic ops and uses
AddedComplexity to distinguish its selection from its no return va

[AMDGPU] Use AddedComplexity for ret and noret atomic ops selection

This patch removes the predicate for return atomic ops and uses
AddedComplexity to distinguish its selection from its no return variant.
This will produce better matchers that doesn't unnecessarily check for
the negated predicate if the initial predicate failed. Also, it
simplifies the enabling of no return atomic ops selection in GlobalISel.

Differential Revision: https://reviews.llvm.org/D128241

show more ...


# 07b7fada 25-May-2022 Joe Nash <[email protected]>

[AMDGPU] gfx11 VOPD instructions MC support

VOPD is a new encoding for dual-issue instructions for use in wave32.
This patch includes MC layer support only.

A VOPD instruction is constituted of an

[AMDGPU] gfx11 VOPD instructions MC support

VOPD is a new encoding for dual-issue instructions for use in wave32.
This patch includes MC layer support only.

A VOPD instruction is constituted of an X component (for which there are
13 possible opcodes) and a Y component (for which there are the 13 X
opcodes plus 3 more). Most of the complexity in defining and parsing
a VOPD operation arises from the possible different total numbers of
operands and deferred parsing of certain operands depending on the
constituent X and Y opcodes.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D128218

show more ...


# e243ead6 18-May-2022 Joe Nash <[email protected]>

Reland [AMDGPU] gfx11 vop3dpp instructions

There was an issue with encoding wide (>64 bit) instructions on
BigEndian hosts, which is fixed in D127195. Therefore reland this.

gfx11 adds the ability

Reland [AMDGPU] gfx11 vop3dpp instructions

There was an issue with encoding wide (>64 bit) instructions on
BigEndian hosts, which is fixed in D127195. Therefore reland this.

gfx11 adds the ability to use dpp modifiers on vop3 instructions.
This patch adds machine code layer support for that. The MCCodeEmitter
is changed to use APInt instead of uint64_t to support these wider
instructions.

Patch 16/N for upstreaming of AMDGPU gfx11 architecture

Differential Revision: https://reviews.llvm.org/D126483

show more ...


# eaed07eb 06-Jun-2022 Joe Nash <[email protected]>

Revert "[AMDGPU] gfx11 vop3dpp instructions"

This reverts commit 99a83b1286748501e0ccf199a582dc3ec5451ef5.


# 99a83b12 18-May-2022 Joe Nash <[email protected]>

[AMDGPU] gfx11 vop3dpp instructions

gfx11 adds the ability to use dpp modifiers on vop3 instructions.
This patch adds machine code layer support for that. The MCCodeEmitter
is changed to use APInt i

[AMDGPU] gfx11 vop3dpp instructions

gfx11 adds the ability to use dpp modifiers on vop3 instructions.
This patch adds machine code layer support for that. The MCCodeEmitter
is changed to use APInt instead of uint64_t to support these wider
instructions.

Patch 16/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126475

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126483

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 0ecbb683 05-Apr-2022 Matt Arsenault <[email protected]>

TableGen/GlobalISel: Make address space/align predicates consistent

The builtin predicate handling has a strange behavior where the code
assumes that a PatFrag is a stack of PatFrags, and each level

TableGen/GlobalISel: Make address space/align predicates consistent

The builtin predicate handling has a strange behavior where the code
assumes that a PatFrag is a stack of PatFrags, and each level adds at
most one predicate. I don't think this particularly makes sense,
especially without a diagnostic to ensure you aren't trying to set
multiple at once.

This wasn't followed for address spaces and alignment, which could
potentially fall through to report no builtin predicate was
added. Just switch these to follow the existing convention for now.

show more ...


# 45ca9433 08-Apr-2022 Abinav Puthan Purayil <[email protected]>

[AMDGPU] Select no-return atomic intrinsics in tblgen

This is to avoid relying on the post-isel hook.

This change also enable the saddr pattern selection for atomic
intrinsics in GlobalISel.

Diffe

[AMDGPU] Select no-return atomic intrinsics in tblgen

This is to avoid relying on the post-isel hook.

This change also enable the saddr pattern selection for atomic
intrinsics in GlobalISel.

Differential Revision: https://reviews.llvm.org/D123583

show more ...


# b7df7152 19-Apr-2022 Abinav Puthan Purayil <[email protected]>

[AMDGPU][GlobalISel] Force return atomic selection for now


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# f4e8cf25 27-Dec-2021 Abinav Puthan Purayil <[email protected]>

[AMDGPU] Select no-return ds_* atomic ops in tblgen.

SelectionDAG relies on MachineInstr's HasPostISelHook for selecting the
no-return atomic ops. GlobalISel, at the moment, doesn't handle
HasPostIS

[AMDGPU] Select no-return ds_* atomic ops in tblgen.

SelectionDAG relies on MachineInstr's HasPostISelHook for selecting the
no-return atomic ops. GlobalISel, at the moment, doesn't handle
HasPostISelHook.

This change adds the selection for no-return ds_* atomic ops in tblgen
so that it can work with both GlobalISel and SelectionDAG. I couldn't
add the predicates for GlobalISel in this change since there's a
restriction in GlobalISelEmitter that disallows selecting generic
atomics ops that return with instructions that doesn't return.

We can't remove the HasPostISelHook code that selects the no return
atomic ops in SelectionDAG yet since we still need to cover selections
in FLATInstructions.td, BUFInstructions.td.

Differential Revision: https://reviews.llvm.org/D115881

show more ...


# d8b69040 20-Jan-2022 Abinav Puthan Purayil <[email protected]>

[AMDGPU] Set MemoryVT for truncstores in tblgen.

GlobalISelEmitter was skipping these patterns when its predicates were
checked. This patch should allow us to select d16_hi stores in
GlobalISel.

Di

[AMDGPU] Set MemoryVT for truncstores in tblgen.

GlobalISelEmitter was skipping these patterns when its predicates were
checked. This patch should allow us to select d16_hi stores in
GlobalISel.

Differential Revision: https://reviews.llvm.org/D117762

show more ...


# 9392b40d 12-Jan-2022 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Fix selection of constant 32-bit addrspace loads

Unfortunately the selection patterns still rely on the address space
from the memory operand instead of using the pointer type. Ad

AMDGPU/GlobalISel: Fix selection of constant 32-bit addrspace loads

Unfortunately the selection patterns still rely on the address space
from the memory operand instead of using the pointer type. Add this
address space to the list of cases supported by global-like loads.

Alternatively we would have to adjust the address space of the memory
operand to deviate from the underlying IR value, which looks ugly and
is more work in the legalizer.

This doesn't come up in the DAG path because it uses a different
selection strategy where the cast is inserted during the addressing
mode matching.

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# ca57b80c 20-Sep-2021 Mateja Marjanovic <[email protected]>

Code quality: Combine V_RSQ

Combine V_RCP and V_SQRT into V_RSQ on AMDGPU for GlobalISel.

Change-Id: I93c5dcb412483156a6e8b68c4085cbce83ac9703


# 14c40511 30-Nov-2021 Abinav Puthan Purayil <[email protected]>

[AMDGPU][NFC] Remove unused defvar in AMDGPUInstructions.td.


# 078da26b 08-Nov-2021 Abinav Puthan Purayil <[email protected]>

[AMDGPU] Check for unneeded shift mask in shift PatFrags.

The existing constrained shift PatFrags only dealt with masked shift
from OpenCL front-ends. This change copies the
X86DAGToDAGISel::isUnnee

[AMDGPU] Check for unneeded shift mask in shift PatFrags.

The existing constrained shift PatFrags only dealt with masked shift
from OpenCL front-ends. This change copies the
X86DAGToDAGISel::isUnneededShiftMask() function to AMDGPU and uses it in
the shift PatFrag predicates.

Differential Revision: https://reviews.llvm.org/D113448

show more ...


# 61e3b9fe 22-Sep-2021 Abinav Puthan Purayil <[email protected]>

[AMDGPU] Add constrained shift pattern matches.

The motivation for this is due to clang's conformance to
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#operators-shift

[AMDGPU] Add constrained shift pattern matches.

The motivation for this is due to clang's conformance to
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#operators-shift
which makes clang emit (<shift> a, (and b, <width> - 1)) for `a <shift> b`
in OpenCL where a is an int of bit width <width>.

Differential revision: https://reviews.llvm.org/D110231

show more ...


# d8699210 15-Oct-2021 Piotr Sobczak <[email protected]>

[AMDGPU] Add patterns for i8/i16 local atomic load/store

Add patterns for i8/i16 local atomic load/store.

Added tests for new patterns.

Copied atomic_[store/load]_local.ll to GlobalISel directory.

[AMDGPU] Add patterns for i8/i16 local atomic load/store

Add patterns for i8/i16 local atomic load/store.

Added tests for new patterns.

Copied atomic_[store/load]_local.ll to GlobalISel directory.

Differential Revision: https://reviews.llvm.org/D111869

show more ...


Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# ce098ccc 06-Jul-2021 Jay Foad <[email protected]>

[AMDGPU] Simplify tablegen files. NFC.

There is no need to cast records to strings before comparing them.


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# 46adccc5 12-May-2021 Julien Pagès <[email protected]>

[AMDGPU] Improve Codegen for build_vector

Improve the code generation of build_vector.
Use the v_pack_b32_f16 instruction instead of
v_and_b32 + v_lshl_or_b32

Differential Revision: https://reviews

[AMDGPU] Improve Codegen for build_vector

Improve the code generation of build_vector.
Use the v_pack_b32_f16 instruction instead of
v_and_b32 + v_lshl_or_b32

Differential Revision: https://reviews.llvm.org/D98081

Patch by Julien Pagès!

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# d92b4956 26-Mar-2021 Jay Foad <[email protected]>

[AMDGPU] Inline FSHRPattern into its only use. NFC.


Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# 91d2e5c8 05-Nov-2020 Paul C. Anagnostopoulos <[email protected]>

[TableGen] Add the !filter bang operator.

Add a test. Update the Programmer's Reference.

Use it in some TableGen files.

Differential Revision: https://reviews.llvm.org/D91008


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4
# 0d5989bb 25-Sep-2020 Jay Foad <[email protected]>

[AMDGPU] Split R600 and GCN bfe patterns

This is in preparation for making the GCN patterns divergence-aware.
NFC.

Differential Revision: https://reviews.llvm.org/D88579


# 286d3fc7 24-Sep-2020 Jay Foad <[email protected]>

[AMDGPU] Split R600 and GCN bfi patterns

This is in preparation for making the GCN patterns divergence-aware.
NFC.

Differential Revision: https://reviews.llvm.org/D88244


Revision tags: llvmorg-11.0.0-rc3
# 277de43d 10-Sep-2020 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Unify intrinsic ret/nortn interface

We have a single noret intrinsic an a lot of special handling
around it. Declare it just as any other but do not define rtn
instructions itself instead.

[AMDGPU] Unify intrinsic ret/nortn interface

We have a single noret intrinsic an a lot of special handling
around it. Declare it just as any other but do not define rtn
instructions itself instead.

Differential Revision: https://reviews.llvm.org/D87719

show more ...


# d17ea67b 21-Aug-2020 Mirko Brkusanin <[email protected]>

[AMDGPU][GlobalISel] Fix 96 and 128 local loads and stores

Fix local ds_read/write_b96/b128 so they can be selected if the alignment
allows. Otherwise, either pick appropriate ds_read2/write2 instru

[AMDGPU][GlobalISel] Fix 96 and 128 local loads and stores

Fix local ds_read/write_b96/b128 so they can be selected if the alignment
allows. Otherwise, either pick appropriate ds_read2/write2 instructions or break
them down.

Differential Revision: https://reviews.llvm.org/D81638

show more ...


123456