History log of /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h (Results 1 – 25 of 166)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# d1af09ad 23-Jun-2022 Joe Nash <[email protected]>

[AMDGPU] gfx11 Generate VOPD Instructions

We form VOPD instructions in the GCNCreateVOPD pass by combining
back-to-back component instructions. There are strict register
constraints for creating a

[AMDGPU] gfx11 Generate VOPD Instructions

We form VOPD instructions in the GCNCreateVOPD pass by combining
back-to-back component instructions. There are strict register
constraints for creating a legal VOPD, namely that the matching operands
(e.g. src0x and src0y, src1x and src1y) must be in different register
banks. We add a PostRA scheduler
mutation to put possible VOPD components back-to-back.

Depends on D128442, D128270

Reviewed By: #amdgpu, rampitec

Differential Revision: https://reviews.llvm.org/D128656

show more ...


# 4874838a 28-Jun-2022 Piotr Sobczak <[email protected]>

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D1287

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D128756

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5
# 07b7fada 25-May-2022 Joe Nash <[email protected]>

[AMDGPU] gfx11 VOPD instructions MC support

VOPD is a new encoding for dual-issue instructions for use in wave32.
This patch includes MC layer support only.

A VOPD instruction is constituted of an

[AMDGPU] gfx11 VOPD instructions MC support

VOPD is a new encoding for dual-issue instructions for use in wave32.
This patch includes MC layer support only.

A VOPD instruction is constituted of an X component (for which there are
13 possible opcodes) and a Y component (for which there are the 13 X
opcodes plus 3 more). Most of the complexity in defining and parsing
a VOPD operation arises from the possible different total numbers of
operands and deferred parsing of certain operands depending on the
constituent X and Y opcodes.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D128218

show more ...


# 485e8b4f 20-Jun-2022 Dmitry Preobrazhensky <[email protected]>

[AMDGPU][MC][GFX11] Correct disassembly of DPP variants of VOPC64 opcodes

Fix bugs https://github.com/llvm/llvm-project/issues/56091, https://github.com/llvm/llvm-project/issues/56065.

Differential

[AMDGPU][MC][GFX11] Correct disassembly of DPP variants of VOPC64 opcodes

Fix bugs https://github.com/llvm/llvm-project/issues/56091, https://github.com/llvm/llvm-project/issues/56065.

Differential Revision: https://reviews.llvm.org/D128075

show more ...


Revision tags: llvmorg-14.0.4
# e8860bee 12-May-2022 Joe Nash <[email protected]>

[AMDGPU] gfx11 Image instructions

MC layer support for instructions in the MIMG encoding(Image
instructions).

Contributors:
Carl Ritson <[email protected]>

Patch 13/N for upstreaming of AMDGPU g

[AMDGPU] gfx11 Image instructions

MC layer support for instructions in the MIMG encoding(Image
instructions).

Contributors:
Carl Ritson <[email protected]>

Patch 13/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125992

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126463

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2
# d21b9b49 21-Apr-2022 Joe Nash <[email protected]>

[AMDGPU] gfx11 scalar alu instructions

MC layer support for SOP(scalar alu operations) including encoding
support for s_delay_alu and s_sendmsg_rtn.

Contributors:
Jay Foad <[email protected]>

Patch

[AMDGPU] gfx11 scalar alu instructions

MC layer support for SOP(scalar alu operations) including encoding
support for s_delay_alu and s_sendmsg_rtn.

Contributors:
Jay Foad <[email protected]>

Patch 7/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125319

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D125498

show more ...


# c7025940 19-Apr-2022 Joe Nash <[email protected]>

[AMDGPU] gfx11 BUF Instructions

Includes MachineCode layer support and tests, and MIR tests not requiring
CodeGen pass changes.
Includes a small change in SMInstructions.td to correct encoded bits.

[AMDGPU] gfx11 BUF Instructions

Includes MachineCode layer support and tests, and MIR tests not requiring
CodeGen pass changes.
Includes a small change in SMInstructions.td to correct encoded bits.

Contributors:
Petar Avramovic <[email protected]>
Dmitry Preobrazhensky <[email protected]>

Depends on D125316

Patch 6/N for upstreaming of AMDGPU gfx11 architecture.

Reviewed By: dp, Petar.Avramovic

Differential Revision: https://reviews.llvm.org/D125319

show more ...


# 8edaf259 12-Apr-2022 Changpeng Fang <[email protected]>

AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally

Summary:
Introduce a new function attribute, amdgpu-no-multigrid-sync-arg, which is default.
We use implicitarg_ptr + offset t

AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally

Summary:
Introduce a new function attribute, amdgpu-no-multigrid-sync-arg, which is default.
We use implicitarg_ptr + offset to check whether the multigrid synchronization
pointer is used. If yes, we remove this attribute and also remove
amdgpu-no-implicitarg-ptr. We generate metadata for the hidden_multigrid_sync_arg
only when the amdgpu-no-multigrid-sync-arg attribute is removed from the function.

Reviewers: arsenm, sameerds, b-sumner and foad

Differential Revision: https://reviews.llvm.org/D123548

show more ...


Revision tags: llvmorg-14.0.1
# 1f6aa903 07-Apr-2022 Dmitry Preobrazhensky <[email protected]>

[AMDGPU][MC][GFX10] Added syntactic sugar for s_waitcnt_depctr operand

Added the following helpers:

depctr_hold_cnt(...)
depctr_sa_sdst(...)
depctr_va_vdst(...)
depctr_va_sdst(...)

[AMDGPU][MC][GFX10] Added syntactic sugar for s_waitcnt_depctr operand

Added the following helpers:

depctr_hold_cnt(...)
depctr_sa_sdst(...)
depctr_va_vdst(...)
depctr_va_sdst(...)
depctr_va_ssrc(...)
depctr_va_vcc(...)
depctr_vm_vsrc(...)

Differential Revision: https://reviews.llvm.org/D123022

show more ...


# 64838ba3 23-Mar-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Use GenericTable to classify DGEMM

Since there is a table introduced for MAI instructions extend it
to use for DGEMM classification.

Differential Revision: https://reviews.llvm.org/D122337


# cad9de71 23-Mar-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] gfx940 MAI hazard recognizer

Differential Revision: https://reviews.llvm.org/D122263


# 1d817a14 21-Mar-2022 Dmitry Preobrazhensky <[email protected]>

[AMDGPU][MC][NFC] Refactored sendmsg(...) handling

Differential Revision: https://reviews.llvm.org/D121995


# dd5895cc 17-Mar-2022 Changpeng Fang <[email protected]>

AMDGPU: Use the implicit kernargs for code object version 5

Summary:
Specifically, for trap handling, for targets that do not support getDoorbellID,
we load the queue_ptr from the implicit kernarg

AMDGPU: Use the implicit kernargs for code object version 5

Summary:
Specifically, for trap handling, for targets that do not support getDoorbellID,
we load the queue_ptr from the implicit kernarg, and move queue_ptr to s[0:1].
To get aperture bases when targets do not have aperture registers, we load
private_base or shared_base directly from the implicit kernarg. In clang, we use
implicitarg_ptr + offsets to implement __builtin_amdgcn_workgroup_size_{xyz}.

Reviewers: arsenm, sameerds, yaxunl

Differential Revision: https://reviews.llvm.org/D120265

show more ...


# 5977dfba 16-Mar-2022 Dmitry Preobrazhensky <[email protected]>

[AMDGPU][MC][NFC] Refactored custom operands handling

The original design of custom operands support assumed that most GPUs
have the same or very similar operand names end encodings. This is
no long

[AMDGPU][MC][NFC] Refactored custom operands handling

The original design of custom operands support assumed that most GPUs
have the same or very similar operand names end encodings. This is
no longer the case. As a result the support code becomes over-complicated
and difficult to maintain.

This change implements a different design with the following benefits:

- support of aliases;
- support of operands with overlapped encodings;
- identification of defined but unsupported operands.

Differential Revision: https://reviews.llvm.org/D121696

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4
# 8dd3d1cf 10-Mar-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Add symbolic names for gfx940 HWREGs

The namespaces of HWREGs is now overlapping with gfx10. Thus the
patch is longer than necessary to just support new names. It also
need to handle proper

[AMDGPU] Add symbolic names for gfx940 HWREGs

The namespaces of HWREGs is now overlapping with gfx10. Thus the
patch is longer than necessary to just support new names. It also
need to handle proper error messages, i.e. to issue a "specified
hardware register is not supported on this GPU" message.

This may need a major refactoring in the future.

Differential Revision: https://reviews.llvm.org/D121418

show more ...


Revision tags: llvmorg-14.0.0-rc3
# 0f20a35b 09-Mar-2022 Changpeng Fang <[email protected]>

AMDGPU: Set up User SGPRs for queue_ptr only when necessary

Summary:
In general, we need queue_ptr for aperture bases and trap handling,
and user SGPRs have to be set up to hold queue_ptr. In curr

AMDGPU: Set up User SGPRs for queue_ptr only when necessary

Summary:
In general, we need queue_ptr for aperture bases and trap handling,
and user SGPRs have to be set up to hold queue_ptr. In current implementation,
user SGPRs are set up unnecessarily for some cases. If the target has aperture
registers, queue_ptr is not needed to reference aperture bases. For trap
handling, if target suppots getDoorbellID, queue_ptr is also not necessary.
Futher, code object version 5 introduces new kernel ABI which passes queue_ptr
as an implicit kernel argument, so user SGPRs are no longer necessary for
queue_ptr. Based on the trap handling document:
https://llvm.org/docs/AMDGPUUsage.html#amdgpu-trap-handler-for-amdhsa-os-v4-onwards-table,
llvm.debugtrap does not need queue_ptr, we remove queue_ptr suport for llvm.debugtrap
in the backend.

Reviewers: sameerds, arsenm

Fixes: SWDEV-307189

Differential Revision: https://reviews.llvm.org/D119762

show more ...


# 8992b50e 02-Mar-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] gfx940 uses new names for coherency bits

Differential Revision: https://reviews.llvm.org/D120855


Revision tags: llvmorg-14.0.0-rc2
# ca62b1db 25-Feb-2022 Changpeng Fang <[email protected]>

[AMDGPU][NFC]: Emit metadata for hidden_heap_v1 kernarg

Summary:
Emit metadata for hidden_heap_v1 kernarg

Reviewers:
sameerds, b-sumner

Fixes:
SWDEV-307188

Differential Revision:
https://

[AMDGPU][NFC]: Emit metadata for hidden_heap_v1 kernarg

Summary:
Emit metadata for hidden_heap_v1 kernarg

Reviewers:
sameerds, b-sumner

Fixes:
SWDEV-307188

Differential Revision:
https://reviews.llvm.org/D119027

show more ...


# 6527b2a4 18-Feb-2022 Sebastian Neubauer <[email protected]>

[AMDGPU][NFC] Fix typos

Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init
# 74702444 02-Feb-2022 Jacob Lambert <[email protected]>

[AMDGPU] Add agpr_count to metadata and AsmParser

gfx90a allows the number of ACC registers (AGPRs) to be set
independently to the VGPR registers. For both HSA and PAL metadata, we
now include an "a

[AMDGPU] Add agpr_count to metadata and AsmParser

gfx90a allows the number of ACC registers (AGPRs) to be set
independently to the VGPR registers. For both HSA and PAL metadata, we
now include an "agpr_count" key to report the number of AGPRs set for
supported devices (gfx90a, gfx908, as determined by hasMAIInsts()).
This is collected from SIProgramInfo.NumAccVGPR for both HSA and PAL.
The AsmParser also now recognizes ".kernel.agpr_count" for supported
devices.

Differential Revision: https://reviews.llvm.org/D116140

show more ...


# 0bad7cb5 16-Feb-2022 Jacob Lambert <[email protected]>

Hoist getTotalNumVGPRs into AMDGPUBaseInfo for use in both codegen and MC

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D119912


# 6655c5a6 16-Feb-2022 Dmitry Preobrazhensky <[email protected]>

[AMDGPU][MC][GFX10] Added an alias for HW_REG_HW_ID1

Enabled HW_REG_HW_ID as an alias for HW_REG_HW_ID1. This is required for compatibility with existing code.

Differential Revision: https://review

[AMDGPU][MC][GFX10] Added an alias for HW_REG_HW_ID1

Enabled HW_REG_HW_ID as an alias for HW_REG_HW_ID1. This is required for compatibility with existing code.

Differential Revision: https://reviews.llvm.org/D119939

show more ...


# cb199e0f 11-Feb-2022 Jay Foad <[email protected]>

[MC] Define and use MCRegisterInfo::regsOverlap

Separate MCRegisterInfo::regsOverlap out from
TargetRegisterInfo::regsOverlap. This is useful in the AMDGPU AsmParser
where we only have access to MCR

[MC] Define and use MCRegisterInfo::regsOverlap

Separate MCRegisterInfo::regsOverlap out from
TargetRegisterInfo::regsOverlap. This is useful in the AMDGPU AsmParser
where we only have access to MCRegisterInfo.

Differential Revision: https://reviews.llvm.org/D119533

show more ...


# c7eb8463 11-Feb-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Merge AMDGPULDSUtils into AMDGPUMemoryUtils

Differential Revision: https://reviews.llvm.org/D119502


# d8f99bb6 11-Feb-2022 Sameer Sahasrabuddhe <[email protected]>

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
r

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216

show more ...


1234567