History log of /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp (Results 1 – 25 of 154)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 3a205977 19-Jul-2022 Jon Chesterfield <[email protected]>

[amdgpu] Implement lds kernel id intrinsic

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS varia

[amdgpu] Implement lds kernel id intrinsic

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060

show more ...


Revision tags: llvmorg-14.0.6
# cef65864 18-Jun-2022 Guillaume Chatelet <[email protected]>

[Alignment] Use Align for MaxKernArgAlign

Differential Revision: https://reviews.llvm.org/D128118


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2
# cc5a1b3d 16-Apr-2022 Matt Arsenault <[email protected]>

llvm-reduce: Add cloning of target MachineFunctionInfo

MIR support is totally unusable for AMDGPU without this, since the set
of reserved registers is set from fields here.

Add a clone method to Ma

llvm-reduce: Add cloning of target MachineFunctionInfo

MIR support is totally unusable for AMDGPU without this, since the set
of reserved registers is set from fields here.

Add a clone method to MachineFunctionInfo. This is a subtle variant of
the copy constructor that is required if there are any MIR constructs
that use pointers. Specifically, at minimum fields that reference
MachineBasicBlocks or the MachineFunction need to be adjusted to the
values in the new function.

show more ...


# cfe51684 18-Apr-2022 Matt Arsenault <[email protected]>

AMDGPU: Make PSV instances static members


# dd7e407d 02-Jun-2022 Matt Arsenault <[email protected]>

AMDGPU: Move SpilledReg from MFI to SIRegisterInfo

This isn't the most natural place for it, but it avoids a circular
include dependency in an out of tree patch.


# 5bd87350 21-Apr-2022 hsmahesha <[email protected]>

[AMDGPU] On gfx908, reserve VGPR for AGPR copy based on register budget.

Based on available register budget, reserve highest available VGPR for
AGPR copy before RA. After RA, shift it to lowest unus

[AMDGPU] On gfx908, reserve VGPR for AGPR copy based on register budget.

Based on available register budget, reserve highest available VGPR for
AGPR copy before RA. After RA, shift it to lowest unused VGPR if the one
exist.

Fixes SWDEV-330006.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D123525

show more ...


# 987df725 16-Apr-2022 Matt Arsenault <[email protected]>

AMDGPU: Serialize VGPRForAGPRCopy


# b5ec1312 16-Apr-2022 Matt Arsenault <[email protected]>

AMDGPU: Fix allocating GDS globals to LDS offsets

These don't seem to be very well used or tested, but try to make the
behavior a bit more consistent with LDS globals.

I'm not sure what the definit

AMDGPU: Fix allocating GDS globals to LDS offsets

These don't seem to be very well used or tested, but try to make the
behavior a bit more consistent with LDS globals.

I'm not sure what the definition for amdgpu-gds-size is supposed to
mean. For now I assumed it's allocating a static size at the beginning
of the allocation, and any known globals are allocated after it.

show more ...


# 378bb801 16-Apr-2022 Matt Arsenault <[email protected]>

AMDGPU: Serialize a few more MachineFunctionInfo fields in MIR


# f90f4884 16-Apr-2022 Matt Arsenault <[email protected]>

AMDGPU: Serialize gds size in MIR


# 5cd17f9d 16-Apr-2022 Matt Arsenault <[email protected]>

AMDGPU: Serialize WWM registers


# e0d585d7 16-Apr-2022 Matt Arsenault <[email protected]>

AMDGPU: Defer creation of WWM VGPR spill slots

There's no reason to create these immediately. They can be created in
the prolog/epilog code like CSR spills. There's probably a cleaner way
to do this

AMDGPU: Defer creation of WWM VGPR spill slots

There's no reason to create these immediately. They can be created in
the prolog/epilog code like CSR spills. There's probably a cleaner way
to do this by utilizing the CSR spill code.

This makes the frame index used transient state for
PrologEpilogInserter, and thus makes serialization easier. Really this
doesn't need to be saved here but there isn't really a better place
for it.

show more ...


# ea47373a 14-Apr-2022 hsmahesha <[email protected]>

[AMDGPU][NFC] Organize code around reserving VGPR32 for AGPR copy.

This is an NFC patch in preparation to fix a bug related to always
reserving VGPR32 for AGPR copy.

Reviewed By: rampitec

Differen

[AMDGPU][NFC] Organize code around reserving VGPR32 for AGPR copy.

This is an NFC patch in preparation to fix a bug related to always
reserving VGPR32 for AGPR copy.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D123651

show more ...


Revision tags: llvmorg-14.0.1
# 8384ced9 28-Mar-2022 Changpeng Fang <[email protected]>

[AMDGPU][NFC]: Remove unnecessary MFI functions

Summary:
hasHostcallPtr() and hasHeapPtr() are only used in metadata emit.
However, we can use the corresponding function attributes directly
instea

[AMDGPU][NFC]: Remove unnecessary MFI functions

Summary:
hasHostcallPtr() and hasHeapPtr() are only used in metadata emit.
However, we can use the corresponding function attributes directly
instead introducing the functions.

Reviewers: arsenm

Differential Revision: https://reviews.llvm.org/D122600

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# ca62b1db 25-Feb-2022 Changpeng Fang <[email protected]>

[AMDGPU][NFC]: Emit metadata for hidden_heap_v1 kernarg

Summary:
Emit metadata for hidden_heap_v1 kernarg

Reviewers:
sameerds, b-sumner

Fixes:
SWDEV-307188

Differential Revision:
https://

[AMDGPU][NFC]: Emit metadata for hidden_heap_v1 kernarg

Summary:
Emit metadata for hidden_heap_v1 kernarg

Reviewers:
sameerds, b-sumner

Fixes:
SWDEV-307188

Differential Revision:
https://reviews.llvm.org/D119027

show more ...


# 6527b2a4 18-Feb-2022 Sebastian Neubauer <[email protected]>

[AMDGPU][NFC] Fix typos

Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235


# d8f99bb6 11-Feb-2022 Sameer Sahasrabuddhe <[email protected]>

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
r

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3
# aeaf85b9 13-Jan-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Select VGPR versions of MFMA if possible

We can select _vgprcd versions of MAI instructions and have no
AGPRs with the whole budget left for VGPRs if:

1. This is a kernel;
2. It has no cal

[AMDGPU] Select VGPR versions of MFMA if possible

We can select _vgprcd versions of MAI instructions and have no
AGPRs with the whole budget left for VGPRs if:

1. This is a kernel;
2. It has no calls;
3. It runs at least on 2 waves thus having not more that 256 VGPRs.
4. There is no inline asm requesting AGPRs.

Differential Revision: https://reviews.llvm.org/D117253

show more ...


Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# d6fdbbca 24-Nov-2021 Matt Arsenault <[email protected]>

AMDGPU: Add second emergency slot for SGPR to vmem for large frames

In a future change, we will sometimes use a VGPR offset for doing
spills to memory, in which case we need 2 free VGPRs to do the S

AMDGPU: Add second emergency slot for SGPR to vmem for large frames

In a future change, we will sometimes use a VGPR offset for doing
spills to memory, in which case we need 2 free VGPRs to do the SGPR
spill. In most cases we could spill the VGPR along with the SGPR being
spilled, but we don't have any free lanes for SGPR_1024 in wave32 so
we could still potentially need a second scavenging slot.

show more ...


# de1600a1 10-Jan-2022 Matt Arsenault <[email protected]>

AMDGPU: Avoid enabling kernel workitem IDs with reqd_work_group_size


# 8470bf2b 12-Jan-2022 Austin Kerbow <[email protected]>

[AMDGPU] Do not reserve any VGPR for SGPR spills

After the split register allocation changes in eebe841a47cb it is no
longer necessary to reserve a VGPR before RA. This can also create bugs
when IPR

[AMDGPU] Do not reserve any VGPR for SGPR spills

After the split register allocation changes in eebe841a47cb it is no
longer necessary to reserve a VGPR before RA. This can also create bugs
when IPRA is enabled since we cannot predict that a called function may
not reserve any register if it does not have any SGPR spills. If that
happens those functions may override reserved registers that are
normally callee saved. Added a test to show this.

Fixes: SWDEV-309900

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D115551

show more ...


# d45a2479 18-Dec-2021 Brendon Cahoon <[email protected]>

[AMDGPU] Don't remove VGPR to AGPR dead spills from frame info

Removing dead frame indices for VGPR to AGPR spills is incorrect
when the frame index is shared by multiple objects, which may
occur du

[AMDGPU] Don't remove VGPR to AGPR dead spills from frame info

Removing dead frame indices for VGPR to AGPR spills is incorrect
when the frame index is shared by multiple objects, which may
occur due to stack slot coloring. The problem is that subsequent
code that processes the other object will assert because the stack
frame index is marked dead.

Removing dead frame indices is needed prior to stack slot
coloring, which is what happens with SGPR to VGPR spills. These
spills are lowered prior to stack slot coloring, but the VGPR
to AGPR spills are processed afterwards during the Prolog/Epilog
Inserter pass. This patch marks the VGPR to AGPR spill slot as
dead if the slot is not used by another object.

Differential Revision: https://reviews.llvm.org/D115996

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 06b90175 14-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Remove fixed function ABI option


# 729bf9b2 14-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Enable fixed function ABI by default

Code using indirect calls is broken without this, and there isn't
really much value in supporting the old attempt to vary the argument
placement based on

AMDGPU: Enable fixed function ABI by default

Code using indirect calls is broken without this, and there isn't
really much value in supporting the old attempt to vary the argument
placement based on uses. This resulted in more argument shuffling code
anyway.

Also have the option stop implying all inputs need to be passed. This
will no rely on the amdgpu-no-* attributes to avoid passing
unnecessary values.

show more ...


# e5340ed3 27-Oct-2021 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Fix global isel for kernels using agprs on gfx90a

With Global ISel getReservedRegs() is called before function is
regbank selected for the first time. Defer caching of usesAGPRs()
in this c

[AMDGPU] Fix global isel for kernels using agprs on gfx90a

With Global ISel getReservedRegs() is called before function is
regbank selected for the first time. Defer caching of usesAGPRs()
in this case.

Differential Revision: https://reviews.llvm.org/D112644

show more ...


1234567