History log of /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td (Results 1 – 25 of 83)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 04fff547 07-Mar-2022 Venkata Ramanaiah Nalamothu <[email protected]>

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added a

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added as a live-in on the function entry to
preserve its value when we have calls so that it gets saved and restored
around the calls.

But the DWARF unwind information (CFI) needs to track where the return address
resides in a frame and the above approach makes it difficult to track the
return address when the CFI information is emitted during the frame lowering,
due to the involvment of understanding the control flow.

This patch moves the return address ABI registers s[30:31] into callee saved
registers range and stops adding live-in for return address registers, so that
the CFI machinery will know where the return address resides when CSR
save/restore happen during the frame lowering.

And doing the above poses an issue that now the return instruction uses undefined
register `sgpr30_sgpr31`. This is resolved by hiding the return address register
use by the return instruction through the `SI_RETURN` pseudo instruction, which
doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the
`S_SETPC_B64_return` during the `expandPostRAPseudo()`.

As an added benefit, this patch simplifies overall return instruction handling.

Note: The AMDGPU CFI changes are there only in the downstream code and another
version of this patch will be posted for review for the downstream code.

Reviewed By: arsenm, ronlieb

Differential Revision: https://reviews.llvm.org/D114652

show more ...


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 09b53296 22-Dec-2021 Ron Lieberman <[email protected]>

Revert "[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range"

This reverts commit 9075009d1fd5f2bf9aa6c2f362d2993691a316b3.

Failed amdgpu runtime buildbot # 3514


# 9075009d 22-Dec-2021 RamNalamothu <[email protected]>

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added a

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added as a live-in on the function entry to
preserve its value when we have calls so that it gets saved and restored
around the calls.

But the DWARF unwind information (CFI) needs to track where the return address
resides in a frame and the above approach makes it difficult to track the
return address when the CFI information is emitted during the frame lowering,
due to the involvment of understanding the control flow.

This patch moves the return address ABI registers s[30:31] into callee saved
registers range and stops adding live-in for return address registers, so that
the CFI machinery will know where the return address resides when CSR
save/restore happen during the frame lowering.

And doing the above poses an issue that now the return instruction uses undefined
register `sgpr30_sgpr31`. This is resolved by hiding the return address register
use by the return instruction through the `SI_RETURN` pseudo instruction, which
doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the
`S_SETPC_B64_return` during the `expandPostRAPseudo()`.

As an added benefit, this patch simplifies overall return instruction handling.

Note: The AMDGPU CFI changes are there only in the downstream code and another
version of this patch will be posted for review for the downstream code.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D114652

show more ...


Revision tags: llvmorg-13.0.1-rc1
# 76cbe622 25-Oct-2021 Thomas Symalla <[email protected]>

[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not goin

[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not going to change the argument values.

This patch changes the AMDGPU_Gfx calling convention. It defines the SGPR registers s[4:29] as callee-save and leaves some SGPRs usable for callers. The intention is to avoid unneccessary s_mov instructions for arguments the caller would otherwise save and restore in these registers.

Reviewed By: sebastian-ne

Differential Revision: https://reviews.llvm.org/D111637

show more ...


# 9bd5cfeb 23-Oct-2021 Abinav Puthan Purayil <[email protected]>

[AMDGPU] Implement llvm.amdgcn.mulhi.[i,u]24 intrinsics.

These intrinsics maps to the 24-bit v_mul_hi instructions.

This change also fixes an incorrect assumption on the associativity of
24-bit mul

[AMDGPU] Implement llvm.amdgcn.mulhi.[i,u]24 intrinsics.

These intrinsics maps to the 24-bit v_mul_hi instructions.

This change also fixes an incorrect assumption on the associativity of
24-bit mulhi in its SDNode record in tblgen.

Differential Revision: https://reviews.llvm.org/D112394

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 48958d02 23-Aug-2021 Daniil Fukalov <[email protected]>

[NFC][AMDGPU] Reduce includes dependencies.

1. Splitted out some parts of R600 target to separate modules/headers.
2. Reduced some include lists in headers.
3. Found and fixed issue with override `G

[NFC][AMDGPU] Reduce includes dependencies.

1. Splitted out some parts of R600 target to separate modules/headers.
2. Reduced some include lists in headers.
3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()`
and `R600TargetMachine::getSubtargetImpl()` had different return value type
than base class.
4. Minor forward declarations cleanup.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D108596

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# a7786bad 15-Jun-2021 Matt Arsenault <[email protected]>

AMDGPU: Move zeroed FP high bits optimization to patterns


Revision tags: llvmorg-12.0.1-rc1
# c714d037 06-May-2021 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32

Differential Revision: https://reviews.llvm.org/D102022


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3
# f89f6d1e 02-Feb-2021 Thomas Symalla <[email protected]>

[AMDGPU]: Fixes an invalid clamp selection pattern.

When running the tests on PowerPC and x86, the lit test GlobalISel/trunc.ll fails at the memory sanitize step. This seems to be due to wrong inval

[AMDGPU]: Fixes an invalid clamp selection pattern.

When running the tests on PowerPC and x86, the lit test GlobalISel/trunc.ll fails at the memory sanitize step. This seems to be due to wrong invalid logic (which matches even if it shouldn't) and likely missing variable initialisation."

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D95878

show more ...


# d49efdc9 03-Feb-2021 Sebastian Neubauer <[email protected]>

Revert "[AMDGPU] Add a new Clamp Pattern to the GlobalISel Path."

This reverts commits 62af0305b7cc..677a3529d3e6 from D93708.
They cause failures in the sanitizer builds because of uninitialized
va

Revert "[AMDGPU] Add a new Clamp Pattern to the GlobalISel Path."

This reverts commits 62af0305b7cc..677a3529d3e6 from D93708.
They cause failures in the sanitizer builds because of uninitialized
values.

A fix is in D95878, but it might take some time until this is pushed,
so reverting the changes for now.

show more ...


Revision tags: llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 6604d81e 13-Jan-2021 Thomas Symalla <[email protected]>

Added and used new target pseudo for v_cvt_pk_i16_i32, changes due to code review.


Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# c5c58fd6 17-Jun-2020 Matt Arsenault <[email protected]>

AMDGPU: Remove intermediate DAG node for trig_preop intrinsic

We weren't doing anything with this, and keeping it would just add
more boilerplate for GlobalISel.


Revision tags: llvmorg-10.0.1-rc1
# 97f3f0ba 06-May-2020 Matt Arsenault <[email protected]>

AMDGPU: Add intrinsic for s_setreg

This will be more useful with fenv access implemented.


# 0312b9f5 23-Apr-2020 Kazuaki Ishizaki <[email protected]>

[llvm] NFC: Fix trivial typo in rst and td files

Differential Revision: https://reviews.llvm.org/D77469


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4
# f4637925 06-Sep-2019 Matt Arsenault <[email protected]>

AMDGPU: Remove custom node for RSQ_LEGACY

Directly select from the intrinsic. This wasn't getting much value
from the custom node.


# e87ec667 05-Apr-2020 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Fix llvm.amdgcn.div.fmas.ll


# 4c1c9422 18-Feb-2020 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Select llvm.amdgcn.fdot2

I'm slighly worried about the generated checks, since they won't catch
incorrect modifiers being added at the end of the line.


# fab4cdea 20-Feb-2020 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Select llvm.amdgcn.fmul.legacy


# 361f2a78 10-Feb-2020 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Handle sbfe/ubfe intrinsic

Try to handle arbitrary scalar BFEs by packing the operands. The DAG
gives up on non-constant arguments. We're still missing any constant
folding, so we

AMDGPU/GlobalISel: Handle sbfe/ubfe intrinsic

Try to handle arbitrary scalar BFEs by packing the operands. The DAG
gives up on non-constant arguments. We're still missing any constant
folding, so we end up with pretty ugly code most of the time. Also
handle the 64-bit scalar case, which the DAG doesn't try to do.

show more ...


# d1b393d9 07-Feb-2020 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Select G_CTTZ_ZERO_UNDEF

Directly select this rather than going through the intermediate
instruction, which may provide some combine value in the future.


# 045a8921 07-Feb-2020 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Select G_CTLZ_ZERO_UNDEF

Directly select this rather than going through the intermediate
instruction, which may provide some combine value in the future.


# 00115d76 06-Jan-2020 Matt Arsenault <[email protected]>

AMDGPU: Remove dead kill handling

At one point a custom node was used for kill handling, but now the
intrinsic is directly selected. Remove leftover pattern machinery.


# 68b102b9 01-Jan-2020 Matt Arsenault <[email protected]>

AMDGPU: Directly select 16-bank LDS case of llvm.amdgcn.interp.p1.f16

Manually select this is as a tablegen workraound. Both SelectionDAG
and GlobalISel end up misplacing the copy to m0 when both in

AMDGPU: Directly select 16-bank LDS case of llvm.amdgcn.interp.p1.f16

Manually select this is as a tablegen workraound. Both SelectionDAG
and GlobalISel end up misplacing the copy to m0 when both instructions
in the output need it. Neither considers that both output instructions
depend on m0. I don't know of any other pattern we need to handle this
case, so it's less effort to just workaround this for now.

show more ...


# eef92f25 05-Jan-2020 Matt Arsenault <[email protected]>

AMDGPU: Remove custom node for exports

I'm mildly worried about potentially reordering exp/exp_done with
IntrWriteMem on the intrinsic.

Requires hacking out the illegal type on SI, so manually sele

AMDGPU: Remove custom node for exports

I'm mildly worried about potentially reordering exp/exp_done with
IntrWriteMem on the intrinsic.

Requires hacking out the illegal type on SI, so manually select that
case during lowering.

show more ...


# 7fa0bfe7 08-Sep-2019 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Select mul24 intrinsics


1234