AMDGPUInstrInfo.td - OpenGrok history log for /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 04fff547	07-Mar-2022	Venkata Ramanaiah Nalamothu <[email protected]>	[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a [AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls. But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow. This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering. And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`. As an added benefit, this patch simplifies overall return instruction handling. Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code. Reviewed By: arsenm, ronlieb Differential Revision: https://reviews.llvm.org/D114652 show more ...
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 09b53296	22-Dec-2021	Ron Lieberman <[email protected]>	Revert "[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range" This reverts commit 9075009d1fd5f2bf9aa6c2f362d2993691a316b3. Failed amdgpu runtime buildbot # 3514
# 9075009d	22-Dec-2021	RamNalamothu <[email protected]>	[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a [AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls. But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow. This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering. And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`. As an added benefit, this patch simplifies overall return instruction handling. Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D114652 show more ...
Revision tags: llvmorg-13.0.1-rc1
# 76cbe622	25-Oct-2021	Thomas Symalla <[email protected]>	[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not goin [AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not going to change the argument values. This patch changes the AMDGPU_Gfx calling convention. It defines the SGPR registers s[4:29] as callee-save and leaves some SGPRs usable for callers. The intention is to avoid unneccessary s_mov instructions for arguments the caller would otherwise save and restore in these registers. Reviewed By: sebastian-ne Differential Revision: https://reviews.llvm.org/D111637 show more ...
# 9bd5cfeb	23-Oct-2021	Abinav Puthan Purayil <[email protected]>	[AMDGPU] Implement llvm.amdgcn.mulhi.[i,u]24 intrinsics. These intrinsics maps to the 24-bit v_mul_hi instructions. This change also fixes an incorrect assumption on the associativity of 24-bit mul [AMDGPU] Implement llvm.amdgcn.mulhi.[i,u]24 intrinsics. These intrinsics maps to the 24-bit v_mul_hi instructions. This change also fixes an incorrect assumption on the associativity of 24-bit mulhi in its SDNode record in tblgen. Differential Revision: https://reviews.llvm.org/D112394 show more ...
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 48958d02	23-Aug-2021	Daniil Fukalov <[email protected]>	[NFC][AMDGPU] Reduce includes dependencies. 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `G [NFC][AMDGPU] Reduce includes dependencies. 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()` and `R600TargetMachine::getSubtargetImpl()` had different return value type than base class. 4. Minor forward declarations cleanup. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D108596 show more ...
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# a7786bad	15-Jun-2021	Matt Arsenault <[email protected]>	AMDGPU: Move zeroed FP high bits optimization to patterns
Revision tags: llvmorg-12.0.1-rc1
# c714d037	06-May-2021	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32 Differential Revision: https://reviews.llvm.org/D102022
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3
# f89f6d1e	02-Feb-2021	Thomas Symalla <[email protected]>	[AMDGPU]: Fixes an invalid clamp selection pattern. When running the tests on PowerPC and x86, the lit test GlobalISel/trunc.ll fails at the memory sanitize step. This seems to be due to wrong inval [AMDGPU]: Fixes an invalid clamp selection pattern. When running the tests on PowerPC and x86, the lit test GlobalISel/trunc.ll fails at the memory sanitize step. This seems to be due to wrong invalid logic (which matches even if it shouldn't) and likely missing variable initialisation." Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D95878 show more ...
# d49efdc9	03-Feb-2021	Sebastian Neubauer <[email protected]>	Revert "[AMDGPU] Add a new Clamp Pattern to the GlobalISel Path." This reverts commits 62af0305b7cc..677a3529d3e6 from D93708. They cause failures in the sanitizer builds because of uninitialized va Revert "[AMDGPU] Add a new Clamp Pattern to the GlobalISel Path." This reverts commits 62af0305b7cc..677a3529d3e6 from D93708. They cause failures in the sanitizer builds because of uninitialized values. A fix is in D95878, but it might take some time until this is pushed, so reverting the changes for now. show more ...
Revision tags: llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 6604d81e	13-Jan-2021	Thomas Symalla <[email protected]>	Added and used new target pseudo for v_cvt_pk_i16_i32, changes due to code review.
Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# c5c58fd6	17-Jun-2020	Matt Arsenault <[email protected]>	AMDGPU: Remove intermediate DAG node for trig_preop intrinsic We weren't doing anything with this, and keeping it would just add more boilerplate for GlobalISel.
Revision tags: llvmorg-10.0.1-rc1
# 97f3f0ba	06-May-2020	Matt Arsenault <[email protected]>	AMDGPU: Add intrinsic for s_setreg This will be more useful with fenv access implemented.
# 0312b9f5	23-Apr-2020	Kazuaki Ishizaki <[email protected]>	[llvm] NFC: Fix trivial typo in rst and td files Differential Revision: https://reviews.llvm.org/D77469
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4
# f4637925	06-Sep-2019	Matt Arsenault <[email protected]>	AMDGPU: Remove custom node for RSQ_LEGACY Directly select from the intrinsic. This wasn't getting much value from the custom node.
# e87ec667	05-Apr-2020	Matt Arsenault <[email protected]>	AMDGPU/GlobalISel: Fix llvm.amdgcn.div.fmas.ll
# 4c1c9422	18-Feb-2020	Matt Arsenault <[email protected]>	AMDGPU/GlobalISel: Select llvm.amdgcn.fdot2 I'm slighly worried about the generated checks, since they won't catch incorrect modifiers being added at the end of the line.
# fab4cdea	20-Feb-2020	Matt Arsenault <[email protected]>	AMDGPU/GlobalISel: Select llvm.amdgcn.fmul.legacy
# 361f2a78	10-Feb-2020	Matt Arsenault <[email protected]>	AMDGPU/GlobalISel: Handle sbfe/ubfe intrinsic Try to handle arbitrary scalar BFEs by packing the operands. The DAG gives up on non-constant arguments. We're still missing any constant folding, so we AMDGPU/GlobalISel: Handle sbfe/ubfe intrinsic Try to handle arbitrary scalar BFEs by packing the operands. The DAG gives up on non-constant arguments. We're still missing any constant folding, so we end up with pretty ugly code most of the time. Also handle the 64-bit scalar case, which the DAG doesn't try to do. show more ...
# d1b393d9	07-Feb-2020	Matt Arsenault <[email protected]>	AMDGPU/GlobalISel: Select G_CTTZ_ZERO_UNDEF Directly select this rather than going through the intermediate instruction, which may provide some combine value in the future.
# 045a8921	07-Feb-2020	Matt Arsenault <[email protected]>	AMDGPU/GlobalISel: Select G_CTLZ_ZERO_UNDEF Directly select this rather than going through the intermediate instruction, which may provide some combine value in the future.
# 00115d76	06-Jan-2020	Matt Arsenault <[email protected]>	AMDGPU: Remove dead kill handling At one point a custom node was used for kill handling, but now the intrinsic is directly selected. Remove leftover pattern machinery.
# 68b102b9	01-Jan-2020	Matt Arsenault <[email protected]>	AMDGPU: Directly select 16-bank LDS case of llvm.amdgcn.interp.p1.f16 Manually select this is as a tablegen workraound. Both SelectionDAG and GlobalISel end up misplacing the copy to m0 when both in AMDGPU: Directly select 16-bank LDS case of llvm.amdgcn.interp.p1.f16 Manually select this is as a tablegen workraound. Both SelectionDAG and GlobalISel end up misplacing the copy to m0 when both instructions in the output need it. Neither considers that both output instructions depend on m0. I don't know of any other pattern we need to handle this case, so it's less effort to just workaround this for now. show more ...
# eef92f25	05-Jan-2020	Matt Arsenault <[email protected]>	AMDGPU: Remove custom node for exports I'm mildly worried about potentially reordering exp/exp_done with IntrWriteMem on the intrinsic. Requires hacking out the illegal type on SI, so manually sele AMDGPU: Remove custom node for exports I'm mildly worried about potentially reordering exp/exp_done with IntrWriteMem on the intrinsic. Requires hacking out the illegal type on SI, so manually select that case during lowering. show more ...
# 7fa0bfe7	08-Sep-2019	Matt Arsenault <[email protected]>	AMDGPU/GlobalISel: Select mul24 intrinsics
12 3 4