|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
04fff547 |
| 07-Mar-2022 |
Venkata Ramanaiah Nalamothu <[email protected]> |
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls.
But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow.
This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering.
And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`.
As an added benefit, this patch simplifies overall return instruction handling.
Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code.
Reviewed By: arsenm, ronlieb
Differential Revision: https://reviews.llvm.org/D114652
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
09b53296 |
| 22-Dec-2021 |
Ron Lieberman <[email protected]> |
Revert "[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range"
This reverts commit 9075009d1fd5f2bf9aa6c2f362d2993691a316b3.
Failed amdgpu runtime buildbot # 3514
|
| #
9075009d |
| 22-Dec-2021 |
RamNalamothu <[email protected]> |
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls.
But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow.
This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering.
And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`.
As an added benefit, this patch simplifies overall return instruction handling.
Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D114652
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
654c89d8 |
| 06-Sep-2021 |
Christudasan Devadasan <[email protected]> |
[AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to
[AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to enable copy between VGPR and AGPR registers during regalloc.
Also, added the missing AV register classes from 192b to 1024b.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D109300
show more ...
|
| #
76cbe622 |
| 25-Oct-2021 |
Thomas Symalla <[email protected]> |
[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not goin
[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not going to change the argument values.
This patch changes the AMDGPU_Gfx calling convention. It defines the SGPR registers s[4:29] as callee-save and leaves some SGPRs usable for callers. The intention is to avoid unneccessary s_mov instructions for arguments the caller would otherwise save and restore in these registers.
Reviewed By: sebastian-ne
Differential Revision: https://reviews.llvm.org/D111637
show more ...
|
| #
f0331100 |
| 25-Oct-2021 |
Thomas Symalla <[email protected]> |
[AMDGPU] Regenerate some tests with the current version of update_mir_test_checks.py
|
|
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
c7f2f81f |
| 17-Jul-2021 |
Carl Ritson <[email protected]> |
[AMDGPU] Tidy SReg/SGPR definitions using template class
Use a multiclass to consistently define SReg/SGPR/TTMP register classes. Add missing TTMP registers for 96b, 160b, 192b, 224b.
Reviewed By:
[AMDGPU] Tidy SReg/SGPR definitions using template class
Use a multiclass to consistently define SReg/SGPR/TTMP register classes. Add missing TTMP registers for 96b, 160b, 192b, 224b.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D105800
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
78b6d73a |
| 19-Feb-2021 |
Matt Arsenault <[email protected]> |
AMDGPU: Add even aligned VGPR/AGPR register classes
gfx90a operations require even aligned registers, but this was previously achieved by reserving registers inside the full class.
Ideally this wou
AMDGPU: Add even aligned VGPR/AGPR register classes
gfx90a operations require even aligned registers, but this was previously achieved by reserving registers inside the full class.
Ideally this would be captured in the static instruction definitions for the operands, and we would have different instructions per subtarget. The hackiest part of this is we need to manually reassign AGPR register classes after instruction selection (we get away without this for VGPRs since those types are actually registered for legal types).
show more ...
|
| #
a8d9d507 |
| 17-Feb-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3 |
|
| #
af2cbe8e |
| 02-Feb-2021 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix adding extra operands for i128 asm constraints
We don't register i128 as a legal type with addRegisterClass, but it appears in the list of legal register types. This inconsistency result
AMDGPU: Fix adding extra operands for i128 asm constraints
We don't register i128 as a legal type with addRegisterClass, but it appears in the list of legal register types. This inconsistency resulted in the asm constraint lowering trying to use 2 128-bit registers for these operands. This would leave behind a dead def that would waste registers.
Regresses GlobalISel tests for i128 load/store, but these aren't very important right now. Ideally these would not depend on the list of register types.
show more ...
|