|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
3e0bf1c7 |
| 14-Jul-2022 |
David Green <[email protected]> |
[CodeGen] Move instruction predicate verification to emitInstruction
D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added
[CodeGen] Move instruction predicate verification to emitInstruction
D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses.
This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally.
The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments.
Recommitted with some fixes for the leftover MCII variables in release builds.
Differential Revision: https://reviews.llvm.org/D129506
show more ...
|
| #
95252133 |
| 13-Jul-2022 |
David Green <[email protected]> |
Revert "Move instruction predicate verification to emitInstruction"
This reverts commit e2fb8c0f4b940e0285ee36c112469fa75d4b60ff as it does not build for Release builds, and some buildbots are givin
Revert "Move instruction predicate verification to emitInstruction"
This reverts commit e2fb8c0f4b940e0285ee36c112469fa75d4b60ff as it does not build for Release builds, and some buildbots are giving more warning than I saw locally. Reverting to fix those issues.
show more ...
|
| #
e2fb8c0f |
| 13-Jul-2022 |
David Green <[email protected]> |
Move instruction predicate verification to emitInstruction
D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Targ
Move instruction predicate verification to emitInstruction
D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses.
This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally.
The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments.
Differential Revision: https://reviews.llvm.org/D129506
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
2db70021 |
| 25-Mar-2022 |
Austin Kerbow <[email protected]> |
[AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic
Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If there is a need to have highly optimized codegen and kernel developers
[AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic
Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If there is a need to have highly optimized codegen and kernel developers have knowledge of inter-wave runtime behavior which is unknown to the compiler this builtin can be used to tune scheduling.
This intrinsic creates a barrier between scheduling regions. The immediate parameter is a mask to determine the types of instructions that should be prevented from crossing the sched_barrier. In this initial patch, there are only two variations. A mask of 0 means that no instructions may be scheduled across the sched_barrier. A mask of 1 means that non-memory, non-side-effect inducing instructions may cross the sched_barrier.
Note that this intrinsic is only meant to work with the scheduling passes. Any other transformations that may move code will not be impacted in the ways described above.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D124700
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
04fff547 |
| 07-Mar-2022 |
Venkata Ramanaiah Nalamothu <[email protected]> |
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls.
But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow.
This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering.
And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`.
As an added benefit, this patch simplifies overall return instruction handling.
Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code.
Reviewed By: arsenm, ronlieb
Differential Revision: https://reviews.llvm.org/D114652
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
2aed07e9 |
| 16-Feb-2022 |
Shao-Ce SUN <[email protected]> |
[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D119846
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
09b53296 |
| 22-Dec-2021 |
Ron Lieberman <[email protected]> |
Revert "[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range"
This reverts commit 9075009d1fd5f2bf9aa6c2f362d2993691a316b3.
Failed amdgpu runtime buildbot # 3514
|
| #
9075009d |
| 22-Dec-2021 |
RamNalamothu <[email protected]> |
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a
[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range
Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls.
But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow.
This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering.
And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`.
As an added benefit, this patch simplifies overall return instruction handling.
Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D114652
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
5b8bbbec |
| 18-Nov-2021 |
Zarko Todorovski <[email protected]> |
[NFC][llvm] Inclusive language: reword and remove uses of sanity in llvm/lib/Target
Reworded removed code comments that contain `sanity check` and `sanity test`.
|
| #
76cbe622 |
| 25-Oct-2021 |
Thomas Symalla <[email protected]> |
[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not goin
[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not going to change the argument values.
This patch changes the AMDGPU_Gfx calling convention. It defines the SGPR registers s[4:29] as callee-save and leaves some SGPRs usable for callers. The intention is to avoid unneccessary s_mov instructions for arguments the caller would otherwise save and restore in these registers.
Reviewed By: sebastian-ne
Differential Revision: https://reviews.llvm.org/D111637
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
47d6274d |
| 07-Sep-2021 |
Daniil Fukalov <[email protected]> |
[NFC][AMDGPU] Reduce includes dependencies, part 2
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Minor forward declarations, red
[NFC][AMDGPU] Reduce includes dependencies, part 2
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Minor forward declarations, redundant includes and flags in GCNSubtarget cleanup.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D109351
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
bf980930 |
| 16-Jul-2021 |
Sebastian Neubauer <[email protected]> |
[AMDGPU] Ignore KILLs when forming clauses
KILL instructions are sometimes present and prevented hard clauses from being formed.
Fix this by ignoring all meta instructions in clauses.
Differential
[AMDGPU] Ignore KILLs when forming clauses
KILL instructions are sometimes present and prevented hard clauses from being formed.
Fix this by ignoring all meta instructions in clauses.
Differential Revision: https://reviews.llvm.org/D106042
show more ...
|
| #
48958d02 |
| 23-Aug-2021 |
Daniil Fukalov <[email protected]> |
[NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `G
[NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()` and `R600TargetMachine::getSubtargetImpl()` had different return value type than base class. 4. Minor forward declarations cleanup.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D108596
show more ...
|
| #
b0402a35 |
| 18-Jul-2021 |
Michael Liao <[email protected]> |
[amdgpu] Add 64-bit PC support when expanding unconditional branches.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D106445
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3 |
|
| #
f0ccdde3 |
| 25-Feb-2021 |
Ruiling Song <[email protected]> |
[AMDGPU] Remove SI_MASK_BRANCH
This is already deprecated, so remove code working on this. Also update the tests by using S_CBRANCH_EXECZ instead of SI_MASK_BRANCH.
Reviewed By: foad
Differential
[AMDGPU] Remove SI_MASK_BRANCH
This is already deprecated, so remove code working on this. Also update the tests by using S_CBRANCH_EXECZ instead of SI_MASK_BRANCH.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D97545
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
560d7e04 |
| 20-Jan-2021 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
|
Revision tags: llvmorg-11.1.0-rc1 |
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
d6199647 |
| 23-Oct-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Increase branch size estimate with offset bug
This will be relaxed to insert a nop if the offset hits the bad value, so over estimate branch instruction sizes.
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
| #
bcd24b2d |
| 14-Feb-2020 |
Fangrui Song <[email protected]> |
[AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI*
|
|
Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init |
|
| #
aa708763 |
| 03-Jan-2020 |
Fangrui Song <[email protected]> |
[MC] Add parameter `Address` to MCInstPrinter::printInst
printInst prints a branch/call instruction as `b offset` (there are many variants on various targets) instead of `b address`.
It is a conven
[MC] Add parameter `Address` to MCInstPrinter::printInst
printInst prints a branch/call instruction as `b offset` (there are many variants on various targets) instead of `b address`.
It is a convention to use address instead of offset in most external symbolizers/disassemblers. This difference makes `llvm-objdump -d` output unsatisfactory.
Add `uint64_t Address` to printInst(), so that it can pass the argument to printInstruction(). `raw_ostream &OS` is moved to the last to be consistent with other print* methods.
The next step is to pass `Address` to printInstruction() (generated by tablegen from the instruction set description). We can gradually migrate targets to print addresses instead of offsets.
In any case, downstream projects which don't know `Address` can pass 0 as the argument.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D72172
show more ...
|
|
Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
| #
e2d104f6 |
| 11-Oct-2019 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] link dpp pseudos and real instructions on gfx10
This defaults to zero fi operand, but we do not expose it anyway. Should we expose it later it needs to be added to the pseudo.
This enables
[AMDGPU] link dpp pseudos and real instructions on gfx10
This defaults to zero fi operand, but we do not expose it anyway. Should we expose it later it needs to be added to the pseudo.
This enables dpp combining on gfx10.
Differential Revision: https://reviews.llvm.org/D68888
llvm-svn: 374604
show more ...
|
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3 |
|
| #
41abf276 |
| 16-Jun-2019 |
Nicolai Haehnle <[email protected]> |
AMDGPU: Prepare for explicit absolute relocations in code generation
Summary: We will use absolute relocations for LDS symbols.
Change-Id: I9a32795ed0ea835e433a787129cfe3c57ee9a325
Reviewers: arse
AMDGPU: Prepare for explicit absolute relocations in code generation
Summary: We will use absolute relocations for LDS symbols.
Change-Id: I9a32795ed0ea835e433a787129cfe3c57ee9a325
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61492
llvm-svn: 363517
show more ...
|
|
Revision tags: llvmorg-8.0.1-rc2 |
|
| #
0f8a764e |
| 05-Jun-2019 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix using 2 different enums for same operand flags
These enums are really for the same namespace of flags set on arbitrary MachineOperands, so merge them to avoid value collisions.
llvm-svn
AMDGPU: Fix using 2 different enums for same operand flags
These enums are really for the same namespace of flags set on arbitrary MachineOperands, so merge them to avoid value collisions.
llvm-svn: 362640
show more ...
|
|
Revision tags: llvmorg-8.0.1-rc1 |
|
| #
33cb8f5b |
| 14-May-2019 |
Tim Renouf <[email protected]> |
[AMDGPU] Fixed +DumpCode
The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allo
[AMDGPU] Fixed +DumpCode
The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allows the application to read back the disassembly of the code. Longer term, we should re-implement that by using the LLVM disassembler from the Vulkan driver.
Recent LLVM changes broke +DumpCode. With -filetype=asm it crashed, and with -filetype=obj I think it did not include any instructions, only the labels. Fixed with this commit: now it has no effect with -filetype=asm, and works as intended with -filetype=obj.
Differential Revision: https://reviews.llvm.org/D60682
Change-Id: I6436d86fe2ea220d74a643a85e64753747c9366b llvm-svn: 360688
show more ...
|
| #
c0bd7bd4 |
| 11-May-2019 |
Richard Trieu <[email protected]> |
[AMDGPU] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targ
[AMDGPU] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure.
llvm-svn: 360487
show more ...
|