AMDGPUMCInstLower.cpp - OpenGrok history log for /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 3e0bf1c7	14-Jul-2022	David Green <[email protected]>	[CodeGen] Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added [CodeGen] Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses. This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally. The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments. Recommitted with some fixes for the leftover MCII variables in release builds. Differential Revision: https://reviews.llvm.org/D129506 show more ...
# 95252133	13-Jul-2022	David Green <[email protected]>	Revert "Move instruction predicate verification to emitInstruction" This reverts commit e2fb8c0f4b940e0285ee36c112469fa75d4b60ff as it does not build for Release builds, and some buildbots are givin Revert "Move instruction predicate verification to emitInstruction" This reverts commit e2fb8c0f4b940e0285ee36c112469fa75d4b60ff as it does not build for Release builds, and some buildbots are giving more warning than I saw locally. Reverting to fix those issues. show more ...
# e2fb8c0f	13-Jul-2022	David Green <[email protected]>	Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Targ Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses. This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally. The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments. Differential Revision: https://reviews.llvm.org/D129506 show more ...
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 2db70021	25-Mar-2022	Austin Kerbow <[email protected]>	[AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If there is a need to have highly optimized codegen and kernel developers [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If there is a need to have highly optimized codegen and kernel developers have knowledge of inter-wave runtime behavior which is unknown to the compiler this builtin can be used to tune scheduling. This intrinsic creates a barrier between scheduling regions. The immediate parameter is a mask to determine the types of instructions that should be prevented from crossing the sched_barrier. In this initial patch, there are only two variations. A mask of 0 means that no instructions may be scheduled across the sched_barrier. A mask of 1 means that non-memory, non-side-effect inducing instructions may cross the sched_barrier. Note that this intrinsic is only meant to work with the scheduling passes. Any other transformations that may move code will not be impacted in the ways described above. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D124700 show more ...
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 04fff547	07-Mar-2022	Venkata Ramanaiah Nalamothu <[email protected]>	[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a [AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls. But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow. This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering. And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`. As an added benefit, this patch simplifies overall return instruction handling. Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code. Reviewed By: arsenm, ronlieb Differential Revision: https://reviews.llvm.org/D114652 show more ...
Revision tags: llvmorg-14.0.0-rc2
# 2aed07e9	16-Feb-2022	Shao-Ce SUN <[email protected]>	[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter` Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 09b53296	22-Dec-2021	Ron Lieberman <[email protected]>	Revert "[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range" This reverts commit 9075009d1fd5f2bf9aa6c2f362d2993691a316b3. Failed amdgpu runtime buildbot # 3514
# 9075009d	22-Dec-2021	RamNalamothu <[email protected]>	[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added a [AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls. But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow. This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering. And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`. As an added benefit, this patch simplifies overall return instruction handling. Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D114652 show more ...
Revision tags: llvmorg-13.0.1-rc1
# 5b8bbbec	18-Nov-2021	Zarko Todorovski <[email protected]>	[NFC][llvm] Inclusive language: reword and remove uses of sanity in llvm/lib/Target Reworded removed code comments that contain `sanity check` and `sanity test`.
# 76cbe622	25-Oct-2021	Thomas Symalla <[email protected]>	[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not goin [AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not going to change the argument values. This patch changes the AMDGPU_Gfx calling convention. It defines the SGPR registers s[4:29] as callee-save and leaves some SGPRs usable for callers. The intention is to avoid unneccessary s_mov instructions for arguments the caller would otherwise save and restore in these registers. Reviewed By: sebastian-ne Differential Revision: https://reviews.llvm.org/D111637 show more ...
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# 47d6274d	07-Sep-2021	Daniil Fukalov <[email protected]>	[NFC][AMDGPU] Reduce includes dependencies, part 2 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Minor forward declarations, red [NFC][AMDGPU] Reduce includes dependencies, part 2 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Minor forward declarations, redundant includes and flags in GCNSubtarget cleanup. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D109351 show more ...
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# bf980930	16-Jul-2021	Sebastian Neubauer <[email protected]>	[AMDGPU] Ignore KILLs when forming clauses KILL instructions are sometimes present and prevented hard clauses from being formed. Fix this by ignoring all meta instructions in clauses. Differential [AMDGPU] Ignore KILLs when forming clauses KILL instructions are sometimes present and prevented hard clauses from being formed. Fix this by ignoring all meta instructions in clauses. Differential Revision: https://reviews.llvm.org/D106042 show more ...
# 48958d02	23-Aug-2021	Daniil Fukalov <[email protected]>	[NFC][AMDGPU] Reduce includes dependencies. 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `G [NFC][AMDGPU] Reduce includes dependencies. 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()` and `R600TargetMachine::getSubtargetImpl()` had different return value type than base class. 4. Minor forward declarations cleanup. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D108596 show more ...
# b0402a35	18-Jul-2021	Michael Liao <[email protected]>	[amdgpu] Add 64-bit PC support when expanding unconditional branches. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D106445
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3
# f0ccdde3	25-Feb-2021	Ruiling Song <[email protected]>	[AMDGPU] Remove SI_MASK_BRANCH This is already deprecated, so remove code working on this. Also update the tests by using S_CBRANCH_EXECZ instead of SI_MASK_BRANCH. Reviewed By: foad Differential [AMDGPU] Remove SI_MASK_BRANCH This is already deprecated, so remove code working on this. Also update the tests by using S_CBRANCH_EXECZ instead of SI_MASK_BRANCH. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D97545 show more ...
Revision tags: llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04	20-Jan-2021	dfukalov <[email protected]>	[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets ... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036
Revision tags: llvmorg-11.1.0-rc1
# 6a87e9b0	25-Dec-2020	dfukalov <[email protected]>	[NFC][AMDGPU] Reduce include files dependency. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# d6199647	23-Oct-2020	Matt Arsenault <[email protected]>	AMDGPU: Increase branch size estimate with offset bug This will be relaxed to insert a nop if the offset hits the bad value, so over estimate branch instruction sizes.
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3
# bcd24b2d	14-Feb-2020	Fangrui Song <[email protected]>	[AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI*
Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init
# aa708763	03-Jan-2020	Fangrui Song <[email protected]>	[MC] Add parameter `Address` to MCInstPrinter::printInst printInst prints a branch/call instruction as `b offset` (there are many variants on various targets) instead of `b address`. It is a conven [MC] Add parameter `Address` to MCInstPrinter::printInst printInst prints a branch/call instruction as `b offset` (there are many variants on various targets) instead of `b address`. It is a convention to use address instead of offset in most external symbolizers/disassemblers. This difference makes `llvm-objdump -d` output unsatisfactory. Add `uint64_t Address` to printInst(), so that it can pass the argument to printInstruction(). `raw_ostream &OS` is moved to the last to be consistent with other print* methods. The next step is to pass `Address` to printInstruction() (generated by tablegen from the instruction set description). We can gradually migrate targets to print addresses instead of offsets. In any case, downstream projects which don't know `Address` can pass 0 as the argument. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72172 show more ...
Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# e2d104f6	11-Oct-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] link dpp pseudos and real instructions on gfx10 This defaults to zero fi operand, but we do not expose it anyway. Should we expose it later it needs to be added to the pseudo. This enables [AMDGPU] link dpp pseudos and real instructions on gfx10 This defaults to zero fi operand, but we do not expose it anyway. Should we expose it later it needs to be added to the pseudo. This enables dpp combining on gfx10. Differential Revision: https://reviews.llvm.org/D68888 llvm-svn: 374604 show more ...
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# 41abf276	16-Jun-2019	Nicolai Haehnle <[email protected]>	AMDGPU: Prepare for explicit absolute relocations in code generation Summary: We will use absolute relocations for LDS symbols. Change-Id: I9a32795ed0ea835e433a787129cfe3c57ee9a325 Reviewers: arse AMDGPU: Prepare for explicit absolute relocations in code generation Summary: We will use absolute relocations for LDS symbols. Change-Id: I9a32795ed0ea835e433a787129cfe3c57ee9a325 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61492 llvm-svn: 363517 show more ...
Revision tags: llvmorg-8.0.1-rc2
# 0f8a764e	05-Jun-2019	Matt Arsenault <[email protected]>	AMDGPU: Fix using 2 different enums for same operand flags These enums are really for the same namespace of flags set on arbitrary MachineOperands, so merge them to avoid value collisions. llvm-svn AMDGPU: Fix using 2 different enums for same operand flags These enums are really for the same namespace of flags set on arbitrary MachineOperands, so merge them to avoid value collisions. llvm-svn: 362640 show more ...
Revision tags: llvmorg-8.0.1-rc1
# 33cb8f5b	14-May-2019	Tim Renouf <[email protected]>	[AMDGPU] Fixed +DumpCode The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allo [AMDGPU] Fixed +DumpCode The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allows the application to read back the disassembly of the code. Longer term, we should re-implement that by using the LLVM disassembler from the Vulkan driver. Recent LLVM changes broke +DumpCode. With -filetype=asm it crashed, and with -filetype=obj I think it did not include any instructions, only the labels. Fixed with this commit: now it has no effect with -filetype=asm, and works as intended with -filetype=obj. Differential Revision: https://reviews.llvm.org/D60682 Change-Id: I6436d86fe2ea220d74a643a85e64753747c9366b llvm-svn: 360688 show more ...
# c0bd7bd4	11-May-2019	Richard Trieu <[email protected]>	[AMDGPU] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targ [AMDGPU] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360487 show more ...
12 3