|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
8871c3c5 |
| 27-Jun-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Regenerate MIR checks. NFC.
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
873a7ee7 |
| 10-Jan-2022 |
Nikita Popov <[email protected]> |
[MachineInstr] Don't include debug uses in bundle header (PR52817)
Following the recommendation in https://github.com/llvm/llvm-project/issues/52817#issuecomment-1007635426, this excludes debug inst
[MachineInstr] Don't include debug uses in bundle header (PR52817)
Following the recommendation in https://github.com/llvm/llvm-project/issues/52817#issuecomment-1007635426, this excludes debug instructions when finalizing the bundle. As uses in debug instructions don't have effects, they will no longer be included in the BUNDLE header.
Fixes https://github.com/llvm/llvm-project/issues/52817.
Differential Revision: https://reviews.llvm.org/D116945
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
f8e61546 |
| 18-Nov-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Fix SIPostRABundler crash on null register used by dbg value
Recently we started generate DBG_VALUEs with $noreg operands. This crashes SIPostRABundler, and it should not iterate these regi
[AMDGPU] Fix SIPostRABundler crash on null register used by dbg value
Recently we started generate DBG_VALUEs with $noreg operands. This crashes SIPostRABundler, and it should not iterate these registers anyway.
Fixes: SWDEV-311733
Differential Revision: https://reviews.llvm.org/D114202
show more ...
|
| #
24cc79b9 |
| 18-Nov-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Regenerate postra-bundle-memops.mir checks. NFC.
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
fae05692 |
| 20-May-2021 |
Matt Arsenault <[email protected]> |
CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted
CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted all of the tests already, but likely missed a few).
Not sure what the exact syntax and policy should be. We can continue printing the number of bytes for non-generic instructions to avoid test churn and only allow non-scalar types for generic instructions.
This will currently print the LLT in parentheses, but accept parsing the existing integers and implicitly converting to scalar. The parentheses are a bit ugly, but the parser logic seems unable to deal without either parentheses or some keyword to indicate the start of a type.
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
3bffb1cd |
| 09-Feb-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amoun
[AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amount of code. These operands are mostly 0 anyway.
Additional advantage that parser will accept these flags in any order unlike now.
Differential Revision: https://reviews.llvm.org/D96469
show more ...
|
| #
81b2c23b |
| 12-Feb-2021 |
Matt Arsenault <[email protected]> |
AMDGPU: Use kill instruction to hint soft clause live ranges
Previously we would use a bundle to hint the register allocator to not overwrite the pointers in a sequence of loads to avoid breaking so
AMDGPU: Use kill instruction to hint soft clause live ranges
Previously we would use a bundle to hint the register allocator to not overwrite the pointers in a sequence of loads to avoid breaking soft clauses. This bundling was based on a fuzzy register pressure heuristic, so we could not guarantee using more registers than are really available. This would result in register allocator failing on unsatisfiable bundles. Use a kill to artificially extend the live ranges, so we can always succeed at register allocation even if it means extra spills in the worst case.
This seems to capture most of the benefit of the bundle while avoiding most of the risk presented by the bundle. However the lit tests do show a handful of regressions. In some cases with sequences of volatile loads, unused load components end up getting reallocated to the next load which forces a wait between. There are also a few small scheduling regressions where a hazard used to be avoided, and one spill torture test which for some reason nearly doubles the stack usage. There is also a bit of noise from leftover kills (it may make sense for post-RA pseudos to strip all of these out).
show more ...
|
| #
a8d9d507 |
| 17-Feb-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
|
| #
a7455d7b |
| 14-Feb-2021 |
Matt Arsenault <[email protected]> |
AMDGPU: Remove kills following clusters of memory instruction
In a future commit, soft clauses will be hinted with kill instructions rather than forced together with bundles. Look for kills that loo
AMDGPU: Remove kills following clusters of memory instruction
In a future commit, soft clauses will be hinted with kill instructions rather than forced together with bundles. Look for kills that look like this, and erase them. I'm not sure if the check for specific uses is worthwhile, or if it would be better to just unconditionally erase kills.
This reduces test churn in a future patch.
show more ...
|
| #
c320e819 |
| 14-Feb-2021 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix debug info handling in post-RA bundler
This was allowing debug instructions to break the bundling, which would change scheduling behavior. Bundle debug info / kills inside the bundle. Th
AMDGPU: Fix debug info handling in post-RA bundler
This was allowing debug instructions to break the bundling, which would change scheduling behavior. Bundle debug info / kills inside the bundle. This seems to work OK, although the asm printer doesn't understand these in a bundle. This implicitly expects the memory legalizer to unbundle. It would probably be slightly nicer to move these after.
Rewrite the loop to be clearer and make sure we don't end a bundle on a meta instruction, only allow them in between other valid bundle instructions.
show more ...
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
f012c58a |
| 29-May-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Move MIMG MMO check to verifier
|
|
Revision tags: llvmorg-10.0.1-rc1 |
|
| #
2e94a64b |
| 10-Apr-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Define 16 bit SGPR subregs
These are needed as a counterpart for VGPR subregs even though there are no scalar instructions which can operate 16 bit values. When we are materializing a const
[AMDGPU] Define 16 bit SGPR subregs
These are needed as a counterpart for VGPR subregs even though there are no scalar instructions which can operate 16 bit values. When we are materializing a constant that is done into an SGPR and that SGPR may/will be copied into a 16 bit VGPR subreg. Such copy is illegal. There are also similar problems if a source operand of a 16 bit VALU instruction is an SGPR. In addition we need to get a register with a lo16 subregister of an SGPR RC during selection and this fails as well.
All of that makes me believe we need these subregisters as a syntactic glue.
Differential Revision: https://reviews.llvm.org/D78250
show more ...
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2 |
|
| #
08682dcc |
| 05-Feb-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Define 16 bit VGPR subregs
We have loads preserving low and high 16 bits of their destinations. However, we always use a whole 32 bit register for these. The same happens with 16 bit stores
[AMDGPU] Define 16 bit VGPR subregs
We have loads preserving low and high 16 bits of their destinations. However, we always use a whole 32 bit register for these. The same happens with 16 bit stores, we have to use full 32 bit register so if high bits are clobbered the register needs to be copied. One example of such code is added to the load-hi16.ll.
The proper solution to the problem is to define 16 bit subregs and use them in the operations which do not read another half of a VGPR or preserve it if the VGPR is written.
This patch simply defines subregisters and register classes. At the moment there should be no difference in code generation. A lot more work is needed to actually use these new register classes. Therefore, there are no new tests at this time.
Register weight calculation has changed with new subregs so appropriate changes were made to keep all calculations just as they are now, especially calculations of register pressure.
Differential Revision: https://reviews.llvm.org/D74873
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc1, llvmorg-11-init |
|
| #
555d8f4e |
| 13-Jan-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Bundle loads before post-RA scheduler
We are relying on atrificial DAG edges inserted by the MemOpClusterMutation to keep loads and stores together in the post-RA scheduler. This does not w
[AMDGPU] Bundle loads before post-RA scheduler
We are relying on atrificial DAG edges inserted by the MemOpClusterMutation to keep loads and stores together in the post-RA scheduler. This does not work all the time since it allows to schedule a completely independent instruction in the middle of the cluster.
Removed the DAG mutation and added pass to bundle already clustered instructions. These bundles are unpacked before the memory legalizer because it does not work with bundles but also because it allows to insert waitcounts in the middle of a store cluster.
Removing artificial edges also allows a more relaxed scheduling.
Differential Revision: https://reviews.llvm.org/D72737
show more ...
|