|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
5648f717 |
| 07-Sep-2021 |
Kazu Hirata <[email protected]> |
[Analysis, Target, Transforms] Construct SmallVector with iterator ranges (NFC)
|
|
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
d9b9fdd9 |
| 08-Jul-2021 |
Ruiling Song <[email protected]> |
[AMDGPU] Don't handle export done when unify exit nodes
This patch aims to revert the changes introduced by D70781 D71192 D76364
D70781 was introduced to fix hardware hang where we do not insert ex
[AMDGPU] Don't handle export done when unify exit nodes
This patch aims to revert the changes introduced by D70781 D71192 D76364
D70781 was introduced to fix hardware hang where we do not insert exp- null-done for a kill inside infinit loop. At that time we have not added exp-null-done for kill early termination, but I believe as for now, we will always add the exp-null-done for early termination case in LaterBranchLowering.
D71192 was introduced to handle the only_kill case, which is also been handled by the kill early termination work.
D76364 was used to fix a regression by D71192, where we cleared the done bit of the export in the existing program and not let the normal return block branching to the new unified return block.
With this change, we just trust frontends have setup exp-done correctly which is true for all existing frontends. The backend only inserts exp-null-done for the kill cases which is handled in SILateBranchLowering.cpp.
Reviewed by: critson
Differential Revision: https://reviews.llvm.org/D105610
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
092a3ce5 |
| 18-May-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Fix typo in comment
|
| #
dc2f6bf5 |
| 27-Apr-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Minor refactoring in AMDGPUUnifyDivergentExitNodes. NFC.
Make unifyReturnBlockSet a member function so we don't have to pass TTI around as an argument.
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3 |
|
| #
7925aa09 |
| 29-Jan-2021 |
Kazu Hirata <[email protected]> |
[llvm] Populate SmallVector at construction time (NFC)
|
|
Revision tags: llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
| #
985f899b |
| 03-Jan-2021 |
Kazu Hirata <[email protected]> |
[Target] Use llvm::append_range (NFC)
|
| #
7c8b8063 |
| 02-Jan-2021 |
Roman Lebedev <[email protected]> |
[SimplifyCFG][AMDGPU] AMDGPUUnifyDivergentExitNodes: SimplifyCFG isn't ready to preserve PostDomTree
There is a number of transforms in SimplifyCFG that take DomTree out of DomTreeUpdater, and do up
[SimplifyCFG][AMDGPU] AMDGPUUnifyDivergentExitNodes: SimplifyCFG isn't ready to preserve PostDomTree
There is a number of transforms in SimplifyCFG that take DomTree out of DomTreeUpdater, and do updates manually. Until they are fixed, user passes are unable to claim that PDT is preserved.
Note that the default for SimplifyCFG is still not to preserve DomTree, so this is still effectively NFC.
show more ...
|
| #
4b806473 |
| 01-Jan-2021 |
Roman Lebedev <[email protected]> |
[AMDGPU][SimplifyCFG] Teach AMDGPUUnifyDivergentExitNodes to preserve {,Post}DomTree
This is a (last big?) part of the patch series to make SimplifyCFG preserve DomTree. Currently, it still does not
[AMDGPU][SimplifyCFG] Teach AMDGPUUnifyDivergentExitNodes to preserve {,Post}DomTree
This is a (last big?) part of the patch series to make SimplifyCFG preserve DomTree. Currently, it still does not actually preserve it, even thought it is pretty much fully updated to preserve it.
Once the default is flipped, a valid DomTree must be passed into simplifyCFG, which means that whatever pass calls simplifyCFG, should also be smart about DomTree's.
As far as i can see from `check-llvm` with default flipped, this is the last LLVM test batch (other than bugpoint tests) that needed fixes to not break with default flipped.
The changes here are boringly identical to the ones i did over 42+ times/commits recently already, so while AMDGPU is outside of my normal ecosystem, i'm going to go for post-commit review here, like in all the other 42+ changes.
Note that while the pass is taught to preserve {,Post}DomTree, it still doesn't do that by default, because simplifycfg still doesn't do that by default, and flipping default in this pass will implicitly flip the default for simplifycfg. That will happen, but not right now.
show more ...
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
49dac4ac |
| 16-Dec-2020 |
Roman Lebedev <[email protected]> |
[SimplifyCFG] MergeBlockIntoPredecessor() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large number of tests, al
[SimplifyCFG] MergeBlockIntoPredecessor() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large number of tests, all of which are marked as such so that they do not regress.
show more ...
|
|
Revision tags: llvmorg-11.0.1-rc1 |
|
| #
ad3ec089 |
| 13-Nov-2020 |
Jay Foad <[email protected]> |
[AMDGPU] One more use of the new export target names. NFC.
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1 |
|
| #
4028409d |
| 16-Jul-2020 |
Roman Lebedev <[email protected]> |
Reland "[NFC] SimplifyCFGOptions: drop multi-parameter ctor, use default member-init"
This reverts commit 5831e86190966d58385678eb74b26aefacbfd101, which reverted commit 90c1b0442a031d6cad686fdc4e5d
Reland "[NFC] SimplifyCFGOptions: drop multi-parameter ctor, use default member-init"
This reverts commit 5831e86190966d58385678eb74b26aefacbfd101, which reverted commit 90c1b0442a031d6cad686fdc4e5d3db03c3603a6 in preparation for reverting commit b2018198c32a0535bb1f5bb5b40fbcf50d8d47b7 in commit 1067d3e176ea7b0b1942c163bf8c6c90107768c1 due to the introducton of a dependency cycle.
Now that the other revert is reverted with a fix, this can be relanded.
show more ...
|
| #
5831e861 |
| 16-Jul-2020 |
Adrian Kuegel <[email protected]> |
Revert "[NFC] SimplifyCFGOptions: drop multi-parameter ctor, use default member-init"
This reverts commit 90c1b0442a031d6cad686fdc4e5d3db03c3603a6. This is based on another commit which also needs t
Revert "[NFC] SimplifyCFGOptions: drop multi-parameter ctor, use default member-init"
This reverts commit 90c1b0442a031d6cad686fdc4e5d3db03c3603a6. This is based on another commit which also needs to be reverted. The other commit introduced a Dependency Cycle between Transforms/Scalar and TransformUtils. Scalar already depends (in many ways) on TransformUtils, so making TransformUtils depend on Scalar should be avoided.
show more ...
|
| #
90c1b044 |
| 15-Jul-2020 |
Roman Lebedev <[email protected]> |
[NFC] SimplifyCFGOptions: drop multi-parameter ctor, use default member-init
Likewise, just use the builder pattern. Taking multiple params is unmaintainable.
|
|
Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
dfcc68c5 |
| 18-May-2020 |
Nicolai Hähnle <[email protected]> |
DomTree: Remove getRoots() accessor
Summary: Avoid exposing details about how roots are stored. This enables subsequent type-erasure changes.
v5: - cleanup a unit test by using EXPECT_EQ instead of
DomTree: Remove getRoots() accessor
Summary: Avoid exposing details about how roots are stored. This enables subsequent type-erasure changes.
v5: - cleanup a unit test by using EXPECT_EQ instead of EXPECT_TRUE
Change-Id: I532b774cc71f2224e543bc7d79131d97f63f093d
Reviewers: arsenm, RKSimon, mehdi_amini, courbet
Subscribers: jvesely, wdng, hiraditya, kuhar, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83085
show more ...
|
| #
72e4da45 |
| 05-Jun-2020 |
Jay Foad <[email protected]> |
Correctly report modified status for AMDGPUUnifyDivergentExitNodes
Related to https://reviews.llvm.org/D80916
Differential Revision: https://reviews.llvm.org/D81271
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5 |
|
| #
d1a7bfca |
| 18-Mar-2020 |
Piotr Sobczak <[email protected]> |
[AMDGPU] Fix AMDGPUUnifyDivergentExitNodes
Summary: For the case where "done" bits on existing exports are removed by unifyReturnBlockSet(), unify all return blocks - even the uniformly reached ones
[AMDGPU] Fix AMDGPUUnifyDivergentExitNodes
Summary: For the case where "done" bits on existing exports are removed by unifyReturnBlockSet(), unify all return blocks - even the uniformly reached ones. We do not want to end up with a non-unified, uniformly reached block containing a normal export with the "done" bit cleared.
That case is believed to be rare - possible with infinite loops in pixel shaders.
This is a fix for D71192.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76364
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3 |
|
| #
ce06d507 |
| 09-Dec-2019 |
Connor Abbott <[email protected]> |
AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns
Summary: The code was assuming in a few places that if there was only one exit from the function that it was a normal return, which i
AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns
Summary: The code was assuming in a few places that if there was only one exit from the function that it was a normal return, which is invalid. It could be an infinite loop, in which case we still need to insert the usual fake edge so that the null export happens. This fixes shaders that end with an infinite loop that discards.
Reviewers: arsenm, nhaehnle, critson
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71192
show more ...
|
|
Revision tags: llvmorg-9.0.1-rc2 |
|
| #
87d98c14 |
| 27-Nov-2019 |
Connor Abbott <[email protected]> |
AMDGPU: Fix handling of infinite loops in fragment shaders
Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provab
AMDGPU: Fix handling of infinite loops in fragment shaders
Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provably infinite loops that are actually supposed to end successfully. The AMDGPUUnifyDivergentExitNodes pass breaks up these loops, but because there's no obvious place to make the loop branch to, it just makes it return immediately, which skips the exports that are supposed to happen at the end and hangs the GPU if all the threads end up being killed.
While it would be nice if the fact that kill terminates the thread were modeled in the IR, I think that the structurizer as-is would make a mess if we did that when the kill is inside control flow. For now, we just add a null export at the end to make sure that it always exports something, which fixes the immediate problem without penalizing the more common case. This means that we sometimes do two "done" exports when only some of the threads enter the discard loop, but from tests the hardware seems ok with that.
This fixes dEQP-VK.graphicsfuzz.while-inside-switch with radv.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70781
show more ...
|
| #
08b205bb |
| 29-Jan-2020 |
Connor Abbott <[email protected]> |
Revert "AMDGPU: Fix handling of infinite loops in fragment shaders"
This reverts commit 0994c485e61322a04e580d83617eab547292aba2.
|
| #
13ab22ab |
| 29-Jan-2020 |
Connor Abbott <[email protected]> |
Revert "AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns"
This reverts commit 323bfde20c5f3e63db3d6b385b394ed38542abe6.
|
| #
323bfde2 |
| 09-Dec-2019 |
Connor Abbott <[email protected]> |
AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns
Summary: The code was assuming in a few places that if there was only one exit from the function that it was a normal return, which i
AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns
Summary: The code was assuming in a few places that if there was only one exit from the function that it was a normal return, which is invalid. It could be an infinite loop, in which case we still need to insert the usual fake edge so that the null export happens. This fixes shaders that end with an infinite loop that discards.
Reviewers: arsenm, nhaehnle, critson
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71192
show more ...
|
| #
0994c485 |
| 27-Nov-2019 |
Connor Abbott <[email protected]> |
AMDGPU: Fix handling of infinite loops in fragment shaders
Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provab
AMDGPU: Fix handling of infinite loops in fragment shaders
Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provably infinite loops that are actually supposed to end successfully. The AMDGPUUnifyDivergentExitNodes pass breaks up these loops, but because there's no obvious place to make the loop branch to, it just makes it return immediately, which skips the exports that are supposed to happen at the end and hangs the GPU if all the threads end up being killed.
While it would be nice if the fact that kill terminates the thread were modeled in the IR, I think that the structurizer as-is would make a mess if we did that when the kill is inside control flow. For now, we just add a null export at the end to make sure that it always exports something, which fixes the immediate problem without penalizing the more common case. This means that we sometimes do two "done" exports when only some of the threads enter the discard loop, but from tests the hardware seems ok with that.
This fixes dEQP-VK.graphicsfuzz.while-inside-switch with radv.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70781
show more ...
|
|
Revision tags: llvmorg-9.0.1-rc1 |
|
| #
05da2fe5 |
| 13-Nov-2019 |
Reid Kleckner <[email protected]> |
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of reco
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
show more ...
|
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3 |
|
| #
688afeb8 |
| 25-Jun-2019 |
Diego Novillo <[email protected]> |
Update phis in AMDGPUUnifyDivergentExitNodes
Original patch https://reviews.llvm.org/D63659 from Steven Perron <[email protected]>
The pass AMDGPUUnifyDivergentExitNodes does not update the p
Update phis in AMDGPUUnifyDivergentExitNodes
Original patch https://reviews.llvm.org/D63659 from Steven Perron <[email protected]>
The pass AMDGPUUnifyDivergentExitNodes does not update the phi nodes in the successors of blocks that is splits. This is fixed by calling BasicBlock::splitBasicBlock to split the block instead of doing it manually. This does extra work because a new conditional branch is created in BB which is immediately replaced, but I think the simplicity is worth it. It also helps make the code more future proof in case other things need to be updated.
llvm-svn: 364342
show more ...
|