|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
8b3ed1fa |
| 17-Jul-2022 |
Kazu Hirata <[email protected]> |
Remove redundant return statements (NFC)
Identified with readability-redundant-control-flow.
|
| #
b941857b |
| 28-Jun-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] New cost model for loop interchange
This is another attempt to land this patch.
The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache
[LoopInterchange] New cost model for loop interchange
This is another attempt to land this patch.
The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis.
Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern.
Updates made to test cases are mostly minor changes and some corrections. One change that applies to all tests is that we added an option `-cache-line-size=64` to the RUN lines. This is ensure that loop cache analysis receives a valid number of cache line size for correct analysis. Test coverage for loop interchange is not reduced.
Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124926
show more ...
|
| #
878309cc |
| 23-Jun-2022 |
Evgenii Stepanov <[email protected]> |
Revert "[LoopInterchange] New cost model for loop interchange"
llvm/lib/Analysis/LoopCacheAnalysis.cpp:702:30: runtime error: signed integer overflow: 6148914691236517209 * 100 cannot be represented
Revert "[LoopInterchange] New cost model for loop interchange"
llvm/lib/Analysis/LoopCacheAnalysis.cpp:702:30: runtime error: signed integer overflow: 6148914691236517209 * 100 cannot be represented in type 'long'
https://lab.llvm.org/buildbot/#/builders/5/builds/25185
This reverts commit 1b24fe34b06cd9f2337313f513a8b19f9a37c5de.
show more ...
|
| #
1b24fe34 |
| 23-Jun-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] New cost model for loop interchange
This is the second attempt to land this patch.
The patch proposed to use a new cost model for loop interchange, which is obtained from loop cac
[LoopInterchange] New cost model for loop interchange
This is the second attempt to land this patch.
The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis.
Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern.
Updates made to test cases are mostly minor changes and some corrections. One change that applies to all tests is that we added an option `-cache-line-size=64` to the RUN lines. This is ensure that loop cache analysis receives a valid number of cache line size for correct analysis. Test coverage for loop interchange is not reduced.
Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124926
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5 |
|
| #
f1940a58 |
| 03-Jun-2022 |
Daniil Suchkov <[email protected]> |
Revert "[LoopInterchange] New cost model for loop interchange"
Reverting the commit due to numerous buildbot failures.
This reverts commit 006334470d8d1b5d8f630890336fcb45795749d1.
|
| #
00633447 |
| 02-Jun-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] New cost model for loop interchange
This patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis.
Given a loopnest, what loop cache
[LoopInterchange] New cost model for loop interchange
This patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis.
Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern.
Updates made to test cases are mostly minor changes and some corrections. Test coverage for loop interchange is not reduced.
Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124926
show more ...
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
eac34875 |
| 06-Apr-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Try to achieve the most optimal access pattern after interchange
Motivated by pr43326 (https://bugs.llvm.org/show_bug.cgi?id=43326), where a slightly modified case is as follows.
[LoopInterchange] Try to achieve the most optimal access pattern after interchange
Motivated by pr43326 (https://bugs.llvm.org/show_bug.cgi?id=43326), where a slightly modified case is as follows.
void f(int e[10][10][10], int f[10][10][10]) { for (int a = 0; a < 10; a++) for (int b = 0; b < 10; b++) for (int c = 0; c < 10; c++) f[c][b][a] = e[c][b][a]; }
The ideal optimal access pattern after running interchange is supposed to be the following
void f(int e[10][10][10], int f[10][10][10]) { for (int c = 0; c < 10; c++) for (int b = 0; b < 10; b++) for (int a = 0; a < 10; a++) f[c][b][a] = e[c][b][a]; }
Currently loop interchange is limited to picking up the innermost loop and finding an order that is locally optimal for it. However, the pass failed to produce the globally optimal loop access order. For more complex examples what we get could be quite far from the globally optimal ordering.
What is proposed in this patch is to do a "bubble-sort" fashion when doing interchange. By comparing neighbors in `LoopList` in each iteration, we would be able to move each loop onto a most appropriate place, hence this is an approach that tries to achieve the globally optimal ordering.
The motivating example above is added as a test case.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D120386
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
abc8ca65 |
| 09-Mar-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Detect output dependency of a store instruction with itself
This patch is motivated by pr48057 where an output dependency is not detected since loop interchange did not check a sto
[LoopInterchange] Detect output dependency of a store instruction with itself
This patch is motivated by pr48057 where an output dependency is not detected since loop interchange did not check a store instruction with itself. Fixed that deficiency.
Reviewed By: bmahjour, Meinersbur, #loopoptwg
Differential Revision: https://reviews.llvm.org/D118102
show more ...
|
| #
59630917 |
| 02-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cl
Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1 |
|
| #
1ef04326 |
| 06-Feb-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Support loop interchange with floating point reductions
Enabled loop interchange support for floating point reductions if it is allowed to reorder floating point operations.
Previ
[LoopInterchange] Support loop interchange with floating point reductions
Enabled loop interchange support for floating point reductions if it is allowed to reorder floating point operations.
Previously when we encouter a floating point PHI node in the outer loop exit block, we bailed out since we could not detect floating point reductions in the early days. Now we remove this limiation since we are able to detect floating point reductions.
Reviewed By: #loopoptwg, Meinersbur
Differential Revision: https://reviews.llvm.org/D117450
show more ...
|
|
Revision tags: llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
fa6a2876 |
| 14-Jan-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Enable interchange with multiple inner loop indvars
Currently loop interchange only supports loops with one inner loop induction variable. This patch adds support for transformatio
[LoopInterchange] Enable interchange with multiple inner loop indvars
Currently loop interchange only supports loops with one inner loop induction variable. This patch adds support for transformation with more than one inner loop induction variables. The induction PHIs and induction increment instructions are moved/duplicated properly to the new outer header and the new outer latch, respectively.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D114917
show more ...
|
| #
37e34b74 |
| 13-Jan-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Enable interchange with multiple outer loop indvars
This patch enables loop interchange with multiple outer loop induction variables, and hence removes the limitation that only a s
[LoopInterchange] Enable interchange with multiple outer loop indvars
This patch enables loop interchange with multiple outer loop induction variables, and hence removes the limitation that only a single outer loop induction variable is supported. In fact, it turns out that the current pass already trivially supports multiple outer indvars, which is the result of a previous patch `https://reviews.llvm.org/D102743`. Therefore, this patch removed that limitation and provides test cases for multiple outer indvars.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D114916
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc2 |
|
| #
c251bfc3 |
| 06-Jan-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Remove a limitation in LoopInterchange legality
There was a limitation in legality that in the original inner loop latch, no instruction was allowed between the induction variable
[LoopInterchange] Remove a limitation in LoopInterchange legality
There was a limitation in legality that in the original inner loop latch, no instruction was allowed between the induction variable increment and the branch instruction. This is because we used to split the inner latch at the induction variable increment instruction. Since now we have split at the inner latch branch instruction and have properly duplicated instructions over to the split block, we remove this limitation.
Please refer to the test case updates to see how we now interchange loops where instructions exist between the induction variable increment and the branch instruction.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D115238
show more ...
|
| #
31b79b86 |
| 06-Jan-2022 |
David Blaikie <[email protected]> |
Revert "Remove unused variable (-Wunused)"
Patch that removed the use of this variable was reverted in 8ade3d43a3e48eb739c9db2f38b618fa213f0546
This reverts commit 3988a06d86e1a14dfd5f5fdae84ddbf9
Revert "Remove unused variable (-Wunused)"
Patch that removed the use of this variable was reverted in 8ade3d43a3e48eb739c9db2f38b618fa213f0546
This reverts commit 3988a06d86e1a14dfd5f5fdae84ddbf928e85dab.
show more ...
|
| #
8ade3d43 |
| 06-Jan-2022 |
Congzhe Cao <[email protected]> |
Revert "[LoopInterchange] Remove a limitation in LoopInterchange legality"
This reverts commit 15702ff9ce28b3f4aafec13be561359d4c721595 while I investigate a ppc build bot failure at https://lab.llv
Revert "[LoopInterchange] Remove a limitation in LoopInterchange legality"
This reverts commit 15702ff9ce28b3f4aafec13be561359d4c721595 while I investigate a ppc build bot failure at https://lab.llvm.org/buildbot#builders/36/builds/16051.
show more ...
|
| #
3988a06d |
| 06-Jan-2022 |
David Blaikie <[email protected]> |
Remove unused variable (-Wunused)
|
| #
15702ff9 |
| 06-Jan-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Remove a limitation in LoopInterchange legality
There was a limitation in legality that in the original inner loop latch, no instruction was allowed between the induction variable
[LoopInterchange] Remove a limitation in LoopInterchange legality
There was a limitation in legality that in the original inner loop latch, no instruction was allowed between the induction variable increment and the branch instruction. This is because we used to split the inner latch at the induction variable increment instruction. Since now we have split at the inner latch branch instruction and have properly duplicated instructions over to the split block, we remove this limitation.
Please refer to the test case updates to see how we now interchange loops where instructions exist between the induction variable increment and the branch instruction.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D115238
show more ...
|
| #
c16fd6a3 |
| 05-Jan-2022 |
Philip Reames <[email protected]> |
Rename doesNotReadMemory to onlyWritesMemory globally [NFC]
The naming has come up as a source of confusion in several recent reviews. onlyWritesMemory is consist with onlyReadsMemory which we use
Rename doesNotReadMemory to onlyWritesMemory globally [NFC]
The naming has come up as a source of confusion in several recent reviews. onlyWritesMemory is consist with onlyReadsMemory which we use for the corresponding readonly case as well.
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
9b8b1645 |
| 07-Nov-2021 |
Benjamin Kramer <[email protected]> |
Put implementation details into anonymous namespaces. NFCI.
|
| #
c714da2c |
| 31-Oct-2021 |
Kazu Hirata <[email protected]> |
[Transforms] Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC)
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
a7b7d22d |
| 16-Jul-2021 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Check lcssa phis in the inner latch in scenarios of multi-level nested loops
We already know that we need to check whether lcssa phis are supported in inner loop exit block or in o
[LoopInterchange] Check lcssa phis in the inner latch in scenarios of multi-level nested loops
We already know that we need to check whether lcssa phis are supported in inner loop exit block or in outer loop exit block, and we have logic to check them already. Presumably the inner loop latch does not have lcssa phis and there is no code that deals with lcssa phis in the inner loop latch. However, that assumption is not true, when we have loops with more than two-level nesting. This patch adds checks for lcssa phis in the inner latch.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D102300
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
bfefde22 |
| 31-May-2021 |
Congzhe Cao <[email protected]> |
[LoopInterhcange] Handle movement of reduction phis appropriately
This patch fixes pr43326 and pr48212.
Currently when we move reduction phis to the right place, loop interchange assumes the first
[LoopInterhcange] Handle movement of reduction phis appropriately
This patch fixes pr43326 and pr48212.
Currently when we move reduction phis to the right place, loop interchange assumes the first phi in loop headers is an induction phi, skips the first phi and assumes the rest of phis are candidate reduction phis to move. However, it may not always be the case.
This patch loops over all phis in loop headers and considers a phi node as a candidate reduction phi to move only when it is indeed a reduction phi across outer and inner loop.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D102743
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
3f8be15f |
| 12-May-2021 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Handle lcssa PHIs with multiple predecessors
This is a bugfix in the transformation phase.
If the original outer loop header branches to both the inner loop (header) and the outer
[LoopInterchange] Handle lcssa PHIs with multiple predecessors
This is a bugfix in the transformation phase.
If the original outer loop header branches to both the inner loop (header) and the outer loop latch, and if there is an lcssa PHI node outside the loop nest, then after interchange the new outer latch will have an lcssa PHI node inserted which has two predecessors, i.e., the original outer header and the original outer latch. Currently the transformation assumes it has only one predecessor (the original outer latch) and crashes, since the inserted lcssa PHI node does not take both predecessors as incoming BBs.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D100792
show more ...
|
| #
40e3aa39 |
| 11-May-2021 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Fix legality for triangular loops
This is a bug fix in legality check.
When we encounter triangular loops such as the following form: for (int i = 0; i < m; i++) for (in
[LoopInterchange] Fix legality for triangular loops
This is a bug fix in legality check.
When we encounter triangular loops such as the following form: for (int i = 0; i < m; i++) for (int j = 0; j < i; j++), or
for (int i = 0; i < m; i++) for (int j = 0; j*i < n; j++),
we should not perform interchange since the number of executions of the loop body will be different before and after interchange, resulting in incorrect results.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D101305
show more ...
|
| #
d3f89d4d |
| 11-May-2021 |
Congzhe Cao <[email protected]> |
Revert "[LoopInterchange] Fix legality for triangular loops"
This reverts commit 29342291d25b83da97e74d75004b177ba41114fc.
The test case requires an assert build. Will add REQUIRES and re-commit.
|