|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
b941857b |
| 28-Jun-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] New cost model for loop interchange
This is another attempt to land this patch.
The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache
[LoopInterchange] New cost model for loop interchange
This is another attempt to land this patch.
The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis.
Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern.
Updates made to test cases are mostly minor changes and some corrections. One change that applies to all tests is that we added an option `-cache-line-size=64` to the RUN lines. This is ensure that loop cache analysis receives a valid number of cache line size for correct analysis. Test coverage for loop interchange is not reduced.
Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124926
show more ...
|
| #
878309cc |
| 23-Jun-2022 |
Evgenii Stepanov <[email protected]> |
Revert "[LoopInterchange] New cost model for loop interchange"
llvm/lib/Analysis/LoopCacheAnalysis.cpp:702:30: runtime error: signed integer overflow: 6148914691236517209 * 100 cannot be represented
Revert "[LoopInterchange] New cost model for loop interchange"
llvm/lib/Analysis/LoopCacheAnalysis.cpp:702:30: runtime error: signed integer overflow: 6148914691236517209 * 100 cannot be represented in type 'long'
https://lab.llvm.org/buildbot/#/builders/5/builds/25185
This reverts commit 1b24fe34b06cd9f2337313f513a8b19f9a37c5de.
show more ...
|
| #
1b24fe34 |
| 23-Jun-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] New cost model for loop interchange
This is the second attempt to land this patch.
The patch proposed to use a new cost model for loop interchange, which is obtained from loop cac
[LoopInterchange] New cost model for loop interchange
This is the second attempt to land this patch.
The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis.
Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern.
Updates made to test cases are mostly minor changes and some corrections. One change that applies to all tests is that we added an option `-cache-line-size=64` to the RUN lines. This is ensure that loop cache analysis receives a valid number of cache line size for correct analysis. Test coverage for loop interchange is not reduced.
Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124926
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
eac34875 |
| 06-Apr-2022 |
Congzhe Cao <[email protected]> |
[LoopInterchange] Try to achieve the most optimal access pattern after interchange
Motivated by pr43326 (https://bugs.llvm.org/show_bug.cgi?id=43326), where a slightly modified case is as follows.
[LoopInterchange] Try to achieve the most optimal access pattern after interchange
Motivated by pr43326 (https://bugs.llvm.org/show_bug.cgi?id=43326), where a slightly modified case is as follows.
void f(int e[10][10][10], int f[10][10][10]) { for (int a = 0; a < 10; a++) for (int b = 0; b < 10; b++) for (int c = 0; c < 10; c++) f[c][b][a] = e[c][b][a]; }
The ideal optimal access pattern after running interchange is supposed to be the following
void f(int e[10][10][10], int f[10][10][10]) { for (int c = 0; c < 10; c++) for (int b = 0; b < 10; b++) for (int a = 0; a < 10; a++) f[c][b][a] = e[c][b][a]; }
Currently loop interchange is limited to picking up the innermost loop and finding an order that is locally optimal for it. However, the pass failed to produce the globally optimal loop access order. For more complex examples what we get could be quite far from the globally optimal ordering.
What is proposed in this patch is to do a "bubble-sort" fashion when doing interchange. By comparing neighbors in `LoopList` in each iteration, we would be able to move each loop onto a most appropriate place, hence this is an approach that tries to achieve the globally optimal ordering.
The motivating example above is added as a test case.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D120386
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5 |
|
| #
e4961218 |
| 11-Sep-2019 |
Florian Hahn <[email protected]> |
[LoopInterchange] Properly move condition, induction increment and ops to latch.
Currently we only rely on the induction increment to come before the condition to ensure the required instructions ge
[LoopInterchange] Properly move condition, induction increment and ops to latch.
Currently we only rely on the induction increment to come before the condition to ensure the required instructions get moved to the new latch.
This patch duplicates and moves the required instructions to the newly created latch. We move the condition to the end of the new block, then process its operands. We stop at operands that are defined outside the loop, or are the induction PHI.
We duplicate the instructions and update the uses in the moved instructions, to ensure other users remain intact. See the added test2 for such an example.
Reviewers: efriedma, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D67367
llvm-svn: 371595
show more ...
|
|
Revision tags: llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
| #
cee313d2 |
| 17-Apr-2019 |
Eric Christopher <[email protected]> |
Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.
Will be re-reverting again.
llvm-svn: 358552
|
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
| #
c8bd6ea3 |
| 01-Nov-2018 |
Florian Hahn <[email protected]> |
[LoopInterchange] Remove support for inner-only reductions.
Inner-loop only reductions require additional checks to make sure they form a load-phi-store cycle across inner and outer loop. Otherwise
[LoopInterchange] Remove support for inner-only reductions.
Inner-loop only reductions require additional checks to make sure they form a load-phi-store cycle across inner and outer loop. Otherwise the reduction value is not properly preserved. This patch disables interchanging such loops for now, as it causes miscompiles in some cases and it seems to apply only for a tiny amount of loops. Across the test-suite, SPEC2000 and SPEC2006, 61 instead of 62 loops are interchange with inner loop reduction support disabled. With -loop-interchange-threshold=-1000, 3256 instead of 3267.
See the discussion and history of D53027 for an outline of how such legality checks could look like.
Reviewers: efriedma, mcrosier, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D53027
llvm-svn: 345877
show more ...
|
| #
6feb6371 |
| 26-Sep-2018 |
Florian Hahn <[email protected]> |
[LoopInterchange] Preserve LCSSA.
This patch extends LoopInterchange to move LCSSA to the right place after interchanging. This is required for LoopInterchange to become a function pass.
An alterna
[LoopInterchange] Preserve LCSSA.
This patch extends LoopInterchange to move LCSSA to the right place after interchanging. This is required for LoopInterchange to become a function pass.
An alternative to the manual moving of the PHIs, we could also re-form the LCSSA phis for a set of interchanged loops, but that's more expensive.
Reviewers: efriedma, mcrosier, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D52154
llvm-svn: 343132
show more ...
|
| #
3afb974a |
| 14-Sep-2018 |
Florian Hahn <[email protected]> |
[LoopInterchange] Preserve ScalarEvolution, by forgetting about interchanged loops.
As preparation for LoopInterchange becoming a loop pass, it needs to preserve ScalarEvolution. Even though interch
[LoopInterchange] Preserve ScalarEvolution, by forgetting about interchanged loops.
As preparation for LoopInterchange becoming a loop pass, it needs to preserve ScalarEvolution. Even though interchanging should not change the trip count of the loop, it modifies loop entry, latch and exit blocks.
I added -verify-scev to some loop interchange tests, but the verification does not catch problems caused by missing invalidation of SE in loop interchange, as the trip counts themselves do not change. So there might be potential to make the SE verification covering more stuff in the future.
Reviewers: mkazantsev, efriedma, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D52026
llvm-svn: 342209
show more ...
|
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1 |
|
| #
831a7577 |
| 05-Apr-2018 |
Florian Hahn <[email protected]> |
[LoopInterchange] Preserve LoopInfo after interchanging.
LoopInterchange relies on LoopInfo being up-to-date, so we should preserve it after interchanging. This patch updates restructureLoops to mov
[LoopInterchange] Preserve LoopInfo after interchanging.
LoopInterchange relies on LoopInfo being up-to-date, so we should preserve it after interchanging. This patch updates restructureLoops to move the BBs of the interchanged loops to the right place.
Reviewers: davide, efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45278
llvm-svn: 329264
show more ...
|
| #
b7921707 |
| 03-Apr-2018 |
Florian Hahn <[email protected]> |
[LoopInterchange] Update tests so DA can handle access after D35430.
I have taken the opportunity to simplify some tests slightly and move parts around.
It also brings back a few IR checks for inte
[LoopInterchange] Update tests so DA can handle access after D35430.
I have taken the opportunity to simplify some tests slightly and move parts around.
It also brings back a few IR checks for interchangable loops.
Reviewers: karthikthecool, sebpop, grosser
Reviewed By: sebpop
Differential Revision: https://reviews.llvm.org/D45207
llvm-svn: 329081
show more ...
|
|
Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1 |
|
| #
bf6e1c26 |
| 06-Mar-2018 |
Sebastian Pop <[email protected]> |
DA: remove uses of GEP, only ask SCEV
It's been quite some time the Dependence Analysis (DA) is broken, as it uses the GEP representation to "identify" multi-dimensional arrays. It even wrongly dete
DA: remove uses of GEP, only ask SCEV
It's been quite some time the Dependence Analysis (DA) is broken, as it uses the GEP representation to "identify" multi-dimensional arrays. It even wrongly detects multi-dimensional arrays in single nested loops:
from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6 ;; for (long int i = 0; i < 50; i++) { ;; A[i][3*i - 6] = i; ;; *B++ = A[i][i];
DA used to detect two subscripts, which makes no sense in the LLVM IR or in C/C++ semantics, as there are no guarantees as in Fortran of subscripts not overlapping into a next array dimension:
maximum nesting levels = 1 SrcPtrSCEV = %A DstPtrSCEV = %A using GEPs subscript 0 src = {0,+,1}<nuw><nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} subscript 1 src = {-6,+,3}<nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} Separable = {} Coupled = {1}
With the current patch, DA will correctly work on only one dimension:
maximum nesting levels = 1 SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body> DstSCEV = {%A,+,404}<%for.body> subscript 0 src = {(-2424 + %A)<nsw>,+,1212}<%for.body> dst = {%A,+,404}<%for.body> class = 1 loops = {1} Separable = {0} Coupled = {}
This change removes all uses of GEP from DA, and we now only rely on the SCEV representation.
The patch does not turn on -da-delinearize by default, and so the DA analysis will be more conservative in the case of multi-dimensional memory accesses in nested loops.
I disabled some interchange tests, as the DA is not able to disambiguate the dependence anymore. To make DA stronger, we may need to compute a bound on the number of iterations based on the access functions and array dimensions.
The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to avoid checking for snippets of LLVM IR: this form of checking is very hard to maintain. Instead, we now check for output of the pass that are more meaningful than dozens of lines of LLVM IR. Some tests now require -debug messages and thus only enabled with asserts.
Patch written by Sebastian Pop and Aditya Kumar.
Differential Revision: https://reviews.llvm.org/D35430
llvm-svn: 326837
show more ...
|
|
Revision tags: llvmorg-6.0.0, llvmorg-6.0.0-rc3 |
|
| #
c6296fea |
| 14-Feb-2018 |
Florian Hahn <[email protected]> |
[LoopInterchange] Incrementally update the dominator tree.
We can use incremental dominator tree updates to avoid re-calculating the dominator tree after interchanging 2 loops.
Reviewers: dmgreen,
[LoopInterchange] Incrementally update the dominator tree.
We can use incremental dominator tree updates to avoid re-calculating the dominator tree after interchanging 2 loops.
Reviewers: dmgreen, kuhar
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D43176
llvm-svn: 325122
show more ...
|
|
Revision tags: llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1 |
|
| #
907b60fb |
| 21-Oct-2017 |
David Green <[email protected]> |
[LoopInterchange] Fix phi node ordering miscompile.
The way that splitInnerLoopHeader splits blocks requires that the induction PHI will be the first PHI in the inner loop header. This makes sure th
[LoopInterchange] Fix phi node ordering miscompile.
The way that splitInnerLoopHeader splits blocks requires that the induction PHI will be the first PHI in the inner loop header. This makes sure that is actually the case when there are both IV and reduction phis.
Differential Revision: https://reviews.llvm.org/D38682
llvm-svn: 316261
show more ...
|